WO2007117039A1 - Method to isolate 5' ends of nucleic acid and its application - Google Patents

Method to isolate 5' ends of nucleic acid and its application Download PDF

Info

Publication number
WO2007117039A1
WO2007117039A1 PCT/JP2007/058126 JP2007058126W WO2007117039A1 WO 2007117039 A1 WO2007117039 A1 WO 2007117039A1 JP 2007058126 W JP2007058126 W JP 2007058126W WO 2007117039 A1 WO2007117039 A1 WO 2007117039A1
Authority
WO
WIPO (PCT)
Prior art keywords
rna
nucleic acid
cdna
dna
molecules
Prior art date
Application number
PCT/JP2007/058126
Other languages
French (fr)
Inventor
Yoshihide Hayashizaki
Piero Carninci
Charles Plessy
Matthias T. Harbers
Original Assignee
Riken
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Riken filed Critical Riken
Publication of WO2007117039A1 publication Critical patent/WO2007117039A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • C12N15/1096Processes for the isolation, preparation or purification of DNA or RNA cDNA Synthesis; Subtracted cDNA library construction, e.g. RT, RT-PCR

Definitions

  • the present invention relates to a method to capture nucleic acid ends suitable for limited amount of starting material based on sequential addition of reagents in compatible buffer in a single tube.
  • Genomes contain the essential genetic information for development and homeostasis of any living organisms .
  • knowledge is required on how such genetic information is utilized in a cell or tissue at a given time point.
  • Many cases are known where mistakes in the utilization of genetic information and related regulatory pass ways have caused disease in human or plant and animal .
  • methods are in need to allow for expression profiling and annotation of the identified transcripts as well as characterizing genetic elements in control of the genetic information.
  • Most expression studies nowadays use either approaches based on in situ hybridization, e.g. microarrays, or high-throughput sequencing of short tags, e.g. SAGE, CAGE, MMPS, where both types of experiments have distinct advantages over each other.
  • it can be desirable to also obtain information on the genetic elements, which control gene expression.
  • SAGE Serial Analysis of Gene Expression
  • This method forms DNA concatemers by ligating multiple short DNA fragments (initially about 10 bp) containing information on the base sequences at the 3' -end of multiple mRNAs, and determines the base sequences in these DNA concatemers .
  • SAGE Short S. et al . , Nat. Biotechnol. 20, 508-12 (2002) , US patent applications 20030008290, 20030049653 all hereby incorporated herein by reference.
  • the SAGE method is currently in wide use as an important method for analyzing genes expressed in specific cells, tissues or organisms; and SAGE tags are available for reference in the public domain, e.g. under http : //cgap .nci .nih. gov/SAGE .
  • Cap-Analysis-Gene-Expression allows for the cloning of 5' -end specific tags into concatemers similar to the SAGE technology, where the so-called CAGE tags enable not only the detection of transcripts and their expression profiling, but further provide information on transcriptional start sites to allow for mechanistic studies on the regulation of transcription or a higher annotation of transcripts.
  • CAGE tags enable not only the detection of transcripts and their expression profiling, but further provide information on transcriptional start sites to allow for mechanistic studies on the regulation of transcription or a higher annotation of transcripts.
  • Later similar approaches for the cloning of concatemers comprising 5' -end specific sequence information had lately also been published by a number of other laboratories , as for example in Hwang B.J. et al . , Proc . Natl. Acad. Sci . USAlOl, 1650-1655 (2004), Hashimotos, et al., Nat. Biotechnol.
  • the invention provides innovative solutions on how to obtain DNA fragments for single molecule detection and sequencing, which can be used also for other applications .
  • the invention provides means to modify an RNA molecule in such a way that sequence information specific for the 5' -end of such modified RNA molecule can be obtained. Therefore the invention will also enable new high-throughput sequencing approaches and their use in for example expression profiling.
  • the invention provides further means of high value to studies including but not limited to expression profiling studies based on 5' -end specific sequence tags, and beyond, which are essential components of commercial applications and services including but not limited to drug development, diagnostics, or forensic studies.
  • obtaining the full-length cDNA has been instrumental to express the encoded proteins, which then can be transferred into other functional vectors.
  • proteins can be cloned into gateway vectors (Gateway is a trademark of Invitrogen) and used for protein expression, purification, and a number of other functional studies .
  • the analysis of 5' ends has further allowed measuring gene expression associated with the promoter identification (Carninci et al, Nature Genetics in press) . By capturing the 5' ends tags (Harbers and Carninci Nat Methods. 2005 Jul;2(7) :495-502] , it has been possible to identify the transcription starting site, and therefore the promoter regions, that are the regions that control gene expression.
  • the promoter regions can be divided into core promoters and long-range controlling elements, the core promoters (those that are in vicinity to the start site) can be identified very efficiently, thus facilitating further functional analysis such as transcriptional networks (the connection between transcriptional controlling elements) and the expressed RNAs.
  • transcriptional networks the connection between transcriptional controlling elements
  • RNAs the expressed RNAs.
  • RNA expression is very different in the various cells that compose a given tissue. Tissues are composed by multiple cells expressing some common RNAs and many very different, cell specific RNAs. Therefore, there is the need for novel technologies that enable the preparation and the capture of 5' ends/full-length cDNAs from small amount of cells, so that the different cells that compose a tissue could be separately analyzed for their expressed RNAs, and their promoter usage . This is essential to the understanding of biological systems and downstream applications, including the identification of transcriptional networks and their perturbation/adjustments with drugs.
  • the invention relates to methods for the isolation of fragments from nucleic acid molecules for the purpose of cloning and analysis .
  • the invention relates to the conversion of a sample containing one or more nucleic acid molecules, where such nucleic acid molecules or any mixture of nucleic acid molecules would be converted into DNA.
  • the invention relates to the manipulation of nucleic acid molecules, where such nucleic acid molecules would be prepared in the form of linear single-stranded DNA.
  • the invention relates to the preparation and manipulation of linear single-stranded DNA.
  • the invention relates to the modification of an RNA molecule or a plurality of RNA molecules to introduce sequence information at the 5' -end of the RNA molecule or RNA molecules.
  • the invention relates to the modification of RNA, where the information added to the RNA molecule is used for the manipulation and/or analysis of the RNA molecule.
  • the invention relates to the conversion of native and/or modified RNA molecules by transcribing the RNA into cDNA. Hence, the invention relates to the synthesis and preparation of single-stranded
  • the invention relates to the use of single-stranded DNA molecules for directly obtaining sequence information thereof.
  • the invention relates to obtaining sequence information from defined regions of single-stranded DNA fragments, said
  • the 5' -end specific sequence information of a DNA fragment relates to the 5' -end sequence of an RNA molecule.
  • the invention relates to obtaining sequence information from an RNA molecule.
  • the invention relates to the sequencing of the tags to allow for their annotation by computational means and their statistical analysis.
  • the invention relates to means for gene discovery, gene identification, gene expression profiling, and annotation.
  • the invention relates to the sequencing of the tags to allow for their annotation by computational means and their statistical analysis, where such tags would be derived from regions within genomes .
  • the invention relates to the characterization of genetic elements within genomes .
  • the invention relates to the preparation of hybridization probes from the ends nucleic acid molecules, where such regions would be analyzed by the means of in situ hybridization.
  • the in situ hybridization experiment would make use of a tiled array.
  • the invention relates to the full-length cloning of nucleic acid molecules.
  • sequence information as obtained from the tags is used for primer design, and where such primers are used to amplify the nucleic acid molecule in an amplification reaction. It is within the scope of the invention to amplify and clone in such a way transcribed regions as well as genomic fragments, where such fragments can contain genetic elements, said promoter regions.
  • the invention provides means for the analysis of nucleic acid molecules and short fragments thereof as needed for example for the characterization of biological samples.
  • Figure 1 shows one of the embodiments of a flow of the present invention.
  • FIG. 2 shows the activity of each of the enzymes in the various steps and buffer.
  • Figure 3 shows the results of ONE-Tube oligo-capping and amplification with 5' end RACE.
  • Figure 4 is a graph showing length of sequenced concatamers with the 454 Life Sciences sequencers.
  • the present invention relates to the analysis of RNA/nucleic acids from biological samples .
  • biological samples includes any kind of material obtained from living organisms including microorganisms, animals, and plants, as well as any kind of infectious particles including viruses and prions , which depend on a host organism for their replication.
  • biological samples include any kind material obtained from a patient, animal, plant or infectious particle for the purpose of research, development, diagnostics or therapy.
  • the invention is not limited to the use of any particular nucleic acid molecules or their origin, but the invention provides general means to be applied to and used for the work on and the manipulation of any given nucleic acid.
  • nucleic acid molecules as applied to perform the invention can be obtained or prepared by any method known to a person skilled in the art including but not limited to those described by Sambrook J. and Russuell D.W., Molecular Cloning, A Laboratory Manual, Cold Spring Harbor Laboratory Press, New York, 2001, hereby incorporated herein by reference.
  • RNA for the purpose of the invention, is considered a single-stranded nucleic acid molecule even where such a molecule may form secondary structures comprising double-stranded RNA portions.
  • RNA encompasses for the purpose of the invention any form of nucleic acid molecule comprised of ribonucleotides, and does not related to a particular sequence or origin of the RNA.
  • RNA can be transcribed in vivo or in vitro by artificial systems, or non-transcribed, spliced or not spliced, incompletely spliced or processed, independent from its natural origin or derived from artificially designed templates, mRNA, tRNA, rRNA, obtained by means of synthesis, or any mixture thereof.
  • DNA DNA
  • RNA nucleic acid
  • sequence encompass nucleic acid materials themselves and are thus not restricted to particular sequence information, vector, phagemid or any other specific nucleic acid molecule.
  • RNA A large division of RNA is the capped RNA with non-capped RNA.
  • Non-capped RNA which is not recovered, constitutes the majority of the RNA in a sample (at least 95% of the species) .
  • the starting RNA type should be divided into groups depending on the presence of cap site.
  • RNA is usually referred to RNAs having cap-sites and a polyA tail, and constitutes 1-3% of the total and encode for proteins.
  • Another class of capped molecules is the nc-RNA, that is not encoding and is poorly polyadenylated, but does have a cap structure. Altogether, these two do not constitute more than 5% of the RNA molecules in a cell. Therefore, we will refer to "total RNA” and "capped RNA” , whereas total RNA contains up to some 5% of capped RNAs . Roughly speaking as example, about 1 molecule/20 molecules of RNA contain a cap.
  • total RNA rather than “capped RNA” fractions (some 20 fold) .
  • total RNA rather than “capped RNA” fractions (some 20 fold) .
  • a solid matrix is intended throughout this document as a solid body with indefinite size, that can have any shape, which includes beads and surfaces. Among them, but not limited to, there are beads including magnetic beads, coated beads and slides constituted by glass, plastic, silica, teflon or any other suitable material . These matrixes can be inert or reactive, such as being coated with streptavidin, antibodies, reactive chemical groups, or any form of nucleic acids (RNA, DNA, hybrid molecules) of derivatives of nucleic acids having one or more modifications or the bases and/or the backbone sequence.
  • matrix RNA, DNA, hybrid molecules
  • solid surface is used interchangeably as they refer to binding nuclei acids to an object that is not in solution, and allow nuclei acid isolation or capture or further manipulations .
  • RNA has been a limiting factor, yet protocol requires minimal amount of RNA varying from 50 micrograms of total RNA. Capture of full-length mRNA is not reported below hundreds of nanograms of capped RNA (requiring micrograms of total RNA) .
  • Another method is based on the capture of the cap-site, through a cap binding protein, after first strand cDNA.
  • the cap-binding proteins is used to capture the cDNA from the 5 ' ends .
  • Alternative version would include capturing the RMA with cap-binding antibodies, or alternatively, other fragment of antibodies or even phage display antibodies .
  • Oligo-capping based methods (Suzuki and Sugano, Methods MoI Biol. 2003; 221:73-91) This family of methods is based on replacing the 5' end cap with a different primer sequence (an oligonucleotide) , which can be RNA, DNA, or a hybrid DNA/RNA molecule. Only cDNA that reach the conjugated 5' ends can be subjected to further manipulation, such as their copying into second strand cDNA and PCR. This group of methods has also achieved some popularity, with the drawbacks of having relatively low efficiency, which requires a large number of PCR cycles before obtaining enough material for a RNA library. With these methods, no less than several micrograms of RNA have been used. Cap-switch methods (Zhu et al, Biotechniques . 2001 Apr; 30(4).-892-1.) .
  • Cap switch methods are based on the addition, at the 5' end of RNAs and in particular the cap-site, of a trinucleotides GGG.
  • the cDNA is primed with oligo-dT to minimize the contamination of ribosomal RNAs, which do not carry a polyA tail. This methods is particular appealing because does not require cap manipulations .
  • investigators have been able to utilize single cells to prepare cDNA probes .
  • the method however, has still two problems. One is that the cap-switch has low efficiency and PCR is ⁇ required.
  • PCR based full-length cDNA libraries preparation is problematic, because the long cDNA clones are seldom recovered because PCR amplification of long insert is outcompeted by shorter PCR products deriving from short mRNAs .
  • Another disadvantage is that three G nucleotides are added to the 5' end of the cDNAs . If using these for 5' end tagging technologies, by using a class II S restriction enzyme to cleave a 5' end tag, three (out of 20 bases, see later for a description of tagging cDNA ends) bases would be lost as useful information. This is particularly critical when mapping these on the genomes (see Mapping of tags on the genome) .
  • a true full-length method identifies with perfect accuracy the transcriptional starting site (TSS) .
  • TSS transcriptional starting site
  • True full length methods are listed above, whereas the initial base is present in the cDNA and can be separated from other sequences, such as linkers or additional sequences (such as the GGG used in the cap-switch or the tailing procedures used in the Okayama and Berg in MoI Cell Biol. 1982 Feb;2 (2) : 161-70) .
  • the cap-switch can be used for capturing the 5' end of reduced amount of samples, including single cells (Gustincich et al, Proc Natl Acad Sci U S A.
  • the cDNA is in fact cloned after exonuclease treatment, which has to remove the part of the first-strand cDNA that extends to the cap site.
  • Libraries obtained with this method may be quasi-full-length, but are not full-length as never identify the true cap site.
  • nucleic acid is also used herein to encompass naturally occurring nucleic acids, artificially synthesized or prepared nucleic acids, and any modified nucleic acids into which at least one or more modifications have been introduced by naturally occurring events or through approaches known to a person skilled in the art.
  • DNA or RNA molecules may function in a specific manner as hybridization probes, and as such are related to as “complementary sequences" for the purpose of the invention, or in experiments where such probes are applied for the detection of a related nucleic acid molecule , even where such a probe and the target molecule may be distinct by naturally occurring or artificially introduced mutations in individual positions.
  • RNAs may be derived from cells, tissues, or any other biological source containing RNAs .
  • these RNAs are derived from any human, animal or plant tissue or cell line, and it could be constituted by the whole RNA or RNA fractions, either selected by size, by RNA features (e.g. : the presence of the polyA tails in most of the mRNAs) , the cell compartments (e.g. : nuclear RNA, cytoplasmic RNA, polysomal RNA, membrane-bound polysomal RNAs. and RNA belonging to other ribonucleoprotein complexes) .
  • RNA features e.g. : the presence of the polyA tails in most of the mRNAs
  • the cell compartments e.g. : nuclear RNA, cytoplasmic RNA, polysomal RNA, membrane-bound polysomal RNAs. and RNA belonging to other ribonucleoprotein complexes
  • RNA may be derived from other biological fluids, such as blood or serum.
  • it may contain viral RJSTA or other potential parasites from the blood of an individual human . Capturing the 5 ' end of these mRNAs would allow diagnosis of potential parasites .
  • the RNA is obtained from purified cells, including flow-sorted cells from dissected tissue. These could either be cells that are labeled with a selectable fluorescent antibody by flow sorting, or labeled by the transgenic expression of a marker such as the green fluorescent protein (GFP) , frequently used in mammalian experiments. Alternative, these cells can be selected based on their morphology or other labeling by laser capture microdissection.
  • purified cells including flow-sorted cells from dissected tissue. These could either be cells that are labeled with a selectable fluorescent antibody by flow sorting, or labeled by the transgenic expression of a marker such as the green fluorescent protein (GFP) , frequently used in mammalian experiments.
  • GFP green fluorescent protein
  • the invention is based on the addition of the reagents to the 5' end of the RNA in a single tube until the RNA is specifically labeled with a "nucleic acid" .
  • the mRNAs are known for having the cap-site, which in this case would be selectively substituted by another nucleic acid molecule, generally and RNA, a DNA or a hybrid of DNA/RNA (but other nucleic acids may be suitable, such as RNA/peptide nucleic acids hybrids) .
  • the nucleic acids attached to the 5' end of the RNA may be modified in various ways, including incorporation of unconventional bases and unconventional backbones, such as alpha-thio derivatives of nucleic acids that are already used in common DNA sequencing operations .
  • nucleic acids may also contain labeled dies for additional selection, and other binding groups, such as biotin or digoxygenin.
  • the nucleic acid sequence attached to the 5 'end of the mRNA does not have any restriction in length, but in most applications it is conveniently built to be at least 15 nucleotides long, to allow efficient priming, and no longer than 100 bases, as this poses challenges for oligonucleotides synthesis.
  • nucleic acids binding the 5' end of the DNA would either be single strand, but also be partially of completely double strands nucleic acids. Partial double strand may help the derivatization of the 5' end of the mRNA with the added nucleotide under certain conditions (such as described in Shibata et al, Biotechniques . 2001 Jun;30 (6) : 1250-4) .
  • these nucleic acids binding the 5' end of the RNAs can be a single type of molecules, or a multitude of different nucleic acids .
  • the target RNA is placed in one tube and is subjected, without requiring sample purification, to three conceptually different steps: (1) masking of the non-full-length mRNA molecules, (2) modification of the cap-site molecules into reactive molecules, and (3) attaching the treated RNA molecule to the priming nucleic acid .
  • step (1) above has require bacterial alkaline phosphatase (BAP) , which works at alkaline around pH of 8.
  • BAP bacterial alkaline phosphatase
  • This enzyme has to be inactivated before proceeding to the step 2 above. If inactivated, it would remove the phosphate groups from the full-length mRNAs obtained at step 2 (see later) .
  • BAP cannot be heat inactivated by mild heat shock at 65°C. Instead the optimal activity of BAP is at 65 0 C (which would cause also RNA alkali based degradation at pH 8) . Therefore , other previous works have required organic extraction.
  • the cap is removed by tobacco acid pyrophosphatase (TAP) , which is then used at pH 6.
  • TAP removes the cap structure of the RNAs (this includes mRNA but also a novel group of non-coding RNAs) that are intact at the 5' end, and leaves a phosphate group instead.
  • the only molecules that contain a phosphate correspond to full-length RNA molecules, or at least other molecules that carry intact sequence and their 5' ends. All of the others RNA molecules contain an OH group at the 5' end, which is not a substrate for the reaction (3) .
  • the TAP has to be inactivated before the steps 1 to 3 takes place, as it would interfere with this step by degrading the ATP, the necessary substrate for the RNA ligase at step 3 (see later) . Therefore, also at the end of this step organic extraction is required and subsequent precipitation, which is time consuming and cause loss of samples.
  • the reaction number 3 is the RNA ligase.
  • RNA ligase does add an RNA molecule, or RNA/DNA hybrid or
  • RNA ligase requires ATP as substrate to provide the necessary energy to catalyze the ligation between the full-length RNA 5' end and the 3' end oh group of the added nucleic acid (usually an oligonucleotide) .
  • This reaction is carried out at various temperatures, usually low to minimize the DNA damage at pH 7.5 or 8.
  • the enzyme we successfully use a phosphatase which is thermosensitive.
  • we used at the step 1 is an Antarctic phosphatase, which can be inactivated at 65 0 C after the stepl, in a buffer that does not damage the RNA even at high temperature useful for inactivation, but not limited to this enzyme.
  • Such buffer resulted to be at pH lower than 7.5 and devoid of divalent ions.
  • a low saline acidic buffer corresponding to these requirements would be particularly useful.
  • a modified buffer that is compatible with the subsequent reaction, which involved addition of a pyrophosphatase activity or any equivalent activity.
  • the invention refers to process of inactivating the pyrophosphatase activity without degrading the KNA, for instance by heat inactivate in another buffer that does not damage the RNA.
  • the condition for the activity of an enzyme to ligate an exogenous nucleic acid to the 5' end of the RNAs may be achieved by the addition of another modifier buffers .
  • the added exogenous nucleic acids can be of various types, but for instance could be RNA oligonucleotide .
  • the preferred enzyme to carry the addition of the exogenous nucleic acid to the 5' end of the mRNA is preferentially a nucleic acid ligase.
  • RNA ligase such as the T4 RNA ligase or other thermostable RNA ligase.
  • This can promoter, for instance, the final step of tagging the RNA with an oligonucleotide .
  • RNA can then be used for a variety of downstream application.
  • full-length cDNA can be obtained by priming the RNA on the 3 ' end tails with an oligo-dT primer. Only the molecules that reach the 5' end oligonucleotide can then be primed for the second strand cDNA, therefore, obtaining cDNA containing both the 5' and 3' ends.
  • These full-length cDNA can then further treated and cloned following standard procedures.
  • the tagging nucleic acid can carry various sequences moieties that allow further functional application. For instance, it can contain restriction enzymes, and it can contain Polymerase sequences or recombination sequences . These sequences can also be used for further purification of the RNA/nucleic acid hybrid, and its derivatives, by mean of nucleic acids hybridization, protein-epitope interaction, or any other chemical interaction.
  • the methods to bind nucleic acids or derivatives is not limited to the above methods, that are cites simply as example.
  • the 5' end oligonucleotide could be an RNA oligonucleotide which contains a restriction site of a class-Us restriction enzyme.
  • These enzymes recognize a sequence but cleave outside their recognition sequence; with appropriate design, the Class Hs restriction enzyme Mmel can cleave 20/18 bases within the cDNA and produce 5' end tags (otherwise named 5 ' -SAGE or CAGE) (reference : Hashimoto et al, Nat Biotechnol .2004 Sep,-22(9) :1146-9; Kodzius et al, Nat Methods.2006 Mar,-3 (3) :211-22) .
  • the 5' end tags are isolated and joined to form concatamers for high throughput sequencing; concatamers allow a much more cost effective utilization of sequencing.
  • concatamers can either be sequenced directly after their ligation with linker suitable for the process of the 454 Life Science sequencing instrument based on pyrosequencing, or can be cloned into appropriate plasmid cloning vectors such as pZero plasraids from Invitrogen, for conventional sequencing with Sanger-based methods or any appropriate sequencer.
  • a “tag” according to the invention can be any region of a nucleic acid molecules as prepared by the means of the invention, where the term “tag” as used herein encompasses any nucleic acids fragment, no mater whether it is derived from naturally occurring, artificially synthesized or prepared nucleic acids, any modified nucleic acids into which at least one or more modifications have been introduced by naturally occurring events or through approaches known to a person skilled in the art . Furthermore, the term “tag” does not relate to any particular sequence information or their composition but to the nucleic acid molecules as such.
  • the fist strand primer can be not only an oligo-dT primer, but also a primer containing random sequences, for instance 6 or 9 random bases in the 3' position.
  • This nucleic acid primer can also contain at its 5' end sequence specific oligonucleotides that can be in turns used for specific priming.
  • such modified RNA can be used to determine the real 5' end sequence in protocols known as
  • these modified RNA molecules can bind directly to a solid support, or after their modification such as transcription into first strand full-length cDNA, and after being immobilized to such solid matrix, they can further be used for subsequent operation such as those used for understanding the primary sequences of these RNA molecules.
  • the 5' ends is conjugated with a nucleic acids carrying a promoter for a polymerase, such as an RNA polymerase.
  • a polymerase such as an RNA polymerase.
  • kits containing among other components the reagents (chemicals) ad the appropriate pH and combinations, nucleic acids and enzymes in order to let unspecialized technical personnel to reproducibly isolate the 5' ends of mRNAs for cDNA production or 5' end expression profiling.
  • the invention encompasses the methods for handling single strand as well as double strand
  • Double-stranded DNA means any nucleic acid molecules each of which is composed of two polymers formed by deoxyribonucleotides and in which the two polymers have substantially complementary sequences to each other allowing for their association to form a dimeric molecule.
  • the two polymers are bound to one another by specific hydrogen bonds formed between matching base pairs within the deoxyribonucleotides .
  • Any DNA molecule composed only of one polymer chain formed by two or more deoxyribonucleotides having no matching complementary DNA molecule to associate with is considered to be a single-stranded DNA molecule for the purpose of the invention, even if such a molecule may form secondary structures comprising double-stranded DNA portions.
  • nucleic acid molecule (s) and “polynucleotide (s) " include RNA or DNA regardless of single or double-stranded, coding or non-coding, complementary or not , and sense or antisense, and also include hybrid sequences thereof. In particular, it encompasses genomic DNA and complementary DNA which are transcribed or non-transcribed, spliced or not spliced, incompletely spliced or processed, independent from its origin, cloned from a biological material, or obtained by means of synthesis.
  • the invention relates to methods for the isolation of fragments from nucleic acid molecules for the purpose of analysis.
  • the invention relates to the conversion of a sample containing one or more nucleic acid molecules, where such nucleic acid molecules or any mixture of nucleic acid molecules would be converted into DNA.
  • nucleic acid molecules can be derived from any naturally occurring genomic DNA, RNA sample, an existing DNA library, is of artificial origin, or any mixture thereof .
  • the invention is not limited to the use of an individual nucleic acid molecule or any plurality of nucleic acid molecules, but the invention can be performed on an individual nucleic acid molecule or any plurality of nucleic acid molecules regardless whether such pluralities would occur in nature, be derived from an exciting library, or be artificially created. Furthermore, the invention can process any nucleic acid molecule regardless of its origin or nature . Thus it is within the scope of the invention that the nucleic acid molecules could be full-length molecules as compared to naturally occurring nucleic acid molecules, or any fragment thereof.
  • fragments of nucleic acid molecules could be prepared by a random process or by a targeted dissection of nucleic acid molecules by the means of an enzymatic activity with a preference for a certain sequence, or by means which would allow for the fragmentation based on the structure of the nucleic acid molecule including but not limited to exons and introns within transcripted regions .
  • the invention is not restricted to the use of any particular starting material .
  • the invention is not dependent on the use of DNA only, as a person familiar with the state of the art will know different approaches to convert RNA into DNA including but not limited to those approaches disclosed by Sambrook J. and Russell D.W., ibid, hereby incorporated herein by reference.
  • RNA into DNA After conversion of RNA into DNA, a single-stranded or double-stranded DNA molecule having the same or complementary sequence to the original RNA can be obtained, said cDNA.
  • cDNA molecules are commonly prepared in the form of liner DNA, where the two open ends allow for their manipulation.
  • a person trained to the state of the art will know about the necessary means to release an insert from such a vector to convert it into linear DNA.
  • said introduction of priming sites at 5' -ends of RNA is used to later bind the resulting nucleic acids.
  • Such oligonucleotides are used to capture specifically RNA molecules for manipulation and analysis by using a part of the sequence to bind RNA to surface, and another part to prime sequencing.
  • the oligonucleotide can be used as a tag to link the RNA, or resulting cDNA, to a surface, and after sequencing primers of know sequence to solution and use the resulting for direct sequencing.
  • Such captured nucleic acids can be used for further operations and analysis, including but not limited to sequencing, replication, amplifications and chemical and enzymatic modifications .
  • nucleic acids consisting of the first strand cDNA from the conjugated 5 ' RNA/RNA can be bound, as it is or through a hybrid nucleotide, to a solid matrix for further solid phase manipulation, such as single molecules sequencing.
  • the obtained nucleic acid hybrid is captured by another immobilized primer that can prime direct polymerization and sequencing reaction on the single molecule level. This can provide unprecedented sequencing throughput is applied on the single cell.
  • nucleic acids ligases it is important to notice that the conjugation of the sequences to nucleic acids corresponding to the 5 ' ends of the mRNAs does not necessarily rely on nucleic acids ligases, but can be done with any other methods to capture full-length cDNA, such as the biotinylated cap trapper family of technologies (Carninci and Hayashizaki, Methods Enzymol. 1999; 303: 19-44) or other equivalent methods.
  • the RNA can be self-ligated in absence of oligonucleotides, whereas the ribosomal RNA (in large excess) mostly ligates the 5' end of mRNA after dephosphorylation and decapping.
  • the ribosomal RNA sequence moiety can be used to prime the second strand cDNA and for further purification of the deriving nucleic acids, including but not limited to binding other nucleic acids for the purpose of priming polymerization, physical immobilization or isolation. These could be then used as substrate for massively parallel sequencers like 454 Life Sciences sequencing instrument .
  • nucleic acids deriving from the conjugation of a nucleic acid to 5' end of RNA molecules would be used to direct sequencing reactions . This is not limited to the process claimed in this patent, but could be obtained with any of the full-length capture methods available.
  • the nuclei acids contains two different sequences that are suitable for single-molecule emulsion PCR amplification as described in (Margulies et al, Nature. 2005 Sep 15; 437(7057) : 376-80.) .
  • the nucleic acids carrying information including the complete 5' end of the nucleic acids is labeled with a moiety such as biotin, or other reactive chemical groups, for further binding to a matrix which is case of biotin will be streptavidin of avidin. This is not limited to this as any other reactive chemical group can be used for the coupling.
  • Such captured nucleic acids is then used for further operations and analysis, including but not limited to sequencing, replication, amplifications and chemical and enzymatic modifications.
  • Example 1 Preparation of a library of 5' -derivatized RNA molecules This example is a typical protocol for the derivatization of five-prime ends of RNA molecules with RNA oligonucleotides. All reactions were performed in a 500 ⁇ l siliconised microtube and by using a siliconized tip each time to avoid nucleic acids losses.
  • the RNA sample was at first depohosphorylated.
  • the RNA sample was at first depohosphorylated.
  • the reaction buffer was 1/10 the common concentration, or 5 mM Bis-Tris-Propane-HCl, 0.1 mM MgCl 2 , 0.01 mM ZnCl 2 , pH 6.0 at 25°C. Glycogen helped to avoid attachment of KNA to the plastic during the operation.
  • the sample was denatured at 65 0 C for 5 min. , to expose the phosphate groups to be later removed, and after holding at 37 0 C for 2 min. , the Anctartic phosphatase (New England Biolabs) was added (2.5 units).
  • the sample was treated for 3 hours to overnight at 37 0 C. Overnight dephosphorylation allowed removal of 98-99% of the phosphate groups . Short incubation could also be performed at 45 0 C in the presence of trehalose at 0.6M final, which increased the activity at 45°C.
  • the Antarctic phosphatase was inactivated at 65°C, but before doing this, the divalent ions had to be chelated. For this reason, 0.55 ⁇ l of a solution of (0.5 M sodium acetate (pH 6.0) , 1OmM EDTA, 1% ⁇ -mercaptoethanol, and 0.1% Triton X-100) were added. This enzyme chelated the divalent ions and creates conditions suitable for the subsequent TAP treatment. The Antarctic phosphatase was also inhibited by EDTA of the buffer. The final inactivation was carried at 65 0 C for 5 to 15 min.
  • TAP tabacco acid pyrophosphatase
  • the ligation was carried out by adding one micro liter of a 100 micro molar "capping RNA" oligonucleotide of any sequence, for instance sequence: 5'- UUUGGAUUUGCUGGUGCAGUACAACUAGGCUUAAUAGGAUCCGACG -3 ' (SEQ ID NO: 1), at final concentration of 5 ⁇ M oligonucleotide.
  • Denaturation of the oligonucleotides was carried out at 65 0 C for 5 minutes, after which the tube was carried to the block of thermocycler at 20 0 C.
  • RNA ligase buffer 50OmM HEPES-NaOH (pH 8.0 at 25°C) , 10OmM MgCl 2 , 10OmM DTT
  • DTT inhibited the TAP.
  • HCC hexamino cobaltum chloride
  • Polyethylene glycol was then added (PEG 8000) at final concentration of 25%, ATP at final concentration of 125 ⁇ M concentration and finally 10 units of T4 RNA ligase were added. At such conditions, the resulting mixture of previous buffers was not inhibitory for the ligation steps .
  • the sample was then ligated for 2 hours to overnight
  • RNA (16 hours) at 20 0 C.
  • the RNA was capped with an oligonucleotide and this could be used for different tests as they appear in other examples, such as full-length cDNA preparation.
  • oligoribonucleotides were subsequently radiolabelled with T4 Kinase and gamma-32P-ATP and analysed by PAGE . In absence of prior dephosphorylation, radiolabelling is impossible due to the 5 1 phosphate. Positive control (9): 5 1 OH oligoribonucleotide . Negative control (10): 5 1 phosphorylated oligoribonucleotide .
  • AP buffer supplied by New England Biolabs.
  • TAP buffer supplied by Epicentre.
  • TAP Tobacco Acid Pyrophosphatase
  • a radiolabelled oligoribonucleotide was incubated in presence of an unlabelled oligoribonucleotide . Ligation results in a shift of the electromobility in polyacrylamide gel.
  • the reactions were made in presence or absence of AP, TAP and RNL.
  • Ligation occurred in the mixed buffer, and was not impaired by- leftovers of TAP.
  • Example 2 Production of full-length cDNA from the above one-tube capped mKNA at example 1.
  • the sample as above could be desalted using microcon YM-100 filter as described by the manufacturer (Millipore) .
  • water and reverse transcriptase (RT) primers which could be obtained by Invitrogen, were added.
  • Eight hundreds ng of the primer AGA GAG AGA CCU CGA GCC UAG GUC CGA C (SEQ ID NO: 2) were used for a 20 ⁇ l reaction, and 3 ⁇ l of the sorbitol-trehalose mixture (3.3 M stock, final concentration of 0.5M Sorbitol and 4% trehalose) was added during making of the final RT reaction.
  • the RNA-primer mixture was heated-up for 10 min.
  • cDNA was achieved at high frequency that spanned the 5 ' end of the original mRNAs .
  • This could be further purified/processed. For instance, it can be treated with proteinase K (addition of 20 ⁇ g, together with EDTA at 10 mM final concentration, followed by RNA and proteinase inactivation at 95 0 C for 15 min. This sample could then be used on C14B
  • the cDNA was then amplified by PCR.
  • To the cDNA we added the Takara EX-taq buffer at final concentration of 1 X, then dNTPs were added (final concentration: 200 micromolar each) , 5' oligonucleotide (sequence: ace teg age eta ggt ccg ac; SEQ ID NO: 3) and 3' end oligonucleotide (sequence: ca gcg tec tea age ggc ege,- SEQ ID NO: 4) , each oligonucleotide at 400 nM concentration, MgCl 2 at 2.5 mM, and KCl at final concentration of 50 mM.
  • the components were mixed with a hot start and then after 5 min. at 94 0 C, samples were incubated for 30 seconds at 94 0 C, 30 seconds at 58 0 C, and 1.5 min. at 68 0 C, for 30 cycles.
  • Example 3 Application for RACE experiment
  • the capped RNA could be prepared as for the example 1, with the only difference that the RNA oligonucleotide had a different sequence as described below.
  • the experiment was performed as follows: 500ng of total RNA from liver was subjected to ONE-Tube oligo-capping, followed by the removal of the unreacted oligoribonucleotides, and reverse-transcription with random primers .
  • the 5 ' ends were amplified with a gene-specific primer (TTGGAGAGAGGGTTTCGACGAGTCA; SEQ ID NO: 7) and a primer complementary to the oligo-cap (CGACTGGAGCACGAGGACACTGA; SEQ ID NO: 8) .
  • Example 4 Application to 454 or other matrix.
  • the cDNA was prepared as in the example 1 and 2. However, the oligonucleotides prepared for the example 4 was designed in order to have the different adaptors at the 3' and 5' end of the RNA, respectfully: (Adaptor A: CCATCTCATCCCTGCGTGTCCCATCTGTTCCCTCCCTGTCTCAG; SEQ ID NO: 9 ; Adaptor B :
  • Adaptor B was used as "oligo-capping" sequence, and the Adaptor A was used to conjugate to a oligo-random primer for the first strand synthesis.
  • the material was passed through a C1-4B spin column to separate the excess of unreacted primers. Subsequently, the sample was subjected to the emulsion-PCR and then sequencing reactions as described for the 454-Life Science sequencing instrument (Margulies et al, Nature. 2005 Sep 15 ;437 (7057) :376-80) . This allowed achieving hundreds of thousands sequences in a single run.
  • Example 5 Application for 5' end sequencing tags.
  • a cDNA was achieved as example 1 and 2 , and the sample was processed until the second strand DNA was achieved by using standard protocols obvious for a person skilled in the art, such as described in Nature Methods, R. Kodzius et al. , Mar.3(3) 211-222, 2006.
  • the cDNA was then cleaved with Mmel, which was present on the oligonucleotide used for the oligo-capping in examples 1 and 2 (sequence: 5'- UUUGGAUUUGCUGGUGCAGUACAACUAGGCUUAAUAGGAUCCGACG -3 ' ; SEQ ID NO: 1) .
  • 100 nanograms of a linker was ligated to the cleaved cDNA (sequence upper oligonucleotide: . 5 '-Phosphate
  • Phosphate-GGATCCTCAGGACTCTTCTATAGTGTCAGTACGGA-NH 2 -3' ; SEQ ID NO: 12; these two oligonucleotide were briefly mixed to reconstitute a linker before the ligation; NH 2 amino group) , and ligation was proceeded with DNA ligase as described in Kodzius et al. The fragment containing the most 5' end of the mRNA is (the 5' end of the cDNA) was separated as in Kodzius et al .
  • Linker A and Linker B were adapted from (Margulies et . al., Nature, 2005 Sep. 15, 437(7057), p376-80) , but were made compatible for the ligation with the 5 ' end tags obtained in this experiment : their sequences were (adaptor A: upper oligo, 5'- CCATCTCATCCCTGCGTGTCCCATCTGTTCCCTCCCTGTCTCAG -3 ' SEQ ID NO: 15 ; lower oligo, 5 ' -Phosphate-GATCCTGAGACAGGGAGGGAACAGATGGGACACGCAGGGATGA GATGG-3 1 SEQ ID NO: 16; adapter B: upper oligo, 5 ' -BioTEG-CCTATCCCCTGTGTGCCTTGCCTATCCCCTGTTGCGTGTCTCAG-B ' SEQ ID NO: 17, lower oligo, 5 ' -
  • linkers A and B were ligated at 1:20 ratio with the 5' end tags (5' end tag excess) and the reaction allowed to proceed overnight with T4 DNA ligase at standard condition in a 10 microliters volume. The sample was then suitable to the 454 sequencer protocol .
  • sequencing tags were used for sequencing and identification of gene borders like in Science, P. Carninci et al., Sep. 2, 309(5740), 1559-63, 2005, expression profiling and promoter of the genes in Nature Methods, M. Herbers and P. Carninci, JuI. 2(7), 495-502, 2005.
  • the sample was subjected to the emulsion PCR and then sequencing reactions as described for the 454-life science sequencing instrument (Margulies et. al., Nature, 2005 Sep. 15, 437(7057), p376-80) , as follows.
  • the library immobilization beads were washed twice with lOO ⁇ l of 2xlibrary binding buffer using the MPC and then removed the buffer. Suspended the beads in 25 ⁇ l of 2 X library binding buffer and added 50 ⁇ l of Oligo-capping library at the final volume of 75 ⁇ l, then the tube was placed in a rotator for 20 min. During the rotation, the neutralization solution was prepared in a 1.5 ⁇ l tube by mixing 500 ⁇ l of Qiagen' s PB buffer and 3.8 ⁇ l of acetic acid. The library-carrying beads were washed twice with lOO ⁇ l of library washed buffer using the MPC, then removed the buffer .
  • MPC Particle Collector
  • Emulsion PCR The content of Clonal Amplification Reagents Kit was as follows, lOxCapture Beads Wash Buffer, DNA Capture Beads, Mock Amplification Misx, Amplification Mix, MgSO 4 , Amplification Primer Mix and Ppiase.
  • ImI of lxCapture Beads Wash buffer was allotted to each tube, then vortexed, centrifuged for 10 sec, rotated 180°, centrifuged for 10 sec and then, removed ImI of the supernatant without disturbing the beads .
  • the sstDNAs were allotted to each tube . Prepared two 8- connected tube and around 25 ⁇ l of the sample was allotted to each well, then vortexed, centrifuged for 10 sec, rotated 180°and centrifuged for 10 sec.
  • 9700 thermocycler 80°Cfor5min, decreased to 70 0 C by 0. l°C/sec, maintained at 70 0 C for 1 min, decreased to 60 0 C 0.1°C/sec, maintained at 6O 0 C for 1 min, decreased to 50 0 C 0.1°C/sec, maintained at 5O 0 C for 1 min, decreased to 20 0 C 0.1°C/sec and end.
  • Emulsion oil was vortexed and 240 ⁇ l of Mock Amplification mix was added to each sixteen tube of Emulsion oil, then the tubes were set in TissueLyser 25/sec for 5 min to make a small bubbles. While TissueLyser was running, Amplification Mix was prepared in a 15ml tube as follows, mixed 181.62 ⁇ l of
  • Amplification mix lO.O ⁇ l of MgSO 4 , 2.08 ⁇ l of Amplification
  • Amplification Primer Mix 72. O ⁇ l of platinum HiFi Taq polymerase (Invitrogen) and 3.6 ⁇ l of Ppiase for twelve tubes, 2905.92 ⁇ l of Amplification mix, 160. O ⁇ l of MgSO 4 , 33.28 ⁇ l of Amplification Primer Mix, 96. O ⁇ l of platinum HiFi Taq polymerase (Invitrogen) and 4.8 ⁇ l of Ppiase for sixteen tubes, respectively. After the annealing, the sample was centrifuged and removed the supernatant as much as . Added 160 ⁇ l of Amplification Mix to each well of two 8-connected tubes, mixed it well and left for 30sec.
  • Enhancing Fluid Added 500 ⁇ l of Enhancing Fluid and mixed slightly, then placed in the MPC for 2 min and carefully removed all the supernatant to 1.5 ml tube. After taking away the tubes from the MPC, added 1 ml of Enhancing Fluid, then placed in the MPC for 2 min. After that, removed all the supernatant and washed twice. After taking away the tubes from the MPC, added 700 ⁇ l of Melt solution, vortexed for 5 sec and placed in the MPC. Then, transferred all the supernatant to a 1.5 ml tube. Again, added 700 ⁇ l of Melt solution, vortexed for 5 sec and placed in the MPC.
  • Packing Beads three times as follows, added 1 ml of Bead Buffer 1 to the tube of Packing Beads, vortexed and centrifuged at lOOOOrpm for 5 min. , then removed all the supernatant. After that, added 1 ml of Bead Buffer 1.
  • 1st Layer mixed 500 ⁇ l of Packing Beads and 500 ⁇ l of Enzyme Beads in a 2ml tube.
  • 2nd Layer mixed 460 ⁇ l of Enzyme beads and 1400 ⁇ l of Bead Buffer2 in a 2 ml tube .
  • sequencing mix to the tube of DNA beads (after rotation) , added 250 ⁇ l of Bead Buffer 2 and 960 ⁇ l of 1st Layer.
  • the cDNA was achieved as in examples 1 and 2, and the sample was processed until the first strand cDNAwas achieved by using standard protocols obvious for a person skilled in the art, such as described in (Kodzius et al, Nat Methods.

Abstract

The invention relates to a method for isolating 5' ends of nucleic acid derived from any biological resources in a single tube and its application. It is enable to construct cDNA libraries, tagging and sequencing including small scale of RNA or reducing a quantity of RNA.

Description

DESCRIPTION
Method to isolate 5' ends of nucleic acid and its application
Field of the Invention
The present invention relates to a method to capture nucleic acid ends suitable for limited amount of starting material based on sequential addition of reagents in compatible buffer in a single tube.
Background of the Invention Gene expression analysis
Genomes contain the essential genetic information for development and homeostasis of any living organisms . For an understanding of biological phenomena, knowledge is required on how such genetic information is utilized in a cell or tissue at a given time point. Many cases are known where mistakes in the utilization of genetic information and related regulatory pass ways have caused disease in human or plant and animal . Thus methods are in need to allow for expression profiling and annotation of the identified transcripts as well as characterizing genetic elements in control of the genetic information. Most expression studies nowadays use either approaches based on in situ hybridization, e.g. microarrays, or high-throughput sequencing of short tags, e.g. SAGE, CAGE, MMPS, where both types of experiments have distinct advantages over each other. However, for our understanding of the regulatory principles behind gene expression, it can be desirable to also obtain information on the genetic elements, which control gene expression.
Due to the limitations of DNA microarray experiments alternative approaches are in use for gene discovery and expression profiling, which are based on partial sequences, said tags, obtained from a plurality of mRNA samples. The so-called SAGE (Serial Analysis of Gene Expression) method is known as an efficient method for obtaining partial information on the base sequences in mRNAs (Velculescu V. E. et at., Science 270, 484-487 (1995) hereby incorporated herein by reference) . This method forms DNA concatemers by ligating multiple short DNA fragments (initially about 10 bp) containing information on the base sequences at the 3' -end of multiple mRNAs, and determines the base sequences in these DNA concatemers . Recently an approved version of SAGE, the so-called LongSAGE, has been published, which allows for the cloning of longer SAGE tags (Saha S. et al . , Nat. Biotechnol. 20, 508-12 (2002) , US patent applications 20030008290, 20030049653 all hereby incorporated herein by reference) . The SAGE method is currently in wide use as an important method for analyzing genes expressed in specific cells, tissues or organisms; and SAGE tags are available for reference in the public domain, e.g. under http : //cgap .nci .nih. gov/SAGE .
US patents 6,352,828; 6,306,597; 6,280,935; 6,265,163; 5,695,934, all hereby incorporated herein by reference, disclosed a different approach for the high-throughput sequencing of short sequence tags, also denoted as Massively Parallel Signature Sequencing or "MPSS" . As described in further details in Brenner S. , et al. , Nat. Biotechnol.18, 630-634 (2000), and Brenner S ., et al . , Proc . Natl. Acad. Sci. USA 97, 1655-1670 (2000), both hereby- incorporated herein by reference, preferentially short sequences from the 3 '-end of transcripts are obtained in a highly parallel manner performing cycles with different enzymatic reactions on a single layer of beads. In WO 03/091416, hereby incorporated herein by reference, modifications to the aforementioned approach were disclosed to enable also the sequencing of short sequences from the 5 '-end of transcripts.
As most of the aforementioned approaches focused on the utilization of 3 '-end derived sequence tags, new approaches have been developed to obtain also sequence tags from other regions, in particular the 5' -ends, of transcripts . Such an approach has been disclosed in PCT/JP03/07514, andShirakiT. et al . , Proc. Natl. Acad. Sci. USAlOO, 15776-15781 (2003), both hereby incorporated herein by reference. This so-called CAGE
(Cap-Analysis-Gene-Expression) approach allows for the cloning of 5' -end specific tags into concatemers similar to the SAGE technology, where the so-called CAGE tags enable not only the detection of transcripts and their expression profiling, but further provide information on transcriptional start sites to allow for mechanistic studies on the regulation of transcription or a higher annotation of transcripts. Later similar approaches for the cloning of concatemers comprising 5' -end specific sequence information had lately also been published by a number of other laboratories , as for example in Hwang B.J. et al . , Proc . Natl. Acad. Sci . USAlOl, 1650-1655 (2004), Hashimotos, et al., Nat. Biotechnol. 22, 1146-1149 (2004), Zhang Z. and Dietrich F. S. , Nuc . Acids Res. 33, 2838-2851 (2005), and Wei CL. etal. Proc. Natl. Acad. Sci. USAlOl, 11701-11706 (2004) , all hereby incorporated herein by reference.
Any of the above approaches focuses only on the cloning and sequencing of one sequence tags. Hence such approaches, are still limited the possible throughput of the tag sequencing. In addition, they require many manipulation steps that can give case to a bias . In particular amplification steps can cause artifacts as well as a bias in the tag frequencies due to distinct amplification rates for individual DNA fragments . Recent developments in the field will open up new ways to obtain sequence information at much higher throughput than presently possible by the classical approaches . Of particular interest is here novel approaches for the detection and sequencing of single DNA molecules as recently reviewed in Metzker M. L. Genome Res. 15, 1767-1776 (2005), Kling J., Nature Biotechnology 23, 1333-1335 (2005) and Shendure J. et al . , Nature Review Genetics 5, 335-344 (2004), all hereby incorporated herein by reference. This invention provides innovative solutions on how to obtain DNA fragments for single molecule detection and sequencing, which can be used also for other applications . Hence, the invention provides means to modify an RNA molecule in such a way that sequence information specific for the 5' -end of such modified RNA molecule can be obtained. Therefore the invention will also enable new high-throughput sequencing approaches and their use in for example expression profiling. Moreover, the invention provides further means of high value to studies including but not limited to expression profiling studies based on 5' -end specific sequence tags, and beyond, which are essential components of commercial applications and services including but not limited to drug development, diagnostics, or forensic studies.
Altogether, these limitations suggest the need for the selection of 5' end regions of mRNAs and cDNAs, that can be suitable for expression profiling and then for other applications.
Isolation of full-length cDNA molecules
Analysis and cloning of 5 ' ends of mRNAs have been basic in the preparation of full-length cDNA collections and to discovered the expressed part of the genome, including protein coding mRNAs and non-coding RNAs (Carninci et al, Science.2005 Sep 2;309 (5740) : 1559-63, Gerhard et al, Genome Res. 2004 Oct;14 (10B) :2121-7, Imanishi et al, PLoS Biol. 2004 Jun;2(6) :el62) . Obtaining full-length cDNAs has been instrumental understanding the genome structure and function and mapping the genes on the genome . Additionally, obtaining the full-length cDNA has been instrumental to express the encoded proteins, which then can be transferred into other functional vectors. For instance, proteins can be cloned into gateway vectors (Gateway is a trademark of Invitrogen) and used for protein expression, purification, and a number of other functional studies . Additionally, the analysis of 5' ends has further allowed measuring gene expression associated with the promoter identification (Carninci et al, Nature Genetics in press) . By capturing the 5' ends tags (Harbers and Carninci Nat Methods. 2005 Jul;2(7) :495-502] , it has been possible to identify the transcription starting site, and therefore the promoter regions, that are the regions that control gene expression. Although the promoter regions can be divided into core promoters and long-range controlling elements, the core promoters (those that are in vicinity to the start site) can be identified very efficiently, thus facilitating further functional analysis such as transcriptional networks (the connection between transcriptional controlling elements) and the expressed RNAs. This analysis is a part of a novel field that is otherwise called "systems biology" . Following this philosophy, it is essential to understand a biological system to take into account all of the component of this system and integrate the knowledge.
So far, however, these technologies have requested relatively large starting material (Carninci et al Genome Res.2003 Jun; 13 (6B) : 1273-89, Shiraki et al . , Proc Natl Acad SciUS A.2003 Dec 23; 100 (26) : 15776-81. ; Kodzius et a1 Genome Res. 2003 Jun; 13 (6B) : 1273-89; Hashimoto et al, Nat Biotechnol. 2004 Sep;22(9) : 1146-9) . Other technologies to capture the 5' ends require extensive amplification (such as PCR; Suzuki and Sugano, Methods MoI Biol .2003; 221:73-91), thus causing misrepresentation of frequency of cDNA, as reviewed in (Carninci et al Genome Res. 2003 Jun; 13 ( 6B) : 1273 - 89 ) .
However, the RNA expression is very different in the various cells that compose a given tissue. Tissues are composed by multiple cells expressing some common RNAs and many very different, cell specific RNAs. Therefore, there is the need for novel technologies that enable the preparation and the capture of 5' ends/full-length cDNAs from small amount of cells, so that the different cells that compose a tissue could be separately analyzed for their expressed RNAs, and their promoter usage . This is essential to the understanding of biological systems and downstream applications, including the identification of transcriptional networks and their perturbation/adjustments with drugs.
In the current invention, we describe a method that simplifies achieving full-length cDNA molecules or 5' tags that start at the beginning of the 5' mRNA.
Summary of the Invention
The invention relates to methods for the isolation of fragments from nucleic acid molecules for the purpose of cloning and analysis . Thus the invention relates to the conversion of a sample containing one or more nucleic acid molecules, where such nucleic acid molecules or any mixture of nucleic acid molecules would be converted into DNA.
In one embodiment, the invention relates to the manipulation of nucleic acid molecules, where such nucleic acid molecules would be prepared in the form of linear single-stranded DNA. Thus the invention relates to the preparation and manipulation of linear single-stranded DNA.
In a particular embodiment, the invention relates to the modification of an RNA molecule or a plurality of RNA molecules to introduce sequence information at the 5' -end of the RNA molecule or RNA molecules. Hence the invention relates to the modification of RNA, where the information added to the RNA molecule is used for the manipulation and/or analysis of the RNA molecule.
In another embodiment, the invention relates to the conversion of native and/or modified RNA molecules by transcribing the RNA into cDNA. Hence, the invention relates to the synthesis and preparation of single-stranded
DNA molecules .
In just another embodiment, the invention relates to the use of single-stranded DNA molecules for directly obtaining sequence information thereof. Hence the invention relates to obtaining sequence information from defined regions of single-stranded DNA fragments, said
5' -ends. In one particular embodiment, the 5' -end specific sequence information of a DNA fragment relates to the 5' -end sequence of an RNA molecule. Hence the invention relates to obtaining sequence information from an RNA molecule.
In another embodiment, the invention relates to the sequencing of the tags to allow for their annotation by computational means and their statistical analysis. Thus the invention relates to means for gene discovery, gene identification, gene expression profiling, and annotation.
In just another embodiment, the invention relates to the sequencing of the tags to allow for their annotation by computational means and their statistical analysis, where such tags would be derived from regions within genomes . Thus the invention relates to the characterization of genetic elements within genomes .
In just a different embodiment, the invention relates to the preparation of hybridization probes from the ends nucleic acid molecules, where such regions would be analyzed by the means of in situ hybridization.
In a preferred embodiment, the in situ hybridization experiment would make use of a tiled array.
In just one more embodiment, the invention relates to the full-length cloning of nucleic acid molecules. In another embodiment, the sequence information as obtained from the tags is used for primer design, and where such primers are used to amplify the nucleic acid molecule in an amplification reaction. It is within the scope of the invention to amplify and clone in such a way transcribed regions as well as genomic fragments, where such fragments can contain genetic elements, said promoter regions.
Thus the invention provides means for the analysis of nucleic acid molecules and short fragments thereof as needed for example for the characterization of biological samples.
Applicahility. The applicability of this invention are to prepare full-length cDNAs and CAGE (cap-analysis gene expression) libraries from limited amount of tissues because the three steps involved the derivatization of the RNA do not involve any precipitation, therefore no nucleic acid loss takes place. In another embodiment, this protocol could also be used for the general cDNA preparation, as the steps are simplified and allows great time and workload savings during the preparation of cDNA/ 5' end tags. Combination of all steps of this invention including from the step of isolating of 5' ends of RNA to determine the sequence is used as a kit or instruments . One of the embodiments of a flow of this current invention is shown as Fig 1.
Brief Description of the Drawings The objects and features of the invention can be better understood with reference to the following detailed description and accompanying drawings .
Figure 1 shows one of the embodiments of a flow of the present invention.
Figure 2 shows the activity of each of the enzymes in the various steps and buffer.
Figure 3 shows the results of ONE-Tube oligo-capping and amplification with 5' end RACE.
Figure 4 is a graph showing length of sequenced concatamers with the 454 Life Sciences sequencers.
Detailed Description of the Invention
The present invention relates to the analysis of RNA/nucleic acids from biological samples . The term "biological samples" includes any kind of material obtained from living organisms including microorganisms, animals, and plants, as well as any kind of infectious particles including viruses and prions , which depend on a host organism for their replication. As such "biological samples" include any kind material obtained from a patient, animal, plant or infectious particle for the purpose of research, development, diagnostics or therapy. Thus, the invention is not limited to the use of any particular nucleic acid molecules or their origin, but the invention provides general means to be applied to and used for the work on and the manipulation of any given nucleic acid. Any such nucleic acid molecules as applied to perform the invention can be obtained or prepared by any method known to a person skilled in the art including but not limited to those described by Sambrook J. and Russuell D.W., Molecular Cloning, A Laboratory Manual, Cold Spring Harbor Laboratory Press, New York, 2001, hereby incorporated herein by reference.
RNA, for the purpose of the invention, is considered a single-stranded nucleic acid molecule even where such a molecule may form secondary structures comprising double-stranded RNA portions. In particular, RNA encompasses for the purpose of the invention any form of nucleic acid molecule comprised of ribonucleotides, and does not related to a particular sequence or origin of the RNA. Thus RNA can be transcribed in vivo or in vitro by artificial systems, or non-transcribed, spliced or not spliced, incompletely spliced or processed, independent from its natural origin or derived from artificially designed templates, mRNA, tRNA, rRNA, obtained by means of synthesis, or any mixture thereof. More precisely, the expressions "DNA" , "RNA" , "nucleic acid" , and "sequence" encompass nucleic acid materials themselves and are thus not restricted to particular sequence information, vector, phagemid or any other specific nucleic acid molecule.
A large division of RNA is the capped RNA with non-capped RNA. Non-capped RNA, which is not recovered, constitutes the majority of the RNA in a sample (at least 95% of the species) . For this reason, for the purpose of this invention, the starting RNA type should be divided into groups depending on the presence of cap site.
An mRNA is usually referred to RNAs having cap-sites and a polyA tail, and constitutes 1-3% of the total and encode for proteins. Another class of capped molecules is the nc-RNA, that is not encoding and is poorly polyadenylated, but does have a cap structure. Altogether, these two do not constitute more than 5% of the RNA molecules in a cell. Therefore, we will refer to "total RNA" and "capped RNA" , whereas total RNA contains up to some 5% of capped RNAs . Roughly speaking as example, about 1 molecule/20 molecules of RNA contain a cap. Therefore, to treat a similar number of capped molecules, one has to use a much larger amount of "total RNA" rather than "capped RNA" fractions (some 20 fold) . This distinction is important to distinguish the use of total RNA versus cap or polyA selected RNA, which contains a much larger fraction of usable molecules.
A solid matrix is intended throughout this document as a solid body with indefinite size, that can have any shape, which includes beads and surfaces. Among them, but not limited to, there are beads including magnetic beads, coated beads and slides constituted by glass, plastic, silica, teflon or any other suitable material . These matrixes can be inert or reactive, such as being coated with streptavidin, antibodies, reactive chemical groups, or any form of nucleic acids (RNA, DNA, hybrid molecules) of derivatives of nucleic acids having one or more modifications or the bases and/or the backbone sequence. In this document, the definition of "matrix" , "surface" and "solid surface" is used interchangeably as they refer to binding nuclei acids to an object that is not in solution, and allow nuclei acid isolation or capture or further manipulations .
Current protocols for cloning
However, one big limitation of the above technology is the capacity to work on small-scale samples, thus reducing the applicability of this method. RNA has been a limiting factor, yet protocol requires minimal amount of RNA varying from 50 micrograms of total RNA. Capture of full-length mRNA is not reported below hundreds of nanograms of capped RNA (requiring micrograms of total RNA) .
There are several methods that have been used so far to capture full-length cDNAs through their cap site, and a comprehensive literature review is not possible within this description. Here only the main but representative methods will be described. Most method take advantage of the presence of the cap-structure of the mRNA (or some other noncoding RNAs) , that is present at the 5' end. Selecting cDNAs through this moiety allows to capture full-length cDNA when the preparation of the cDNA starts on the 3 ' ends from the polyA tails . This may be facilitated by RNase digestion of the RNAs, in absence of full-length cDNA that protects the RNA hybridized to the cDNAs.
Cap-modifying agents through derivatization of the cap moiety on RNAs
These methods include the cap-trapping (Carninci and Hayashizaki, Methods Enzymol. 1999; 303: 19-44.) and the Patents of Genset (U. S pat. No.5, 962, 272, US 6,022,715) . These methods have been very effective to capture the 5' ends of mRNAs/RNAs at high efficiency (reportedly up to 95% efficiency, Ref . ) . These methods start from relatively high starting RNA material, which is larger than 1 microgram of total RNA (Carninci et al . , Genome Res. 2003 Jun,- 13 (6B) : 1273-89) . These groups of methods cannot be used for less than 1 microgram of total RNA. Cap binding proteins/peptides (Edery et al, MoI Cell Biol. 1995 Jun;15(6).-3363-71)
Another method is based on the capture of the cap-site, through a cap binding protein, after first strand cDNA. In this regards, the cap-binding proteins is used to capture the cDNA from the 5 ' ends . Alternative version would include capturing the RMA with cap-binding antibodies, or alternatively, other fragment of antibodies or even phage display antibodies .
This method (Edery et al, Edery et al, MoI Cell Biol.
1995 Jun,- 15(6) :3363-71) was used with several micrograms of mRNA only and did not become widespread because the difficulties or binding the cap-binding proteins to the a solid supporting matrix.
Oligo-capping based methods (Suzuki and Sugano, Methods MoI Biol. 2003; 221:73-91) This family of methods is based on replacing the 5' end cap with a different primer sequence (an oligonucleotide) , which can be RNA, DNA, or a hybrid DNA/RNA molecule. Only cDNA that reach the conjugated 5' ends can be subjected to further manipulation, such as their copying into second strand cDNA and PCR. This group of methods has also achieved some popularity, with the drawbacks of having relatively low efficiency, which requires a large number of PCR cycles before obtaining enough material for a RNA library. With these methods, no less than several micrograms of RNA have been used. Cap-switch methods (Zhu et al, Biotechniques . 2001 Apr; 30(4).-892-1.) .
Cap switch methods are based on the addition, at the 5' end of RNAs and in particular the cap-site, of a trinucleotides GGG. The cDNA is primed with oligo-dT to minimize the contamination of ribosomal RNAs, which do not carry a polyA tail. This methods is particular appealing because does not require cap manipulations . Using this method, investigators have been able to utilize single cells to prepare cDNA probes . The method, however, has still two problems. One is that the cap-switch has low efficiency and PCR is^ required. PCR based full-length cDNA libraries preparation is problematic, because the long cDNA clones are seldom recovered because PCR amplification of long insert is outcompeted by shorter PCR products deriving from short mRNAs . Another disadvantage is that three G nucleotides are added to the 5' end of the cDNAs . If using these for 5' end tagging technologies, by using a class II S restriction enzyme to cleave a 5' end tag, three (out of 20 bases, see later for a description of tagging cDNA ends) bases would be lost as useful information. This is particularly critical when mapping these on the genomes (see Mapping of tags on the genome) .
Other methods for full-length cDNA synthesis are constituted by general improvement of the conventional technologies . A simple method is to simply select the cDNA based on their size. Selection of long cDNAs based on their size. Long cDNA molecules will have a much larger chance to be non-truncated cDNA molecules than short cDNA molecules (See for example, Nomura etal. DNARes.1994; 1(5) :223-9.) .
Another example consist of the classic methods to tail the cDNA (Okayama and Berg, MoI Cell Biol. 1982 Feb; 2 (2) :161-70) are useful to capture full length cDNAs . Alternatively, or in combination with any of the above technologies, improved reverse transcriptases, involved in the first strand of full-length cDNA preparation, such as Superscript II and Superscript III (Invitrogen) are useful to prepare cDNA libraries enriched in full length cDNAs. However, it is important to distinguish from libraries enriched in full-length cDNAs (where up to 60-70% of the cDNA may be full-length) to cap selected full length cDNAs, because the latter may contain much higher percentage of full-length clones (95% or better) , that give a much larger certainty that redundantly identified 5' ends are true transcriptional starting sites . Therefore, we need to refer to "cap-selected" or "full-length selected"
Another important definition is the exact accuracy to identify the 5' end of true full-length molecules. A true full-length method identifies with perfect accuracy the transcriptional starting site (TSS) . True full length methods are listed above, whereas the initial base is present in the cDNA and can be separated from other sequences, such as linkers or additional sequences (such as the GGG used in the cap-switch or the tailing procedures used in the Okayama and Berg in MoI Cell Biol. 1982 Feb;2 (2) : 161-70) . Although the cap-switch can be used for capturing the 5' end of reduced amount of samples, including single cells (Gustincich et al, Proc Natl Acad Sci U S A. 2004 Apr 6; 101(14) :5069-74.) , it adds undesired sequences (the GGG), which reduces the usefulness of the approach. For instance, if the 5' end cDNA is used to tag-based profiling, where the 5' end of the cDNA is used for producing 20-bases tags for later concatenation and high-throughput sequencing (as reviewed by Harbers and Carninci, Nature methods, Nat Methods. 2005 JuI; 2 (7) :495-502) , these extra three bases would substitute disadvantageousIy the sequence of the mRNA in such tags. This would largely impair mapping onto the genome, because 17 bases are seldom unique in the mammalian genome, whereas 20 bases tend to be mostly unique (length of tag and mapping efficiency principle has been discussed by the review of Harbers and Carninci, Nat Methods. 2005 Jul;2(7) :495-502) .
Frequent methods for cloning include the Gubler-Hoffman method (Gubler and Hoffman, Gene. 1983
Nov;25 (2-3) :263-9) . In this case, even when the cDNA is synthesized up to the 5' end of the mRNA, the preparation of the second strand requires RNaseH cleavage, followed by second strand synthesis with DNA polymerase I and ligation of the obtained DNA fragments . The mechanism of polymerierization of the second strand requires primer at the 5' end having an extensible 3'-OH group. This is provided by the residual RNA, but there is no 3'-0H group that is suitable to prime the most upstream (close to the cap site) cDNA. Therefore, the initial 10 to 50 base pairs, identifying the cap-site, are lost when using the Gubler-Hoffman protocol, which is otherwise appealing because of its simplicity. The cDNA is in fact cloned after exonuclease treatment, which has to remove the part of the first-strand cDNA that extends to the cap site. Libraries obtained with this method may be quasi-full-length, but are not full-length as never identify the true cap site.
This brief summary of the existing methods highlight the lack of a method that can produce cDNA, or 5' end cDNAs desirable for capturing full-length cDNAs at high efficiency for cDNA cloning or tagging approaches . Existing methods lack at least one of these features, that are required all combined for - Capability to capture full-length cDNAs from nanograms of RNA or less,
No addition of undesired homopolymer tails (such as GGG)
Capturing the exact 5' end, - Simplified protocol in a single tube to avoid loss, or
Simplified protocol scalability of the operations.
Novel ONE-tube invention The term "nucleic acid" is also used herein to encompass naturally occurring nucleic acids, artificially synthesized or prepared nucleic acids, and any modified nucleic acids into which at least one or more modifications have been introduced by naturally occurring events or through approaches known to a person skilled in the art. Similarly, DNA or RNA molecules may function in a specific manner as hybridization probes, and as such are related to as "complementary sequences" for the purpose of the invention, or in experiments where such probes are applied for the detection of a related nucleic acid molecule , even where such a probe and the target molecule may be distinct by naturally occurring or artificially introduced mutations in individual positions.
We have developed a novel principle of ONE-TUBE concept to add one nucleic acid to nucleic acids representing the 5' end of full-length mRNA molecules, which is suitable for full-length cDNA isolation, 5' end tag isolation, 5' end enrichment mRNA/cDNA isolation and other analytical applications not restricted to these descriptions . The principle is based on having a series of reagents added to an RNA, or a mixture of RNAs in a tube.
These mRNAs may be derived from cells, tissues, or any other biological source containing RNAs . In one embodiment, these RNAs are derived from any human, animal or plant tissue or cell line, and it could be constituted by the whole RNA or RNA fractions, either selected by size, by RNA features (e.g. : the presence of the polyA tails in most of the mRNAs) , the cell compartments (e.g. : nuclear RNA, cytoplasmic RNA, polysomal RNA, membrane-bound polysomal RNAs. and RNA belonging to other ribonucleoprotein complexes) .
In another embodiment, RNA may be derived from other biological fluids, such as blood or serum. For instance, it may contain viral RJSTA or other potential parasites from the blood of an individual human . Capturing the 5 ' end of these mRNAs would allow diagnosis of potential parasites .
In another embodiment, the RNA is obtained from purified cells, including flow-sorted cells from dissected tissue. These could either be cells that are labeled with a selectable fluorescent antibody by flow sorting, or labeled by the transgenic expression of a marker such as the green fluorescent protein (GFP) , frequently used in mammalian experiments. Alternative, these cells can be selected based on their morphology or other labeling by laser capture microdissection.
These are not the only way to extract RNAs for this application and are just examples on the potential applicability of the invention.
The invention is based on the addition of the reagents to the 5' end of the RNA in a single tube until the RNA is specifically labeled with a "nucleic acid" . The mRNAs are known for having the cap-site, which in this case would be selectively substituted by another nucleic acid molecule, generally and RNA, a DNA or a hybrid of DNA/RNA (but other nucleic acids may be suitable, such as RNA/peptide nucleic acids hybrids) . The nucleic acids attached to the 5' end of the RNA may be modified in various ways, including incorporation of unconventional bases and unconventional backbones, such as alpha-thio derivatives of nucleic acids that are already used in common DNA sequencing operations . These nucleic acids may also contain labeled dies for additional selection, and other binding groups, such as biotin or digoxygenin. In other words, the nucleic acid sequence attached to the 5 'end of the mRNA does not have any restriction in length, but in most applications it is conveniently built to be at least 15 nucleotides long, to allow efficient priming, and no longer than 100 bases, as this poses challenges for oligonucleotides synthesis. However, these are no restrictions on the finding, but are practical limitation of accessory operations (such as priming) and oligonucleotide RNA synthesis.
The nucleic acids binding the 5' end of the DNA would either be single strand, but also be partially of completely double strands nucleic acids. Partial double strand may help the derivatization of the 5' end of the mRNA with the added nucleotide under certain conditions (such as described in Shibata et al, Biotechniques . 2001 Jun;30 (6) : 1250-4) .
Ultimately, these nucleic acids binding the 5' end of the RNAs can be a single type of molecules, or a multitude of different nucleic acids .
In one embodiment, the target RNA is placed in one tube and is subjected, without requiring sample purification, to three conceptually different steps: (1) masking of the non-full-length mRNA molecules, (2) modification of the cap-site molecules into reactive molecules, and (3) attaching the treated RNA molecule to the priming nucleic acid .
The terms used here to name an enzyme, or an enzymatic activity, are used herein to describe the function or activity of such a component, but do not require the absolute purity of such a components . Thus any mixture containing such an enzyme, enzymatic activity, or mixtures thereof with other components of the same, related or unrelated function are within the scope of the invention.
Limitation of traditional protocols for oligocapping and rationale
The reactions (I)- (3) above are usually achieved, in the oligo-capping protocol, by extensive purification by protease digestion and/or phenol and other organic solvent extraction followed by alcohol precipitation, at each stage . Alternative and less toxic methods may be available, but all purification steps cause unacceptable samples loss. For this reason, this procedures (also know as standard oligocapping methods) has not been usable with very low RNA concentration, as small amount of RNAs are lost during the organic (phenol, chloroform) extraction and subsequent ethanol precipitation.
Such tedious and cumbersome extractions have been derived by the use of enzymes working at different pH, buffer and conditions, and their inability of inactivating them with consequent interference in the subsequent step . For instance, step (1) above has require bacterial alkaline phosphatase (BAP) , which works at alkaline around pH of 8. This enzyme has to be inactivated before proceeding to the step 2 above. If inactivated, it would remove the phosphate groups from the full-length mRNAs obtained at step 2 (see later) . However, this cannot be achieved easily, because BAP cannot be heat inactivated by mild heat shock at 65°C. Instead the optimal activity of BAP is at 650C (which would cause also RNA alkali based degradation at pH 8) . Therefore , other previous works have required organic extraction.
In the step 2, the cap is removed by tobacco acid pyrophosphatase (TAP) , which is then used at pH 6. TAP removes the cap structure of the RNAs (this includes mRNA but also a novel group of non-coding RNAs) that are intact at the 5' end, and leaves a phosphate group instead. After step 1 and 2 , the only molecules that contain a phosphate correspond to full-length RNA molecules, or at least other molecules that carry intact sequence and their 5' ends. All of the others RNA molecules contain an OH group at the 5' end, which is not a substrate for the reaction (3) . The TAP has to be inactivated before the steps 1 to 3 takes place, as it would interfere with this step by degrading the ATP, the necessary substrate for the RNA ligase at step 3 (see later) . Therefore, also at the end of this step organic extraction is required and subsequent precipitation, which is time consuming and cause loss of samples.
The reaction number 3 is the RNA ligase. The enzyme
RNA ligase does add an RNA molecule, or RNA/DNA hybrid or
DNA molecules at the 5' ends. As mentioned, RNA ligase requires ATP as substrate to provide the necessary energy to catalyze the ligation between the full-length RNA 5' end and the 3' end oh group of the added nucleic acid (usually an oligonucleotide) . This reaction is carried out at various temperatures, usually low to minimize the DNA damage at pH 7.5 or 8.
Overcoming above limitation of reactions (I)- (3)
In our process, we introduce the concepts how to overcome requirements of (a) multiple extraction/purification steps with phenol/chloroform and ethanol without substituting them with other nucleic acids purification steps, which are not welcome because time consuming, causing sample loss, or expensive. (b) We introduce the concepts how to overcome the requirement of multiple buffers changes and (c) we introduce the concepts how to overcome the need to change the enzymes by identifying conditions to inactivate the enzymes, without removing them from the solution. This has permitted the "one tube" full-length RNA conjugation, which is useful also because allows to work with small samples, to simply laboratory operations with any size of sample, and reduce personnel and material costs .
The steps we took to make the reactions (I)- (3) possible in a single tube are the followings.
1) Contrary to common laboratory procedures, we have identified principles, including enzymes used at the beginning of this procedure work well in reduced buffer and salinity concentration. 2) We have established the concept to use a series of conditions including buffers and modifier buffers. This allows increasing the buffer concentration gradually throughout the steps (I)- (3) that are surprisingly optimal for all of the enzymes. In other words, we have tested and found that the remaining traces of buffers, and inactivated enzymes from upstream reactions do not inhibit downstream reactions .
3) We have established the concept to use conditions, including enzyme combinations and conditions that allows enzyme necessary at step (1) and (2) to be inactive during the step 3.
4) We have established the concept to use conditions that such combination of reagents works at enough high efficiency, to allow the method also when using total RNA instead of purified mRNA fraction as in the original oligo-capping methods . This allows obtaining cDNA even from very small quantity of RNA.
5) We have established the concept to use conditions to find the optimal buffer and pH condition by which RNA is not degraded at 650C during the enzymes inactivation, as this is an additional issue when heating-up RNA in presence of buffer often containing divalent metals and at high pH.
In one embodiment, the enzyme we successfully use a phosphatase which is thermosensitive. Preferably, we used at the step 1 is an Antarctic phosphatase, which can be inactivated at 650C after the stepl, in a buffer that does not damage the RNA even at high temperature useful for inactivation, but not limited to this enzyme. Such buffer resulted to be at pH lower than 7.5 and devoid of divalent ions. In particular, a low saline acidic buffer corresponding to these requirements would be particularly useful. After inactivation of the Phosphatase, without any further purification, a modified buffer that is compatible with the subsequent reaction, which involved addition of a pyrophosphatase activity or any equivalent activity. Preferably, we have been using the TAP enzyme (see example 1) . After the reaction, the invention refers to process of inactivating the pyrophosphatase activity without degrading the KNA, for instance by heat inactivate in another buffer that does not damage the RNA. Finally, we create the condition for the activity of an enzyme to ligate an exogenous nucleic acid to the 5' end of the RNAs. They may be achieved by the addition of another modifier buffers . The added exogenous nucleic acids can be of various types, but for instance could be RNA oligonucleotide . The preferred enzyme to carry the addition of the exogenous nucleic acid to the 5' end of the mRNA is preferentially a nucleic acid ligase. This could be a DNA ligase but preferentially an RNA ligase, such as the T4 RNA ligase or other thermostable RNA ligase. This can promoter, for instance, the final step of tagging the RNA with an oligonucleotide . In a preferred embodiment, we not to degrade the RNA throughout the process by using low concentration and mild acidic buffers.
Such modified RNA can then be used for a variety of downstream application. In one application, full-length cDNA can be obtained by priming the RNA on the 3 ' end tails with an oligo-dT primer. Only the molecules that reach the 5' end oligonucleotide can then be primed for the second strand cDNA, therefore, obtaining cDNA containing both the 5' and 3' ends. These full-length cDNA can then further treated and cloned following standard procedures.
The tagging nucleic acid can carry various sequences moieties that allow further functional application. For instance, it can contain restriction enzymes, and it can contain Polymerase sequences or recombination sequences . These sequences can also be used for further purification of the RNA/nucleic acid hybrid, and its derivatives, by mean of nucleic acids hybridization, protein-epitope interaction, or any other chemical interaction. The methods to bind nucleic acids or derivatives is not limited to the above methods, that are cites simply as example.
Alternatively, the 5' end oligonucleotide could be an RNA oligonucleotide which contains a restriction site of a class-Us restriction enzyme. These enzymes recognize a sequence but cleave outside their recognition sequence; with appropriate design, the Class Hs restriction enzyme Mmel can cleave 20/18 bases within the cDNA and produce 5' end tags (otherwise named 5 ' -SAGE or CAGE) (reference : Hashimoto et al, Nat Biotechnol .2004 Sep,-22(9) :1146-9; Kodzius et al, Nat Methods.2006 Mar,-3 (3) :211-22) . In this embodiment , the 5' end tags are isolated and joined to form concatamers for high throughput sequencing; concatamers allow a much more cost effective utilization of sequencing. These concatamers can either be sequenced directly after their ligation with linker suitable for the process of the 454 Life Science sequencing instrument based on pyrosequencing, or can be cloned into appropriate plasmid cloning vectors such as pZero plasraids from Invitrogen, for conventional sequencing with Sanger-based methods or any appropriate sequencer. A "tag" according to the invention can be any region of a nucleic acid molecules as prepared by the means of the invention, where the term "tag" as used herein encompasses any nucleic acids fragment, no mater whether it is derived from naturally occurring, artificially synthesized or prepared nucleic acids, any modified nucleic acids into which at least one or more modifications have been introduced by naturally occurring events or through approaches known to a person skilled in the art . Furthermore, the term "tag" does not relate to any particular sequence information or their composition but to the nucleic acid molecules as such.
In one embodiment, the fist strand primer can be not only an oligo-dT primer, but also a primer containing random sequences, for instance 6 or 9 random bases in the 3' position. This nucleic acid primer can also contain at its 5' end sequence specific oligonucleotides that can be in turns used for specific priming.
In another embodiment, such modified RNA can be used to determine the real 5' end sequence in protocols known as
RACE (rapid amplification of cDNA ends) , with the obvious advantage of being much faster than normal race procedures . In an alternative embodiment, these modified RNA molecules can bind directly to a solid support, or after their modification such as transcription into first strand full-length cDNA, and after being immobilized to such solid matrix, they can further be used for subsequent operation such as those used for understanding the primary sequences of these RNA molecules.
In another embodiment, the 5' ends is conjugated with a nucleic acids carrying a promoter for a polymerase, such as an RNA polymerase. This allows the hybrid molecules, or their derivatives, to promote RNA transcription and thus to replicate its own molecule, or a part of it. This molecule can include part or complete RNAmolecule, or can also contain additional nucleic acids sequences.
This method can be used then to produce kits, containing among other components the reagents (chemicals) ad the appropriate pH and combinations, nucleic acids and enzymes in order to let unspecialized technical personnel to reproducibly isolate the 5' ends of mRNAs for cDNA production or 5' end expression profiling.
In another embodiment, the invention encompasses the methods for handling single strand as well as double strand
DNA molecules. Double-stranded DNA means any nucleic acid molecules each of which is composed of two polymers formed by deoxyribonucleotides and in which the two polymers have substantially complementary sequences to each other allowing for their association to form a dimeric molecule. The two polymers are bound to one another by specific hydrogen bonds formed between matching base pairs within the deoxyribonucleotides . Any DNA molecule composed only of one polymer chain formed by two or more deoxyribonucleotides having no matching complementary DNA molecule to associate with is considered to be a single-stranded DNA molecule for the purpose of the invention, even if such a molecule may form secondary structures comprising double-stranded DNA portions. As used interchangeably herein, the terms "nucleic acid molecule (s) " and "polynucleotide (s) " include RNA or DNA regardless of single or double-stranded, coding or non-coding, complementary or not , and sense or antisense, and also include hybrid sequences thereof. In particular, it encompasses genomic DNA and complementary DNA which are transcribed or non-transcribed, spliced or not spliced, incompletely spliced or processed, independent from its origin, cloned from a biological material, or obtained by means of synthesis.
In a different aspect, the invention relates to methods for the isolation of fragments from nucleic acid molecules for the purpose of analysis. Thus the invention relates to the conversion of a sample containing one or more nucleic acid molecules, where such nucleic acid molecules or any mixture of nucleic acid molecules would be converted into DNA. To perform the invention, nucleic acid molecules can be derived from any naturally occurring genomic DNA, RNA sample, an existing DNA library, is of artificial origin, or any mixture thereof . The invention is not limited to the use of an individual nucleic acid molecule or any plurality of nucleic acid molecules, but the invention can be performed on an individual nucleic acid molecule or any plurality of nucleic acid molecules regardless whether such pluralities would occur in nature, be derived from an exciting library, or be artificially created. Furthermore, the invention can process any nucleic acid molecule regardless of its origin or nature . Thus it is within the scope of the invention that the nucleic acid molecules could be full-length molecules as compared to naturally occurring nucleic acid molecules, or any fragment thereof. Even furthermore, it can be envisioned that such fragments of nucleic acid molecules could be prepared by a random process or by a targeted dissection of nucleic acid molecules by the means of an enzymatic activity with a preference for a certain sequence, or by means which would allow for the fragmentation based on the structure of the nucleic acid molecule including but not limited to exons and introns within transcripted regions . Thus the invention is not restricted to the use of any particular starting material . The invention is not dependent on the use of DNA only, as a person familiar with the state of the art will know different approaches to convert RNA into DNA including but not limited to those approaches disclosed by Sambrook J. and Russell D.W., ibid, hereby incorporated herein by reference. After conversion of RNA into DNA, a single-stranded or double-stranded DNA molecule having the same or complementary sequence to the original RNA can be obtained, said cDNA. Such cDNA molecules are commonly prepared in the form of liner DNA, where the two open ends allow for their manipulation. However, even where cDNAs are cloned into a vector, a person trained to the state of the art will know about the necessary means to release an insert from such a vector to convert it into linear DNA.
In another embodiment, said introduction of priming sites at 5' -ends of RNA is used to later bind the resulting nucleic acids. Such oligonucleotides are used to capture specifically RNA molecules for manipulation and analysis by using a part of the sequence to bind RNA to surface, and another part to prime sequencing. Alternatively, the oligonucleotide can be used as a tag to link the RNA, or resulting cDNA, to a surface, and after sequencing primers of know sequence to solution and use the resulting for direct sequencing. Such captured nucleic acids can be used for further operations and analysis, including but not limited to sequencing, replication, amplifications and chemical and enzymatic modifications .
Alternatively, the nucleic acids consisting of the first strand cDNA from the conjugated 5 ' RNA/RNA can be bound, as it is or through a hybrid nucleotide, to a solid matrix for further solid phase manipulation, such as single molecules sequencing.
In another preferred embodiment, the obtained nucleic acid hybrid is captured by another immobilized primer that can prime direct polymerization and sequencing reaction on the single molecule level. This can provide unprecedented sequencing throughput is applied on the single cell.
It is important to notice that the conjugation of the sequences to nucleic acids corresponding to the 5 ' ends of the mRNAs does not necessarily rely on nucleic acids ligases, but can be done with any other methods to capture full-length cDNA, such as the biotinylated cap trapper family of technologies (Carninci and Hayashizaki, Methods Enzymol. 1999; 303: 19-44) or other equivalent methods.
In another embodiment, the RNA can be self-ligated in absence of oligonucleotides, whereas the ribosomal RNA (in large excess) mostly ligates the 5' end of mRNA after dephosphorylation and decapping. The ribosomal RNA sequence moiety can be used to prime the second strand cDNA and for further purification of the deriving nucleic acids, including but not limited to binding other nucleic acids for the purpose of priming polymerization, physical immobilization or isolation. These could be then used as substrate for massively parallel sequencers like 454 Life Sciences sequencing instrument .
In a further embodiment, the nucleic acids deriving from the conjugation of a nucleic acid to 5' end of RNA molecules would be used to direct sequencing reactions . This is not limited to the process claimed in this patent, but could be obtained with any of the full-length capture methods available.
In a preferred embodiment, the nuclei acids contains two different sequences that are suitable for single-molecule emulsion PCR amplification as described in (Margulies et al, Nature. 2005 Sep 15; 437(7057) : 376-80.) . In a further embodiment, the nucleic acids carrying information including the complete 5' end of the nucleic acids is labeled with a moiety such as biotin, or other reactive chemical groups, for further binding to a matrix which is case of biotin will be streptavidin of avidin. This is not limited to this as any other reactive chemical group can be used for the coupling. Such captured nucleic acids is then used for further operations and analysis, including but not limited to sequencing, replication, amplifications and chemical and enzymatic modifications.
The examples which follows are set forth to illustrate the present invention, and are not to be construed as limiting thereof .
Embodiments
Example 1: Preparation of a library of 5' -derivatized RNA molecules This example is a typical protocol for the derivatization of five-prime ends of RNA molecules with RNA oligonucleotides. All reactions were performed in a 500 μl siliconised microtube and by using a siliconized tip each time to avoid nucleic acids losses.
The RNA sample was at first depohosphorylated. The RNA
(for instance 1 ng to 1 μg) was added in a tube, together with 2μg of glycogen, in a total volume of 5 μl . The reaction buffer was 1/10 the common concentration, or 5 mM Bis-Tris-Propane-HCl, 0.1 mM MgCl2, 0.01 mM ZnCl2, pH 6.0 at 25°C. Glycogen helped to avoid attachment of KNA to the plastic during the operation. The sample was denatured at 650C for 5 min. , to expose the phosphate groups to be later removed, and after holding at 370C for 2 min. , the Anctartic phosphatase (New England Biolabs) was added (2.5 units). The sample was treated for 3 hours to overnight at 370C. Overnight dephosphorylation allowed removal of 98-99% of the phosphate groups . Short incubation could also be performed at 450C in the presence of trehalose at 0.6M final, which increased the activity at 45°C.
Then, the Antarctic phosphatase was inactivated at 65°C, but before doing this, the divalent ions had to be chelated. For this reason, 0.55 μl of a solution of (0.5 M sodium acetate (pH 6.0) , 1OmM EDTA, 1% β-mercaptoethanol, and 0.1% Triton X-100) were added. This enzyme chelated the divalent ions and creates conditions suitable for the subsequent TAP treatment. The Antarctic phosphatase was also inhibited by EDTA of the buffer. The final inactivation was carried at 650C for 5 to 15 min.
This was the followed by decapping, but simple addition of 0.2 μl (2 units) of tabacco acid pyrophosphatase (TAP) . It was also possible to increase the quantity of the TAP up to 20 units/experiment. The reaction was carried out for 2 hours, followed by heat inactivation in this buffer, at 650C for 15 min. , after which the sample was brought on ice. Optionally, also betain can be added (1 M) , which helps melting GC rich secondary structures in RNAs, but this is optional . After this treatment, the TAP did not degrade ATP, which is necessary for the subsequent step.
Then, the ligation was carried out by adding one micro liter of a 100 micro molar "capping RNA" oligonucleotide of any sequence, for instance sequence: 5'- UUUGGAUUUGCUGGUGCAGUACAACUAGGCUUAAUAGGAUCCGACG -3 ' (SEQ ID NO: 1), at final concentration of 5 μM oligonucleotide. Denaturation of the oligonucleotides was carried out at 650C for 5 minutes, after which the tube was carried to the block of thermocycler at 200C. To 6.75 μl of the reaction mixture, 2 μl of RNA ligase buffer (50OmM HEPES-NaOH (pH 8.0 at 25°C) , 10OmM MgCl2, 10OmM DTT) were added. DTT inhibited the TAP. Optionally, also hexamino cobaltum chloride (HCC) can be added at final concentration of ImM concentration, but this is optional and not necessary. Polyethylene glycol was then added (PEG 8000) at final concentration of 25%, ATP at final concentration of 125 μM concentration and finally 10 units of T4 RNA ligase were added. At such conditions, the resulting mixture of previous buffers was not inhibitory for the ligation steps .
The sample was then ligated for 2 hours to overnight
(16 hours) at 200C. At this point, the RNA was capped with an oligonucleotide and this could be used for different tests as they appear in other examples, such as full-length cDNA preparation.
The activity of each of the enzymes in the various steps and buffer is described in Figure 2, whereas: (A) : Evaluation of the activity of the Antarctic Phosphatase (New England Biolabs) . 51 phosphorylated oligoribonucleotid.es were dephosophorylated for 120 min. at 37°C in the following buffers. (1): Antarctic phosphatase (AP) Ix. (2): AP 0.5x (3) : AP O.lx (4) : H20 (5) : Tobacco Acid Pyrophosphatase (TAP) O.lx (6) : TAP 0.5x (7) : TAP Ix (8) : TAP 0.5x + AP 0.5x. The oligoribonucleotides were subsequently radiolabelled with T4 Kinase and gamma-32P-ATP and analysed by PAGE . In absence of prior dephosphorylation, radiolabelling is impossible due to the 51 phosphate. Positive control (9): 51OH oligoribonucleotide . Negative control (10): 51 phosphorylated oligoribonucleotide . AP buffer: supplied by New England Biolabs. TAP buffer: supplied by Epicentre.
(B) Evaluation of the activity of the Tobacco Acid Pyrophosphatase (TAP) (Epicentre) . Gamma-32P-ATP was incubated with 2 U TAP (except control lanes 2, 5, and 7) in the following buffers (1) : H2O (2), (3) and (4) : TAP Ix
(5), (6) and (7) : T4 RNA ligase Ix (Fermentas) + TAP 0.25x
+ AP 0.25x. Negative controls: (2) and (5), no enzyme. Heat-inactivation control (7) : the TAP was heated 15 min. before incubation with radioactive ATP. The ATP was degraded by TAP in lanes 3 and 4 (replicates) , but not in other buffers or after heat-inactivation.
(C) Evaluation of the activity of the T4 RNA ligase
(RNL, Fermentas) in presence of traces of AP and TAP buffers.
A radiolabelled oligoribonucleotide was incubated in presence of an unlabelled oligoribonucleotide . Ligation results in a shift of the electromobility in polyacrylamide gel. The reactions were made in presence or absence of AP, TAP and RNL. (1) : AP, TAP, RNL (2) : TAP, RNL (3) : AP, RNL (4) : AP, TAP (5) : RNL (6) : TAP (7) : AP (8) : no enzyme . Ligation occurred in the mixed buffer, and was not impaired by- leftovers of TAP.
Example 2: Production of full-length cDNA from the above one-tube capped mKNA at example 1.
The sample as above could be desalted using microcon YM-100 filter as described by the manufacturer (Millipore) . To the ligated RNA, water and reverse transcriptase (RT) primers, which could be obtained by Invitrogen, were added. Eight hundreds ng of the primer AGA GAG AGA CCU CGA GCC UAG GUC CGA C (SEQ ID NO: 2) were used for a 20 μl reaction, and 3 μl of the sorbitol-trehalose mixture (3.3 M stock, final concentration of 0.5M Sorbitol and 4% trehalose) was added during making of the final RT reaction. The RNA-primer mixture was heated-up for 10 min. at 650C and then stored on ice during preparation of the remaining reagents. Then, a premix composed of 11 μl of 2x GC buffer (described in Carninci, Shiraki et al, Biotechniques) was added and 1 μl of 10 mM dNTPs stock, and finally, 1 μl of MMLV reverse transcriptase (RnaseH minus, Fermentas) were further added. The GC buffer system could be replaced by any buffer recommended by the manufacturer. Mixed this reaction with the RNA sample, and incubated for 2 min. at 250C (to anneal the samples) , 30 min. at 42°C, 10 min. at 52°C, 10 min. at 560C before stopping the reaction. In this way, cDNA was achieved at high frequency that spanned the 5 ' end of the original mRNAs . This could be further purified/processed. For instance, it can be treated with proteinase K (addition of 20 μg, together with EDTA at 10 mM final concentration, followed by RNA and proteinase inactivation at 950C for 15 min. This sample could then be used on C14B
(Amersham-Pharmacia) to fractionate the size, or eliminate the primers.
The cDNA was then amplified by PCR. To the cDNA we added the Takara EX-taq buffer at final concentration of 1 X, then dNTPs were added (final concentration: 200 micromolar each) , 5' oligonucleotide (sequence: ace teg age eta ggt ccg ac; SEQ ID NO: 3) and 3' end oligonucleotide (sequence: ca gcg tec tea age ggc ege,- SEQ ID NO: 4) , each oligonucleotide at 400 nM concentration, MgCl2 at 2.5 mM, and KCl at final concentration of 50 mM. The components were mixed with a hot start and then after 5 min. at 940C, samples were incubated for 30 seconds at 940C, 30 seconds at 580C, and 1.5 min. at 680C, for 30 cycles.
This produced 5' end cDNA that were complete and could be blunted and cloned following standard techniques into a plasmid vector (refer to Sambrook et al for general information about molecular cloning and sequencing) .
Here is one example of complete 5 '-end sequence of a cDNA clone isolated with this method:
>1 tgtaaaacnacggCCaGtGaATgtaaaACGACGgcCAgtGAATtgTAATACGACTC
ACTA
TAgggCgaaTtGggccgctAccggccgccatggccgcgggTattacctcgagccta ggtcgacacctcgagcctaggtccgacatcgcttctcggccttttggctaagatca agtgtagtatctgttcttatcagtttaatatctgatacgtcctctatccgaggaca atatattaaatggatttttggaagtaggagttggaataggagcttgctccgtccac tccacgcatcgaacctggcgGccgcttgaggacgctgtgcgaggtggtgtgttgag tagcgtgtcgtgaatcactagtgcgGccgcctgcaggtcgaccatatggGagagct cccaacgcgttggaTgCaTagCTtgagtattcTAtagtgtcacctaaatagcttgg cgtaatcatggtcatagctgtttcctg tgtgaaattgttatccgcta (SEQ ID NO: 5)
This contains vector sequences (1-104) , a duplicated 5' primers and the U2 snRNA (capped, form base 145 to 297) followed by the 31 primer (ACTCAACACACCACCTCGCACAGC; SEQ ID NO: 6), and vector sequences.
Example 3 : Application for RACE experiment The capped RNA could be prepared as for the example 1, with the only difference that the RNA oligonucleotide had a different sequence as described below. By using the process in example 1, followed by PCR, it was possible to amplify 5' end RACE (rapid amplification of 5' ends) the experiment was performed as follows: 500ng of total RNA from liver was subjected to ONE-Tube oligo-capping, followed by the removal of the unreacted oligoribonucleotides, and reverse-transcription with random primers . The 5 ' ends were amplified with a gene-specific primer (TTGGAGAGAGGGTTTCGACGAGTCA; SEQ ID NO: 7) and a primer complementary to the oligo-cap (CGACTGGAGCACGAGGACACTGA; SEQ ID NO: 8) .
The experiment is described in Figure 3. (1) : Phenol-Cholrophorm purification after ONE-Tube capping. (2) : Microcon YM-100 (Millipore) purification after ONE-Tube capping. (3) Phenol-Cholrophorm purification after ONE-Tube capping, no TAP during ONE-Tube capping. (4) Negative control of the PCR reaction.
Example 4: Application to 454 or other matrix.
The cDNA was prepared as in the example 1 and 2. However, the oligonucleotides prepared for the example 4 was designed in order to have the different adaptors at the 3' and 5' end of the RNA, respectfully: (Adaptor A: CCATCTCATCCCTGCGTGTCCCATCTGTTCCCTCCCTGTCTCAG; SEQ ID NO: 9 ; Adaptor B :
/BBioTEG/CCUAUCCCCUGUGUGCCUUGCCUAUCCCCUGUUGCGUGUCUCAG; SEQ ID NO: 10) . Adaptor B was used as "oligo-capping" sequence, and the Adaptor A was used to conjugate to a oligo-random primer for the first strand synthesis. After the first strand synthesis, the material was passed through a C1-4B spin column to separate the excess of unreacted primers. Subsequently, the sample was subjected to the emulsion-PCR and then sequencing reactions as described for the 454-Life Science sequencing instrument (Margulies et al, Nature. 2005 Sep 15 ;437 (7057) :376-80) . This allowed achieving hundreds of thousands sequences in a single run.
Example 5: Application for 5' end sequencing tags.
A cDNA was achieved as example 1 and 2 , and the sample was processed until the second strand DNA was achieved by using standard protocols obvious for a person skilled in the art, such as described in Nature Methods, R. Kodzius et al. , Mar.3(3) 211-222, 2006. The cDNA was then cleaved with Mmel, which was present on the oligonucleotide used for the oligo-capping in examples 1 and 2 (sequence: 5'- UUUGGAUUUGCUGGUGCAGUACAACUAGGCUUAAUAGGAUCCGACG -3 ' ; SEQ ID NO: 1) . After the Mmel cleavage, 100 nanograms of a linker was ligated to the cleaved cDNA (sequence upper oligonucleotide: . 5 '-Phosphate
GGATCCTCAGGACTCTTCTATAGTGTCAGTACGGA-NH2-3I ; SEQIDNO: 11, lower oligonucleotide,
Phosphate-GGATCCTCAGGACTCTTCTATAGTGTCAGTACGGA-NH2-3' ; SEQ ID NO: 12; these two oligonucleotide were briefly mixed to reconstitute a linker before the ligation; NH2= amino group) , and ligation was proceeded with DNA ligase as described in Kodzius et al. The fragment containing the most 5' end of the mRNA is (the 5' end of the cDNA) was separated as in Kodzius et al . , and it was amplified as described with the exception that the primers used for the PCR were 5 ' -DualBio GCAGTACAACTAGGCTTAATA -3' (SEQ ID NO: 13) and 5 ' -DualBio GACACTATAGAAGAGTCCTGA -3' (SEQ ID NO: 14). After purification of the selected band (77 bp) , the samples were digested with the restriction enzyme BamHI, and then the tags containing the 5' end of cDNA (29 nt) were purified out from acrylamide gel. Detailed protocols for the production of the 20 nt tags, corresponding to the 5' end of mRNAs, are described in Nature Methods, R. Kodzius et al., Mar. 3(3) 211-222, 2006. After the purification of the 34 nt tags, these were mixed with two linkers, names Linker A and Linker B, that were adapted from (Margulies et . al., Nature, 2005 Sep. 15, 437(7057), p376-80) , but were made compatible for the ligation with the 5 ' end tags obtained in this experiment : their sequences were (adaptor A: upper oligo, 5'- CCATCTCATCCCTGCGTGTCCCATCTGTTCCCTCCCTGTCTCAG -3 ' SEQ ID NO: 15 ; lower oligo, 5 ' -Phosphate-GATCCTGAGACAGGGAGGGAACAGATGGGACACGCAGGGATGA GATGG-31 SEQ ID NO: 16; adapter B: upper oligo, 5 ' -BioTEG-CCTATCCCCTGTGTGCCTTGCCTATCCCCTGTTGCGTGTCTCAG-B ' SEQ ID NO: 17, lower oligo, 5 ' -Phosphate-GATCCTGAGACACGCAACAGGGGATAGGCAAGGCACACAGGGG ATAGG -3' SEQ ID NO: 18). During the ligation, linkers A and B were ligated at 1:20 ratio with the 5' end tags (5' end tag excess) and the reaction allowed to proceed overnight with T4 DNA ligase at standard condition in a 10 microliters volume. The sample was then suitable to the 454 sequencer protocol .
These sequencing tags were used for sequencing and identification of gene borders like in Science, P. Carninci et al., Sep. 2, 309(5740), 1559-63, 2005, expression profiling and promoter of the genes in Nature Methods, M. Herbers and P. Carninci, JuI. 2(7), 495-502, 2005.
In this example, the sample was subjected to the emulsion PCR and then sequencing reactions as described for the 454-life science sequencing instrument (Margulies et. al., Nature, 2005 Sep. 15, 437(7057), p376-80) , as follows.
1) Usage of 5 ' -end concatamers for pyrosequencing (454 Life Sciences sequencing instrument) .
Details are available with the 454 sequencing kit. The main modification, if compared to Margulies et al, was that the sample introduced with this example contains already the linker A and B and therefore was suitable for being processed from an intermediate stage (described in Margulies) .
Transferred 50μl of library immobilization beads to a fresh 1.5 ml tube and the tube was set into a Magnetic
Particle Collector (MPC) , then removed the buffer. The library immobilization beads were washed twice with lOOμl of 2xlibrary binding buffer using the MPC and then removed the buffer. Suspended the beads in 25μl of 2 X library binding buffer and added 50μl of Oligo-capping library at the final volume of 75μl, then the tube was placed in a rotator for 20 min. During the rotation, the neutralization solution was prepared in a 1.5μl tube by mixing 500μl of Qiagen' s PB buffer and 3.8μl of acetic acid. The library-carrying beads were washed twice with lOOμl of library washed buffer using the MPC, then removed the buffer . Added 50μl of melt solution; 0.125ml of ION NaOH, 9.875ml of sterilized distilled water (SDW) , to the washed sample, vortexed it and using the MPC, removed and transferred 50μl of supernatant to the neutralization solution, then repeated above steps again. The neutralized single-stranded template DNA (sstDNA) library was purified using a Qiagen' s MinElute PCR Purification Kit. Added 750μl of PE to the tube of lOOμl of sample and 503.8μl of neutralization solution, placed it in the centrifuge for 1 min, then removed and transferred the supernatant to a 1.5ml tube. After that, it was placed in the centrifuge for 1 min again, added 15μl of EB buffer and centrifuged for 1 min.
2) Emulsion PCR The content of Clonal Amplification Reagents Kit was as follows, lOxCapture Beads Wash Buffer, DNA Capture Beads, Mock Amplification Misx, Amplification Mix, MgSO4, Amplification Primer Mix and Ppiase.
Prepared lxCapture beads wash buffer; mixed ImI of lOxCapture Beads Wash Buffer and 9. OmI of SDW in a 15ml tube .
Vortexed the DNA capture beads and allotted 480μl to each four 0.2ml tubes. Then vortexed, centrifuged for 10 sec, rotated 180°and centrifuged for 10 sec. One ml of the supernatant was removed without disturbing the beads. ImI of lxCapture Beads Wash buffer was allotted to each tube, then vortexed, centrifuged for 10 sec, rotated 180°, centrifuged for 10 sec and then, removed ImI of the supernatant without disturbing the beads. Again, ImI of lxCapture Beads Wash buffer was allotted to each tube, then vortexed, centrifuged for 10 sec, rotated 180°, centrifuged for 10 sec and then, removed ImI of the supernatant without disturbing the beads . The sstDNAs were allotted to each tube . Prepared two 8- connected tube and around 25μl of the sample was allotted to each well, then vortexed, centrifuged for 10 sec, rotated 180°and centrifuged for 10 sec.
Annealed the sstDNA to the capture beads using 9700 thermocycler, 80°Cfor5min, decreased to 700C by 0. l°C/sec, maintained at 700C for 1 min, decreased to 600C 0.1°C/sec, maintained at 6O0C for 1 min, decreased to 500C 0.1°C/sec, maintained at 5O0C for 1 min, decreased to 200C 0.1°C/sec and end. For the preparation of emulsion PCR, one box of
Emulsion oil was vortexed and 240 μl of Mock Amplification mix was added to each sixteen tube of Emulsion oil, then the tubes were set in TissueLyser 25/sec for 5 min to make a small bubbles. While TissueLyser was running, Amplification Mix was prepared in a 15ml tube as follows, mixed 181.62 μl of
Amplification mix, lO.Oμl of MgSO4, 2.08μl of Amplification
Primer Mix, 6.0μl of platinum HiFi Taq polymerase
(Invitrogen) and 0.3μl of Ppase for one tube, 726.48μl of Amplification mix, 40. Oμl of MgSO4, 8.32μl of Amplification
Primer Mix, 24. Oμl of platinum HiFi Taq polymerase
(Invitrogen) and 1.2μl of Ppiase for four tubes, 2179.44μl of Amplification mix, 120. Oμl of MgSO4, 24.96μl of
Amplification Primer Mix, 72. Oμl of platinum HiFi Taq polymerase (Invitrogen) and 3.6μl of Ppiase for twelve tubes, 2905.92μl of Amplification mix, 160. Oμl of MgSO4, 33.28μl of Amplification Primer Mix, 96. Oμl of platinum HiFi Taq polymerase (Invitrogen) and 4.8μl of Ppiase for sixteen tubes, respectively. After the annealing, the sample was centrifuged and removed the supernatant as much as . Added 160μl of Amplification Mix to each well of two 8-connected tubes, mixed it well and left for 30sec. All quantity of sstDNA and Amplification mix were transferred to each tube of emulsion oil. Then set the TissueLyser to 15/sec for 5 min to make big bubbles. Prepared 1.5 plates of 96-well plate. One tube of emulsion oil was added to each 8 wells, then set into the 9700 thermocycler and started the amplification program as follows, 940C for 4min 1 cycle, 94°C for 30 sec 40 cycles, 58°C for 60 sec, 68°C for 90 sec, 94°C for 30 sec 13 cycles, 580C for 6 min and 100C forever. 3) Bead Recovery
Prepared a buffer as follows, mixed 25ml of DNA Bead Wash Buffer and 100 ml of EtOH, mixed 62.5 ml of Enhancing Fluid and 187.5ml of SDW, and mixed 8 ml of lOxAnnealing Buffer and 72 ml of SDW.
Prepared the syringe by screwing the 16 gauge blunt needle on to the end of a 10ml syringe, then assembled the Swinlock filter unit with the nylon filter.
Added lOOμl of isopropanol to each tube containing the emulsion of amplified material . Drew the emulsion-isopropanol mix from each of the wells (up to 32 wells) into the syringe. Again, added another lOOμl of isopropanol to each well of the 96-well plate and drew into the syringe. Inverted the syringe and expelled all the air. Added 9ml of isopropanol into the syringe and washed three times, then discarded the isopropanol, washed with 6ml of IxDNA Bead Wash Buffer once, washed with 6 ml of Enhancing Fluid once, then drew in 0.5-lml of Enhancing Fluid, removed the Swinlock filter unit and expelled the contents of- the syringe into a 1.5ml tube. If there was many beads on the filter membrane, washed with another 500 μl of Enhancing Fluid, expelled the contents of the syringe into the same tube, then vortexed, centrifuged for 10 sec, rotated 180 °and centrifuged for 10 sec. After that, remained lOOμl of supernatant and remain was discarded.
For the DNA library bead enrichment, vortexed the tube of Enrichment Beads and 80μl of it was allotted to each of 1.5 ml four tubes . Added 1 ml of Enhancing Fluid to the tubes , and which were vortexed well. Using the MPC, pelleted the magnet to Enrichment Beads, then removed and discarded the supernatant, after that, added 400μl of Enhancing Fluid to each of the tubes and vortexed them well. Added 400μl of Enrichment Beads to lOOμl of DNA beads, vortexted for 2 sec and rotated 5 times. Added 500μl of Enhancing Fluid and mixed slightly, then placed in the MPC for 2 min and carefully removed all the supernatant to 1.5 ml tube. After taking away the tubes from the MPC, added 1 ml of Enhancing Fluid, then placed in the MPC for 2 min. After that, removed all the supernatant and washed twice. After taking away the tubes from the MPC, added 700 μl of Melt solution, vortexed for 5 sec and placed in the MPC. Then, transferred all the supernatant to a 1.5 ml tube. Again, added 700 μl of Melt solution, vortexed for 5 sec and placed in the MPC. Then, transferred all the supernatant to the same 1.5 ml tube (total 1.4 ml) , and the tube was vortexed, centrifuged for 10 sec, rotated 180° and centrifuged for 10 sec. After that, removed 1300 ml of the supernatant (remain was lOOμl) and added 1 ml of lxAnnealing Buffer, then vortexed, centrifuged for 10 sec, rotated 180° and centrifuged for 10 sec. Removed 1 ml of the supernatant and washed twice, then transferred the remaining of enriched DNA bead (lOOμl) to a 0.2 ml tube. Added lOOμl of lxAnnealing Buffer to the same 1.5 ml tube to collect remaining of enriched DNA bead and transferred to the 0.2 ml tube, then vortexed, centrifuged for 10 sec, rotated 180°and centrifuged for 10 sec. After that, removed 185μl of the supernatant (remain was 15 μl) and added 12 μl of Sequencing primer . Then set the 0.2 ml tube into the 9700 thermocycler to anneal the Sequencing primer. After the annealing, added lOOμl of Annealing Buffer to the 0.2 ml tube, vortexed, centrifuged for 10 sec, rotated 180°and centrifuged for 10 sec. Subsequently, removed lOOμl of the supernatant, added 200 μl of Annealing buffer, and the tube was vortexed, centrifuged for 10 sec, rotated 180°and centrifuged for 10 sec. Then, removed lOOμl of the supernatant .
4 ) Sequencing
Mixed and vortexed DNA beads with 18μl of Control DNA beads in a 1.5ml tube, then vortexed, centrifuged for 10 sec, rotated 180°and centrifuged for 10 sec. After that, 30 μl of supernatant was remained in the tube and discarded the remains, after that the tube was kept on ice.
Prepared incubation mix in a 1.5ml tube as follows, mixed 1080 μl of Bead Buffer 3, 70 μl of polymerase Co-factor and 140μl of DNA polymerase . Then added 620μl of Incubation mix to the tube of the sample beads and rotated for 30 min.
For the preparation of the Packing Beads, washed
Packing Beads three times as follows, added 1 ml of Bead Buffer 1 to the tube of Packing Beads, vortexed and centrifuged at lOOOOrpm for 5 min. , then removed all the supernatant. After that, added 1 ml of Bead Buffer 1.
For the preparation of the Enzyme Beads, washed the Enzyme Beads three times as follows, added 1 ml of Bead Buffer 2 to the tube of Enzyme Beads, vortexed and placed in the MPC, removed all the supernatant. After that, added 1 ml
Figure imgf000053_0001
Figure imgf000054_0001
of Bead Buffer 2.
For the preparation of 1st Layer, mixed 500 μl of Packing Beads and 500 μl of Enzyme Beads in a 2ml tube. For the preparation of 2nd Layer, mixed 460 μl of Enzyme beads and 1400 μl of Bead Buffer2 in a 2 ml tube . For the sequencing mix, to the tube of DNA beads (after rotation) , added 250μl of Bead Buffer 2 and 960 μl of 1st Layer.
For the deposition of the First and Second Layer, placed PicoTiterPlate, gasket was set, then applied 100 μl of Bead Buffer 2 per region, then centrifuged at 2800 rpm as program 1, after that, removed Bead Buffer 2. Subsequently, Applied 100 μl of 1st Layer per region, then centrifuged at 2800 rpm as program 2, after that, removed the supernatant . Same step as for 1st Layer was conducted for 2nd Layer. Then soak the PicoTitrePlate in Bead Buffer 2 and set the plate to start Prep Run. After that, confirmed the removal of Bead Buffer 2 from the PicoTitrePlate completely, started sequencing.
The result is showed as Figure 4, Table 1. In fig 4, Y axis shows the number of read and X axis shows the read length (bp) . Table 1 shows top 20 of Mapped CAGE tags, CAGE tags were mapped against Mm8 Mouse genome . As the result showed in Fig 4 and Table 1, one-tube method was significant effective to the application for 5' end sequencing tags.
Table 1: Top 20 of Mapped CAGE tags
Sequence: Sequence of CAGE tags
Mapped: How many different locations in genome this CAGE tags mapped to
Location: Where CAGE tags mapped to
Example 6: Sequencing DNA bound to solid surface.
The cDNA was achieved as in examples 1 and 2, and the sample was processed until the first strand cDNAwas achieved by using standard protocols obvious for a person skilled in the art, such as described in (Kodzius et al, Nat Methods.
2006 Mar;3 (3) :211-22) . Subsequently, the nucleic acids area attached to a solid-phase matrix such as in the US patent application US 20060012793, US 20060012784, US 20060008824, and instruments based on such technology. The foregoing examples are illustrative of the present invention, and are not to be construed as limiting thereof. The invention is described by the following claims, with equivalents of the claims to be included therein.
References
Velculescu V. E. et at., Science 270, 484-487 (1995) Sana S. et al., Nat. Biotechnol. 20, 508-12 (2002) Brenner S., et al., Nat. Biotechnol. 18, 630-634 (2000) Brenner S., et al . , Proc. Natl. Acad. Sci. USA 97, 1655-1670 (2000)
HwangB.J. et al . , Proc. Natl. Acad. Sci. USAlOl, 1650-1655 (2004) Hashimoto S. et al., Nat. Biotechnol. 22, 1146-1149 (2004) Zhang Z. and Dietrich F. S., Nuc . Acids Res. 33, 2838-2851 (2005)
Wei CL. et al. Proc. Natl. Acad. Sci. USA 101, 11701-11706 (2004) , Metzker M. L. Genome Res. 15, 1767-1776 (2005)
Kling J., Nature Biotechnology 23, 1333-1335 (2005) Shendure J. et al . , Nature Review Genetics 5, 335-344 (2004) Carninci et al, Science. 2005 Sep 2 ,-309 (5740) : 1559-63 , Gerhard et al, Genome Res. 2004 Oct ; 14 (10B) :2121-7 Imanishi et al. PLoS Biol. 2004 Jun,-2 (6) :el62
Harbers and Carninci Nat Methods. 2005 Jul;2 (7) :495-502 Carninci et al Genome Res. 2003 Jun; 13 (6B) : 1273-89, Kodzius et al, Nat Methods. 2006 Mar; 3 (3) :211-22 Hashimoto et al, Nat Biotechnol. 2004 Sep;22 (9) : 1146-9 Suzuki and Sugano, Methods MoI Biol. 2003/221:73-91
Sambrook J. and Russuell D.W., Molecular Cloning, A Laboratory Manual, Cold Spring Harbor Laboratory Press, New York, 2001 Carninci and Hayashizaki, Methods Enzymol . 1999;303 : 19-44. Edery et al, MoI Cell Biol. 1995 Jun; 15 (6) : 3363-71 Zhu et al, Biotechniques . 2001 Apr; 30 (4) : 892-7.
Nomura et al. DNA Res. 1994; 1 (5) :223-9
Okayama and Berg, MoI Cell Biol. 1982 Feb;2 (2) : 161-70
Gustincich et al, Proc Natl Acad Sci U S A. 2004 Apr 6; 101 (14) :5069-74.
Gubler and Hoffman, Gene. 1983 Nov;25 (2-3) :263-9
Margulies et al, Nature. 2005 Sep 15; 437 (7057) : 376-80
Shibata Y, et al, Biotechniques. 2001 Jun,-30 (6) : 1250-4
US patent application 20030008290, US patent application 20030049653
US patent application 20060012793,
US patent application 20060012784,
US patent application 20060008824,
US patents 6,352,828; 6,306,597; 6,280,935; 6,265,163; 5,695,934
PCT/JP03/07514

Claims

1. A method to capture nucleic acid ends suitable for limited amount of starting material based on sequential addition of reagents in compatible buffer, comprising the steps of :
(a) removing the phosphate groups at 5' -end of non-capped nucleic acid molecules;
(b) removing the cap-structure from 5' -complete RJSTA molecules and creating of a free phosphate group at the 5 '
-end position of the previously capped RNA molecule ; and
(c) adding of a nucleic acid molecule to the phosphorylated 5' -end of the RNA molecule derived in step
(c) wherein all of the steps are completed in a single tube through sequential reagents addition.
2. A method claimed in claim 1, wherein the step (a) is performed by a phosphatase that can be inactivated by heat treatment, a change of the reaction buffer, or a combination thereof .
3. A method claimed in claim 2 , wherein the phosphatase is an Antarctic Phosphatase .
4. A method claimed in claim 1, wherein the step (b) is performed by an enzyme that can be inactivated by heat treatment, a change of the reaction buffer, or a combination thereof .
5. A method claimed in claim 4, wherein the Cap-structure is removed by a reaction with the Tobacco Acid Pyrophosphatase .
6. A method claimed in claim 1, wherein a nucleic acid molecule is added to the phosphorylated 5' -end of the RNA molecule by an enzyme selected from the group consisting of the T4 RNA Ligase, T4 DNA Ligase, ThermoPhage, single-stranded DNA Ligase, and any mixture thereof.
7. A method claimed in claim 1, wherein the nucleic acid molecule added to the phosphorylated 5' -end is made of ribonucleotides, desoxyribonucleotides, modified nucleotides, or any mixture thereof, and wherein said nucleic acid molecule may carry information for the manipulation of the modified RNA or any DNA derived thereof .
8. A method claimed in claim 1 wherein the obtained modified RNA is subjected to reverse transcription to obtain cDNA.
9. A method claimed in claim 8 wherein the obtained cDNA is used to prepare a full-length cDNA library.
10. A method claimed in claim 8, wherein obtained cDNA is used to isolate a sequence tags corresponding to the end of the RNA.
11. A method claimed in claim 10, wherein the obtained sequence tag is used for the preparation of nucleic acids Goncatamers .
12. A method claimed in claim 10 or 11, wherein the fragments are isolated at the 5' end of the RNA.
13. A method claimed in claim 10 or 11, wherein both 5' and 3' end tags are used to create a 5' -3' end ditags .
14. Amethod claimed in any one of claims 1 to 7 , wherein the obtained DNA are used for high throughput sequencing.
15. A method claimed in any one of claims 1 to 14, wherein the cDNA is used for detection of 5' ends.
16. A method claimed in claims 1 to 14, wherein the derivatized nucleic acids or its complementary nucleic acids or their derivatives with/without substitute are bound to a matrix for further analysis .
17. A method claimed in claim 16, wherein the obtained nucleic acid bounded matrix is applied to any appropriate sequencer to determine its sequence .
18. A method claimed in claim 16, wherein the obtained nucleic acid corresponds to a 5' -end RNA molecules obtained with a 5' -end molecules enrichment process.
19. A kit comprising of the combination of all steps in any one of claims 1 to 18.
PCT/JP2007/058126 2006-04-07 2007-04-06 Method to isolate 5' ends of nucleic acid and its application WO2007117039A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2006-106770 2006-04-07
JP2006106770A JP2009072062A (en) 2006-04-07 2006-04-07 Method for isolating 5'-terminals of nucleic acid and its application

Publications (1)

Publication Number Publication Date
WO2007117039A1 true WO2007117039A1 (en) 2007-10-18

Family

ID=38198097

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2007/058126 WO2007117039A1 (en) 2006-04-07 2007-04-06 Method to isolate 5' ends of nucleic acid and its application

Country Status (2)

Country Link
JP (1) JP2009072062A (en)
WO (1) WO2007117039A1 (en)

Cited By (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009268362A (en) * 2008-04-30 2009-11-19 Dnaform:Kk Modification of rna and method for preparing dna from rna
EP2283132A2 (en) * 2008-05-02 2011-02-16 Epicentre Technologies Corporation Selective 5' ligation tagging of rna
WO2013063308A1 (en) * 2011-10-25 2013-05-02 University Of Massachusetts An enzymatic method to enrich for capped rna, kits for performing same, and compositions derived therefrom
EP2684954A1 (en) 2012-07-10 2014-01-15 Lexogen GmbH 5´ protection dependent amplification
WO2014048185A1 (en) * 2012-09-29 2014-04-03 深圳华大基因科技服务有限公司 Method for enriching transcript from rna sample and use thereof
EP2347013B1 (en) * 2008-11-14 2014-06-18 Baxter International Inc. Method for the specific detection of low abundance rna species in a biological sample
US9422602B2 (en) 2012-08-15 2016-08-23 Bio-Rad Laboratories, Inc. Methods and compositions for determining nucleic acid degradation
US11286530B2 (en) 2010-05-18 2022-03-29 Natera, Inc. Methods for simultaneous amplification of target loci
US11306357B2 (en) 2010-05-18 2022-04-19 Natera, Inc. Methods for non-invasive prenatal ploidy calling
US11306359B2 (en) 2005-11-26 2022-04-19 Natera, Inc. System and method for cleaning noisy genetic data from target individuals using genetic data from genetically related individuals
US11312991B2 (en) 2016-01-27 2022-04-26 Kabushiki Kaisha Dnaform Method for decoding base sequence of nucleic acid corresponding to end region of RNA and method for analyzing DNA element
US11322224B2 (en) 2010-05-18 2022-05-03 Natera, Inc. Methods for non-invasive prenatal ploidy calling
US11319596B2 (en) 2014-04-21 2022-05-03 Natera, Inc. Detecting mutations and ploidy in chromosomal segments
US11326208B2 (en) 2010-05-18 2022-05-10 Natera, Inc. Methods for nested PCR amplification of cell-free DNA
US11332793B2 (en) 2010-05-18 2022-05-17 Natera, Inc. Methods for simultaneous amplification of target loci
US11332785B2 (en) 2010-05-18 2022-05-17 Natera, Inc. Methods for non-invasive prenatal ploidy calling
US11339429B2 (en) 2010-05-18 2022-05-24 Natera, Inc. Methods for non-invasive prenatal ploidy calling
US11390916B2 (en) 2014-04-21 2022-07-19 Natera, Inc. Methods for simultaneous amplification of target loci
US11408031B2 (en) 2010-05-18 2022-08-09 Natera, Inc. Methods for non-invasive prenatal paternity testing
US11479812B2 (en) 2015-05-11 2022-10-25 Natera, Inc. Methods and compositions for determining ploidy
US11485996B2 (en) 2016-10-04 2022-11-01 Natera, Inc. Methods for characterizing copy number variation using proximity-litigation sequencing
US11519028B2 (en) 2016-12-07 2022-12-06 Natera, Inc. Compositions and methods for identifying nucleic acid molecules
US11519035B2 (en) 2010-05-18 2022-12-06 Natera, Inc. Methods for simultaneous amplification of target loci
US11525159B2 (en) 2018-07-03 2022-12-13 Natera, Inc. Methods for detection of donor-derived cell-free DNA
US11939634B2 (en) 2010-05-18 2024-03-26 Natera, Inc. Methods for simultaneous amplification of target loci

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
HUE052213T2 (en) * 2009-11-06 2021-04-28 Univ Leland Stanford Junior Non-invasive diagnosis of graft rejection in organ transplant patients

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5891637A (en) * 1996-10-15 1999-04-06 Genentech, Inc. Construction of full length cDNA libraries
WO2000056913A1 (en) * 1999-03-19 2000-09-28 Genetics Institute, Inc. Primers-attached vector elongation (pave): a 5'-directed cdna cloning strategy
WO2004076628A2 (en) * 2003-02-24 2004-09-10 New England Biolabs, Inc. Overexpression, purification and characterization of a thermolabile phosphatase

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5891637A (en) * 1996-10-15 1999-04-06 Genentech, Inc. Construction of full length cDNA libraries
WO2000056913A1 (en) * 1999-03-19 2000-09-28 Genetics Institute, Inc. Primers-attached vector elongation (pave): a 5'-directed cdna cloning strategy
US20030008298A1 (en) * 1999-03-19 2003-01-09 Genetics Institute, Inc. Primers-attached vector elongation (PAVE): a 5'-directed CDNA cloning strategy
WO2004076628A2 (en) * 2003-02-24 2004-09-10 New England Biolabs, Inc. Overexpression, purification and characterization of a thermolabile phosphatase

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
CLEPET C ET AL: "Improved full-length cDNA production based on RNA tagging by T4 DNA ligase", NUCLEIC ACIDS RESEARCH, OXFORD UNIVERSITY PRESS, SURREY, GB, vol. 32, January 2004 (2004-01-01), pages e61 - e66, XP002271141, ISSN: 0305-1048 *
KOBORI H ET AL: "HEAT-LABILE ALKALINE PHOSPHATASE FROM ANTARCTIC BACTERIA RAPID 5' END-LABELING OF NUCLEIC-ACIDS", PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, vol. 81, no. 21, 1984, pages 6691 - 6695, XP002441906, ISSN: 0027-8424 *
MARUYAMA K ET AL: "OLIGO-CAPPING: A SIMPLE METHOD TO REPLACE THE STRUCTURE OF EUKARYOTIC MRNAS WITH OLIGORIBONUCLEOTIDES", GENE, ELSEVIER, AMSTERDAM, NL, vol. 138, 1994, pages 171 - 174, XP001008855, ISSN: 0378-1119 *
NILSEN I W ET AL: "THERMOLABILE ALKALINE PHOSPHATASE FROM NORTHERN SHRIMP (PANDALUS BOREALIS): PROTEIN AND CDNA SEQUENCE ANALYSES", COMPARATIVE BIOCHEMISTRY AND PHYSIOLOGY. B. COMPARATIVE BIOCHEMISTRY, PERGAMON PRESS, LONDON, GB, vol. 129, 2001, pages 853 - 861, XP001057502, ISSN: 0305-0491 *
RINA M ET AL: "Alkaline phosphatase from the Antarctic strain TAB5. Properties and psychrophilic adaptations", EUROPEAN JOURNAL OF BIOCHEMISTRY, BERLIN, DE, vol. 267, no. 4, February 2000 (2000-02-01), pages 1230 - 1238, XP002386959, ISSN: 0014-2956 *
SUZUKI Y ET AL: "CONSTRUCTION OF A FULL-LENGTH ENRICHED AND A 5'-END ENRICHED CDNA LIBRARY USING THE OLIGO-CAPPING METHOD", METHODS IN MOLECULAR BIOLOGY, HUMANA PRESS INC., CLIFTON, NJ, US, vol. 221, 2003, pages 73 - 91, XP008027714 *

Cited By (48)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11306359B2 (en) 2005-11-26 2022-04-19 Natera, Inc. System and method for cleaning noisy genetic data from target individuals using genetic data from genetically related individuals
US8163491B2 (en) 2007-08-17 2012-04-24 Epicentre Technologies Corporation Selective 5′ ligation tagging of RNA
US8309335B2 (en) 2007-08-17 2012-11-13 Epicentre Technologies Corporation Selective 5′ ligation tagging of RNA
US9963735B2 (en) 2007-08-17 2018-05-08 Epicentre Technologies Corporation Selective 5′ ligation tagging of RNA
JP2009268362A (en) * 2008-04-30 2009-11-19 Dnaform:Kk Modification of rna and method for preparing dna from rna
EP2283132A2 (en) * 2008-05-02 2011-02-16 Epicentre Technologies Corporation Selective 5' ligation tagging of rna
JP2011521625A (en) * 2008-05-02 2011-07-28 エピセンター テクノロジーズ コーポレーション Tagging by selective 5 'ligation to RNA
EP2283132A4 (en) * 2008-05-02 2013-02-20 Epict Technologies Corp Selective 5' ligation tagging of rna
EP2347013B1 (en) * 2008-11-14 2014-06-18 Baxter International Inc. Method for the specific detection of low abundance rna species in a biological sample
US11286530B2 (en) 2010-05-18 2022-03-29 Natera, Inc. Methods for simultaneous amplification of target loci
US11339429B2 (en) 2010-05-18 2022-05-24 Natera, Inc. Methods for non-invasive prenatal ploidy calling
US11525162B2 (en) 2010-05-18 2022-12-13 Natera, Inc. Methods for simultaneous amplification of target loci
US11746376B2 (en) 2010-05-18 2023-09-05 Natera, Inc. Methods for amplification of cell-free DNA using ligated adaptors and universal and inner target-specific primers for multiplexed nested PCR
US11482300B2 (en) 2010-05-18 2022-10-25 Natera, Inc. Methods for preparing a DNA fraction from a biological sample for analyzing genotypes of cell-free DNA
US11408031B2 (en) 2010-05-18 2022-08-09 Natera, Inc. Methods for non-invasive prenatal paternity testing
US11939634B2 (en) 2010-05-18 2024-03-26 Natera, Inc. Methods for simultaneous amplification of target loci
US11519035B2 (en) 2010-05-18 2022-12-06 Natera, Inc. Methods for simultaneous amplification of target loci
US11332785B2 (en) 2010-05-18 2022-05-17 Natera, Inc. Methods for non-invasive prenatal ploidy calling
US11326208B2 (en) 2010-05-18 2022-05-10 Natera, Inc. Methods for nested PCR amplification of cell-free DNA
US11306357B2 (en) 2010-05-18 2022-04-19 Natera, Inc. Methods for non-invasive prenatal ploidy calling
US11322224B2 (en) 2010-05-18 2022-05-03 Natera, Inc. Methods for non-invasive prenatal ploidy calling
US11312996B2 (en) 2010-05-18 2022-04-26 Natera, Inc. Methods for simultaneous amplification of target loci
US11332793B2 (en) 2010-05-18 2022-05-17 Natera, Inc. Methods for simultaneous amplification of target loci
WO2013063308A1 (en) * 2011-10-25 2013-05-02 University Of Massachusetts An enzymatic method to enrich for capped rna, kits for performing same, and compositions derived therefrom
KR20150028982A (en) * 2012-07-10 2015-03-17 렉소겐 게엠베하 5' protection dependent amplification
EP2684954A1 (en) 2012-07-10 2014-01-15 Lexogen GmbH 5´ protection dependent amplification
WO2014009413A2 (en) 2012-07-10 2014-01-16 Lexogen Gmbh 5´ protection dependent amplification
KR102119431B1 (en) 2012-07-10 2020-06-08 렉소겐 게엠베하 5' protection dependent amplification
US10538795B2 (en) 2012-07-10 2020-01-21 Lexogen Gmbh 5′ protection dependent amplification
WO2014009413A3 (en) * 2012-07-10 2014-06-05 Lexogen Gmbh 5´ protection dependent amplification
JP2015521857A (en) * 2012-07-10 2015-08-03 レクソジェン・ゲゼルシャフト・ミット・ベシュレンクテル・ハフツングLEXOGEN GmbH Amplification dependent on 5 'protection
US9422602B2 (en) 2012-08-15 2016-08-23 Bio-Rad Laboratories, Inc. Methods and compositions for determining nucleic acid degradation
WO2014048185A1 (en) * 2012-09-29 2014-04-03 深圳华大基因科技服务有限公司 Method for enriching transcript from rna sample and use thereof
US11371100B2 (en) 2014-04-21 2022-06-28 Natera, Inc. Detecting mutations and ploidy in chromosomal segments
US11530454B2 (en) 2014-04-21 2022-12-20 Natera, Inc. Detecting mutations and ploidy in chromosomal segments
US11319596B2 (en) 2014-04-21 2022-05-03 Natera, Inc. Detecting mutations and ploidy in chromosomal segments
US11408037B2 (en) 2014-04-21 2022-08-09 Natera, Inc. Detecting mutations and ploidy in chromosomal segments
US11414709B2 (en) 2014-04-21 2022-08-16 Natera, Inc. Detecting mutations and ploidy in chromosomal segments
US11486008B2 (en) 2014-04-21 2022-11-01 Natera, Inc. Detecting mutations and ploidy in chromosomal segments
US11319595B2 (en) 2014-04-21 2022-05-03 Natera, Inc. Detecting mutations and ploidy in chromosomal segments
US11390916B2 (en) 2014-04-21 2022-07-19 Natera, Inc. Methods for simultaneous amplification of target loci
US11479812B2 (en) 2015-05-11 2022-10-25 Natera, Inc. Methods and compositions for determining ploidy
US11946101B2 (en) 2015-05-11 2024-04-02 Natera, Inc. Methods and compositions for determining ploidy
US11312991B2 (en) 2016-01-27 2022-04-26 Kabushiki Kaisha Dnaform Method for decoding base sequence of nucleic acid corresponding to end region of RNA and method for analyzing DNA element
US11485996B2 (en) 2016-10-04 2022-11-01 Natera, Inc. Methods for characterizing copy number variation using proximity-litigation sequencing
US11519028B2 (en) 2016-12-07 2022-12-06 Natera, Inc. Compositions and methods for identifying nucleic acid molecules
US11530442B2 (en) 2016-12-07 2022-12-20 Natera, Inc. Compositions and methods for identifying nucleic acid molecules
US11525159B2 (en) 2018-07-03 2022-12-13 Natera, Inc. Methods for detection of donor-derived cell-free DNA

Also Published As

Publication number Publication date
JP2009072062A (en) 2009-04-09

Similar Documents

Publication Publication Date Title
WO2007117039A1 (en) Method to isolate 5' ends of nucleic acid and its application
US20100035249A1 (en) Rna sequencing and analysis using solid support
EP4265723A2 (en) Reagents and methods for molecular barcoding of nucleic acids of single cells
EP1654360B1 (en) Amplification method
US9255291B2 (en) Oligonucleotide ligation methods for improving data quality and throughput using massively parallel sequencing
US20080108804A1 (en) Method for modifying RNAS and preparing DNAS from RNAS
WO2021184146A1 (en) Method for constructing sequencing library of an rna sample to be sequenced
US20120329097A1 (en) Nucleic acid amplification
US20080096255A1 (en) Method for Preparing Sequence Tags
WO2004015085A2 (en) Method and compositions relating to 5’-chimeric ribonucleic acids
JP2020533964A (en) Cell-free protein expression using double-stranded concatemer DNA
CN107488655B (en) Method for removing 5 'and 3' adaptor connection by-products in sequencing library construction
US20050250100A1 (en) Method of utilizing the 5'end of transcribed nucleic acid regions for cloning and analysis
CN116368236A (en) Polynucleotide array
WO2011157617A1 (en) Complex set of mirna libraries
JP2006506953A (en) A fixed-length signature for parallel sequencing of polynucleotides
CN115552029A (en) Compositions and methods for rapid RNA-adenylation and RNA sequencing
JP4403069B2 (en) Methods for using the 5 'end of mRNA for cloning and analysis
EP3749779A1 (en) Library preparation
EP3798319A1 (en) An improved diagnostic and/or sequencing method and kit
US20050221360A1 (en) Method for purifying microbeads
Datson Scaling down SAGE: from miniSAGE to microSAGE
JP5253703B2 (en) Gene fragment acquisition method
CA3220708A1 (en) Oligo-modified nucleotide analogues for nucleic acid preparation
WO2023122746A2 (en) Compositions and methods for end to end capture of messenger rnas

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 07741562

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

NENP Non-entry into the national phase

Ref country code: JP

122 Ep: pct application non-entry in european phase

Ref document number: 07741562

Country of ref document: EP

Kind code of ref document: A1