WO2010096532A1 - Sequencing small quantities of nucleic acids - Google Patents

Sequencing small quantities of nucleic acids Download PDF

Info

Publication number
WO2010096532A1
WO2010096532A1 PCT/US2010/024547 US2010024547W WO2010096532A1 WO 2010096532 A1 WO2010096532 A1 WO 2010096532A1 US 2010024547 W US2010024547 W US 2010024547W WO 2010096532 A1 WO2010096532 A1 WO 2010096532A1
Authority
WO
WIPO (PCT)
Prior art keywords
nucleic acid
template
duplexes
sample
template nucleic
Prior art date
Application number
PCT/US2010/024547
Other languages
French (fr)
Inventor
Fatih Ozsolak
Original Assignee
Helicos Biosciences Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Helicos Biosciences Corporation filed Critical Helicos Biosciences Corporation
Publication of WO2010096532A1 publication Critical patent/WO2010096532A1/en
Priority to US12/904,683 priority Critical patent/US20110091883A1/en
Priority to US14/158,618 priority patent/US20150307932A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6844Nucleic acid amplification reactions
    • C12Q1/686Polymerase chain reaction [PCR]

Definitions

  • the invention generally relates to methods for sequencing small quantities of nucleic acids.
  • Sequencing-by-synthesis involves template-dependent addition of nucleotides to a template/primer duplex. Nucleotide addition is mediated by a polymerase enzyme and added nucleotides may be labeled in order to facilitate their detection. Single molecule sequencing has been used to obtain high-throughput sequence information on individual DNA or RNA. See, Braslavsky, Proc. Natl. Acad. Sci. USA 100: 3960-64 (2003). Recently, all four Watson-Crick nucleotides may be added simultaneously, each with a different detectable label or nucleotides may be added one at a time in a step-and-repeat manner for imaging incorporations.
  • template nucleic acid is not limiting, a number of applications start from small quantities of nucleic acid. For example, when bacteria that cannot be cultured (Rappe et al., Annu. Rev. Microbiol. 57:369-394, 2003) or when cDNA libraries from a small number of cells (Schutze et al., Nat. Biotechnol. 16:737-742, 1998) are sequenced, template nucleic acid amounts limit the number of sequences that may be determined.
  • Methods of the invention allow for very small quantities (e.g., nanogram, picogram, or fentogram amounts) of nucleic acids to be analyzed by sequencing methodologies.
  • methods of the invention analyze nucleic acids obtained from only a single cell. Methods of the invention are accomplished by increasing availability of nucleic acids in a sample to undergo a sequencing reaction.
  • methods of the invention involve obtaining a sample including template nucleic acid, introducing a carrier molecule to the sample, attaching an oligonucleotide tail to the template nucleic acid in the presence of the carrier molecule, introducing the tailed template nucleic acid to primers to form template/primer duplexes, exposing the duplexes to at least one detectably labeled nucleotide in the presence of a polymerase under conditions that permit the polymerase to add nucleotides to the primer in a template-dependent manner, detecting a signal from the label of the incorporated labeled nucleotide, and sequentially repeating the exposing and detecting steps at least once.
  • the carrier molecule stabilizes the tailing reaction, allowing for addition of higher concentrations of enzyme and dNTPs.
  • the oligonucleotide tail may be any oligonucleotide sequence.
  • the tail is a poly(A) tail.
  • The may be any length, such as at least about 5 nucleotides, at least about 10 nucleotides, at least about 20 nucleotides, at least about 50 nucleotides, at least about 70 nucleotides, or at least about 150 nucleotides.
  • the tail is 150 nucleotides.
  • the oligonucleotide tail may be attached to the template nucleic acids by any method known in the art.
  • the oligonucleotide tail may be ligated to the template nucleic acid.
  • the oligonucleotide tail is attached by a terminal transferase enzyme.
  • the carrier molecule may be any molecule that stabilizes the tailing reaction, allowing for addition of higher concentrations of enzyme and dNTPs.
  • Exemplary carrier molecules include RNA oligonucleotide and bead bound oligonucleotides.
  • the duplexes are attached to a substrate, either directly or indirectly (e.g., through a polymerase molecule). In other embodiments, the duplexes are attached at single molecule resolution.
  • detectably labeled nucleotides are added to the primer in a template-dependent manner.
  • the detectably labeled nucleotide is an optically detectably labeled nucleotide, such as a fluorescently labeled nucleotide.
  • Exemplary fluorescent labels include Atto, cyanine, rhodamine, fluorescien, coumarin, BODIPY, alexa, and conjugated multi-dyes.
  • the detectable label is a non-optically detectable label such as, for example, detection using nanopores.
  • methods of the invention involve increasing the number of 3' ends of template nucleic acids in a sample.
  • Those methods of the invention involve obtaining a sample including template nucleic acid, increasing the number of 3' ends of template nucleic acid in the sample, attaching an oligonucleotide tail to the 3' ends of the template nucleic acids, introducing the tailed template nucleic acids to primers to form template/primer duplexes, exposing the duplexes to at least one detectably labeled nucleotide in the presence of a polymerase under conditions that permit the polymerase to add nucleotides to the primer in a template-dependent manner, detecting a signal from the label of the incorporated labeled nucleotide, and sequentially repeating the exposing and detecting steps at least once.
  • the number of template nucleic acids that may receive an oligonucleotide tail is increased.
  • the number of template nucleic acids that may form template primer duplexes is increased and thus the number of template nucleic acids available to undergo the subsequent sequencing reaction is increased.
  • any method known in the art to fragment or shear nucleic acids may be used.
  • at least one restriction enzyme is added to the sample to digest the template nucleic acids to increase the number of 3' ends available for the tailing reaction.
  • methods of the invention optimize the reaction conditions and hybridization conditions for sequencing of a small quantity of template nucleic acids.
  • the invention generally relates to methods for sequencing small quantities of nucleic acids.
  • methods of the invention involve obtaining a sample including template nucleic acid, introducing a carrier molecule to the sample, attaching an oligonucleotide tail to the template nucleic acid in the presence of the carrier molecule, introducing the tailed template nucleic acid to primers to form template/primer duplexes, exposing the duplexes to at least one detectably labeled nucleotide in the presence of a polymerase under conditions that permit the polymerase to add nucleotides to the primer in a template-dependent manner, detecting a signal from the label of the incorporated labeled nucleotide, and sequentially repeating the exposing and detecting steps at least once.
  • the carrier molecule acts to stabilize the tailing reaction, allowing use of higher concentrations of enzyme and dNTPs. By stabilizing the reaction, a greater number of template nucleic acid molecules will have an oligonucleotide attached.
  • the number of template nucleic acids that successful receive an oligonucleotide tail is increased because the oligonucleotide tail acts as a primer binding site.
  • Increasing the number of template primer/duplexes increases the number of template nucleic acids available to undergo the subsequent sequencing.
  • the carrier molecule may be any molecule that stabilizes the tailing reaction, allowing for addition of higher concentrations of enzyme and dNTPs.
  • the carrier molecule is an RNA oligonucleotide.
  • the RNA oligonucleotides may be added to the tailing reaction along with the template nucleic acids. Because the RNA oligonucleotides do not have a free 3' end, the RNA oligonucleotides do not receive an oligonucleotide tail, and thus do not form a duplex with the primers and do not undergo the sequencing reaction.
  • the carrier molecule could also be a solid support bound oligonucleotide, such as a bead bound oligonucleotide. Any bead known in the art may be used.
  • the beads are magnetic dynabeads (Invitrogen).
  • Methods of attaching oligonucleotides to beads are known in the art, such as covalently attaching the oligonucleotides to the beads.
  • the bead bound oligonucleotides may be separated from the sample including the template nucleic acids, thus the bead bound oligonucleotides do not participate in the sequencing reaction.
  • Separating the bead bound oligonucleotides from the sample may be accomplished by any technique known in the art and will depend on the type of beads used. See for example, Sambrook et al. (Molecular Cloning - A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York, vol. 2, 1989), the content of which is incorporated by reference herein in its entirety. In other aspects, methods of the invention involve increasing the number of 3' ends of template nucleic acids in a sample.
  • Those methods of the invention involve obtaining a sample including template nucleic acids, increasing the number of 3' ends of template nucleic acid in the sample, attaching an oligonucleotide tail to the 3' ends of the template nucleic acids, introducing the tailed template nucleic acid to primers to form template/primer duplexes, exposing the duplexes to at least one detectably labeled nucleotide in the presence of a polymerase under conditions that permit the polymerase to add nucleotides to the primer in a template-dependent manner, detecting a signal from the label of the incorporated labeled nucleotide, and sequentially repeating the exposing and detecting steps at least once.
  • the number of template nucleic acids that may receive an oligonucleotide tail is increased.
  • the number of template nucleic acids that may form template primer duplexes is increased and thus the number of template nucleic acids available to undergo the subsequent sequencing reaction is increased.
  • template nucleic acids are able to be fragmented or sheared, using a variety of mechanical, chemical and/or enzymatic methods.
  • DNA may be randomly sheared via sonication, e.g. Covaris method, brief exposure to a DNase, or using a mixture of one or more restriction enzymes.
  • RNA may be fragmented by brief exposure to an RNase, heat plus magnesium, or by shearing. The RNA may be converted to cDNA before or after fragmentation.
  • the number of 3' ends of template nucleic acids in the sample is increased by using sequence-specific restriction enzymes.
  • Exemplary restriction enzymes include, but are not limited to: AfIII; ApaLI; BgIII; Ncol; Ndel; MIuI; Pad; BamHI; EcoRI; Bsu36I; Xbal; and PvuII.
  • only a single restriction enzyme is added to the sample.
  • a combination of different restriction enzymes is added to the sample.
  • Oligonucleotide tailing is described for example in Steinman et al. (International patent application number PCT/US09/64001), the content of which is incorporated by reference herein in its entirety.
  • the oligonucleotide tails act as a primer binding sites.
  • the primer binding site may be used to hybridize the template nucleic acid molecule to a sequencing primer, which may optionally be anchored to a substrate.
  • the primer binding sequence may be a unique sequence including at least 2 bases but likely contains a unique order of all 4 bases and is generally 20-50 bases in length.
  • One example of a specific sequence binding primer is: 5'-CAG GGC AGA GGA TGG ATG CAA GGA TAA GTG GA-3' (SEQ ID NO: 1).
  • the primer binding sequence is a homopolymer of a single base, e.g. polyA, generally 20 - 200 bases in length.
  • the oligonucleotide tail also may include a blocker, e.g., a chain terminating nucleotide, on the 3 '-end.
  • the blocker prevents unintended sequence information from being obtained using the 3 '-end of the primer binding site inadvertently as a second sequencing primer, particularly when using homopolymeric primer sequences.
  • the blocker may be any moiety that prevents a polymerase from adding bases during incubation with a dNTPs.
  • An exemplary blocker is a nucleotide terminator that lacks a 3'-OH, i.e., a dideoxynucleotide (ddNTP).
  • nucleotide terminators are 2',3'-dideoxynucleotides, 3'-aminonucleotides, 3'-deoxynucleotides, 3'-azidonucleotides, acyclonucleotides, etc.
  • the blocker may have attached a detectable label, e.g. a fluorophore.
  • the label may be attached via a labile linkage, e.g., a disulfide, so that following hybridization of the bar coded template nucleic acid to the surface, the locations of the template nucleic acids may be identified by imaging.
  • the detectable label is removed before commencing with sequencing.
  • the cleaved product may or may not require further chemical modification to prevent undesirable side reactions, for example following cleavage of a disulfide by TCEP the produced reactive thiol is blocked with iodoacetamide.
  • Methods of the invention involve attaching the oligonucleotide tail to the template nucleic acid molecules.
  • the oligonucleotide tail is attached to the template nucleic acid molecule with an enzyme, such as terminal transferase.
  • the enzyme may be a ligase or a polymerase.
  • the ligase may be any enzyme capable of ligating an oligonucleotide (RNA or DNA) to the template nucleic acid molecule.
  • Suitable ligases include T4 DNA ligase and T4 RNA ligase (such ligases are available commercially, from New England Biolabs). Methods for using ligases are well known in the art.
  • the polymerase may be any enzyme capable of adding nucleotides to the 3' terminus of template nucleic acid molecules.
  • the polymerase may be, for example, yeast poly(A) polymerase, commercially available from USB.
  • the polymerase is used according to the manufacturer's instructions.
  • the enzyme is a terminal transferase, which is commercially from New England Biolabs. The enzyme is used according to the manufacturer's instructions.
  • the ligation may be blunt ended or via use of complementary over hanging ends.
  • the ends of the template nucleic acids are repaired, trimmed (e.g. using an exonuclease), or filled (e.g., using a polymerase and dNTPs), to form blunt ends.
  • the ends may be treated with a polymerase and dATP to form a template independent addition to the 3 '-end of the template nucleic acids, thus producing a single A overhanging. This single A is used to guide ligation of fragments with a single T overhanging from the 5 '-end in a method referred to as T-A cloning.
  • the ends may be left as is, i.e., ragged ends.
  • double stranded oligonucleotides with complementary over hanging ends are used.
  • the A:T single base over hang method is used (see Steinman et al., International patent application number PCT/US09/64001).
  • a substrate has anchored a reverse complement to the primer binding sequence of the oligonucleotide, for example 5'-TC CAC TTA TCC TTG CAT CCA TCC TCT GCC CTG (SEQ ID NO: 2) or a polyT(50).
  • a reverse complement to the primer binding sequence of the oligonucleotide for example 5'-TC CAC TTA TCC TTG CAT CCA TCC TCT GCC CTG (SEQ ID NO: 2) or a polyT(50).
  • a reverse complement to the primer binding sequence of the oligonucleotide for example 5'-TC CAC TTA TCC TTG CAT CCA TCC TCT GCC CTG (SEQ ID NO: 2) or a polyT(50).
  • the sample is washed and the polymerase is incubated with one or two dNTPs complementary to the base(s) used in the lock sequence.
  • the fill and lock can also be performed in a single step process in which polymerase, TTP and one or two reversible terminators (complements of the lock bases) are mixed together and incubated.
  • the reversible terminators stop addition during this stage and can be made functional again (reversal of inhibitory mechanism) by treatments specific to the analogs used.
  • Some reversible terminators have functional blocks on the 3'-OH which need to be removed while others, for example Helicos BioSciences Virtual Terminators have inhibitors attached to the base via a disulfide which can be removed by treatment with TCEP.
  • the sequencing method is a single molecule sequencing by synthesis method.
  • Single molecule sequencing is shown for example in Lapidus et al. (U.S. patent number 7,169,560), Lapidus et al. (U.S. patent application number 2009/0191565), Quake et al. (U.S. patent number 6,818,395), Harris (U.S. patent number 7,282,337), Quake et al. (U.S. patent application number 2002/0164629), and Braslavsky, et al., PNAS (USA), 100: 3960-3964 (2003), the contents of each of these references is incorporated by reference herein in its entirety.
  • a single-stranded nucleic acid (e.g., DNA or cDNA) is hybridized to oligonucleotides attached to a surface of a flow cell.
  • the oligonucleotides may be covalently attached to the surface or various attachments other than covalent linking as known to those of ordinary skill in the art may be employed.
  • the attachment may be indirect, e.g., via a polymerase directly or indirectly attached to the surface.
  • the surface may be planar or otherwise, and/or may be porous or non-porous, or any other type of surface known to those of ordinary skill to be suitable for attachment.
  • the nucleic acid is then sequenced by imaging the polymerase-mediated addition of fluorescently-labeled nucleotides incorporated into the growing strand surface oligonucleotide, at single molecule resolution.
  • the nucleotides used in the sequencing reaction are not chain terminating nucleotides.
  • Nucleic acid templates include deoxyribonucleic acid (DNA) and/or ribonucleic acid (RNA). Nucleic acid templates can be synthetic or derived from naturally occurring sources. In one embodiment, nucleic acid template molecules are isolated from a biological sample containing a variety of other components, such as proteins, lipids and non-template nucleic acids. Nucleic acid template molecules can be obtained from any cellular material, obtained from an animal, plant, bacterium, fungus, or any other cellular organism. In certain embodiments, the nucleic acid templates are obtained from a single cell. Biological samples for use in the present invention include viral particles or preparations.
  • Nucleic acid template molecules can be obtained directly from an organism or from a biological sample obtained from an organism, e.g., from blood, urine, cerebrospinal fluid, seminal fluid, saliva, sputum, stool and tissue. Any tissue or body fluid specimen may be used as a source for nucleic acid for use in the invention. Nucleic acid template molecules can also be isolated from cultured cells, such as a primary cell culture or a cell line. The cells or tissues from which template nucleic acids are obtained can be infected with a virus or other intracellular pathogen. A sample can also be total RNA extracted from a biological specimen, a cDNA library, viral, or genomic DNA.
  • Nucleic acid obtained from biological samples typically is fragmented to produce suitable fragments for analysis.
  • nucleic acid from a biological sample is fragmented by sonication.
  • Nucleic acid template molecules can be obtained as described in U.S. Patent Application Publication Number US2002/0190663 Al, published Oct. 9, 2003.
  • nucleic acid can be extracted from a biological sample by a variety of techniques such as those described by Maniatis, et al., Molecular Cloning: A Laboratory Manual, Cold Spring Harbor, N.Y., pp. 280-281 (1982).
  • individual nucleic acid template molecules can be from about 5 bases to about 20 kb.
  • Nucleic acid molecules may be single-stranded, double-stranded, or double-stranded with single-stranded regions (for example, stem- and loop-structures).
  • a biological sample as described herein may be homogenized or fractionated in the presence of a detergent or surfactant.
  • concentration of the detergent in the buffer may be about 0.05% to about 10.0%.
  • concentration of the detergent can be up to an amount where the detergent remains soluble in the solution. In a preferred embodiment, the concentration of the detergent is between 0.1% to about 2%.
  • the detergent particularly a mild one that is nondenaturing, can act to solubilize the sample.
  • Detergents may be ionic or nonionic.
  • ionic detergents examples include deoxycholate, sodium dodecyl sulfate (SDS), N-lauroylsarcosine, and cetyltrimethylammoniumbromide (CTAB).
  • a zwitterionic reagent may also be used in the purification schemes of the present invention, such as Chaps, zwitterion 3-14, and 3-[(3- cholamidopropyl)dimethylammonio]-l-propanesulf-onate. It is contemplated also that urea may be added with or without another detergent or surfactant.
  • Lysis or homogenization solutions may further contain other agents, such as reducing agents.
  • reducing agents include dithiothreitol (DTT), .beta.-mercaptoethanol, DTE, GSH, cysteine, cysteamine, tricarboxyethyl phosphine (TCEP), or salts of sulfurous acid.
  • DTT dithiothreitol
  • TCEP tricarboxyethyl phosphine
  • Nucleotides useful in the invention include any nucleotide or nucleotide analog, whether naturally-occurring or synthetic.
  • preferred nucleotides include phosphate esters of deoxyadenosine, deoxycytidine, deoxyguanosine, deoxythymidine, adenosine, cytidine, guanosine, and uridine.
  • nucleotides useful in the invention comprise an adenine, cytosine, guanine, thymine base, a xanthine or hypoxanthine; 5-bromouracil, 2-aminopurine, deoxyinosine, or methylated cytosine, such as 5-methylcytosine, and N4-methoxydeoxycytosine.
  • bases of polynucleotide mimetics such as methylated nucleic acids, e.g., 2'-O- methRNA, peptide nucleic acids, modified peptide nucleic acids, locked nucleic acids and any other structural moiety that can act substantially like a nucleotide or base, for example, by exhibiting base-complementarity with one or more bases that occur in DNA or RNA and/or being capable of base-complementary incorporation, and includes chain-terminating analogs.
  • a nucleotide corresponds to a specific nucleotide species if they share base-complementarity with respect to at least one base.
  • Nucleotides for nucleic acid sequencing according to the invention preferably include a detectable label that is directly or indirectly detectable.
  • Preferred labels include optically- detectable labels, such as fluorescent labels.
  • fluorescent labels include, but are not limited to, 4-acetamido-4'-isothiocyanatostilbene-2,2'disulfonic acid; acridine and derivatives: acridine, acridine isothiocyanate; 5-(2'-aminoethyl)aminonaphthalene-l -sulfonic acid (EDANS); 4-amino-N-[3-vinylsulfonyl)phenyl]naphthalimide-3,5 disulfonate; N-(4-anilino-l- naphthyl)maleimide; anthranilamide; BODIPY; Brilliant Yellow; coumarin and derivatives; coumarin, 7-amino-4-methylcoumarin (AMC, Coumarin 120), 7
  • Nucleic acid polymerases generally useful in the invention include DNA polymerases, RNA polymerases, reverse transcriptases, and mutant or altered forms of any of the foregoing. DNA polymerases and their properties are described in detail in, among other places, DNA Replication 2nd edition, Kornberg and Baker, W. H. Freeman, New York, N. Y. (1991).
  • Known conventional DNA polymerases useful in the invention include, but are not limited to, Pyrococcus furiosus (Pfu) DNA polymerase (Lundberg et al., 1991, Gene, 108: 1, Stratagene), Pyrococcus woesei (Pwo) DNA polymerase (Hinnisdaels et al., 1996, Biotechniques, 20:186-8, Boehringer Mannheim), Thermus thermophilus (Tth) DNA polymerase (Myers and Gelfand 1991, Biochemistry 30:7661), Bacillus stearothermophilus DNA polymerase (Stenesh and McGowan, 1977, Biochim Biophys Acta 475:32), Thermococcus litoralis (TIi) DNA polymerase (also referred to as Vent.TM.
  • DNA polymerase Cariello et al., 1991, Polynucleotides Res, 19: 4193, New England Biolabs), 9.degree.Nm.TM. DNA polymerase (New England Biolabs), Stoffel fragment, ThermoSequenase® (Amersham Pharmacia Biotech UK), Therminator.TM. (New England Biolabs), Thermotoga maritima (Tma) DNA polymerase (Diaz and Sabino, 1998 Braz J. Med. Res, 31:1239), Thermus aquaticus (Taq) DNA polymerase (Chien et al., 1976, J.
  • Thermophilic DNA polymerases include, but are not limited to, ThermoSequenase®, 9.degree.Nm.TM., Therminator.TM., Taq, Tne, Tma, Pfu, TfI, Tth, TIi, Stoffel fragment, Vent.TM. and Deep Vent.TM. DNA polymerase, KOD DNA polymerase, Tgo, JDF-3, and mutants, variants and derivatives thereof.
  • a highly-preferred form of any polymerase is a 3' exonuclease-deficient mutant.
  • Reverse transcriptases useful in the invention include, but are not limited to, reverse transcriptases from HIV, HTLV-I, HTLV-II, FeLV, FIV, SIV, AMV, MMTV, MoMuLV and other retroviruses (see Levin, Cell 88:5-8 (1997); Verma, Biochim Biophys Acta. 473:1-38 (1977); Wu et al., CRC Crit. Rev Biochem. 3:289-347 (1975)). Attachment
  • nucleic acid template molecules are attached to a substrate (also referred to herein as a surface) and subjected to analysis by single molecule sequencing as described herein. Nucleic acid template molecules are attached to the surface such that the template/primer duplexes are individually optically resolvable.
  • Substrates for use in the invention can be two- or three-dimensional and can comprise a planar surface (e.g., a glass slide) or can be shaped.
  • a substrate can include glass (e.g., controlled pore glass (CPG)), quartz, plastic (such as polystyrene (low cross-linked and high cross-linked polystyrene), polycarbonate, polypropylene and poly(methymethacrylate)), acrylic copolymer, polyamide, silicon, metal (e.g., alkanethiolate- derivatized gold), cellulose, nylon, latex, dextran, gel matrix (e.g., silica gel), polyacrolein, or composites.
  • CPG controlled pore glass
  • plastic such as polystyrene (low cross-linked and high cross-linked polystyrene), polycarbonate, polypropylene and poly(methymethacrylate)
  • acrylic copolymer polyamide
  • silicon e.g., metal (e.g., alkanethiolate- derivatized gold)
  • cellulose e.g., nylon, latex, dextran, gel matrix (e.g.
  • Suitable three-dimensional substrates include, for example, spheres, microparticles, beads, membranes, slides, plates, micromachined chips, tubes (e.g., capillary tubes), microwells, microfluidic devices, channels, filters, or any other structure suitable for anchoring a nucleic acid.
  • Substrates can include planar arrays or matrices capable of having regions that include populations of template nucleic acids or primers. Examples include nucleoside-derivatized CPG and polystyrene slides; derivatized magnetic slides; polystyrene grafted with polyethylene glycol, and the like.
  • Substrates are preferably coated to allow optimum optical processing and nucleic acid attachment. Substrates for use in the invention can also be treated to reduce background. Exemplary coatings include epoxides, and derivatized epoxides (e.g., with a binding molecule, such as an oligonucleotide or streptavidin).
  • Various methods can be used to anchor or immobilize the nucleic acid molecule to the surface of the substrate.
  • the immobilization can be achieved through direct or indirect bonding to the surface.
  • the bonding can be by covalent linkage. See, Joos et al., Analytical Biochemistry 247:96-101, 1997; Oroskar et al., Clin. Chem. 42:1547-1555, 1996; and Khandjian, MoI. Bio. Rep. 11 : 107-115, 1986.
  • a preferred attachment is direct amine bonding of a terminal nucleotide of the template or the 5' end of the primer to an epoxide integrated on the surface.
  • the bonding also can be through non-covalent linkage.
  • biotin-streptavidin (Taylor et al., J. Phys. D. Appl. Phys. 24:1443, 1991) and digoxigenin with anti-digoxigenin (Smith et al., Science 253:1122, 1992) are common tools for anchoring nucleic acids to surfaces and parallels.
  • the attachment can be achieved by anchoring a hydrophobic chain into a lipid monolayer or bilayer.
  • Other methods for known in the art for attaching nucleic acid molecules to substrates also can be used. Detection
  • exemplary detection methods include radioactive detection, optical absorbance detection, e.g., UV-visible absorbance detection, optical emission detection, e.g., fluorescence or chemiluminescence.
  • extended primers can be detected on a substrate by scanning all or portions of each substrate simultaneously or serially, depending on the scanning method used.
  • fluorescence labeling selected regions on a substrate may be serially scanned one-by- one or row-by-row using a fluorescence microscope apparatus, such as described in Fodor (U.S. Pat. No. 5,445,934) and Mathies et al. (U.S. Pat. No. 5,091,652).
  • Devices capable of sensing fluorescence from a single molecule include scanning tunneling microscope (siM) and the atomic force microscope (AFM). Hybridization patterns may also be scanned using a CCD camera (e.g., Model TE/CCD512SF, Princeton Instruments, Trenton, NJ.) with suitable optics (Ploem, in Fluorescent and Luminescent Probes for Biological Activity Mason, T. G. Ed., Academic Press, Landon, pp. 1-11 (1993), such as described in Yershov et al., Proc. Natl. Acad. Sci. 93:4913 (1996), or may be imaged by TV monitoring.
  • CCD camera e.g., Model TE/CCD512SF, Princeton Instruments, Trenton, NJ.
  • suitable optics Ploem, in Fluorescent and Luminescent Probes for Biological Activity Mason, T. G. Ed., Academic Press, Landon, pp. 1-11 (1993), such as described in Yershov e
  • a phosphorimager device For radioactive signals, a phosphorimager device can be used (Johnston et al., Electrophoresis, 13:566, 1990; Drmanac et al., Electrophoresis, 13:566, 1992; 1993).
  • Other commercial suppliers of imaging instruments include General Scanning Inc., (Watertown, Mass. on the World Wide Web at genscan.com), Genix Technologies (Waterloo, Ontario, Canada; on the World Wide Web at confocal.com), and Applied Precision Inc. Such detection methods are particularly useful to achieve simultaneous scanning of multiple attached template nucleic acids.
  • Optical setups include near-field scanning microscopy, far-field confocal microscopy, wide-field epi-illumination, light scattering, dark field microscopy, photoconversion, single and/or multiphoton excitation, spectral wavelength discrimination, fluorophor identification, evanescent wave illumination, and total internal reflection fluorescence (TIRF) microscopy.
  • TIRF total internal reflection fluorescence
  • certain methods involve detection of laser-activated fluorescence using a microscope equipped with a camera.
  • Suitable photon detection systems include, but are not limited to, photodiodes and intensified CCD cameras.
  • an intensified charge couple device (ICCD) camera can be used.
  • ICCD intensified charge couple device
  • the use of an ICCD camera to image individual fluorescent dye molecules in a fluid near a surface provides numerous advantages. For example, with an ICCD optical setup, it is possible to acquire a sequence of images (movies) of fluorophores.
  • TIRF microscopy uses totally internally reflected excitation light and is well known in the art. See, e.g., the World Wide Web at nikon-instruments.jp/eng/page/products/tirf.aspx.
  • detection is carried out using evanescent wave illumination and total internal reflection fluorescence microscopy.
  • An evanescent light field can be set up at the surface, for example, to image fluorescently-labeled nucleic acid molecules.
  • the optical field does not end abruptly at the reflective interface, but its intensity falls off exponentially with distance.
  • This surface electromagnetic field called the “evanescent wave”
  • the thin evanescent optical field at the interface provides low background and facilitates the detection of single molecules with high signal-to-noise ratio at visible wavelengths.
  • the evanescent field also can image fluorescently-labeled nucleotides upon their incorporation into the attached template/primer complex in the presence of a polymerase. Total internal reflectance fluorescence microscopy is then used to visualize the attached template/primer duplex and/or the incorporated nucleotides with single molecule resolution.
  • Some embodiments of the invention use non-optical detection methods such as, for example, detection using nanopores (e.g., protein or solid state) through which molecules are individually passed so as to allow identification of the molecules by noting characteristics or changes in various properties or effects such as capacitance or blockage current flow (see, for example, Stoddart et al, Proc. Nat. Acad. Sci., 106:7702, 2009; Purnell and Schmidt, ACS Nano, 3:2533, 2009; Branton et al, Nature Biotechnology, 26:1146, 2008; Polonsky et al, U.S. Application 2008/0187915; Mitchell & Howorka, Angew. Chem. Int. Ed. 47:5565, 2008; Borsenberger et al, J. Am. Chem. Soc, 131, 7530, 2009) ; or other suitable non-optical detection methods.
  • nanopores e.g., protein or solid state
  • Alignment and/or compilation of sequence results obtained from the image stacks produced as generally described above utilizes look-up tables that take into account possible sequences changes (due, e.g., to errors, mutations, etc.). Essentially, sequencing results obtained as described herein are compared to a look-up type table that contains all possible reference sequences plus 1 or 2 base errors.
  • RNA template capture involves attaching a po Iy(A) tail to a 3' end of an RNA molecule. Attaching may be by enzymatic methods, such as using E. coli poly(A) polymerase I or yeast poly(A). Other methods for adding the oligonucleotide tail of poly(A) to the RNA molecule may be accomplished by methods described herein. The length of the poly(A) tail may be controlled by introducing 3 ' deoxyATP (cordycepin triphosphate) to the poly-adenylation reaction shortly after the start of the tailing reaction.
  • 3 ' deoxyATP cordycepin triphosphate
  • the length of the poly(A) tail is controlled by the amount of time allowed to elapse from the start of the poly-adenylation reaction and the addition of the 3' deoxyATP (cordycepin triphosphate).
  • the 3' end block prevents "downward" extension during sequencing and hybridized to dT(50) flow cells.
  • RNA clean-up after the A-tailing and blocking reaction are performed by phenol/chloroform extraction and ethanol precipitation.
  • commercial column-based RNA clean-up kits can be used, depending on the RNA sample being processed. This clean-up step is non-essential for sequencing purposes.
  • RNA After tailing, the tailed RNA are introduced to primers and template/primer duplexes are formed. The duplex then undergoes the sequencing reaction as described herein. Further description is provided in Kahvejian (U.S. patent application number 2008/0081330), the content of which is incorporated by reference herein in its entirety.
  • Methods of the invention involve attaching a poly(A) tail to an RNA molecule and introducing the tailed RNA molecule to a solid support having poly(T) primers, thereby forming template RNA/primer duplexes attached to the solid support.
  • a sequencing reaction is then performed on at least a portion of the RNA molecule to obtain a first read. The sequencing reaction is performed as described herein.
  • RNA template is removed by methods known in the art, such as exposing the duplex to hot water.
  • the complementary DNA generated during the copy step remains on the surface because it is extended from the covalently attached poly(T) oligonucleotide.
  • An oligonucleotide of poly(G) is then added to the 3' end of the cDNA. Adding the oligonucleotide tail of poly(G) may be accomplished by methods described herein, such as using a terminal transferase enzyme.
  • the poly(G) tail is then blocked used a ddGTP. Blocking is described herein.
  • a poly(C) primer is then hybridized to the poly(G) tail and a second sequencing reaction is performed. The second sequencing reaction is conducted as described herein.
  • the poly(G) tail is not necessary, because the first sequencing reaction can provide the sequence of the terminal portion of the cDNA, which information may be used to design a primer that will hybridize to the terminal portion of the cDNA for the second sequencing reaction. Further description is provided in Harris (U.S. patent application number 2009/0053705), Harris (U.S. patent number 7,282,337), and Harris (U.S. patent application number 2009/0197257), the content of each of which is incorporated by reference herein in its entirety.
  • Example 1 Tailing of nucleic acids in presence of carrier molecule
  • the starting material for this protocol is about 5 to 6 nanograms of Chromatin Immunoprecipitated (ChIP) DNA, although as low as 3 nanograms of ChIP DNA may be used.
  • the fragment size of the DNA is about 400 to about 500 bp.
  • the reaction described below involves a poly(A) tailing step and a 3'-dideoxy-blocking step.
  • the tailing reaction is conducted in the presence of an RNA carrier oligonucleotide.
  • the reaction produces poly(A) tailed DNA, however the protocol described herein may be used to produce any type of tailed DNA by merely adjusting the reagents used in the reaction, as will be known to one of skill in the art.
  • RNA contamination Prior to conducting the tailing reaction on the DNA, RNA contamination may be removed by treating samples with RNase.
  • a commercially available kit (Qiagen MinElute PCR Purification Kit, catalog number 28004) may be used for sample clean-up prior to the tailing reaction.
  • ChIP DNA quantity may be determined with the QUANT-IT PicroGreen dsDNA Reagent Kit (Invitrogen, catalog number Pl 1495).
  • RNA ribonucleotide carrier IDT
  • the following mix is prepared: 2 ⁇ l of Terminal Transferase 1OX buffer; 2 ⁇ l of CoCl 2 ; 11.8 ⁇ l of ChIP DNA and Nuclease-free water. The total volume is about 15.8 ⁇ l.
  • the mix is heated at 95°C for 5 minutes in the thermocycler to denature the DNA. After heating, the mix is cooled on the pre-chilled aluminum block that has been kept in an ice and water slurry (about O 0 C) to obtain single-stranded DNA. It is important to chill the sample as quick as possible to prevent re-annealing of the denatured, single-stranded DNA.
  • the following mix is added to the denatured DNA from above: l ⁇ l of Terminal Transferase (20U/ ⁇ l); 2 ⁇ l of 50 ⁇ M dATP; l ⁇ l of l ⁇ M RNA oligonucleotide carrier; and 0.2 ⁇ l of BSA.
  • the volume of this mix is 4.2 ⁇ l, bringing the total volume of the reaction to 20 ⁇ l. Mix well.
  • the tubes containing the mixture are placed in the thermocycler and the following program is run: 37 0 C for 1 hour; 7O 0 C for 10 minutes; and temperature is brought back down to 4°C.
  • a poly(A) tail will now have been added to the DNA.
  • the following blocking mixture is added to the denatured poly-adenylated mixture from above: l ⁇ l of Terminal Transferase 1OX buffer; l ⁇ l OfCoCl 2 ; l ⁇ l of Terminal Transferase (20U/ ⁇ l); 0.5 ⁇ l of 200 ⁇ M ddATP; and 6.5 ⁇ l of nuclease-free water.
  • the volume of this mix is lO ⁇ l, bringing the total volume of the reaction to 30 ⁇ l. Mix well.
  • the tubes containing the mixture are placed in the thermocycler and the following program is run: 37°C for 1 hour; 70 0 C for 20 minutes; and temperature is brought back down to 4°C. A 3' end block will now have been added to the poly-adenylated DNA.
  • control oligonucleotide 2 picomoles of control oligonucleotide is added to the heat inactivated 30 ⁇ l terminal transferase reaction above.
  • the control oligonucleotide is added to the sample to minimize ChIP DNA loss during sample loading steps.
  • the control oligonucleotide does not contain a poly(A) tail, and therefore will not hybridize to the flow cell surface.
  • the sample is now ready to be hybridized to the flow cells for the sequencing reaction. No additional clean-up step is required.
  • Example 2 Direct tailing of nucleic acids
  • the starting material for this protocol is about 5 to 6 nanograms of Chromatin Immunoprecipitated (ChIP) DNA, although as low as 3 nanograms of ChIP DNA may be used.
  • the fragment size of the DNA is about 400 to about 500 bp.
  • the reaction described below involves a poly(A) tailing step and a 3'-dideoxy-blocking step.
  • the tailing reaction is conducted in the presence of an RNA carrier oligonucleotide.
  • the reaction produces poly(A) tailed DNA, however the protocol described herein may be used to produce any type of tailed DNA by merely adjusting the reagents used in the reaction, as will be known to one of skill in the art.
  • RNA contamination Prior to conducting the tailing reaction on the DNA, RNA contamination may be removed by treating samples with RNase.
  • a commercially available kit (Qiagen MinElute PCR Purification Kit, catalog number 28004) may be used for sample clean-up prior to the tailing reaction.
  • ChIP DNA quantity may be determined with the QUANT-IT PicroGreen dsDNA Reagent Kit (Invitrogen, catalog number Pl 1495).
  • the following mix is prepared: 2 ⁇ l of Terminal Transferase 1OX buffer; 2 ⁇ l of CoCl 2 ; 10.8 ⁇ l of ChIP DNA and Nuclease-free water. The total volume is about 14.8 ⁇ l.
  • the mix is heated at 95 0 C for 5 minutes in the thermocycler to denature the DNA. After heating, the mix is cooled on the pre-chilled aluminum block that has been kept in an ice and water slurry (about 0°C) to obtain single-stranded DNA. It is important to chill the sample as quick as possible to prevent re-annealing of the denatured, single-stranded DNA.
  • the following mix is added to the denatured DNA from above: l ⁇ l of Terminal Transferase (klOdiluted, 2U/ ⁇ l); 4 ⁇ l of lO ⁇ M dATP; and 0.2 ⁇ l of BSA.
  • the volume of this mix is 5.2 ⁇ l, bringing the total volume of the reaction to 20 ⁇ l. Mix well.
  • the tubes containing the mixture are placed in the thermocycler and the following program is run: 37°C for 1 hour; 70 0 C for 10 minutes; and temperature is brought back down to 4°C. A poly(A) tail will now have been added to the DNA.
  • the following blocking mixture is added to the denatured poly-adenylated mixture from above: l ⁇ l of Terminal Transferase 1OX buffer; l ⁇ l of CoCl 2 ; l ⁇ l of Terminal Transferase (1:10 diluted, 2U/ ⁇ l); l ⁇ l of Terminal Transferase (20U/ ⁇ l); l ⁇ l of 200 ⁇ M ddATP; and 6 ⁇ l of nuclease-free water.
  • the volume of this mix is lO ⁇ l, bringing the total volume of the reaction to 30 ⁇ l. Mix well.
  • the tubes containing the mixture are placed in the thermocycler and the following program is run: 37°C for 1 hour; 70°C for 20 minutes; and temperature is brought back down to 4°C. A 3' end block will now have been added to the poly-adenylated DNA.
  • control oligonucleotide 2 picomoles of control oligonucleotide is added to the heat inactivated 30 ⁇ l terminal transferase reaction above.
  • the control oligonucleotide is added to the sample to minimize ChIP DNA loss during sample loading steps.
  • the control oligonucleotide does not contain a poly(A) tail, and therefore will not hybridize to the flow cell surface.
  • the sample is now ready to be hybridized to the flow cells for the sequencing reaction. No additional clean-up step is required.
  • Example 3 Sequencing nucleic acids
  • the 7249 nucleotide genome of the bacteriophage M13mpl8 was sequenced using single molecule methods of the invention.
  • Purified, single-stranded viral M13mpl8 genomic DNA was obtained from New England Biolabs. Approximately 25 ⁇ g of M 13 DNA was digested to an average fragment size of 40 bp with 0.1 U Dnase I (New England Biolabs) for 10 minutes at 37°C. Digested DNA fragment sizes were estimated by running an aliquot of the digestion mixture on a precast denaturing (TBE-Urea) 10% polyacrylamide gel (Novagen) and staining with SYBR Gold (Invitrogen/Molecular Probes).
  • the DNase I-digested genomic DNA was filtered through a YMlO ultrafiltration spin column (Millipore) to remove small digestion products less than about 30 nt. Approximately 20 pmol of the filtered DNase I digest was then polyadenylated with terminal transferase according to known methods (Roychoudhury, R and Wu, R. 1980, Terminal transferase-catalyzed addition of nucleotides to the 3' termini of DNA. Methods Enzymol. 65(l):43-62.). The average dA tail length was 50+/-5 nucleotides. Terminal transferase was then used to label the fragments with Cy3-dUTP.
  • Epoxide-coated glass slides were prepared for oligo attachment.
  • Epoxide-functionalized 40 mm diameter #1.5 glass cover slips (slides) were obtained from Erie Scientific (Salem, N.H.).
  • the slides were preconditioned by soaking in 3xSSC for 15 minutes at 37°C.
  • a 500 ⁇ M aliquot of 5' aminated poly(dT50) primer (polythymidine of 50 nucleotides in length with a 5' terminal amine) is incubated with each slide for 30 minutes at room temperature in a volume of 80 ml.
  • the resulting slides have primer attached by direct amine linkage to the epoxide.
  • slides are then treated with phosphate (1 M) for 4 hours at room temperature in order to passivate the surface.
  • Slides re then stored in polymerase rinse buffer (20 mM Tris, 100 mM NaCl, 0.001% Triton X-100, pH 8.0) until they are used for sequencing.
  • the slides are placed in a modified FCS2 flow cell (Bioptechs, Butler, Pa.) using a 50 um thick gasket.
  • the flow cell is placed on a movable stage that is part of a high- efficiency fluorescence imaging system built around a Nikon TE-2000 inverted microscope equipped with a total internal reflection (TIR) objective.
  • the slide is then rinsed with HEPES buffer with 100 mM NaCl and equilibrated to a temperature of 50 0 C.
  • An aliquot of poly(dT50) template is placed in the flow cell and incubated on the slide for 15 minutes.
  • the flow cell is rinsed with lxSSC/HEPES/0.1% SDS followed by HEPES/NaCl.
  • a passive vacuum apparatus is used to pull fluid across the flow cell.
  • the resulting slide contains M 13 template/primer duplex.
  • the temperature of the flow cell is then reduced to 37 0 C for sequencing and the objective is brought into contact with the flow cell.
  • cytosine triphosphate, guanidine triphosphate, adenine triphosphate, and uracil triphosphate each having a cyanine-5 label (at the 7-deaza position for ATP and GTP and at the C5 position for CTP and UTP (PerkinElmer) are stored separately in buffer containing 20 mM Tris-HCl, pH 8.8, 10 niM MgSO 4 , 10 mM (NH 4 ) 2 SO 4 , 10 mM HCl, and 0.1% Triton X-100, and IOOU Klenow exo " polymerase (NEN). Sequencing proceeds as follows.
  • initial imaging is used to determine the positions of duplex on the epoxide surface.
  • the Cy 3 label attached to the M 13 templates is imaged by excitation using a laser tuned to 532 nm radiation (Verdi V-2 Laser, Coherent, Inc., Santa Clara, Calif.) in order to establish duplex position. For each slide only single fluorescent molecules imaged in this step are counted. Imaging of incorporated nucleotides as described below is accomplished by excitation of a cyanine-5 dye using a 635 nm radiation laser (Coherent). 5 uM Cy5CTP is placed into the flow cell and exposed to the slide for 2 minutes.
  • the slide is rinsed in lxSSC/15 mM HEPES/0.1% SDS/pH 7.0 ("SSC/HEPES/SDS”) (15 times in 60 ul volumes each, followed by 150 mM HEPES/150 mM NaCl/pH 7.0 (“HEPES/NaCl”) (10 times at 60 ul volumes).
  • An oxygen scavenger containing 30% acetonitrile and scavenger buffer (134 ul HEPES/NaCl, 24 ul 100 mM Trolox in MES, pH6.1, 10 ul DABCO in MES, pH 6.1, 8 ul 2M glucose, 20 ul NaI (50 mM stock in water), and 4 ul glucose oxidase) is next added.
  • the slide is then imaged (500 frames) for 0.2 seconds using an Inova301K laser (Coherent) at 647 nm, followed by green imaging with a Verdi V-2 laser (Coherent) at 532 nm for 2 seconds to confirm duplex position. The positions having detectable fluorescence are recorded.
  • the flow cell is rinsed 5 times each with SSC/HEPES/SDS (60 ul) and HEPES/NaCl (60 ul).
  • the cyanine-5 label is cleaved off incorporated CTP by introduction into the flow cell of 50 mM TCEP for 5 minutes, after which the flow cell is rinsed 5 times each with SSC/HEPES/SDS (60 ul) and HEPES/NaCl (60 ul).
  • the remaining nucleotide is capped with 50 mM iodoacetamide for 5 minutes followed by rinsing 5 times each with SSC/HEPES/SDS (60 ul) and HEPES/NaCl (60 ul).
  • the scavenger is applied again in the manner described above, and the slide is again imaged to determine the effectiveness of the cleave/cap steps and to identify non-incorporated fluorescent objects.
  • the image stack data i.e., the single molecule sequences obtained from the various surface-bound duplex
  • the image data obtained can be compressed to collapse homopolymeric regions.
  • the sequence "TCAAAGC” is represented as "TCAGC” in the data tags used for alignment.
  • homopolymeric regions in the reference sequence are collapsed for alignment.
  • the alignment algorithm matches sequences obtained as described above with the actual M13 linear sequence. Placement of obtained sequence on M13 is based upon the best match between the obtained sequence and a portion of M 13 of the same length, taking into consideration 0, 1, or 2 possible errors. All obtained 9-mers with 0 errors (meaning that they exactly match a 9-mer in the M 13 reference sequence) are first aligned with Ml 3. Then 10-, H-, and 12-mers with 0 or 1 error are aligned. Finally, all 13-mers or greater with 0, 1, or 2 errors are aligned.
  • the image stack data i.e., the single molecule sequences obtained from the various surface-bound duplex
  • the image data obtained can be compressed to collapse homopolymeric regions as described above.

Abstract

The invention generally relates to methods for sequencing small quantities of nucleic acids. Certain embodiments of the invention involve attaching oligonucleotide tails to template nucleic acids in the presence of carrier molecules, which carrier molecules stabilize the tailing reaction and allow for addition of higher concentrations of enzyme and dNTPs, thereby increasing the number of template nucleic acids available for a sequencing reaction. Other embodiments increase the number of 3' ends of template nucleic acids in a sample, thereby increasing the number of template nucleic acids available for a sequencing reaction. Still other embodiments of the invention optimize reaction conditions of the sequencing reaction.

Description

SEQUENCING SMALL QUANTITIES OF NUCLEIC ACIDS
Related Application
The present application claims the benefit of and priority to U.S. provisional patent application serial number 61/153,548, filed February 18, 2009, the content of which is incorporated by reference herein in its entirety.
Field of the Invention
The invention generally relates to methods for sequencing small quantities of nucleic acids.
Background
Sequencing-by-synthesis involves template-dependent addition of nucleotides to a template/primer duplex. Nucleotide addition is mediated by a polymerase enzyme and added nucleotides may be labeled in order to facilitate their detection. Single molecule sequencing has been used to obtain high-throughput sequence information on individual DNA or RNA. See, Braslavsky, Proc. Natl. Acad. Sci. USA 100: 3960-64 (2003). Recently, all four Watson-Crick nucleotides may be added simultaneously, each with a different detectable label or nucleotides may be added one at a time in a step-and-repeat manner for imaging incorporations.
Although in most applications of this technology the amount of template nucleic acid is not limiting, a number of applications start from small quantities of nucleic acid. For example, when bacteria that cannot be cultured (Rappe et al., Annu. Rev. Microbiol. 57:369-394, 2003) or when cDNA libraries from a small number of cells (Schutze et al., Nat. Biotechnol. 16:737-742, 1998) are sequenced, template nucleic acid amounts limit the number of sequences that may be determined.
There is a need for methods that allow for analysis of samples that include only a small quantity of nucleic acid.
Summary
Methods of the invention allow for very small quantities (e.g., nanogram, picogram, or fentogram amounts) of nucleic acids to be analyzed by sequencing methodologies. In particular embodiments, methods of the invention analyze nucleic acids obtained from only a single cell. Methods of the invention are accomplished by increasing availability of nucleic acids in a sample to undergo a sequencing reaction.
In certain aspects, methods of the invention involve obtaining a sample including template nucleic acid, introducing a carrier molecule to the sample, attaching an oligonucleotide tail to the template nucleic acid in the presence of the carrier molecule, introducing the tailed template nucleic acid to primers to form template/primer duplexes, exposing the duplexes to at least one detectably labeled nucleotide in the presence of a polymerase under conditions that permit the polymerase to add nucleotides to the primer in a template-dependent manner, detecting a signal from the label of the incorporated labeled nucleotide, and sequentially repeating the exposing and detecting steps at least once. The carrier molecule stabilizes the tailing reaction, allowing for addition of higher concentrations of enzyme and dNTPs. By increasing the number of template nucleic acids that successful receive an oligonucleotide tail, the number of template nucleic acids that may form template/primer duplexes is increased and thus the number of template nucleic acids available to undergo the subsequent sequencing reaction is increased.
The oligonucleotide tail may be any oligonucleotide sequence. In particular embodiments, the tail is a poly(A) tail. The may be any length, such as at least about 5 nucleotides, at least about 10 nucleotides, at least about 20 nucleotides, at least about 50 nucleotides, at least about 70 nucleotides, or at least about 150 nucleotides. In certain embodiments, the tail is 150 nucleotides.
The oligonucleotide tail may be attached to the template nucleic acids by any method known in the art. For example, the oligonucleotide tail may be ligated to the template nucleic acid. In certain embodiments, the oligonucleotide tail is attached by a terminal transferase enzyme.
The carrier molecule may be any molecule that stabilizes the tailing reaction, allowing for addition of higher concentrations of enzyme and dNTPs. Exemplary carrier molecules include RNA oligonucleotide and bead bound oligonucleotides.
Once the template nucleic acids have had the oligonucleotide tail attached, the tailed template molecules are exposed to primers to form template/primer duplexes and the duplexes are then sequenced. In certain embodiments, the duplexes are attached to a substrate, either directly or indirectly (e.g., through a polymerase molecule). In other embodiments, the duplexes are attached at single molecule resolution.
During the sequencing reaction, detectably labeled nucleotides are added to the primer in a template-dependent manner. In certain embodiments, the detectably labeled nucleotide is an optically detectably labeled nucleotide, such as a fluorescently labeled nucleotide. Exemplary fluorescent labels include Atto, cyanine, rhodamine, fluorescien, coumarin, BODIPY, alexa, and conjugated multi-dyes. Alternatively, the detectable label is a non-optically detectable label such as, for example, detection using nanopores.
In other aspects, methods of the invention involve increasing the number of 3' ends of template nucleic acids in a sample. Those methods of the invention involve obtaining a sample including template nucleic acid, increasing the number of 3' ends of template nucleic acid in the sample, attaching an oligonucleotide tail to the 3' ends of the template nucleic acids, introducing the tailed template nucleic acids to primers to form template/primer duplexes, exposing the duplexes to at least one detectably labeled nucleotide in the presence of a polymerase under conditions that permit the polymerase to add nucleotides to the primer in a template-dependent manner, detecting a signal from the label of the incorporated labeled nucleotide, and sequentially repeating the exposing and detecting steps at least once. By increasing the number of 3 ' ends of template nucleic acids, the number of template nucleic acids that may receive an oligonucleotide tail is increased. By increasing the number of template nucleic acids that receive an oligonucleotide tail, the number of template nucleic acids that may form template primer duplexes is increased and thus the number of template nucleic acids available to undergo the subsequent sequencing reaction is increased.
Any method known in the art to fragment or shear nucleic acids may be used. In particular embodiments, at least one restriction enzyme is added to the sample to digest the template nucleic acids to increase the number of 3' ends available for the tailing reaction.
In other aspects, methods of the invention optimize the reaction conditions and hybridization conditions for sequencing of a small quantity of template nucleic acids.
Detailed Description of the Invention
The invention generally relates to methods for sequencing small quantities of nucleic acids. In certain aspects, methods of the invention involve obtaining a sample including template nucleic acid, introducing a carrier molecule to the sample, attaching an oligonucleotide tail to the template nucleic acid in the presence of the carrier molecule, introducing the tailed template nucleic acid to primers to form template/primer duplexes, exposing the duplexes to at least one detectably labeled nucleotide in the presence of a polymerase under conditions that permit the polymerase to add nucleotides to the primer in a template-dependent manner, detecting a signal from the label of the incorporated labeled nucleotide, and sequentially repeating the exposing and detecting steps at least once.
The carrier molecule acts to stabilize the tailing reaction, allowing use of higher concentrations of enzyme and dNTPs. By stabilizing the reaction, a greater number of template nucleic acid molecules will have an oligonucleotide attached. By increasing the number of template nucleic acids that successful receive an oligonucleotide tail, the number of template nucleic acids that may form template/primer duplexes is increased because the oligonucleotide tail acts as a primer binding site. Increasing the number of template primer/duplexes increases the number of template nucleic acids available to undergo the subsequent sequencing.
The carrier molecule may be any molecule that stabilizes the tailing reaction, allowing for addition of higher concentrations of enzyme and dNTPs. In certain embodiments, the carrier molecule is an RNA oligonucleotide. The RNA oligonucleotides may be added to the tailing reaction along with the template nucleic acids. Because the RNA oligonucleotides do not have a free 3' end, the RNA oligonucleotides do not receive an oligonucleotide tail, and thus do not form a duplex with the primers and do not undergo the sequencing reaction.
The carrier molecule could also be a solid support bound oligonucleotide, such as a bead bound oligonucleotide. Any bead known in the art may be used. In certain embodiments, the beads are magnetic dynabeads (Invitrogen). Methods of attaching oligonucleotides to beads are known in the art, such as covalently attaching the oligonucleotides to the beads. By having the oligonucleotide attached to a bead, the bead bound oligonucleotides may be separated from the sample including the template nucleic acids, thus the bead bound oligonucleotides do not participate in the sequencing reaction. Separating the bead bound oligonucleotides from the sample may be accomplished by any technique known in the art and will depend on the type of beads used. See for example, Sambrook et al. (Molecular Cloning - A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York, vol. 2, 1989), the content of which is incorporated by reference herein in its entirety. In other aspects, methods of the invention involve increasing the number of 3' ends of template nucleic acids in a sample. Those methods of the invention involve obtaining a sample including template nucleic acids, increasing the number of 3' ends of template nucleic acid in the sample, attaching an oligonucleotide tail to the 3' ends of the template nucleic acids, introducing the tailed template nucleic acid to primers to form template/primer duplexes, exposing the duplexes to at least one detectably labeled nucleotide in the presence of a polymerase under conditions that permit the polymerase to add nucleotides to the primer in a template-dependent manner, detecting a signal from the label of the incorporated labeled nucleotide, and sequentially repeating the exposing and detecting steps at least once.
By increasing the number of 3' ends of template nucleic acids, the number of template nucleic acids that may receive an oligonucleotide tail is increased. By increasing the number of template nucleic acids that receive an oligonucleotide tail, the number of template nucleic acids that may form template primer duplexes is increased and thus the number of template nucleic acids available to undergo the subsequent sequencing reaction is increased.
Numerous methods may be used to increase the number of 3' ends of template nucleic acids in the sample. For example, template nucleic acids are able to be fragmented or sheared, using a variety of mechanical, chemical and/or enzymatic methods. DNA may be randomly sheared via sonication, e.g. Covaris method, brief exposure to a DNase, or using a mixture of one or more restriction enzymes. RNA may be fragmented by brief exposure to an RNase, heat plus magnesium, or by shearing. The RNA may be converted to cDNA before or after fragmentation.
In a particular embodiment, the number of 3' ends of template nucleic acids in the sample is increased by using sequence-specific restriction enzymes. Exemplary restriction enzymes include, but are not limited to: AfIII; ApaLI; BgIII; Ncol; Ndel; MIuI; Pad; BamHI; EcoRI; Bsu36I; Xbal; and PvuII. In certain embodiments, only a single restriction enzyme is added to the sample. In alternative embodiments, a combination of different restriction enzymes is added to the sample.
Oligonucleotide tailing
Oligonucleotide tailing is described for example in Steinman et al. (International patent application number PCT/US09/64001), the content of which is incorporated by reference herein in its entirety. The oligonucleotide tails act as a primer binding sites. The primer binding site may be used to hybridize the template nucleic acid molecule to a sequencing primer, which may optionally be anchored to a substrate. The primer binding sequence may be a unique sequence including at least 2 bases but likely contains a unique order of all 4 bases and is generally 20-50 bases in length. One example of a specific sequence binding primer is: 5'-CAG GGC AGA GGA TGG ATG CAA GGA TAA GTG GA-3' (SEQ ID NO: 1). In a particular embodiment, the primer binding sequence is a homopolymer of a single base, e.g. polyA, generally 20 - 200 bases in length.
The oligonucleotide tail also may include a blocker, e.g., a chain terminating nucleotide, on the 3 '-end. The blocker prevents unintended sequence information from being obtained using the 3 '-end of the primer binding site inadvertently as a second sequencing primer, particularly when using homopolymeric primer sequences. The blocker may be any moiety that prevents a polymerase from adding bases during incubation with a dNTPs. An exemplary blocker is a nucleotide terminator that lacks a 3'-OH, i.e., a dideoxynucleotide (ddNTP). Common nucleotide terminators are 2',3'-dideoxynucleotides, 3'-aminonucleotides, 3'-deoxynucleotides, 3'-azidonucleotides, acyclonucleotides, etc. The blocker may have attached a detectable label, e.g. a fluorophore. The label may be attached via a labile linkage, e.g., a disulfide, so that following hybridization of the bar coded template nucleic acid to the surface, the locations of the template nucleic acids may be identified by imaging. Generally, the detectable label is removed before commencing with sequencing. Depending upon the linkage, the cleaved product may or may not require further chemical modification to prevent undesirable side reactions, for example following cleavage of a disulfide by TCEP the produced reactive thiol is blocked with iodoacetamide.
Methods of the invention involve attaching the oligonucleotide tail to the template nucleic acid molecules. In certain embodiments, the oligonucleotide tail is attached to the template nucleic acid molecule with an enzyme, such as terminal transferase. The enzyme may be a ligase or a polymerase. The ligase may be any enzyme capable of ligating an oligonucleotide (RNA or DNA) to the template nucleic acid molecule. Suitable ligases include T4 DNA ligase and T4 RNA ligase (such ligases are available commercially, from New England Biolabs). Methods for using ligases are well known in the art. The polymerase may be any enzyme capable of adding nucleotides to the 3' terminus of template nucleic acid molecules. The polymerase may be, for example, yeast poly(A) polymerase, commercially available from USB. The polymerase is used according to the manufacturer's instructions. In a particular embodiment, the enzyme is a terminal transferase, which is commercially from New England Biolabs. The enzyme is used according to the manufacturer's instructions.
The ligation may be blunt ended or via use of complementary over hanging ends. In certain embodiments, the ends of the template nucleic acids are repaired, trimmed (e.g. using an exonuclease), or filled (e.g., using a polymerase and dNTPs), to form blunt ends. Upon generating blunt ends, the ends may be treated with a polymerase and dATP to form a template independent addition to the 3 '-end of the template nucleic acids, thus producing a single A overhanging. This single A is used to guide ligation of fragments with a single T overhanging from the 5 '-end in a method referred to as T-A cloning.
Alternatively, because the possible combination of overhangs left by the restriction enzymes are known after a restriction digestion, the ends may be left as is, i.e., ragged ends. In certain embodiments double stranded oligonucleotides with complementary over hanging ends are used. In a particular example, the A:T single base over hang method is used (see Steinman et al., International patent application number PCT/US09/64001).
In a particular embodiment, a substrate has anchored a reverse complement to the primer binding sequence of the oligonucleotide, for example 5'-TC CAC TTA TCC TTG CAT CCA TCC TCT GCC CTG (SEQ ID NO: 2) or a polyT(50). When homopolymeric sequences are used for the primer, it may be advantageous to perform a procedure known in the art as a "fill and lock". When polyA (20-70) on the template nucleic acids and polyT (50) on the surface hybridize there is a high likelihood that there will not be perfect alignment, so the hybrid is filled in by incubating the sample with polymerase and TTP. Following the fill step, the sample is washed and the polymerase is incubated with one or two dNTPs complementary to the base(s) used in the lock sequence. The fill and lock can also be performed in a single step process in which polymerase, TTP and one or two reversible terminators (complements of the lock bases) are mixed together and incubated. The reversible terminators stop addition during this stage and can be made functional again (reversal of inhibitory mechanism) by treatments specific to the analogs used. Some reversible terminators have functional blocks on the 3'-OH which need to be removed while others, for example Helicos BioSciences Virtual Terminators have inhibitors attached to the base via a disulfide which can be removed by treatment with TCEP. Sequencing
In certain embodiments, the sequencing method is a single molecule sequencing by synthesis method. Single molecule sequencing is shown for example in Lapidus et al. (U.S. patent number 7,169,560), Lapidus et al. (U.S. patent application number 2009/0191565), Quake et al. (U.S. patent number 6,818,395), Harris (U.S. patent number 7,282,337), Quake et al. (U.S. patent application number 2002/0164629), and Braslavsky, et al., PNAS (USA), 100: 3960-3964 (2003), the contents of each of these references is incorporated by reference herein in its entirety.
Briefly, a single-stranded nucleic acid (e.g., DNA or cDNA) is hybridized to oligonucleotides attached to a surface of a flow cell. The oligonucleotides may be covalently attached to the surface or various attachments other than covalent linking as known to those of ordinary skill in the art may be employed. Moreover, the attachment may be indirect, e.g., via a polymerase directly or indirectly attached to the surface. The surface may be planar or otherwise, and/or may be porous or non-porous, or any other type of surface known to those of ordinary skill to be suitable for attachment. The nucleic acid is then sequenced by imaging the polymerase-mediated addition of fluorescently-labeled nucleotides incorporated into the growing strand surface oligonucleotide, at single molecule resolution. In certain embodiments, the nucleotides used in the sequencing reaction are not chain terminating nucleotides. The following sections discuss general considerations for nucleic acid sequencing, for example, polymerases useful in sequencing-by-synthesis, choice of surfaces, reaction conditions, signal detection and analysis.
Nucleic Acid Templates
Nucleic acid templates include deoxyribonucleic acid (DNA) and/or ribonucleic acid (RNA). Nucleic acid templates can be synthetic or derived from naturally occurring sources. In one embodiment, nucleic acid template molecules are isolated from a biological sample containing a variety of other components, such as proteins, lipids and non-template nucleic acids. Nucleic acid template molecules can be obtained from any cellular material, obtained from an animal, plant, bacterium, fungus, or any other cellular organism. In certain embodiments, the nucleic acid templates are obtained from a single cell. Biological samples for use in the present invention include viral particles or preparations. Nucleic acid template molecules can be obtained directly from an organism or from a biological sample obtained from an organism, e.g., from blood, urine, cerebrospinal fluid, seminal fluid, saliva, sputum, stool and tissue. Any tissue or body fluid specimen may be used as a source for nucleic acid for use in the invention. Nucleic acid template molecules can also be isolated from cultured cells, such as a primary cell culture or a cell line. The cells or tissues from which template nucleic acids are obtained can be infected with a virus or other intracellular pathogen. A sample can also be total RNA extracted from a biological specimen, a cDNA library, viral, or genomic DNA.
Nucleic acid obtained from biological samples typically is fragmented to produce suitable fragments for analysis. In one embodiment, nucleic acid from a biological sample is fragmented by sonication. Nucleic acid template molecules can be obtained as described in U.S. Patent Application Publication Number US2002/0190663 Al, published Oct. 9, 2003. Generally, nucleic acid can be extracted from a biological sample by a variety of techniques such as those described by Maniatis, et al., Molecular Cloning: A Laboratory Manual, Cold Spring Harbor, N.Y., pp. 280-281 (1982). Generally, individual nucleic acid template molecules can be from about 5 bases to about 20 kb. Nucleic acid molecules may be single-stranded, double-stranded, or double-stranded with single-stranded regions (for example, stem- and loop-structures).
A biological sample as described herein may be homogenized or fractionated in the presence of a detergent or surfactant. The concentration of the detergent in the buffer may be about 0.05% to about 10.0%. The concentration of the detergent can be up to an amount where the detergent remains soluble in the solution. In a preferred embodiment, the concentration of the detergent is between 0.1% to about 2%. The detergent, particularly a mild one that is nondenaturing, can act to solubilize the sample. Detergents may be ionic or nonionic. Examples of nonionic detergents include triton, such as the Triton® X series (Triton® X-100 t-Oct-C6H4- (OCH2-CH2)XOH, x=9-10, Triton® X-100R, Triton® X-114 x=7-8), octyl glucoside, polyoxyethylene(9)dodecyl ether, digitonin, IGEP AL® CA630 octylphenyl polyethylene glycol, n-octyl-beta-D-glucopyranoside (betaOG), n-dodecyl-beta, Tween® 20 polyethylene glycol sorbitan monolaurate, Tween® 80 polyethylene glycol sorbitan monooleate, polidocanol, n- dodecyl beta-D-maltoside (DDM), NP-40 nonylphenyl polyethylene glycol, C12E8 (octaethylene glycol n-dodecyl monoether), hexaethyleneglycol mono-n-tetradecyl ether (C14EO6), octyl-beta-thioglucopyranoside (octyl thioglucoside, OTG), Emulgen, and polyoxyethylene 10 lauryl ether (C12E10). Examples of ionic detergents (anionic or cationic) include deoxycholate, sodium dodecyl sulfate (SDS), N-lauroylsarcosine, and cetyltrimethylammoniumbromide (CTAB). A zwitterionic reagent may also be used in the purification schemes of the present invention, such as Chaps, zwitterion 3-14, and 3-[(3- cholamidopropyl)dimethylammonio]-l-propanesulf-onate. It is contemplated also that urea may be added with or without another detergent or surfactant.
Lysis or homogenization solutions may further contain other agents, such as reducing agents. Examples of such reducing agents include dithiothreitol (DTT), .beta.-mercaptoethanol, DTE, GSH, cysteine, cysteamine, tricarboxyethyl phosphine (TCEP), or salts of sulfurous acid. Nucleotides
Nucleotides useful in the invention include any nucleotide or nucleotide analog, whether naturally-occurring or synthetic. For example, preferred nucleotides include phosphate esters of deoxyadenosine, deoxycytidine, deoxyguanosine, deoxythymidine, adenosine, cytidine, guanosine, and uridine. Other nucleotides useful in the invention comprise an adenine, cytosine, guanine, thymine base, a xanthine or hypoxanthine; 5-bromouracil, 2-aminopurine, deoxyinosine, or methylated cytosine, such as 5-methylcytosine, and N4-methoxydeoxycytosine. Also included are bases of polynucleotide mimetics, such as methylated nucleic acids, e.g., 2'-O- methRNA, peptide nucleic acids, modified peptide nucleic acids, locked nucleic acids and any other structural moiety that can act substantially like a nucleotide or base, for example, by exhibiting base-complementarity with one or more bases that occur in DNA or RNA and/or being capable of base-complementary incorporation, and includes chain-terminating analogs. A nucleotide corresponds to a specific nucleotide species if they share base-complementarity with respect to at least one base.
Nucleotides for nucleic acid sequencing according to the invention preferably include a detectable label that is directly or indirectly detectable. Preferred labels include optically- detectable labels, such as fluorescent labels. Examples of fluorescent labels include, but are not limited to, 4-acetamido-4'-isothiocyanatostilbene-2,2'disulfonic acid; acridine and derivatives: acridine, acridine isothiocyanate; 5-(2'-aminoethyl)aminonaphthalene-l -sulfonic acid (EDANS); 4-amino-N-[3-vinylsulfonyl)phenyl]naphthalimide-3,5 disulfonate; N-(4-anilino-l- naphthyl)maleimide; anthranilamide; BODIPY; Brilliant Yellow; coumarin and derivatives; coumarin, 7-amino-4-methylcoumarin (AMC, Coumarin 120), 7-amino-4- trifluoromethylcouluarin (Coumaran 151); cyanine dyes; cyanosine; 4',6-diaminidino-2- phenylindole (DAPI); 5'5"-dibromopyrogallol-sulfonaphthalein (Bromopyrogallol Red); 7- diethylamino-3 -(4'-isothiocyanatophenyl)-4-methylcoumarin; diethylenetriamine pentaacetate; 4,4'-diisothiocyanatodihydro-stilbene-2,2'-disulfonic acid; 4,4'-diisothiocyanatostilbene-2,2'- disulfonic acid; 5-[dimethylamino]naphthalene-l-sulfonyl chloride (DNS, dansylchloride); 4- dimethylaminophenylazophenyl-4'-isothiocyanate (DABITC); eosin and derivatives; eosin, eosin isothiocyanate, erythrosin and derivatives; erythrosin B, erythrosin, isothiocyanate; ethidium; fluorescein and derivatives; 5 -carboxy fluorescein (FAM), 5-(4,6-dichlorotriazin-2- yl)aminofluorescein (DTAF), 2',7'-dimethoxy-4'5'-dichloro-6-carboxyfluorescein, fluorescein, fluorescein isothiocyanate, QFITC, (XRITC); fluorescamine; IR144; IR1446; Malachite Green isothiocyanate; 4-methylumbelliferoneortho cresolphthalein; nitrotyrosine; pararosaniline; Phenol Red; B-phycoerythrin; o-phthaldialdehyde; pyrene and derivatives: pyrene, pyrene butyrate, succinimidyl 1 -pyrene; butyrate quantum dots; Reactive Red 4 (Cibacron.TM. Brilliant Red 3B-A) rhodamine and derivatives: 6-carboxy-X-rhodamine (ROX), 6-carboxyrhodamine (R6G), lissamine rhodamine B sulfonyl chloride rhodamine (Rhod), rhodamine B, rhodamine 123, rhodamine X isothiocyanate, sulforhodamine B, sulforhodamine 101, sulfonyl chloride derivative of sulforhodamine 101 (Texas Red); N,N,N',N'tetramethyl-6-carboxyrhodamine (TAMRA); tetramethyl rhodamine; tetramethyl rhodamine isothiocyanate (TRITC); riboflavin; rosolic acid; terbium chelate derivatives; Cy3; Cy5; Cy5.5; Cy7; IRD 700; IRD 800; La Jolta Blue; phthalo cyanine; and naphthalo cyanine. Preferred fluorescent labels are cyanine-3 and cyanine-5. Labels other than fluorescent labels are contemplated by the invention, including other optically-detectable labels.
Polymerases
Nucleic acid polymerases generally useful in the invention include DNA polymerases, RNA polymerases, reverse transcriptases, and mutant or altered forms of any of the foregoing. DNA polymerases and their properties are described in detail in, among other places, DNA Replication 2nd edition, Kornberg and Baker, W. H. Freeman, New York, N. Y. (1991). Known conventional DNA polymerases useful in the invention include, but are not limited to, Pyrococcus furiosus (Pfu) DNA polymerase (Lundberg et al., 1991, Gene, 108: 1, Stratagene), Pyrococcus woesei (Pwo) DNA polymerase (Hinnisdaels et al., 1996, Biotechniques, 20:186-8, Boehringer Mannheim), Thermus thermophilus (Tth) DNA polymerase (Myers and Gelfand 1991, Biochemistry 30:7661), Bacillus stearothermophilus DNA polymerase (Stenesh and McGowan, 1977, Biochim Biophys Acta 475:32), Thermococcus litoralis (TIi) DNA polymerase (also referred to as Vent.TM. DNA polymerase, Cariello et al., 1991, Polynucleotides Res, 19: 4193, New England Biolabs), 9.degree.Nm.TM. DNA polymerase (New England Biolabs), Stoffel fragment, ThermoSequenase® (Amersham Pharmacia Biotech UK), Therminator.TM. (New England Biolabs), Thermotoga maritima (Tma) DNA polymerase (Diaz and Sabino, 1998 Braz J. Med. Res, 31:1239), Thermus aquaticus (Taq) DNA polymerase (Chien et al., 1976, J. Bacteoriol, 127: 1550), DNA polymerase, Pyrococcus kodakaraensis KOD DNA polymerase (Takagi et al., 1997, Appl. Environ. Microbiol. 63:4504), JDF-3 DNA polymerase (from thermococcus sp. JDF-3, Patent application WO 0132887), Pyrococcus GB-D (PGB-D) DNA polymerase (also referred as Deep Vent.TM. DNA polymerase, Juncosa-Ginesta et al., 1994, Biotechniques, 16:820, New England Biolabs), UlTma DNA polymerase (from thermophile Thermotoga maritima; Diaz and Sabino, 1998 Braz J. Med. Res, 31:1239; PE Applied Biosystems), Tgo DNA polymerase (from thermococcus gorgonarius, Roche Molecular Biochemicals), E. coli DNA polymerase I (Lecomte and Doubleday, 1983, Polynucleotides Res. 11:7505), T7 DNA polymerase (Nordstrom et al., 1981, J. Biol. Chem. 256:3112), and archaeal DP1I/DP2 DNA polymerase II (Cann et al, 1998, Proc. Natl. Acad. Sci. USA 95:14250).
Both mesophilic polymerases and thermophilic polymerases are contemplated. Thermophilic DNA polymerases include, but are not limited to, ThermoSequenase®, 9.degree.Nm.TM., Therminator.TM., Taq, Tne, Tma, Pfu, TfI, Tth, TIi, Stoffel fragment, Vent.TM. and Deep Vent.TM. DNA polymerase, KOD DNA polymerase, Tgo, JDF-3, and mutants, variants and derivatives thereof. A highly-preferred form of any polymerase is a 3' exonuclease-deficient mutant.
Reverse transcriptases useful in the invention include, but are not limited to, reverse transcriptases from HIV, HTLV-I, HTLV-II, FeLV, FIV, SIV, AMV, MMTV, MoMuLV and other retroviruses (see Levin, Cell 88:5-8 (1997); Verma, Biochim Biophys Acta. 473:1-38 (1977); Wu et al., CRC Crit. Rev Biochem. 3:289-347 (1975)). Attachment
In a preferred embodiment, nucleic acid template molecules are attached to a substrate (also referred to herein as a surface) and subjected to analysis by single molecule sequencing as described herein. Nucleic acid template molecules are attached to the surface such that the template/primer duplexes are individually optically resolvable. Substrates for use in the invention can be two- or three-dimensional and can comprise a planar surface (e.g., a glass slide) or can be shaped. A substrate can include glass (e.g., controlled pore glass (CPG)), quartz, plastic (such as polystyrene (low cross-linked and high cross-linked polystyrene), polycarbonate, polypropylene and poly(methymethacrylate)), acrylic copolymer, polyamide, silicon, metal (e.g., alkanethiolate- derivatized gold), cellulose, nylon, latex, dextran, gel matrix (e.g., silica gel), polyacrolein, or composites.
Suitable three-dimensional substrates include, for example, spheres, microparticles, beads, membranes, slides, plates, micromachined chips, tubes (e.g., capillary tubes), microwells, microfluidic devices, channels, filters, or any other structure suitable for anchoring a nucleic acid. Substrates can include planar arrays or matrices capable of having regions that include populations of template nucleic acids or primers. Examples include nucleoside-derivatized CPG and polystyrene slides; derivatized magnetic slides; polystyrene grafted with polyethylene glycol, and the like.
Substrates are preferably coated to allow optimum optical processing and nucleic acid attachment. Substrates for use in the invention can also be treated to reduce background. Exemplary coatings include epoxides, and derivatized epoxides (e.g., with a binding molecule, such as an oligonucleotide or streptavidin).
Various methods can be used to anchor or immobilize the nucleic acid molecule to the surface of the substrate. The immobilization can be achieved through direct or indirect bonding to the surface. The bonding can be by covalent linkage. See, Joos et al., Analytical Biochemistry 247:96-101, 1997; Oroskar et al., Clin. Chem. 42:1547-1555, 1996; and Khandjian, MoI. Bio. Rep. 11 : 107-115, 1986. A preferred attachment is direct amine bonding of a terminal nucleotide of the template or the 5' end of the primer to an epoxide integrated on the surface. The bonding also can be through non-covalent linkage. For example, biotin-streptavidin (Taylor et al., J. Phys. D. Appl. Phys. 24:1443, 1991) and digoxigenin with anti-digoxigenin (Smith et al., Science 253:1122, 1992) are common tools for anchoring nucleic acids to surfaces and parallels. Alternatively, the attachment can be achieved by anchoring a hydrophobic chain into a lipid monolayer or bilayer. Other methods for known in the art for attaching nucleic acid molecules to substrates also can be used. Detection
Any detection method can be used that is suitable for the type of label employed. Thus, exemplary detection methods include radioactive detection, optical absorbance detection, e.g., UV-visible absorbance detection, optical emission detection, e.g., fluorescence or chemiluminescence. For example, extended primers can be detected on a substrate by scanning all or portions of each substrate simultaneously or serially, depending on the scanning method used. For fluorescence labeling, selected regions on a substrate may be serially scanned one-by- one or row-by-row using a fluorescence microscope apparatus, such as described in Fodor (U.S. Pat. No. 5,445,934) and Mathies et al. (U.S. Pat. No. 5,091,652). Devices capable of sensing fluorescence from a single molecule include scanning tunneling microscope (siM) and the atomic force microscope (AFM). Hybridization patterns may also be scanned using a CCD camera (e.g., Model TE/CCD512SF, Princeton Instruments, Trenton, NJ.) with suitable optics (Ploem, in Fluorescent and Luminescent Probes for Biological Activity Mason, T. G. Ed., Academic Press, Landon, pp. 1-11 (1993), such as described in Yershov et al., Proc. Natl. Acad. Sci. 93:4913 (1996), or may be imaged by TV monitoring. For radioactive signals, a phosphorimager device can be used (Johnston et al., Electrophoresis, 13:566, 1990; Drmanac et al., Electrophoresis, 13:566, 1992; 1993). Other commercial suppliers of imaging instruments include General Scanning Inc., (Watertown, Mass. on the World Wide Web at genscan.com), Genix Technologies (Waterloo, Ontario, Canada; on the World Wide Web at confocal.com), and Applied Precision Inc. Such detection methods are particularly useful to achieve simultaneous scanning of multiple attached template nucleic acids.
A number of approaches can be used to detect incorporation of fluorescently-labeled nucleotides into a single nucleic acid molecule. Optical setups include near-field scanning microscopy, far-field confocal microscopy, wide-field epi-illumination, light scattering, dark field microscopy, photoconversion, single and/or multiphoton excitation, spectral wavelength discrimination, fluorophor identification, evanescent wave illumination, and total internal reflection fluorescence (TIRF) microscopy. In general, certain methods involve detection of laser-activated fluorescence using a microscope equipped with a camera. Suitable photon detection systems include, but are not limited to, photodiodes and intensified CCD cameras. For example, an intensified charge couple device (ICCD) camera can be used. The use of an ICCD camera to image individual fluorescent dye molecules in a fluid near a surface provides numerous advantages. For example, with an ICCD optical setup, it is possible to acquire a sequence of images (movies) of fluorophores.
Some embodiments of the present invention use TIRF microscopy for imaging. TIRF microscopy uses totally internally reflected excitation light and is well known in the art. See, e.g., the World Wide Web at nikon-instruments.jp/eng/page/products/tirf.aspx. In certain embodiments, detection is carried out using evanescent wave illumination and total internal reflection fluorescence microscopy. An evanescent light field can be set up at the surface, for example, to image fluorescently-labeled nucleic acid molecules. When a laser beam is totally reflected at the interface between a liquid and a solid substrate (e.g., a glass), the excitation light beam penetrates only a short distance into the liquid. The optical field does not end abruptly at the reflective interface, but its intensity falls off exponentially with distance. This surface electromagnetic field, called the "evanescent wave", can selectively excite fluorescent molecules in the liquid near the interface. The thin evanescent optical field at the interface provides low background and facilitates the detection of single molecules with high signal-to-noise ratio at visible wavelengths.
The evanescent field also can image fluorescently-labeled nucleotides upon their incorporation into the attached template/primer complex in the presence of a polymerase. Total internal reflectance fluorescence microscopy is then used to visualize the attached template/primer duplex and/or the incorporated nucleotides with single molecule resolution.
Some embodiments of the invention use non-optical detection methods such as, for example, detection using nanopores (e.g., protein or solid state) through which molecules are individually passed so as to allow identification of the molecules by noting characteristics or changes in various properties or effects such as capacitance or blockage current flow (see, for example, Stoddart et al, Proc. Nat. Acad. Sci., 106:7702, 2009; Purnell and Schmidt, ACS Nano, 3:2533, 2009; Branton et al, Nature Biotechnology, 26:1146, 2008; Polonsky et al, U.S. Application 2008/0187915; Mitchell & Howorka, Angew. Chem. Int. Ed. 47:5565, 2008; Borsenberger et al, J. Am. Chem. Soc, 131, 7530, 2009) ; or other suitable non-optical detection methods.
Analysis
Alignment and/or compilation of sequence results obtained from the image stacks produced as generally described above utilizes look-up tables that take into account possible sequences changes (due, e.g., to errors, mutations, etc.). Essentially, sequencing results obtained as described herein are compared to a look-up type table that contains all possible reference sequences plus 1 or 2 base errors. Direct RNA sequencing
RNA template capture involves attaching a po Iy(A) tail to a 3' end of an RNA molecule. Attaching may be by enzymatic methods, such as using E. coli poly(A) polymerase I or yeast poly(A). Other methods for adding the oligonucleotide tail of poly(A) to the RNA molecule may be accomplished by methods described herein. The length of the poly(A) tail may be controlled by introducing 3 ' deoxyATP (cordycepin triphosphate) to the poly-adenylation reaction shortly after the start of the tailing reaction. Thus, the length of the poly(A) tail is controlled by the amount of time allowed to elapse from the start of the poly-adenylation reaction and the addition of the 3' deoxyATP (cordycepin triphosphate). The 3' end block prevents "downward" extension during sequencing and hybridized to dT(50) flow cells.
In certain embodiments, RNA clean-up after the A-tailing and blocking reaction are performed by phenol/chloroform extraction and ethanol precipitation. Alternatively, commercial column-based RNA clean-up kits can be used, depending on the RNA sample being processed. This clean-up step is non-essential for sequencing purposes.
After tailing, the tailed RNA are introduced to primers and template/primer duplexes are formed. The duplex then undergoes the sequencing reaction as described herein. Further description is provided in Kahvejian (U.S. patent application number 2008/0081330), the content of which is incorporated by reference herein in its entirety.
On surface tailing
Other aspects of the invention provide a paired-end sequencing strategy for direct RNA sequencing. Methods of the invention involve attaching a poly(A) tail to an RNA molecule and introducing the tailed RNA molecule to a solid support having poly(T) primers, thereby forming template RNA/primer duplexes attached to the solid support. A sequencing reaction is then performed on at least a portion of the RNA molecule to obtain a first read. The sequencing reaction is performed as described herein.
After obtaining the first read, natural dNTPs are used to copy the remainder of the strand (known as "filling-up"), producing a cDNA-RNA hybrid along the length of the original template. The RNA template is removed by methods known in the art, such as exposing the duplex to hot water. The complementary DNA generated during the copy step remains on the surface because it is extended from the covalently attached poly(T) oligonucleotide. An oligonucleotide of poly(G) is then added to the 3' end of the cDNA. Adding the oligonucleotide tail of poly(G) may be accomplished by methods described herein, such as using a terminal transferase enzyme. The poly(G) tail is then blocked used a ddGTP. Blocking is described herein. A poly(C) primer is then hybridized to the poly(G) tail and a second sequencing reaction is performed. The second sequencing reaction is conducted as described herein. In certain embodiments, the poly(G) tail is not necessary, because the first sequencing reaction can provide the sequence of the terminal portion of the cDNA, which information may be used to design a primer that will hybridize to the terminal portion of the cDNA for the second sequencing reaction. Further description is provided in Harris (U.S. patent application number 2009/0053705), Harris (U.S. patent number 7,282,337), and Harris (U.S. patent application number 2009/0197257), the content of each of which is incorporated by reference herein in its entirety.
Incorporation by Reference
References and citations to other documents, such as patents, patent applications, patent publications, journals, books, papers, web contents, have been made throughout this disclosure. All such documents are hereby incorporated herein by reference in their entirety for all purposes.
Equivalents
The invention may be embodied in other specific forms without departing from the spirit or essential characteristics thereof. The foregoing embodiments are therefore to be considered in all respects illustrative rather than limiting on the invention described herein. Scope of the invention is thus indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein.
EXAMPLES
Example 1: Tailing of nucleic acids in presence of carrier molecule
The starting material for this protocol is about 5 to 6 nanograms of Chromatin Immunoprecipitated (ChIP) DNA, although as low as 3 nanograms of ChIP DNA may be used. The fragment size of the DNA is about 400 to about 500 bp. The reaction described below involves a poly(A) tailing step and a 3'-dideoxy-blocking step. The tailing reaction is conducted in the presence of an RNA carrier oligonucleotide. The reaction produces poly(A) tailed DNA, however the protocol described herein may be used to produce any type of tailed DNA by merely adjusting the reagents used in the reaction, as will be known to one of skill in the art.
Prior to conducting the tailing reaction on the DNA, RNA contamination may be removed by treating samples with RNase. A commercially available kit (Qiagen MinElute PCR Purification Kit, catalog number 28004) may be used for sample clean-up prior to the tailing reaction. ChIP DNA quantity may be determined with the QUANT-IT PicroGreen dsDNA Reagent Kit (Invitrogen, catalog number Pl 1495).
Materials
1. Terminal Transferase Kit (NEB, M0315)
2. dATP (Roche, 11277049001)
3. ddATP (Roche, 03732738001)
4. Control oligonucleotide (IDT/Operon)
Recommended Sequence:
5'-TCACTATTGTTGAGAACGTTGGCCTATAGTGAGTCGTATTACGCGCGGTGAC ACGGGAGATCTGAACTCGTACTCACG[ddT] (SEQ ID NO: 3)
5. RNA ribonucleotide carrier (IDT)
Recommended Sequence: 5'-AGAGUCCCAUCCUCACCAUCAUCACACUGGAAGACUGCAG (SEQ ID NO: 4)
6. Bovine serum Albumin (NEB B9001S)
7. Nuclease-free water
8. Pre-chilled aluminum block milled for 0.2 mL tubes
9. Thermocycler
10. P-2, P20, and P200 pipette
11. Ice bucket
12. QUANT-IT PicroGreen dsDNA Reagent Kit (Invitrogen, catalog number Pl 1495) - optional Methods
The following mix is prepared: 2μl of Terminal Transferase 1OX buffer; 2μl of CoCl2; 11.8μl of ChIP DNA and Nuclease-free water. The total volume is about 15.8μl. The mix is heated at 95°C for 5 minutes in the thermocycler to denature the DNA. After heating, the mix is cooled on the pre-chilled aluminum block that has been kept in an ice and water slurry (about O0C) to obtain single-stranded DNA. It is important to chill the sample as quick as possible to prevent re-annealing of the denatured, single-stranded DNA.
On ice, the following mix is added to the denatured DNA from above: lμl of Terminal Transferase (20U/μl); 2μl of 50μM dATP; lμl of lμM RNA oligonucleotide carrier; and 0.2μl of BSA. The volume of this mix is 4.2μl, bringing the total volume of the reaction to 20μl. Mix well. The tubes containing the mixture are placed in the thermocycler and the following program is run: 370C for 1 hour; 7O0C for 10 minutes; and temperature is brought back down to 4°C. A poly(A) tail will now have been added to the DNA.
Denature the 20μl poly-adenylation reaction by heating the mixture to 95°C for 5 minutes in the thermocycler followed by rapid cooling in the pre-chilled aluminum block kept in an ice and water slurry (about 00C). It is important to chill the sample as quick as possible to prevent re-annealing of the denatured, single-stranded DNA.
The following blocking mixture is added to the denatured poly-adenylated mixture from above: lμl of Terminal Transferase 1OX buffer; lμl OfCoCl2; lμl of Terminal Transferase (20U/μl); 0.5μl of 200μM ddATP; and 6.5μl of nuclease-free water. The volume of this mix is lOμl, bringing the total volume of the reaction to 30μl. Mix well.
The tubes containing the mixture are placed in the thermocycler and the following program is run: 37°C for 1 hour; 700C for 20 minutes; and temperature is brought back down to 4°C. A 3' end block will now have been added to the poly-adenylated DNA.
2 picomoles of control oligonucleotide is added to the heat inactivated 30μl terminal transferase reaction above. The control oligonucleotide is added to the sample to minimize ChIP DNA loss during sample loading steps. The control oligonucleotide does not contain a poly(A) tail, and therefore will not hybridize to the flow cell surface. The sample is now ready to be hybridized to the flow cells for the sequencing reaction. No additional clean-up step is required. Example 2: Direct tailing of nucleic acids
The starting material for this protocol is about 5 to 6 nanograms of Chromatin Immunoprecipitated (ChIP) DNA, although as low as 3 nanograms of ChIP DNA may be used. The fragment size of the DNA is about 400 to about 500 bp. The reaction described below involves a poly(A) tailing step and a 3'-dideoxy-blocking step. The tailing reaction is conducted in the presence of an RNA carrier oligonucleotide. The reaction produces poly(A) tailed DNA, however the protocol described herein may be used to produce any type of tailed DNA by merely adjusting the reagents used in the reaction, as will be known to one of skill in the art.
Prior to conducting the tailing reaction on the DNA, RNA contamination may be removed by treating samples with RNase. A commercially available kit (Qiagen MinElute PCR Purification Kit, catalog number 28004) may be used for sample clean-up prior to the tailing reaction. ChIP DNA quantity may be determined with the QUANT-IT PicroGreen dsDNA Reagent Kit (Invitrogen, catalog number Pl 1495).
Materials
1. Terminal Transferase Kit (NEB, M0315)
2. dATP (Roche, 11277049001)
3. ddATP (Roche, 03732738001)
4. Control oligonucleotide (IDT/Operon)
Recommended Sequence:
5'-TCACTATTGTTGAGAACGTTGGCCTATAGTGAGTCGTATTACGCGCGGTGAC ACGGGAGATCTGAACTCGTACTCACG[ddT] (SEQ ID NO: 3)
5. Bovine serum Albumin (NEB B9001S)
6. Nuclease-free water
7. Pre-chilled aluminum block milled for 0.2 mL tubes
8. Thermocycler
9. P-2, P20, and P200 pipette
10. Ice bucket
11. QUANT-IT PicroGreen dsDNA Reagent Kit (Invitrogen, catalog number Pl 1495) - optional Methods
The following mix is prepared: 2μl of Terminal Transferase 1OX buffer; 2μl of CoCl2; 10.8μl of ChIP DNA and Nuclease-free water. The total volume is about 14.8μl. The mix is heated at 950C for 5 minutes in the thermocycler to denature the DNA. After heating, the mix is cooled on the pre-chilled aluminum block that has been kept in an ice and water slurry (about 0°C) to obtain single-stranded DNA. It is important to chill the sample as quick as possible to prevent re-annealing of the denatured, single-stranded DNA.
On ice, the following mix is added to the denatured DNA from above: lμl of Terminal Transferase (klOdiluted, 2U/μl); 4μl of lOμM dATP; and 0.2μl of BSA. The volume of this mix is 5.2μl, bringing the total volume of the reaction to 20μl. Mix well.
The tubes containing the mixture are placed in the thermocycler and the following program is run: 37°C for 1 hour; 700C for 10 minutes; and temperature is brought back down to 4°C. A poly(A) tail will now have been added to the DNA.
Denature the 20μl poly-adenylation reaction by heating the mixture to 95°C for 5 minutes in the thermocycler followed by rapid cooling in the pre-chilled aluminum block kept in an ice and water slurry (about 0°C). It is important to chill the sample as quick as possible to prevent re-annealing of the denatured, single-stranded DNA.
The following blocking mixture is added to the denatured poly-adenylated mixture from above: lμl of Terminal Transferase 1OX buffer; lμl of CoCl2; lμl of Terminal Transferase (1:10 diluted, 2U/μl); lμl of Terminal Transferase (20U/μl); lμl of 200μM ddATP; and 6μl of nuclease-free water. The volume of this mix is lOμl, bringing the total volume of the reaction to 30μl. Mix well.
The tubes containing the mixture are placed in the thermocycler and the following program is run: 37°C for 1 hour; 70°C for 20 minutes; and temperature is brought back down to 4°C. A 3' end block will now have been added to the poly-adenylated DNA.
2 picomoles of control oligonucleotide is added to the heat inactivated 30μl terminal transferase reaction above. The control oligonucleotide is added to the sample to minimize ChIP DNA loss during sample loading steps. The control oligonucleotide does not contain a poly(A) tail, and therefore will not hybridize to the flow cell surface. The sample is now ready to be hybridized to the flow cells for the sequencing reaction. No additional clean-up step is required. Example 3: Sequencing nucleic acids
The 7249 nucleotide genome of the bacteriophage M13mpl8 was sequenced using single molecule methods of the invention. Purified, single-stranded viral M13mpl8 genomic DNA was obtained from New England Biolabs. Approximately 25 μg of M 13 DNA was digested to an average fragment size of 40 bp with 0.1 U Dnase I (New England Biolabs) for 10 minutes at 37°C. Digested DNA fragment sizes were estimated by running an aliquot of the digestion mixture on a precast denaturing (TBE-Urea) 10% polyacrylamide gel (Novagen) and staining with SYBR Gold (Invitrogen/Molecular Probes). The DNase I-digested genomic DNA was filtered through a YMlO ultrafiltration spin column (Millipore) to remove small digestion products less than about 30 nt. Approximately 20 pmol of the filtered DNase I digest was then polyadenylated with terminal transferase according to known methods (Roychoudhury, R and Wu, R. 1980, Terminal transferase-catalyzed addition of nucleotides to the 3' termini of DNA. Methods Enzymol. 65(l):43-62.). The average dA tail length was 50+/-5 nucleotides. Terminal transferase was then used to label the fragments with Cy3-dUTP. Fragments were then terminated with dideoxyTTP (also added using terminal transferase). The resulting fragments were again filtered with a YMlO ultrafiltration spin column to remove free nucleotides and stored in ddH2O at -20°C.
Epoxide-coated glass slides were prepared for oligo attachment. Epoxide-functionalized 40 mm diameter #1.5 glass cover slips (slides) were obtained from Erie Scientific (Salem, N.H.). The slides were preconditioned by soaking in 3xSSC for 15 minutes at 37°C. Next, a 500 μM aliquot of 5' aminated poly(dT50) primer (polythymidine of 50 nucleotides in length with a 5' terminal amine) is incubated with each slide for 30 minutes at room temperature in a volume of 80 ml. The resulting slides have primer attached by direct amine linkage to the epoxide. The slides are then treated with phosphate (1 M) for 4 hours at room temperature in order to passivate the surface. Slides re then stored in polymerase rinse buffer (20 mM Tris, 100 mM NaCl, 0.001% Triton X-100, pH 8.0) until they are used for sequencing.
For sequencing, the slides are placed in a modified FCS2 flow cell (Bioptechs, Butler, Pa.) using a 50 um thick gasket. The flow cell is placed on a movable stage that is part of a high- efficiency fluorescence imaging system built around a Nikon TE-2000 inverted microscope equipped with a total internal reflection (TIR) objective. The slide is then rinsed with HEPES buffer with 100 mM NaCl and equilibrated to a temperature of 500C. An aliquot of poly(dT50) template is placed in the flow cell and incubated on the slide for 15 minutes. After incubation, the flow cell is rinsed with lxSSC/HEPES/0.1% SDS followed by HEPES/NaCl. A passive vacuum apparatus is used to pull fluid across the flow cell. The resulting slide contains M 13 template/primer duplex. The temperature of the flow cell is then reduced to 370C for sequencing and the objective is brought into contact with the flow cell.
For sequencing, cytosine triphosphate, guanidine triphosphate, adenine triphosphate, and uracil triphosphate, each having a cyanine-5 label (at the 7-deaza position for ATP and GTP and at the C5 position for CTP and UTP (PerkinElmer) are stored separately in buffer containing 20 mM Tris-HCl, pH 8.8, 10 niM MgSO4, 10 mM (NH4)2SO4, 10 mM HCl, and 0.1% Triton X-100, and IOOU Klenow exo" polymerase (NEN). Sequencing proceeds as follows.
First, initial imaging is used to determine the positions of duplex on the epoxide surface. The Cy 3 label attached to the M 13 templates is imaged by excitation using a laser tuned to 532 nm radiation (Verdi V-2 Laser, Coherent, Inc., Santa Clara, Calif.) in order to establish duplex position. For each slide only single fluorescent molecules imaged in this step are counted. Imaging of incorporated nucleotides as described below is accomplished by excitation of a cyanine-5 dye using a 635 nm radiation laser (Coherent). 5 uM Cy5CTP is placed into the flow cell and exposed to the slide for 2 minutes. After incubation, the slide is rinsed in lxSSC/15 mM HEPES/0.1% SDS/pH 7.0 ("SSC/HEPES/SDS") (15 times in 60 ul volumes each, followed by 150 mM HEPES/150 mM NaCl/pH 7.0 ("HEPES/NaCl") (10 times at 60 ul volumes). An oxygen scavenger containing 30% acetonitrile and scavenger buffer (134 ul HEPES/NaCl, 24 ul 100 mM Trolox in MES, pH6.1, 10 ul DABCO in MES, pH 6.1, 8 ul 2M glucose, 20 ul NaI (50 mM stock in water), and 4 ul glucose oxidase) is next added. The slide is then imaged (500 frames) for 0.2 seconds using an Inova301K laser (Coherent) at 647 nm, followed by green imaging with a Verdi V-2 laser (Coherent) at 532 nm for 2 seconds to confirm duplex position. The positions having detectable fluorescence are recorded. After imaging, the flow cell is rinsed 5 times each with SSC/HEPES/SDS (60 ul) and HEPES/NaCl (60 ul). Next, the cyanine-5 label is cleaved off incorporated CTP by introduction into the flow cell of 50 mM TCEP for 5 minutes, after which the flow cell is rinsed 5 times each with SSC/HEPES/SDS (60 ul) and HEPES/NaCl (60 ul). The remaining nucleotide is capped with 50 mM iodoacetamide for 5 minutes followed by rinsing 5 times each with SSC/HEPES/SDS (60 ul) and HEPES/NaCl (60 ul). The scavenger is applied again in the manner described above, and the slide is again imaged to determine the effectiveness of the cleave/cap steps and to identify non-incorporated fluorescent objects.
The procedure described above is then conducted 100 nM Cy5dATP, followed by 100 nM Cy5dGTP, and finally 500 nM Cy5dUTP. The procedure (expose to nucleotide, polymerase, rinse, scavenger, image, rinse, cleave, rinse, cap, rinse, scavenger, final image) is repeated exactly as described for ATP, GTP, and UTP except that Cy5dUTP is incubated for 5 minutes instead of 2 minutes. Uridine is used instead of Thymidine due to the fact that the Cy5 label is incorporated at the position normally occupied by the methyl group in Thymidine triphosphate, thus turning the dTTP into dUTP. In all 64 cycles (C, A, G, U) are conducted as described in this and the preceding paragraph.
Once the desired number of cycles are completed, the image stack data (i.e., the single molecule sequences obtained from the various surface-bound duplex) are aligned to the M 13 reference sequence. The image data obtained can be compressed to collapse homopolymeric regions. Thus, the sequence "TCAAAGC" is represented as "TCAGC" in the data tags used for alignment. Similarly, homopolymeric regions in the reference sequence are collapsed for alignment.
The alignment algorithm matches sequences obtained as described above with the actual M13 linear sequence. Placement of obtained sequence on M13 is based upon the best match between the obtained sequence and a portion of M 13 of the same length, taking into consideration 0, 1, or 2 possible errors. All obtained 9-mers with 0 errors (meaning that they exactly match a 9-mer in the M 13 reference sequence) are first aligned with Ml 3. Then 10-, H-, and 12-mers with 0 or 1 error are aligned. Finally, all 13-mers or greater with 0, 1, or 2 errors are aligned.
Once the desired number of cycles is completed, the image stack data (i.e., the single molecule sequences obtained from the various surface-bound duplex) are aligned to the Ml 3 reference sequence and/or are aligned to the sequence initially obtained as described above. The image data obtained can be compressed to collapse homopolymeric regions as described above.

Claims

What is claimed is:
1. A method for sequencing nucleic acid from a sample, the method comprising: obtaining a sample comprising template nucleic acid; introducing a carrier molecule to the sample; attaching an oligonucleotide tail to the template nucleic acid in the presence of the carrier molecule; introducing the tailed template nucleic acid to primers to form template/primer duplexes; exposing the duplexes to at least one detectably labeled nucleotide in the presence of a polymerase under conditions that permit the polymerase to add nucleotides to the primer in a template-dependent manner; detecting a signal from the label of the incorporated labeled nucleotide; and sequentially repeating the exposing and detecting steps at least once.
2. The method according to claim 1, wherein the oligonucleotide tail is a poly(A) tail.
3. The method according to claim 1, wherein the oligonucleotide tail is ligated to the template nucleic acid.
4. The method according to claim 1, wherein the oligonucleotide tail is attached by a terminal transferase enzyme.
5. The method according to claim 1, wherein the oligonucleotide tail is about 150 nucleotides.
6. The method according to claim 1, wherein the carrier molecule is an RNA oligonucleotide.
7. The method according to claim 1, wherein the carrier molecule is a bead bound oligonucleotide.
8. The method according to claim 1, wherein the duplexes are attached to a substrate.
9. The method according to claim 8, wherein at least some of the duplexes are individually optically resolvable.
10. The method according to claim 8, wherein the duplexes are directly attached to the substrate.
11. The method according to claim 8, wherein the duplexes are indirectly attached to the substrate.
12. The method according to claim 1, wherein the detectable label is an optically detectable label.
13. The method according to claim 12, wherein the optically detectable label is a fluorescent label.
13. The method according to claim 1, wherein the detectable label is a non-optically detectable label.
14. The method according to claim 1, wherein the amount of nucleic acids is the sample is a nanogram amount or a picogram amount.
15. The method according to claim 1, wherein the sample comprises nucleic acid from a single cell.
16. A method for analyzing a nucleic acid in a sample, the method comprising: obtaining a sample comprising template nucleic acid; increasing the number of 3' ends of template nucleic acid in the sample; attaching an oligonucleotide tail to the 3' ends of the template nucleic acids; introducing the tailed template nucleic acid to primers to form template/primer duplexes; exposing the duplexes to at least one detectably labeled nucleotide in the presence of a polymerase under conditions that permit the polymerase to add nucleotides to the primer in a template-dependent manner; detecting a signal from the label of the incorporated labeled nucleotide; and sequentially repeating the exposing and detecting steps at least once.
17. The method according to claim 16, wherein the oligonucleotide tail is a poly(A) tail.
18. The method according to claim 16, wherein the oligonucleotide tail is ligated to the template nucleic acid.
19. The method according to claim 16, wherein the oligonucleotide tail is attached by a terminal transferase enzyme.
20. The method according to claim 16, wherein the duplexes are attached to a substrate.
21. The method according to claim 20, wherein at least some of the duplexes are individually optically resolvable.
22. The method according to claim 20, wherein the duplexes are directly attached to the substrate.
23. The method according to claim 20, wherein the duplexes are indirectly attached to the substrate.
24. The method according to claim 16, wherein the detectable label is an optically detectable label.
25. The method according to claim 24, wherein the optically detectable label is a fluorescent label.
26. The method according to claim 16, wherein the detectable label is a non-optically detectable label.
27. The method according to claim 16, wherein the amount of nucleic acids is the sample is a nanogram amount or a picogram amount.
28. The method according to claim 16, wherein the sample comprises nucleic acid from a single cell.
29. The method according to claim 16, wherein increasing comprising introducing the template nucleic acids to at least one restriction enzyme.
PCT/US2010/024547 2009-02-18 2010-02-18 Sequencing small quantities of nucleic acids WO2010096532A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US12/904,683 US20110091883A1 (en) 2009-02-18 2010-10-14 Methods for analyzing minute cellular nucleic acids
US14/158,618 US20150307932A1 (en) 2009-02-18 2014-01-17 Methods for Analyzing Minute Cellular Nucleic Acids

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US15354809P 2009-02-18 2009-02-18
US61/153,548 2009-02-18

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US12/904,683 Continuation-In-Part US20110091883A1 (en) 2009-02-18 2010-10-14 Methods for analyzing minute cellular nucleic acids

Publications (1)

Publication Number Publication Date
WO2010096532A1 true WO2010096532A1 (en) 2010-08-26

Family

ID=42634204

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2010/024547 WO2010096532A1 (en) 2009-02-18 2010-02-18 Sequencing small quantities of nucleic acids

Country Status (1)

Country Link
WO (1) WO2010096532A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023131939A1 (en) * 2022-01-05 2023-07-13 Yeda Research And Development Co. Ltd. Methods and kits for analyzing nucleosomes and plasma proteins

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5580970A (en) * 1989-12-01 1996-12-03 Amoco Corporation Detection of HPV transcripts
US20070231795A1 (en) * 2002-06-17 2007-10-04 Intel Corporation Methods and apparatus for nucleic acid sequencing by signal stretching and data integration
US20080076123A1 (en) * 2006-09-27 2008-03-27 Helicos Biosciences Corporation Polymerase variants for DNA sequencing

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5580970A (en) * 1989-12-01 1996-12-03 Amoco Corporation Detection of HPV transcripts
US20070231795A1 (en) * 2002-06-17 2007-10-04 Intel Corporation Methods and apparatus for nucleic acid sequencing by signal stretching and data integration
US20080076123A1 (en) * 2006-09-27 2008-03-27 Helicos Biosciences Corporation Polymerase variants for DNA sequencing

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
MICHELSON ET AL.: "Characterization of the homopolymer tailing reaction catalyzed by terminal dewcynudeotidyl transferase.", JBC, vol. 257, 1982, pages 14773 - 14782 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023131939A1 (en) * 2022-01-05 2023-07-13 Yeda Research And Development Co. Ltd. Methods and kits for analyzing nucleosomes and plasma proteins

Similar Documents

Publication Publication Date Title
EP2007908B1 (en) Methods for increasing accuracy of nucleic acid sequencing
US20110301042A1 (en) Methods of sample encoding for multiplex analysis of samples by single molecule sequencing
US20150159210A1 (en) Methods for Increasing Accuracy of Nucleic Acid Sequencing
US9163053B2 (en) Nucleotide analogs
US7767400B2 (en) Paired-end reads in sequencing by synthesis
US20070099212A1 (en) Consecutive base single molecule sequencing
US7767805B2 (en) Methods and compositions for sequencing a nucleic acid
US20080103058A1 (en) Molecules and methods for nucleic acid sequencing
US7994304B2 (en) Methods and compositions for sequencing a nucleic acid
US20090305248A1 (en) Methods for increasing accuracy of nucleic acid sequencing
US20070190546A1 (en) Methods and compositions for sequencing a nucleic acid
WO2009097626A2 (en) Paired-end reads in sequencing by synthesis
US20100203524A1 (en) Polymerases and methods of use thereof
US20080138804A1 (en) Buffer composition
US20090226906A1 (en) Methods and compositions for reducing nucleotide impurities
US20090226900A1 (en) Methods for Reducing Contaminants in Nucleic Acid Sequencing by Synthesis
WO2010096532A1 (en) Sequencing small quantities of nucleic acids
WO2009085328A1 (en) Molecules and methods for nucleic acid sequencing
JP2024502293A (en) Sequencing of non-denaturing inserts and identifiers

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 10744282

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 10744282

Country of ref document: EP

Kind code of ref document: A1