EP2203568A1 - Procédé et système pour obtenir des fragments de séquence segmentés et ordonnés le long d'une molécule d'acide nucléique - Google Patents

Procédé et système pour obtenir des fragments de séquence segmentés et ordonnés le long d'une molécule d'acide nucléique

Info

Publication number
EP2203568A1
EP2203568A1 EP08843207A EP08843207A EP2203568A1 EP 2203568 A1 EP2203568 A1 EP 2203568A1 EP 08843207 A EP08843207 A EP 08843207A EP 08843207 A EP08843207 A EP 08843207A EP 2203568 A1 EP2203568 A1 EP 2203568A1
Authority
EP
European Patent Office
Prior art keywords
nucleic acid
polymerase
acid molecule
acceptor
fret
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP08843207A
Other languages
German (de)
English (en)
Inventor
Susan Hardin
Mitsu Sreedhar Reddy
Tommie Lloyd Lincecum, Jr.
Anelia Kraltcheva
Uma Nagaswamy
Alok N. Bandekar
Hongyi Wang
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Life Technologies Corp
Original Assignee
Life Technologies Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Life Technologies Corp filed Critical Life Technologies Corp
Publication of EP2203568A1 publication Critical patent/EP2203568A1/fr
Withdrawn legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • C12Q1/6874Methods for sequencing involving nucleic acid arrays, e.g. sequencing by hybridisation

Definitions

  • TITLE A METHOD AND SYSTEM FOR OBTAINING ORDERED
  • nucleic acids More particularly, provided herein are methods, systems and compositions suitable for realtime single molecule sequencing using ordered, segmented sequence fragments along a nucleic acid molecule.
  • DNA deoxyribonucleic acid
  • first-generation methods require large quantities of the target DNA molecule to be sequenced using time and resource intensive processes.
  • Maxam-Gilbert sequencing involves the chemical cleavage of end-labeled fragments of DNA. The resulting fragments are then size separated by gel electrophoresis, and the sequence of the original end-labeled fragments is determined by analyzing the pattern of fragments produced by the gel. Read lengths using this approach are typically limited to approximately 500 nucleotides. Furthermore, such methods are lengthy, and frequently require amplification of the target DNA to obtain sufficient amounts of starting material.
  • DNA sequencing methodologies generally involve monitoring the activity of a sequencing enzyme, such as DNA polymerase, as it replicates a test DNA molecule by polymerizing monomelic subunits, such as dNTPs, to extend a primer into a newly synthesized DNA strand that complements the test molecule of interest.
  • the polymerization products are analyzed after the sequencing reaction has been terminated, thereby adding to the length of the process.
  • Sanger-dideoxy sequencing involves elongation of an end-labeled nucleotide primer with random incorporation of chain terminating dideoxy nucleotides in four separate DNA polymerase reactions.
  • the extension products must be size separated by gel electrophoresis and the nucleotide sequence may be determined from analyzing the pattern of fragments in the gel.
  • the use of four different fluorescently labeled dideoxynucleotides enables the sequencing reactions to be size separated in a single gel lane, facilitating automated sequence determination. Read lengths utilizing this approach are limited to approximately 1000 nucleotides, and the process can take a few hours to half a day to perform.
  • DNA strands thereby facilitating accurate assembly of contiguous extended nucleic acid sequences.
  • these methods readily facilitate high throughput sequencing in parallel, and ultimately allow the simultaneous sequencing of an entire genome rapidly and cheaply.
  • methods for sequencing at least a portion of a nucleic acid molecule in real time or near real time comprising the steps of displaying a nucleic acid molecule; manipulating the nucleic acid molecule to form one or more polymerase-accessible priming sites along the length of the nucleic acid molecule, wherein the one or more priming sites are separated from each other by a length of nucleotides sufficient to permit independent detection and resolution of sequencing activity occurring at each priming site by a detection system; contacting at least a portion of the nucleic acid molecule with a polymerase solution and one or more detectably labeled components under such conditions that extension occurs from at least one priming site; monitoring signals emitted during the extension reaction by at least one detectably labeled component; and analyzing the signals in real or near real time to determine the sequence of at least a portion of the nucleic acid molecule.
  • At least one of the detectably labeled components comprises a Forster resonance energy transfer (FRET) donor.
  • FRET Forster resonance energy transfer
  • At least one of the detectably labeled components comprises a FRET acceptor.
  • At least one of the detectably labeled components comprises both a FRET donor and a FRET acceptor.
  • At least one of the detectably labeled components is a polymerase operably linked to a FRET donor.
  • the signals emitted during the extension reaction are a result of
  • the signals emitted during the extension reaction are signals resulting from FRET between a FRET donor and the FRET acceptor.
  • the signals emitted during the extension reaction are FRET signals resulting from energy transfer between at least one intercalated dye molecule and at least one nucleotide labeled with a FRET acceptor.
  • At least one of the detectably labeled components is an intercalating dye.
  • At least one detectably labeled component is a nucleotide operably linked to a FRET acceptor.
  • the FRET acceptor is attached to a portion of the nucleotide that is released upon incorporation of the nucleotide into a nascent nucleotide strand that is synthesized by the polymerase.
  • the FRET acceptor is attached to a portion of the nucleotide that becomes incorporated into a nascent nucleotide strand synthesized by the polymerase, and the sequencing method further comprises the step of removing the acceptor after incorporation.
  • removing the acceptor after incorporation comprises photobleaching the acceptor after incorporation or, alternatively, photocleaving the acceptor after incorporation.
  • displaying the single nucleic acid molecule comprises immobilizing the nucleic acid molecule by attachment to a substrate.
  • immobilizing a polynucleotide strand further comprises providing a substrate including a surface having a layer formulated to immobilize a polynucleotide strand or a plurality of polynucleotide strands in an elongated form.
  • each immobilized polynucleotide strand is attached to the substrate by at least one attachment site.
  • the immobilized polynucleotide strand is immobilized at a plurality of attachment sites situated along its length so that the strand is fixed to the substrate in an elongated form to minimize strand movement during subsequent processing steps.
  • displaying the single nucleic acid molecule comprises introducing the molecule into a nanostructure adapted to receive and display the molecule.
  • manipulating the nucleic acid molecule to form a plurality of polymerase-accessible priming sites further comprises annealing one or more oligonucleotide primers along the length of the nucleic acid molecule.
  • one or more oligonucleotide primers is a random primer.
  • one or more oligonucleotide primers is a site- specific primer.
  • manipulating the nucleic acid molecule to form a plurality of polymerase-accessible priming sites further comprises contacting the nucleic acid molecule with a nicking reagent adapted to form a plurality of polymerase-accessible nick sites along the length of the nucleic acid molecule.
  • manipulating the nucleic acid molecule to form a plurality of polymerase-accessible priming sites further comprises treating the DNA with chemical or enzymatic nicking agents.
  • the polymerase solution comprises at least one type of detectably labeled nucleotide.
  • the detectably labeled nucleotides are added separately from the polymerase solution.
  • the detectably labeled nucleotides are added prior to, or after, the addition of the polymerase solution.
  • the polymerase solution comprises at least two, three or four types of detectably labeled nucleotides.
  • the detectable label of at least one type of detectably labeled nucleotide is a chromophore, fluorophore or luminophore.
  • the detectable label of at least one type of detectably-labeled nucleotide is selected from the group consisting of: ROX, Cy3,
  • Cy5 xanthine dye, fluorescein, cyanine, rhodamine, coumarin, acridine, Texas Red dye,
  • the polymerase solution comprises a polymerase.
  • the polymerase is an RNA polymerase, DNA polymerase or reverse transcriptase.
  • the DNA polymerase is a Klenow fragment of DNA polymerase I, PM29 DNA polymerase, B54 DNA polymerase, 9 0 N DNA polymerase, Vent DNA polymerase, Deep
  • Vent DNA polymerase E. coli DNA polymerase I, T7 DNA polymerase, T4 DNA polymerase, Thermus acquaticus DNA polymerase, or Thermococcus litoralis DNA polymerase.
  • the polymerase solution comprises at least one type of detectably labeled nucleotide comprising three, four or more phosphate groups.
  • At least one detectably labeled component comprises a detectably labeled nucleotide, wherein the detectable label is operably linked to a terminal phosphate in the polyphosphate chain of the detectably labeled nucleotide.
  • At least one detectably labeled component comprises a nucleotide operably linked to least two separate detectable labels.
  • the polymerase solution further comprises a detectably labeled polymerase.
  • At least one of the detectably labeled components is a polymerase operably linked to a nanocrystal or other FRET donor.
  • the nucleic acid molecule comprises chromosomal DNA.
  • the nucleic acid molecule comprises an intact chromosome.
  • the sequencing method further comprises sequencing one or more additional nucleotide strands in parallel with sequencing a first nucleotide strand according to the methods disclosed herein.
  • the detectably labeled components comprise one type of detectably labeled nucleotide and a detectably labeled polymerase.
  • the detectably labeled components comprise a fluorescent moiety that non-specifically associates with the template nucleic acid molecule along the length of the molecule.
  • the fluorescent moiety is a FRET donor.
  • the fluorescent moiety is an intercalating dye.
  • the intercalating dye becomes absorbed into the polynucleotide strand and becomes fluorescently active upon absorption.
  • the polymerase-accessible priming sites are separated by a length of nucleotides sufficient to separate the polymerase-accessible sites by a distance sufficient to permit independent detection of the polymerases on the polymerase-accessible nick sites via a detection system.
  • Also provided herein are methods for sequencing at least a portion of a nucleic acid molecule in real time or near real time comprising the steps of immobilizing a nucleic acid molecule on a substrate; nicking the immobilized nucleic acid molecule to form one or more polymerase-accessible nick sites along the length of the strand; adding an intercalating dye and a polymerase solution, wherein the polymerase solution further comprises a polymerase and one or more detectably labeled nucleotides, under conditions such that an extension reaction is initiated at one or more polymerase-accessible nick sites along the length of the immobilized nucleic acid molecule; monitoring signals emitted during the extension reaction at one or more polymerase-accessible nick sites; and analyzing the signals in real or near real time to determine the sequence of at least some portion of the nucleic acid molecule.
  • the extension reaction extends the polymerase-accessible nick site by a plurality of nucleotides, or by at least 10, 20, 50, 100, 250, 500 or 1000 nucleotides. [0043] In some embodiments, the extension reaction is monitored by a monitoring subsystem capable of visualizing extension activity along the strand at one or more polymerase- accessible nick sites.
  • the extension reaction is monitored through detection of FRET signals arising from energy transfer from at least one intercalated dye molecule and at least one detectable label of a detectably labeled nucleotide.
  • the sequencing method further comprises the step of converting the detected events into a sequence of identified nucleotides complementary to the non-nicked single strand at the nick site.
  • the distance separating the nick sites is between about 1 Kb to about 250 Kb, between about 2 Kb to about 200 Kb, between about 3 Kb to about 100 Kb, between about 3 Kb to about 50 Kb, between about 3 Kb to about 10 Kb, between about 3 Kb to about 5 Kb, or between about 5 Kb to about 10 Kb.
  • a system for sequencing a nucleotide strand by obtaining ordered sequence fragments along a polynucleotide strand comprising a reaction chamber comprising a substrate on which at least one polynucleotide strand can be immobilized and nicked; a monitoring subsystem capable of detecting signals from extension activity occurring at the nick sites along the at least one polynucleotide strand; and an analyzing subsystem that converts the signals detected from extension activity into sequence information and then maps sequence fragments along the length of the at least one polynucleotide strand in such a manner that ordered sequence fragment information is obtained for nucleic acid identification and classification.
  • a system for sequencing DNA by obtaining ordered sequence fragments along a polynucleotide strand comprising a reaction chamber comprising a substrate on which at least one polynucleotide strand can be nicked and immobilized; a monitoring subsystem capable of detecting signals from extension activity occurring at the nick sites along the at least one polynucleotide strand; and an analyzing subsystem that converts the signals detected from extension activity into sequence information and then maps sequence fragments along the length of the at least one polynucleotide strand in such a manner that ordered sequence fragment information is obtained for nucleic acid identification and classification.
  • Figure 1 depicts a visual characterization of one embodiment of the sequencing methods and systems of the present disclosure.
  • Figure 2 depicts fluorescent spectra of four intercalating dyes.
  • Figure 3A depicts SYBR Green I average intensity from a user-defined Region of
  • ROI Interest (ROI) containing a DNA fragment, relative to average background intensity.
  • Figure 3B depicts YOYO-I average intensity within a user-defined ROI, relative to average background intensity.
  • Figure 4A depicts spectra of YOYO-I and four fluorescent acceptors.
  • Figure 4B depicts spectra of a quantum dot (Qdot 525) and four fluorescent acceptors.
  • Figure 5 depicts images of background fluorescence from acceptor-labeled nucleotide on glass substrates coated with H — I — h and PEBN layers.
  • Figure 6 depicts images of DNA nicked with a site-specific nickase, incubated with acceptor-labeled nucleotide (dU-Cy5) and polymerase, mixed with the intercalating dye
  • Figure 7 depicts images of DNA nicked with a site-specific nickase, incubated with
  • FRET acceptor-labeled nucleotide (dU-A1610) and polymerase, mixed with the intercalating dye YOYO-I and immobilized on a PEBN-coated surface.
  • Figure 8 depicts images of DNA nicked with a site-specific nickase, incubated with acceptor-labeled nucleotide (dU-Cy5) and polymerase, mixed with the intercalating dye
  • Figure 9 depicts images of DNA nicked with a site-specific nickase, immobilized on a PEBN-coated surface, incubated with acceptor-labeled nucleotides and polymerase, and mixed with the intercalating dye YOYO-I.
  • Figure 10 depicts pictorially one exemplary embodiment of the sequencing compositions and methods disclosed herein, using a donor labeled polymerase, acceptor labeled nucleotides, and unlabeled surface-immobilized DNA template.
  • Figure 11 illustrates the detection of incorporation events that occur using the methodology of Figure 10, and depicts a comparison of various attributes of donor segments, made between different polymerases binding to immobilized duplex on a surface.
  • Figure 12 depicts a schematic for monitoring FRET emissions arising from incorporation of base-labeled nucleotides in real time.
  • Figure 13 depicts assessment of acceptor signal using SYBR Green I as the FRET donor.
  • Figure 14 depicts assessment of acceptor signal using SYBR Green I as the FRET donor in a single molecule assay.
  • Figure 15 depicts assessment of acceptor signal using SYBR Green I as the FRET donor.
  • Figure 16 depicts images of Lambda ( ⁇ ) DNA incubated with a mixture containing DNAse I, acceptor-labeled nucleotides and polymerase, then mixed with the intercalating dye YOYO-I, and immobilized on PEBN-coated surfaces.
  • Figure 17 depicts a real-time incorporation trace at right showing 20 ASN with base- labeled dNTP ("BL dNTP"); donor dipping is clearly detectable in the trace.
  • Figure 18 depicts results of depicts overlay of the segmented object after automatic segmentation and registration of the fluorescence image to identify FRET events.
  • Panel 18(A) depicts an average intensity image in the donor channel and the user defined box represents the region of interests (ROI) in the image.
  • Panel 18(B) depicts the overlay of the segmented object on top of the average intensity image in the donor channel.
  • Panel 18(C) depicts an average intensity image in the acceptor channel and the user defined box represents the ROI in the image.
  • Panel 18(D) depicts the overlay of the segmented object on top of the average intensity image in the acceptor channel.
  • Figure 19 Panel 19(A) depicts the overlay of the segmented object on top of the average intensity image in the donor channel.
  • Figure 19(B) depicts the overlay of the segmented object on top of the average intensity image in the acceptor channel.
  • Panel 19(C) depicts the overlay of the registered object with respect to the donor channel on top of the average intensity image in the acceptor channel.
  • Figure 20 The top-most graph of Panel 20(A) depicts the normalized intensity profile of the segmented object in the donor channel.
  • the middle graph of Panel 20(A) depicts the normalized intensity profile of the segmented object in the acceptor channel.
  • Panel 20(B) depicts co-localization of the detected points in the acceptor channel via Argon 488nm or Red HeNe excitation, confirming the accuracy to the level of pixel registration of the automated analysis.
  • the term “a” or “an” means “at least one” or “one or more”.
  • the terms “comprising” (and any form or variant of comprising, such as “comprise” and “comprises”), “having” (and any form or variant of having, such as “have” and “has”), "including” (and any form or variant of including, such as “includes” and “include”), or “containing” (and any form or variant of containing, such as “contains” and “contain”) are inclusive or open-ended and do not exclude additional, unrecited additives, components, integers, elements or method steps.
  • compositions and methods of this invention have been described in terms of preferred embodiments, these embodiments are in no way intended to limit the scope of the claims, and it will be apparent to those of skill in the art that variations may be applied to the compositions and/or methods and in the steps or in the sequence of steps of the methods described herein without departing from the concept, spirit and scope of the invention. More specifically, it will be apparent that certain agents which are both chemically and physiologically related may be substituted for the agents described herein while the same or similar results would be achieved. All such similar substitutes and modifications apparent to those skilled in the art are deemed to be within the spirit, scope and concept of the invention as defined by the appended claims.
  • the sequencing methods, systems and compositions of the present disclosure collectively provide rapid sequencing of a single polymeric molecule of interest, such as a nucleic acid, by monitoring of emitted signals. More specifically, the present disclosure provides a method for obtaining ordered, segmented sequence fragments along a polynucleotide strand.
  • the nucleic acid molecule to be sequenced is oriented, or otherwise displayed, in a spatially addressable before or after being subjected to an extension (i.e., sequencing) reaction in situ. Either prior to or after display and/or extension, the nucleic acid molecule is also treated so as introduce or form a plurality of polymerase-accessible priming sites along the length of the molecule, where adjacent priming sites on a strand are separated by a length of nucleotides sufficient to permit independent detection and resolution by a detection system.
  • an extension i.e., sequencing
  • Treatment of the nucleic acid molecule to introduce priming sites may be performed before, after or concurrently with the elongation/display step and/or the extension step; the order of steps is immaterial.
  • the nucleic acid molecule is contacted with a polymerase solution and with other components of the sequencing machinery under conditions such that the polymerase extends the nucleic acid strand from at least one priming site by polymerizing nucleotides onto a free 3' end of the nucleic acid molecule.
  • the polymerase and/or components of the sequence machinery are operably linked to, or otherwise associated with, detectable labels that emit signals as the sequencing reaction proceeds. These signals are detected and analyzed in real time or near real time to obtain sequence information for at least some portion of the nucleic acid molecule.
  • the nucleic acid molecule to be sequenced is DNA or RNA; however, in some cases it can also be a polymer comprising nucleotide analogs capable of polymerization by a polymerase.
  • the nucleic acid is displayed by maintaining it in a spatially addressable format, such that signals emitted from a specific and discrete point, portion, region or terminus of the nucleic acid molecule can be visualized, resolved assigned to their site of origin on the nucleic acid molecule, and tracked over time.
  • Any suitable method may be used to display the nucleic acid molecule, including but not limited to fixation or immobilization of the molecule on a surface, suspending the nucleic acid molecule in a laminar flow stream, passing the nucleic acid molecule to a nanopore, confining the nucleic acid molecule within a waveguide or within a suitable nanostructure, e.g., a nanoture, nanowell, nanochannel or the like, adapted to receiving and display the nucleic acid molecule, or using optical tweezers to hold and restrict the nucleic acid molecule during the detection step.
  • fixation or immobilization of the molecule on a surface suspending the nucleic acid molecule in a laminar flow stream, passing the nucleic acid molecule to a nanopore, confining the nucleic acid molecule within a waveguide or within a suitable nanostructure, e.g., a nanoture, nanowell, nanochannel or the like, adapted to receiving and display the nucleic acid molecule,
  • the nucleic acid molecule may be displayed by moving the molecule relative to a detection station, such that signals emitted from the molecule are tracking along the length of the molecule and assigned to their point of original along the nucleic acid molecule.
  • the nucleic acid molecule is displayed by fixing or otherwise immobilizing the nucleic acid molecule on a two-dimensional surface on a substrate. Any suitable substrate may be used for immobilization of the nucleic acid molecule, including substrates that exhibit non-specific adherence to nucleic acid, or substrates to which nucleic acid molecules can be bound.
  • the nucleic acid molecule is immobilized by contacting it with a substrate including a surface having a layer formulated to immobilize a polynucleotide strand or a plurality of polynucleotide strands in an elongated form.
  • the nucleic acid molecule may be bound to the surface of the substrate via a plurality of attachment sites situated along its length, so that the strand is fixed to the substrate in an elongated form to minimize strand movement during subsequent processing steps.
  • individual polymeric molecules are displayed and elongated using a nanofluidic device comprising a nanochannel array, wherein the entire sample population is elongated and displayed in a spatially addressable format.
  • the use of nanofluidic devices for separation and isolation of test polymeric molecules bypasses the requirement for immobilization or attachment of sequencing components to a substrate and also enables the sequencing of intact chromosomes, thereby exponentially increasing the amount of sequencing information obtained from a single reaction and also enabling analysis of such "macro" structural features as methylation, inversions, indels and tandem repeats.
  • nanofluidic devices that permit the simultaneous observation of a high number of macromolecules in a multitude of channels can be employed.
  • Such devices increase the amount of sequence information obtainable from a single experiment and decrease the cost of sequencing of an entire genome. See, for example, U.S. Published App. No. 2004/0197843.
  • semiconductor nanocrystals or analogs thereof operably linked to polymerase activity polymer sequence data can be generated as labeled monomers are incorporated into a newly synthesized polymer strand by a polymerase, thus enabling the sequencing of polymers in real time.
  • the nanofluidic-based sequencing methods disclosed herein can be used to rapidly obtain both "raw" sequence at the single nucleic acid molecule level as well as validation of incoming sequence information via simultaneous priming at multiple points along the template strand.
  • Manipulation of the DNA includes without limitation any method of treatment that results in formation of one or more priming site along the length of a nucleic acid molecule, while preserving the ability to obtain resolvable sequence information from the molecule and maintaining the structural integrity of the molecule, i.e., will not induce fragmentation, degradation, disruption or complete breakage of the nucleic acid molecule.
  • the manipulation is performed in such as manner as to ensure that independent sequence information can be determined from each priming site spaced along the length of the DNA strand.
  • the manipulation is refined to obtain optimal spacing of priming sites, such that the priming sites are separated from each other by a length of nucleotides sufficient to make their termini accessible to a polymerase for subsequent extension.
  • the separation length, or distance, between the priming sites can vary between about 1 Kb (kilobase) to about 250 Kb along the nucleic acid strand. In certain embodiments, the separation distance between adjacent priming sites is between about 3 Kb and about 5 Kb along the length of the nucleic acid molecule.
  • priming sites on a test nucleic acid molecule is by annealing the nucleic acid molecule with one or more primers.
  • the primers may either be random, or may be adapted to bind only to certain portions of the nucleic acid molecule.
  • Another typical method used to introduce priming sites into a nucleic acid molecule involves nicking of the molecule by chemical or enzymatic means, or with nicking reagents.
  • Suitable nicking reagents include without limitation any reagent capable of creating a nick in either strand of the polynucleotide, such as, for example, exonucleases, DNases, chemical reagents such as the glycogen product from Fermentas International, Inc., of Ontario, Canada, or any other chemical or biological system capable of introducing nick sites into one or both strands of a nucleic acid molecule.
  • the limited nicking reaction can be performed in solution prior to nucleic acid immobilization as described in Zasloff and Camerini-Otero, 1980, or the limited nicking reaction can be performed after nucleic acid immobilization to ensure that most if not all of the nick sites are polymerase-accessible, if an enzyme is used to nick the polynucleotide.
  • the frequency of extendable nicks is characterized by incorporating a base-labeled nucleotide at the nick sites in solution.
  • the resulting nucleic acid comprising nicked termini is then immobilized on a substrate and visualized using a single-molecule detection system by either direct excitation of the acceptor, or by detection of FRET between a donor dye used to stain the DNA (e.g., SYBR Green I, YOYO-I or similar intercalating or groove-binding dye) and the incorporated acceptor, as described herein.
  • a donor dye used to stain the DNA e.g., SYBR Green I, YOYO-I or similar intercalating or groove-binding dye
  • the nucleic acid molecule must be contacted with a polymerase solution and at least one detectably labeled component. Such contacting may be achieved by any suitable means, in any phase, in any order of addition of reagents, and under any suitable conditions that permit polymerization of nucleotides to ultimately occur.
  • the polymerase solution polymerase solution comprising a polymerase, or any agent that is capable of polymerizing monomeric subunits into polymers, and a suitable buffer.
  • other components of the sequencing machinery such as nucleotides or nucleotide analogs that can be polymerized into the extending strand by the polymerase, are included in the polymerase solution.
  • these additional components may be added to the extension reaction, or otherwise contacted with the nucleic acid template, at any time in the procedure.
  • any and all components of the sequence/extension reaction may be added to, or otherwise contacted with, the nucleic acid molecule to be sequenced in any order whatsoever that permits productive extension via incorporation of nucleotides into the extending strand.
  • the polymerase and/or other components of the sequencing machinery are operably linked to, or otherwise associated with, a detectable label.
  • the nucleotides of the polymerase solution are detectably labeled.
  • a high efficiency FRET event occurs via energy transfer to the acceptor from donor intercalated dye molecules located both 5' (i.e., upstream), or 3' (i.e., downstream), or both 5' and 3', of the nucleotide incorporation site.
  • the detectable signal is a FRET signal generated between a FRET donor moiety and a FRET acceptor moiety.
  • FRET Forster resonance energy transfer
  • FRET donor first excited molecule
  • FRET acceptor second molecule
  • the process of energy transfer results in a reduction (quenching) of fluorescence intensity and excited state lifetime of the FRET donor, and can produce an increase in the emission intensity of the FRET acceptor.
  • FRET occurs only when two appropriately labeled molecules or moieties are sufficiently proximal to each other to transfer energy. Visualization of a FRET event can be achieved via detection of the FRET signal induced by energy transfer from the FRET donor dye moiety to the FRET acceptor moiety.
  • the FRET acceptor is attached to a nucleotide, and the FRET donor is operably linked to, or otherwise directly or indirectly associated with, any component of the sequencing machinery such as the polynucleotide backbone (e.g., phosphate groups), the polynucleotide bases, or the polymerase.
  • the polynucleotide backbone e.g., phosphate groups
  • the polynucleotide bases e.g., phosphate groups
  • the FRET donor is constantly replenished as the sequencing reaction progresses, thus allowing extended read lengths to be obtained.
  • a replenishing donor is dye that binds with high affinity in a sequence-independent fashion to the nucleic acid, including but not limited to intercalating dyes.
  • the replenishing donor can be a labeled DNA polymerase, which can be replenished by exchanging enzymes as the sequencing reaction progresses.
  • Figure 10 One embodiment of a system using a replenishing donor-labeled polymerase is shown in Figure 10, and involves the use of an unlabeled DNA template immobilized via attachment to a surface, with donor labeled polymerase and gamma-labeled dNTPs.
  • the donor can be replenished by exchanging enzymes. Further, there is no concern of the duplex disassociating from the enzyme complex. Moreover, incorporation will only occur and be detected when a donor (enzyme) binds to the duplex. This is indicated by the detection of FRET signals between the donor-labeled polymerase and the acceptor-labeled gamma-dNTPs.
  • the use of less processive polymerases is beneficial to the experimental setup because it allows for more rapid exchange of the donor and the donor is less likely to photo-bleach. Experiments are being carried out to determine the most appropriate enzyme for this method.
  • mutant polymerases that have increased activity with gamma-modified dNTPs but exhibit decreased processivity may be used.
  • an ultra-stable donor such as a nanocrystal may be used in place of a replenishing donor.
  • One exemplary embodiment of this method includes a donor nanocrystal stably attached to the polymerase.
  • the FRET donor is an intercalating dye molecule or other fluorescent moiety that has a high affinity for polynucleotides and spontaneously intercalates itself between the bases of, or otherwise associates itself with, a nucleic acid molecule, producing increased fluorescence in the intercalated or associated state.
  • intercalated dye molecules are displaced from a direction 3' of the polymerase and absorbed into a 5' direction of the DNA, thereby constantly replenishing as the polymerization reaction proceeds.
  • the acceptor-labeled nucleotide When the acceptor-labeled nucleotide is positioned within the polymerase active site for incorporation into the newly synthesized DNA strand by the polymerase, it undergoes FRET with the donor, resulting in emission of a FRET signal that can be detected and characterized. As the acceptor attached to the incorporated nucleotide is removed, a second detectably labeled nucleotide will enter the active site and produce a second high efficiency FRET event.
  • the polymerase extends the newly synthesized strand by successively adding labeled monomers to the free 3' end of the strand in a template-dependent fashion, the identity of each successive incoming monomer bound and incorporated by the polymerase will be identifiable by the emission spectrum of the FRET acceptor attached to that particular monomer. Accordingly, the base sequence of the newly synthesized strand can be identified by detection and characterization of the time- sequence of FRET events, as described below.
  • the template DNA strand is treated to introduce a multiplicity of priming sites along the length of the strand in such a manner that the priming sites are optimally spaced apart to allow independent detection and resolution.
  • the limits of detection and resolution will depend on the capabilities of the particular detection system employed in the disclosed methods.
  • each sequencing complex along the strand provides not only sequence information about a region contained within the extended fragment, but also information about the placement of each sequence read relative to others obtained from the same strand. In other words, each sequence read along the strand is both discrete and ordered.
  • nucleotide polymerase enzymes in the polymerase solution recognize and bind to the priming sites and initiate extension at the priming site by polymerization of nucleotides and elongation from the priming site
  • detection of signals is performed. Real-time sequencing is achieved by monitoring emissions from the detectable labels attached to various components of the polymerase solution as the extension reaction proceeds.
  • the progress of the sequencing or extension reaction can also be tracked by detecting the dip in donor intensity that accompanies any FRET event involving energy transfer between the FRET donor and acceptor moieties is.
  • the ability to detect a dip in donor intensity likely depends on a variety of conditions. Dips in donor intensities can be monitored using standard detection systems.
  • the number of donor fluorophores associated with the nucleic acid can be varied to maximize acceptor FRET.
  • Optimal spacing between a donor fluorophore and an acceptor on the incorporating nucleotide should be closer than the R of the donor-acceptor pair so that high FRET results.
  • the FRET efficiency is greater than about 80%. If too few donor fluorophores interact with the nucleic acid, the donor fluorophores can be spaced to far apart for adequate FRET signal to noise ratio - adequate FRET detection. However, too many intercalated donor fluorophores may result in signal quenching.
  • polymerases used in the methods and systems of this disclosure are first analyzed with regard to donor duration and donor signal frequency over the collection time.
  • the donor signals are assigned as segments of excited (digital unit), and dark (digital zero) depending on their intensities compared to the noise level.
  • the excited donor segments are denoted by a horizontal dark green bar and the dark regions are denoted by horizontal black bars (figure below).
  • the number of donor segments of the excited state is extracted for every donor in the field of view and attributes of these segments such as the duration, intensity and frequency are analyzed. A comparison of these attributes of donor segments, made between different polymerases binding to immobilized duplex on a surface, is shown in Figure 11.
  • priming sites and/or extension can be performed prior to display; in other embodiments, priming and/or extension is performed after the nucleotide is immobilized. In yet other embodiments, formation of priming sites and/or immobilization may precede extension. All of these permutations and combinations, as well as any others that preserve the spirit and scope of the invention, may be used according to the present disclosure, and are contemplated to be within the spirit and scope of the present invention.
  • Removal of the detectable label of a nucleotide following incorporation of the nucleotide into the newly synthesized DNA strand by the polymerase can be accomplished by any suitable means. Typically, removal is accomplished by enzymatic cleavage upon incorporation, as will occur when the detectable label comprising the FRET acceptor is attached to a portion of the nucleotide that is released during incorporation (e.g., a pyrophosphate group with or without an associated linker; a fluorophore) as a natural byproduct of polymerase activity. Such labels are commonly referred to as "non-persistent" acceptors.
  • the detectable label comprising the FRET acceptor is a "persistent" label, i.e., it remains attached to the portion of the nucleotide that is incorporated into the elongating nucleotide strand, and thus is also incorporated into the newly synthesized portion of the nucleotide strand.
  • the acceptor will have to be either photobleached or photocleaved after incorporation, or the acceptor will contribute to the signals emitted by the next incoming nucleotide until the persistent acceptor permanently photobleaches.
  • any suitable polymerase may be used that is capable of polymerizing monomeric subunits into polymers.
  • the polymerase is a nucleotide polymerase, i.e., a polymerase that can polymerize nucleotides such as DNA or RNA polymerases that polymerize DNA, RNA or mixed sequences, into extended nucleic acid polymers.
  • the nucleotide polymerase will elongate a pre-existing polynucleotide strand, typically a primer, by polymerizing nucleotides on to the 3' end of the strand.
  • polymerases that can be isolated from its host in sufficient amounts for purification and use and/or genetically engineered into other organisms for expression, isolation and purification in amounts sufficient for use, as well as mutants or variants of native polymerases having one or more amino acids replaced by amino acids amenable to attaching an atomic or molecular label, which have a detectable property.
  • Exemplary polymerases include without limitation DNA polymerases, RNA polymerases and reverse transcriptases.
  • the polymerase is a DNA polymerase.
  • Suitable nucleotide polymerases that may be used to practice the methods disclosed herein include without limitation any naturally occurring nucleotide polymerases as well as mutated, truncated, modified, genetically engineered or fusion variants of such polymerases.
  • Known conventional naturally occurring DNA polymerases include without limitation bacterial DNA polymerases, eukaryotic DNA polymerases, archaeal DNA polymerases, viral DNA polymerases and phage DNA polymerases.
  • Suitable bacterial DNA polymerase include without limitation E. coli DNA polymerases I, II and III, IV and V, the Klenow fragment of E.
  • coli DNA polymerase (including mutants thereof, such as mutants lacking exonuclease activity), Clostridium stercorarium (Cst) DNA polymerase, Clostridium thermocellum (Cth) DNA polymerase and Sulfolobus solfataricus (Sso) DNA polymerase.
  • Suitable eukaryotic DNA polymerases include without limitation the DNA polymerases ⁇ , ⁇ , ⁇ , ⁇ , ⁇ ,, ⁇ , ⁇ , ⁇ , and K, as well as the Revl polymerase (terminal deoxycytidyl transferase) and terminal deoxynucleotidyl transferase (TdT).
  • Suitable viral DNA polymerases include without limitation T4 DNA polymerase, T7 DNA polymerase, Phi29 DNA polymerase (also referred to herein as Phi-29 polymerase) and mutated and/or engineered PM29 DNA polymerases, including mutants lacking exonuclease activity.
  • Suitable archaeal DNA polymerases include without limitation the thermostable and/or thermophilic DNA polymerases such as, for example, DNA polymerases isolated from Thermus aquaticus (Taq) DNA polymerase, Thermus filiformis (Tfi) DNA polymerase, Thermococcus zilligi (Tzi) DNA polymerase, Thermus thermophilus (Tth) DNA polymerase, Thermus flavus (TfI) DNA polymerase, Pyrococcus woesei (Pwo) DNA polymerase, Pyrococcus furiosus (Pfu) DNA polymerase as well as Turbo Pfu DNA polymerase, Thermococcus litoralis (TIi) DNA polymerase or Vent DNA polymerase, Pyrococcus sp.
  • thermostable and/or thermophilic DNA polymerases such as, for example, DNA polymerases isolated from Thermus aquaticus (Taq) DNA polymerase, Thermus filiform
  • GB-D polymerase "Deep Vent” DNA polymerase, New England Biolabs), Thermotoga maritima (Tma) DNA polymerase, Bacillus stearothermophilus (B st) DNA polymerase, Pyrococcus Kodakaraensis (KOD) DNA polymerase, Pfx DNA polymerase, Thermococcus sp. JDF-3 (JDF-3) DNA polymerase, Thermococcus gorgonarius (Tgo) DNA polymerase, Thermococcus acidophilium DNA polymerase; Sulfolobus acidocaldarius DNA polymerase; Thermococcus sp.
  • RNA polymerases include, without limitation, T7, T3 and SP6 RNA polymerases.
  • Suitable reverse transcriptases include without limitation reverse transcriptases from HIV, HTLV-I, HTLV-II, FeLV, FIV, SIV, AMV, MMTV and MoMuLV, as well as the commercially available "Superscript” reverse transcriptases (Invitrogen) and telomerases.
  • the methods and systems disclosed herein may also be practiced using any subunits, mutated, modified, truncated, genetically engineered or fusion variants of naturally occurring polymerases (wherein the mutation involves the replacement of one or more or many amino acids with other amino acids, the insertion or deletion of one or more or many amino acids, or the conjugation of parts of one or more polymerases) non-naturally occurring polymerases, synthetic molecules or any molecular assembly that can polymerize a polymer having a pre-determined or specified or templated sequence of monomers may be used in the methods disclosed herein.
  • polymerases that retain the desired levels of processivity when conjugated to a donor or acceptor fluorophore are preferred.
  • fidelity refers to the accuracy of nucleotide polymerization by a given template-dependent nucleotide polymerase.
  • the fidelity of a nucleotide polymerase is typically measured as the error rate, i.e., the frequency of incorporation of a nucleotide in a manner that violates the widely known Watson-Crick base pairing rules.
  • the accuracy or fidelity of DNA polymerization is influenced not only by the polymerase activity of a given enzyme, but also by the 3 '-5' exonuclease activity of a DNA polymerase.
  • the fidelity or error rate of a DNA polymerase may be measured using assays known to the art. See, for example, Lundburg et al., 1991 Gene, 108:1-6. By suitable selection and engineering of the nucleotide polymerase, the error rate of the single-molecule sequencing methods disclosed herein can be further improved.
  • the polymerase used in this sequencing technology is engineered to possess either a strong strand displacement activity or 5' to 3' exonuclease activity to remove the downstream strand, thereby facilitating DNA synthesis.
  • a highly processive polymerase such as an engineered Phi29 polymerase, the downstream strand is displaced. However, because the 5' terminated strand cannot serve as a template in the absence of added primer, no secondary sequence information from this site will be detected, which would confound sequence data analysis.
  • an unwinding agent such as a topoisomerase or a gyrase may be added to the extension reaction to facilitate optimal sequencing performance.
  • the immobilized DNA is typically linearized through attachment to the surface at various points along its length, and therefore consists of a series of closed DNA domains.
  • One option for circumventing the "closed" state of the DNA involves inclusion of a topoisomerase and/or a gyrase to modulate the number of DNA supercoils that may be introduced during the sequencing reaction (See, e.g., Champoux, 2001). The requirement for inclusion of such enzymes will reflect both sequence read length and the degree to which the DNA is immobilized onto the surface.
  • the extension reaction is supplemented by addition of an agent that reduces formation of secondary structures that hinder progress of the extension reaction.
  • an agent that reduces formation of secondary structures that hinder progress of the extension reaction is undesirable because such structures bind dye molecules and exhibit increased fluorescence intensity as compared to dye molecules that are in solution, or that are intercalated into the displaced single strand, exhibit reduced fluorescence intensity.
  • the presence of secondary structure in the displaced strand results in inappropriate dye intercalation into the displaced strand, and consequently inappropriate detection of fluorescence.
  • additional dyes may be used to identify the dye, or dye combinations that produce the highest quality sequence information.
  • agents that stabilize the displaced strand and prevent formation of secondary structure may be included within the extension reaction mixture.
  • a reagent that may optionally be added to the extension reaction mixture to prevent formation of unwanted secondary structure is single strand binding protein, also known as SSBP.
  • SSBP single strand binding protein
  • the number of donor fluorophores that associate with the DNA is optimized to identify a staining concentration that produces high FRET.
  • optimal spacing between a donor fluorophore and an acceptor on the incorporated nucleotide should be closer than the Ro of the donor-acceptor pair so that high FRET results, with greater than 80% FRET being preferred. If too few fluorophores interact with the DNA, they will not be spaced closely enough to produce high FRET with the acceptor fluorophore. However, if too many donor fluorophores intercalate or bind the DNA, fluorophore quenching may occur.
  • Suitable depolymerizing agents for use in the disclosed methods and compositions include, without limitation, any depolymerizing agent that depolymerizes monomers in a step-wise fashion such as exonucleases in the case of DNA, RNA or mixed DNA/RNA polymers, proteases in the case of polypeptides and enzymes or enzyme systems that sequentially depolymerize polysaccharides.
  • the FRET donor is a dye molecule intercalated between the bases of the template nucleic acid molecule, or otherwise associated with the nucleic acid molecule.
  • Suitable intercalating dyes include, without limitation, any detectectable moiety capable of inserting, interposing, or otherwise intercalating into single- or double-stranded polynucleotides.
  • the intercalating dye may be a fluorescent dye or may be fluorescent dye conjugated to a molecule that is primarily an intercalator. Intercalating dyes are well known to the person of ordinary skill in the art.
  • intercalating dyes suitable for use in the disclosed methods and compositions include, without limitation, mono- and bis- intercalating dyes, phenanthridines and acridines, such as ethidium bromide, propidium iodidem, hexidium iodide, dihydroethidium, ethidium homodimers, acridine orange, 9- amino-6-chloro-2-methoxyacridine; indoles and imaidazoles such as DAPI, bisbenzimide dyes, Actinomycin D, Nissl stains, hydroxystilbamidine, SYBR Green I (also referred to herein simply as "SYBR Green"), SYBR Green II, SYBR GOLD, YO (Oxazole Yellow), TO (Thiazole Orange), PG (PicoGreen), dyes from ATTO-TEC GmbH of Siegen, Germany, intercalating dyes, BEBO, BETO and BOXTO, BO, BO-PRO, TO
  • the number of donor fluorophores that associate with the DNA is optimized to identify a staining concentration that produces high FRET.
  • the use of SYBR Green I and other intercalating dyes as replenishing FRET donors will increase donor lifetime and intensity, and more importantly will increase acceptor intensity.
  • Use of multiple donors at a dye-to- base pair ratio of ⁇ 1: 5-7 results in the punctuation of DNA with dye molecules that can serve as donors for the growing DNA strand (See, e.g., Howell et al., 2002; Takatsu et al 2004).
  • nucleotide or nucleotide analogs or their variants, as used herein, refer to any compounds that can be polymerized and/or incorporated into a newly synthesized strand by a naturally occurring, genetically modified or engineered nucleotide polymerase.
  • Suitable nucleotides or other monomers for use in the methods and compositions include, without limitation, any monomer that can be step-wise polymerized and/or incorporated into an elongating nucleotide strand or other polynucleotide polymer by a polymerase or other polymerizing agent, including but not limited to ribonucleotides, deoxyribonucleo tides, modified ribonucleotides, modified deoxyribonucleotides, ribonucleotide polyphosphates, deoxyribonucleotide polyphosphates, modified ribonucleotide polyphosphates, modified deoxyribonucleotide polyphosphates, peptide nucleotides, modified peptide nucleotides, and modified phosphate-sugar backbone nucleotides, and any analogs or variants of the foregoing, including analogs or variants having atomic and/or molecular labels attached thereto, or mixtures or combinations thereof
  • any suitable monomers capable of polymerization by a naturally occurring, genetically engineered, or synthetic polymerase may be used, including, for example, amino acids (natural or synthetic) for protein or protein analog synthesis, and monosaccharides or polysaccharides for carbohydrate synthesis.
  • the labeled nucleotide monomer has three, four or more phosphates.
  • the nucleotide or nucleotide analogs comprise more than one detectable label moiety per nucleotide molecule.
  • the nucleotides or nucleotide analogs may comprise a persistent acceptor, a non-persistent acceptor, or both a persistent and non- persistent acceptor group conjugated to the same nucleotide molecule.
  • a persistent acceptor e.g., a persistent acceptor, a non-persistent acceptor, or both a persistent and non- persistent acceptor group conjugated to the same nucleotide molecule.
  • dual nucleotides are known in the art. See, e.g., U.S. Provisional App. No. 60/891,029, filed February 21, 2007 and U.S. App. No. 12/035,352, filed February 21, 2008, herein incorporated by reference in their entirety.
  • the nucleotide is conjugated or otherwise operably linked to a detectable label using suitable methods.
  • Any suitable methods for detectably labeling nucleotides may be employed including but not limited to those described in U.S. Patent Nos. 7,041,892, 7,052,839, 7,125,671 and 7,223,541; U.S. Pub. Nos. 2007/072196 and 2008/0091005; Sood et al., 2005, J. Am. Chem. Soc. 127:2394-2395; Arzumanov et al., 1996, J. Biol. Chem.
  • operably link refers to chemical fusion or bonding or association of sufficient stability to withstand conditions encountered in the method of nucleotide sequencing utilized, between a combination of different molecules or moieties, such as, but not limited to: between a linker and a nucleotide; between a linker and a dye moiety; and the like.
  • dye labels may be conjugated to the terminal phosphate of deoxyribonucleotide polyphosphates using a linker and/or spacer using suitable techniques.
  • Suitable linkers include, for example, any compound or moiety that can act as a molecular bridge to operably link two different molecules.
  • Exemplary linkers include, but are not limited to, chemical chains, chemical compounds (e.g., reagents), and the like.
  • the linkers may include, but are not limited to, homobifunctional linkers and heterobifunctional linkers.
  • heterobifunctional linkers contain one end having a first reactive functionality to specifically link to a first molecule, and an opposite end having a second reactive functionality to specifically link to a second molecule.
  • the linker may vary in length and composition for optimizing properties such as stability, length, FRET efficiency, resistance to certain chemicals and/or temperature parameters, and be of sufficient stereo-selectivity or size to operably link a detectable label to a nucleotide such that the resultant conjugate is useful in optimizing a polymerization reaction.
  • Linkers can be employed using standard chemical techniques and include but not limited to, amine linkers for attaching labels to nucleotides (see, for example, U.S. Pat. No.
  • a linker typically contain a primary or secondary amine for operably linking a label to a nucleotide; and a rigid hydrocarbon arm added to a nucleotide base (see, for example, Science 282:1020-21, 1998.
  • any detectable label that is suitable for attachment to the polymerase, the nucleic acid molecule and/or the nucleotides may be used, including but not limited to luminescent, photoluminescent, electroluminescent, bioluminescent, chemluminescent, fluorescent and/or phosphorescent labels.
  • the label comprises a FRET donor and/or a FRET acceptor.
  • the FRET donor and/or the FRET acceptor is typically a fluorophore or fluorescent label; however the FRET donor and/or FRET acceptor may also be a luminophore, chemiluminophore, bioluminophore or other label, or a quencher that can participate in this reaction, as described below.
  • the FRET labels may be referred to as fluorophores or fluorescent labels for convenience, but this in no way is meant to exclude the possibility of using a quencher or limit the donor and/or acceptor only to fluorescent labels.
  • the detectable labels used in the disclosed methods and compositions may undergo other types of energy transfer with each other, including but not limited to luminescence resonance energy transfer, bioluminescence resonance energy transfer, chemiluminescence resonance energy transfer, and similar types of energy transfer not strictly following the Forster's theory, such as the nonoverlapping energy transfer when nonoverlapping acceptors are utilized. See, for example, Anal. Chem. 2005, 77: 1483-1487.
  • Suitable detectable labels for use in the disclosed methods and compositions include, without limitation, any atomic structure, molecule or other moeity amenable to attachment to a specific site in a polymerizing agent or dNTP, including but not limited to Europium shift agents, NMR active atoms or the like; fluorescent dyes such as Rhodol dyes, d-Rhodamine acceptor dyes including but not limited to dichloro[R110], dichloro[R6G], dichloro [TAMRA], dichloro [ROX] or the like, fluorescein donor dye including but not limited to fluorescein, 6-FAM, or the like; Acridine including but not limited to Acridine orange, Acridine yellow, Proflavin, or the like; aromatic hydrocarbon including but not limited to 2-Methylbenzoxazole, Ethyl p-dimethylaminobenzoate, Phenol, benzene, toluene, or the like; Arylmethine Dyes including but
  • miscellaneous dyes including but not limited to 4',6-Diamidino-2-phenylindole (DAPI), 4',6-Diamidino-2-phenylindole (DAPI), 7-Benzylamino-4-nitrobenz-2-oxa-l,3-diazole, dansyl glycine, dansyl glycine, Hoechst 33258, Hoechst 33258, Lucifer yellow CH, Piroxicam, Quinine sulfate, Quinine sulfate, Squarylium dye III, or the like; oligophenylenes including but not limited to 2,5- Diphenyloxazole (PPO), Biphenyl, POPOP, p-Quaterphenyl, p-Terphenyl, or the like; oxazines including but not limited to Cresyl violet perchlorate, Nile Blue, Nile Red, Ni
  • any molecule, nano-structure, or other chemical structure that is capable of chemical modification and includes a detectable property capable of being detected by a detection system may be used in the disclosed methods and systems.
  • detectable structure can include one presently known and structures that are being currently designed and those that will be prepared in the future.
  • the nucleotide comprises a releasable or non-persistent label that can be removed via suitable means prior to incorporation of the next nucleotide by the polymerase into the newly synthesized strand.
  • suitable non-persistent or releasable labels include detectable moieties operably linked to the base, sugar or alpha phophate of a nucleotide or nucleotide analog.
  • the FRET acceptor label is attached to a nucleotide phosphate group that is cleaved and released upon incorporation of the underlying nucleotide into the primer strand, for example the ⁇ -phosphate, the ⁇ - phosphate, or the terminal phosphate of the incoming nucleotide.
  • the signal from the label (or, for embodiments wherein the label is a FRET donor, the FRET signal between the FRET donor and the FRET acceptor moieties) ceases after the nucleotide is incorporated and the label (or FRET signal) diffuses away.
  • a detectable signal indicative of nucleotide incorporation is generated as each incoming nucleotide hybridizes to a complementary nucleotide in the target nucleic acid molecule and becomes incorporated into the newly synthesized strand.
  • the accompanying dip in donor intensity can also be detected to confirm the occurrence of FRET. While detection of donor dipping can be useful by providing independent corroboration of a FRET event, it can be dispensed with in embodiments where the acceptor signals are sufficiently intense and well- defined.
  • the nucleotide comprises a persistent label, which is not released upon incorporation of the nucleotide into the nascent nucleotide strand synthesized by the polymerase.
  • suitable persistent labels include without limitation any FRET acceptor moiety operably linked to the base, sugar, or internal phosphate, of the nucleotide or nucleotide analog, for example, the alpha phosphate.
  • Persistently-labeled nucleotides may be used when a stable signal is preferred and their use enables the reaction to be performed in advance of immobilization on the support for viewing in the detection system, which improves reaction efficiency.
  • the persistently-labeled nucleotide may be a dideoxynucleotide to ensure that a single nucleotide is incorporated at the reaction site.
  • Non-persistently-labeled nucleotides or nucleotides containing both a persistent and a non-persistent label are used when the detection of the non-persistent signal is preferred.
  • the use of non-persistently-labeled nucleotides or nucleotides containing both a persistent and a non-persistent label requires that the extension reaction is performed and detected in real-time or near real-time on the detection system to associate the nonpersistent label with a particular nucleic acid strand.
  • intercalating dyes such as SYBR Green I
  • SYBR Green I intercalating dyes
  • SYBR Green I intercalating dyes
  • new fluorophores are positioned as new donors when they insert into the newly synthesized, double- stranded DNA.
  • donor-acceptor pairs are typically selected such that there is overlap between the emission spectrum of the donor and excitation spectrum of the acceptor. Dyes and dye concentrations are chosen such that optimized donor emission and maximized acceptor intensities are obtained.
  • certain combinations of DNA-associated donor dyes produce higher intensity acceptor signals when paired with the spectrally-resolved acceptors used in other sequencing technologies based on determination of base identity (i.e., the donor fluorophores must be good FRET partners with the acceptors used to label the nucleotides), and these donor dyes may need to be present in particular ratios to maximize these effects.
  • Any suitable FRET donor: acceptor pair may be used in the disclosed methods and compositions, including but not limited to a fluorescein, cyanine, rhodamine, coumarin, acridine, Texas Red dye, BODIPY, Alexa Fluor, GFP, or a derivative or modification of any of the foregoing. See, for example, U.S. Published App. No. 2008/0091995.
  • excitation of the donor produces energy in its emission spectrum that is then picked up by the acceptor in its excitation spectrum, leading to the emission of light from the acceptor in its emission spectrum.
  • excitation of the donor sets off a chain reaction, leading to emission from the acceptor when the two are sufficiently close to each other.
  • the label operably linked or attached to the nucleotide may be a quencher. Quenchers are useful as acceptors in FRET applications, because they produce a signal through the reduction or quenching of fluorescence from the donor fluorophore.
  • quenchers have an absorption spectrum and large extinction coefficients, however the quantum yield for quenchers is extremely reduced, such that the quencher emits little to no light upon excitation.
  • illumination of the donor fluorophore excites the donor, and if an appropriate acceptor is not close enough to the donor, the donor emits light. This light signal is reduced or abolished when FRET occurs between the donor and a quencher acceptor, resulting in little or no light emission from the quencher.
  • interaction or proximity between a donor and quencher-acceptor may be detected by the reduction or absence of donor light emission.
  • quenchers include the QSY dyes available from Molecular Probes (Eugene, OR).
  • One exemplary method involves the use of quenchers in conjunction with fluorescent labels.
  • certain nucleotides in the reaction mixture are labeled with a fluorescent label, while the remaining nucleotides are labeled with one or more quenchers.
  • each of the nucleotides in the reaction mixture is labeled with one or more quenchers.
  • Discrimination of the nucleotide bases is based on the wavelength and/or intensity of light emitted from the FRET acceptor, as well as the intensity of light emitted from the FRET donor. If no signal is detected from the FRET acceptor, a corresponding reduction in light emission from the FRET donor indicates incorporation of a nucleotide labeled with a quencher. The degree of intensity reduction may be used to distinguish between different quenchers.
  • the intercalating dye and the detectable label of the nucleotide will be selected and/or designed to ensure not that the presence of such labels does not unduly hinder the progress of the polymerization reaction as determined by speed, error rate, fidelity, processivity and average read length of the newly synthesized strand.
  • the sequencing reaction is initiated by the addition of a suitable polymerase and labeled nucleotides. Suitable temperatures and the addition of other components such as divalent metal ions can be determined and optimized based on the particular nucleotide polymerase and the target nucleic acid sequences. Illumination of the reaction site permits observation of the detectable signals, e.g., FRET signals, which indicate the nucleotide incorporation event.
  • Detection of the signals emitted by various components of the polymerase reaction mixture as the polymerase incorporates nucleotide(s) into an extending strand in a template-directed fashion can be detected by means of any suitable system capable of detecting and/or monitoring such signals.
  • the detection system will achieve these functions by first generating and transmitting an incident wavelength to the polynucleotides isolated within nanostructures, and then collecting and analyzing the emissions from the reactants.
  • a typical sequencing system comprises a detection subsystem capable of viewing a field on the substrate. The view field can be adjusted to view one or a plurality of elongated and immobilized nucleic acids.
  • the sequencing system also comprises a monitoring subsystem capable of detecting nucleotide incorporation events occurring at the nick sites of the nucleotide strand to be sequenced.
  • the system also comprises an analyzing subsystem that converts the detected events into sequencing information and then maps the sequence fragments along the length of the nucleic acid so that ordered sequence fragment information is obtained for nucleic acid identification and classification including partial or fragmentary sequence information.
  • detection systems suitable for use according to the present disclosure include without limitation the systems described in U.S. Published App. No. 2008/0241951 and 2008/0241938, herein incorporated by reference in their entirety.
  • a detection system of the present invention comprises at least two elements, namely an excitation source and a detector.
  • the excitation source generates and transmits incident radiation used to excite the reactants contained in the array.
  • the source of the incident light can be a laser, laser diode, a light-emitting diode (LED), a ultra-violet light bulb, and/or a white light source.
  • more than one source can be employed simultaneously.
  • the use of multiple sources is particularly desirable in applications that employ multiple different reagent compounds having differing excitation spectra, consequently allowing detection of more than one fluorescent signal to track the interactions of more than one or one type of molecules simultaneously.
  • Any suitable detection strategies can be employed to determine the identity of the nitrogenous base of the incoming nucleotides, depending on the nature of the labeling strategy that is employed.
  • Exemplary labeling and detection strategies include but are not limited to those disclosed in U.S. Patent Nos. 6,423,551 and 6,864,626; U.S. Pub. Nos. 2005/0003464, 2006/0176479, 2006/0177495, 2007/0109536, 2007/0111350, 2007/0116868, 2007/0250274 and 2008/08825. Detection of emissions during the polymerization reaction permits the discrimination of independent interactions between uniquely labeled moieties, reactants or subunits.
  • the label linked to the nucleotide undergoes a transition to an 'excited state' whereby it emits photons over a spectral range characterized by the identity of the emitting moiety.
  • the donor moiety must be sufficiently excited in order for FRET to occur.
  • Emissions may be detected using any suitable device.
  • detectors include but are not limited to microscopes, optical readers, high-efficiency photon detection systems, photodiodes (e.g. avalanche photo diodes (APD); APD arrays, etc.), cameras, charge couple devices (CCD), electron-multiplying charge-coupled device (EMCCD), intensified charge coupled device (ICCD), photomultiplier tubes (PMT), a muti-anode PMT, and a microscope equipped with any of the foregoing detectors.
  • the subject arrays contain various alignment aides or keys to facilitate a proper spatial placement of each spatially addressable array location and the excitation sources, the photon detectors, or the optical transmission element as described below.
  • characteristic signals from different independently labeled, nucleotides are simultaneously detected and resolved using a suitable detection method capable of discriminating between the respective labels.
  • the characteristic signals from each nucleotide are distinguished by resolving the characteristic spectral properties of the different labels. See, for example, Lakowitz, J.R., 2006, Principles of Fluorescence Spectroscopy, Third Edition.
  • Spectral detection may also optionally be combined and/or replaced by other detection methods capable of discriminating between chemically similar or different labels in parallel, including, but not limited to, polarization, lifetime, Raman, intensity, ratiometric, time-resolved anisotropy, fluorescence recovery after photobleaching (FRAP) and parallel multi-color imaging.
  • an image splitter such as, for example, a dichroic mirror, filter, grating, prism, etc.
  • a CCD typically a CCD
  • multiple cameras or detectors may be used to view the sample through optical elements (such as, for example, dichroic mirrors, filters, gratings, prisms, etc.) of different wavelength specificity.
  • optical elements such as, for example, dichroic mirrors, filters, gratings, prisms, etc.
  • suitable methods to distinguish emission events include, but are not limited to, correlation/anti-correlation analysis, fluorescent lifetime measurements, anisotropy, time- resolved methods and polarization detection.
  • Suitable imaging methodologies that may be implemented for detection of emissions include, but are not limited to, confocal laser scanning microscopy, Total Internal Reflection (TIR), Total Internal Reflection Fluorescence (TIRF), near- field scanning microscopy, far-field confocal microscopy, wide-field epi- illumination, light scattering, dark field microscopy, photoconversion, wide field fluorescence, single and/or multi-photon excitation, spectral wavelength discrimination, evanescent wave illumination, scanning two-photon, scanning wide field two-photon, Nipkow spinning disc, multi-foci multi -photon, and/or other forms of microscopy.
  • TIR Total Internal Reflection
  • TIRF Total Internal Reflection Fluorescence
  • the detection system may optionally include one or more optical transmission elements that serve to collect and/or direct the incident wavelength to the reactant array; to transmit and/or direct the signals emitted from the reactants to the photon detector; and/or to select and modify the optical properties of the incident wavelengths or the emitted wavelengths from the reactants.
  • suitable optical transmission elements and optical detection systems include but are not limited to diffraction gratings, arrayed wave guide gratings (AWG), optic fibers, optical switches, mirrors, lenses (including microlens and nanolens), collimators.
  • Other examples include optical attenuators, polarization filters (e.g., dichroic filters), wavelength filters (low-pass, band-pass, or high- pass), wave-plates, and delay lines.
  • the detection system comprises optical transmission elements suitable for channeling light from one location to another in either an altered or unaltered state.
  • optical transmission devices include optical fibers, diffraction gratings, arrayed waveguide gratings (AWG), optical switches, mirrors, (including dichroic mirrors), lenses (including microlens and nanolens), collimators, filters, prisms, and any other devices that guide the transmission of light through proper refractive indices and geometries.
  • the detection system comprises an optical train that directs signals from an organized array onto different locations of an array-based detector to simultaneously detect multiple different optical signals from each of multiple different locations.
  • the optical trains typically include optical gratings and/or wedge prisms to simultaneously direct and separate signals having differing spectral characteristics from each spatially addressable location in an array to different locations on an array-based detector, e.g., a CCD.
  • detection is performed using multifluorescence imaging wherein each of the different types of nucleotide is operably linked to a label with different spectral properties from the rest, thereby permitting the simultaneous detection of incorporation of all different nucleotide types.
  • each of the different types of nucleotide may be operably linked to a FRET acceptor fluorophore, wherein each fluorophore has been selected such that the overlapping of the absorption and emission spectra between the different fluorophores, as well as the the overlapping between the absorption and emission maxima of the different fluorophores, is minimized.
  • Detection of different nucleotide label is performed by observing two or more targets at the same time, wherein the emissions from each label are separated in the detection path.
  • Such separation is typically accomplished through use of suitable filters, including but not limited to band pass filters, image splitting prisms, band cutoff filters, wavelength dispersion prisms and dichroic mirrors, that can selectively detect specific emission wavelengths.
  • filters may optionally be used in combination with suitable diffraction gratings.
  • the detection system utilizes tunable excitation and/or tunable emission fluorescence imaging.
  • tunable excitation light from a light source passes through a tuning section and condenser prior to irradiating the sample.
  • tunable emissions emissions from the sample are imaged onto a detector after passing through imaging optics and a tuning section. The user may control the tuning sections to optimize performance of the system.
  • a number of labeling and detection strategies are available for base discrimination using the FRET technique.
  • different fluorescent labels may be used for each type of nucleotide present in the extension reaction with discrimination between the different labels based on the wavelength and/or the intensity of the light emitted from the fluorescent label.
  • a second strategy involves the use of fluorescent labels and quenchers.
  • certain nucleotides in the reaction mixture are labeled with a fluorescent label, while the remaining nucleotides are labeled with one or more quenchers.
  • each of the nucleotides in the reaction mixture is labeled with one or more quenchers.
  • Discrimination of the nucleotide bases is based on the wavelength and/or intensity of light emitted from the FRET acceptor, as well as the intensity of light emitted from the FRET donor. If no signal is detected from the FRET acceptor, a corresponding reduction in light emission from the FRET donor indicates incorporation of a nucleotide labeled with a quencher. The degree of intensity reduction may be used to distinguish between different quenchers.
  • the signal from the detector is converted into a digital signal with an
  • A-D converter and an image of the sample is reconstructed on a monitor.
  • the user can optionally select a composite image that combines the images derived at a number of different wavelengths into a single image.
  • the user can also specify that an artificial color system is to be used in which particular probes are artificially associated with specific colors. In an alternate artificial color system the user can designate specific colors for specific emission intensities.
  • Any combination of the above described labeling and detection strategies may be employed together in the same sequencing reaction. Depending on the number of distinguishable labels and quenchers used in any of the above strategies, the identities of one, two, three or four nucleotides may be determined in a single sequencing reaction.
  • Multiple sequencing reactions may then be run, rotating the identities of the nucleotides determined in each reaction, to determine the identities of the remaining nucleotides.
  • these reactions may be run at the same time, in parallel, to allow for complete sequencing in a reduced amount of time.
  • the identities of the incorporated nucleotides may be determined rapidly, for example in real time or near real time, as extension of the primer strand occurs, through FRET interactions between a intercalating dye donor moiety and a FRET acceptor moiety attached to the incoming nucleotide as it are incorporated into the newly synthesized strand by the polymerase.
  • the raw data generated by the detector represents multiple time-dependent fluorescence data streams comprising wavelength and intensity information.
  • the data may be analyzed using suitable methods to correlate the particular spectral characteristics of the emissions with the identity of the incorporated base.
  • such analysis is performed by means of a suitable information processing and control system.
  • the information processing and control system comprises a computer or microprocessor attached to or incorporating a data storage unit containing data collected from the detection system.
  • the information processing and control system may maintain a database associating specific spectral emission characteristics with specific nucleotides.
  • the information processing and control system may record the emissions detected by the detector and may correlate those emissions with incorporation of a particular nucleotide.
  • the information processing and control system may also maintain a record of nucleotide incorporations that indicates the sequence of the template molecule.
  • the information processing and control system may also perform standard procedures known in the art, such as subtraction of background signals.
  • An exemplary information processing and control system may incorporate a computer comprising a bus for communicating information and a processor for processing information.
  • the processor is selected from the Pentium.RTM, Celeron.RTM, Itanium.RTM, or a Pentium Xeon.RTM family of processors (Intel Corp., Santa Clara, Calif.). Alternatively, other processors may be used.
  • the computer may further comprise a random access memory (RAM) or other dynamic storage device, a read only memory (ROM) and/or other static storage and a data storage device such as a magnetic disk or optical disc and its corresponding drive.
  • RAM random access memory
  • ROM read only memory
  • the information processing and control system may also comprise other peripheral devices known in the art, such a display device (e.g., cathode ray tube or Liquid Crystal Display), an alphanumeric input device (e.g., keyboard), a cursor control device (e.g., mouse, trackball, or cursor direction keys) and a communication device (e.g., modem, network interface card, or interface device used for coupling to Ethernet, token ring, or other types of networks).
  • display device e.g., cathode ray tube or Liquid Crystal Display
  • an alphanumeric input device e.g., keyboard
  • cursor control device e.g., mouse, trackball, or cursor direction keys
  • a communication device
  • the detection system may also be coupled to the bus.
  • Data from the detection unit may be processed by the processor and the data stored in the main memory.
  • Data on emission profiles for standard nucleotides may also be stored in main memory or in ROM.
  • the processor may compare the emission spectra from nucleotide in the polymerase reaction to identify the type of nucleotide precursor incorporated into the newly synthesized strand.
  • the processor may analyze the data from the detection system to determine the sequence of the template nucleic acid.
  • the data will typically be reported to a data analysis operation.
  • the data obtained by the detection system will typically be analyzed using a digital computer.
  • the computer will be appropriately programmed for receipt and storage of the data from the detection system, as well as for analysis and reporting of the data gathered.
  • custom designed software packages may be used to analyze the data obtained from the detection system.
  • data analysis may be performed, using an information processing and control system and publicly available software packages.
  • available software for DNA sequence analysis include the PRISM.TM. DNA Sequencing Analysis Software (Applied Biosystems, Foster City, Calif.), the Sequencher.TM. package (Gene Codes, Ann Arbor, Mich.), and a variety of software packages available through the National Biotechnology Information Facility at website www.nbif .org/links/1.4. J .php.
  • Data collection allows data to be assembled from partial information to obtain sequence information from multiple polymerase molecules in order to determine the overall sequence of the template or target molecule.
  • the method further comprises sequencing one or more additional nucleic acid molecules, for example a second nucleic acid, in parallel with sequencing the first nucleic acid.
  • the rate of nucleotide sequencing determination (based on a single read of a nucleic acid template) is equal to or greater than 10 nucleotides per second, typically equal to or greater than 100 nucleotides per second.
  • the sequencing error rate will be equal to or less than 1 in 100,000 bases.
  • the error rate of nucleotide sequence determination is equal to or less than 1 in 10 bases, 1 in 20 bases, 3 in 100 bases, 1 in 100 bases, 1 in 1000 bases, and 1 in 10,000 bases.
  • test DNA will comprise a complete and intact chromosome.
  • methods disclosed herein may be performed in a multiplex fashion (including in array format), such that additional nucleic acid molecules are sequenced in parallel with a first nucleic acid molecule.
  • primer(s) that direct sequencing complexes to particular areas along the DNA strand are used to specifically determine sequence at those sites.
  • the DNA is denatured and hybridized to at least one site-specific primer, and extension is initiated via addition of appropriate components, such as polymerase, nucleotides (especially base-labeled nucleotides that produce long duration signals) and reaction buffer.
  • appropriate components such as polymerase, nucleotides (especially base-labeled nucleotides that produce long duration signals) and reaction buffer.
  • the extension products are then displayed and visualized.
  • donor-labeled primers may be used to more specifically identify sequence at multiple sites of incorporation.
  • several differentially labeled primers can be used to produce 'multiplexed' sequence information at resolvable sites along the DNA strand.
  • the priming sites need not be resolvable by the detection system.
  • resolution can be dispensed with as long as the signals emitted from the primer-incorporated nucleotide located on the strand do not affect the fluorescence of the other primer- incorporated nucleotides. Because the number of bases that span a pixel in current detection systems is approximately 700, many primer probes can be used to interrogate potential SNPs within a single pixel. Acceptor signal to noise ratios and FRET distances constraints define this upper limit.
  • No information site should interfere with data from any other information site (i.e., primer-incorporated acceptor-labeled FRET pairs are distributed along the strand at >10 nm separation, which is no closer than -30 bp) whereas the incorporated acceptor is closely positioned to the donor on the primer to produce a high FRET event (i.e., each primer-incorporated acceptor-labeled nucleotide is at or within the RO of the donor-acceptor pair).
  • each primer-incorporated acceptor-labeled nucleotide is at or within the RO of the donor-acceptor pair.
  • the donor and acceptor fluorophores must not be too close that they are quenched.
  • the donor is typically present within 15 bases from the 3' end of the primer.
  • Primers are long enough to specifically hybridize to their target site (i.e, an 8 base primer is likely to be unique in a 65 Kb template, with the occurrence frequency being related to base composition and type of DNA being sequenced - coding versus non-coding regions; Hardin et al., U.S. Patent No. 6,083,695).
  • Longer primers increase hybridization specificity and reduce the need to highly purify specific genomes or regions thereof.
  • a preferred primer length is 25-50 bases with a minimal spacing of one base between each primer.
  • a polymerase that is not significantly affected by the presence of the donor fluorophore within or immediately 5' to its binding site on the primer/DNA template is used, and this enzyme is additionally deficient in 3' to 5' exonuclease activity and strand displacement activity.
  • the template- ordered extension products may be optionally ligated at a low concentration (to favor intramolecular ligation events) to create a covalently-closed linear DNA strand that is comprised of annealed primers that are extended by a single base.
  • each donor- acceptor is optimally spaced to produce a distinct high FRET event.
  • a further variation of this method produces donor-acceptor pairs that are well separated after first incorporating the persistently labeled nucleotide 3' of the primer, by adding natural nucleotides to complete synthesis to the 5' end of the next annealed primer, followed by performing the optional ligation reaction.
  • DNA strands of approximately 100 Kb can be viewed in one field of view with existing real-time sequencing systems, and the field of view can be moved to increase the length of examined DNA.
  • nick spacing of 3-5 Kb produces resolvable complexes and reduces the risk that a sequence read from one strand will encounter a nick on the opposite strand, thereby terminating extension.
  • the relative distance of visible markers can also be used to determine which DNA site is being determined.
  • the immobilized DNA can be stained with a dye that is not involved in producing the detectable FRET event.
  • the double-or single-stranded nature of the DNA must be taken into account when one needs information about the immobilized DNA strand.
  • the methods and systems of the present disclosure can be combined with reported techniques where integrated fluorescence intensity measurements coupled with quantile analysis provides an accurate measure for the amount of DNA (Li et al., 2007). Analogous to a whole genome shotgun sequencing strategy, the entire genome sequence can be determined according to the present disclosure by sequencing many individual copies of the same or overlapping DNA fragments.
  • donor energy transfer capabilities are continuously optimized throughout the extension reaction, because new donor fluorophores constantly intercalate into the nascent strand, effectively positioning a new donor at a distance that will produce a higher efficiency FRET event relative to the more upstream donor that may have photobleached or, as a result of nucleotide incorporation and enzyme translocation, become too distant from the acceptor- labeled dNTP bound at the enzymatic active site.
  • the acceptor signal is increased as compared to signal generated in other systems that use a single donor fluorophore.
  • the disclosed methods and systems involve the use of tracking software searches along a donor intensity trajectory for acceptor signals (sequence information) originating from different regions along the same DNA strand, thereby permitting accurate placement of the relative locations of each independent sequence along the DNA strand and resulting in the simultaneous generation of multiple discrete and ordered sequence "reads" along the length of a single nucleic acid strand.
  • sequencing of long DNA strands will facilitate the identification of genomic rearrangements and improve the assembly accuracy of chromosomal sequences (e.g., correctly identifying independent HIV genomes; associating sequence reads with the correct maternal/paternal chromosome).
  • Sequence reads obtained according to the present disclosure can produce haplotype information and thus further facilitate accurate genome assembly. Production of haplotype information is especially important because it is shown to have more power than individual nucleotide variation in the context of association studies and in predicting disease risks (Stephens, Schneider et al, 2001; HapMap Project).
  • the first diploid genome sequence of a single human demonstrates that maternal and paternal chromosomes are 99.5% similar when genetic variation due to insertion and deletion is taken into account (Levy, Sutton et al., 2007). The combination of longer read lengths and discrete, ordered reads will facilitate correct assembly of the maternal and paternal chromosome sequences.
  • the methods, compositions and systems disclosed herein for sequencing of long DNA strands are capable of facilitating the identification of genomic rearrangements within the strand, improving the assembly accuracy of chromosomal sequences (especially in regions sharing a great deal of similarity), and improving copy number variation determination (especially for longer repeats).
  • the first diploid genome sequence of a single human demonstrates that maternal and paternal chromosomes are 99.5% similar when genetic variation due to insertion and deletion is taken into account (Levy et al., 2007). Thus, it will be critical to carefully track sequence information associated with each chromosome.
  • Example 1 Assessment of Intercalating Dyes for use as FRET donors
  • Various intercalating dyes were tested for use as a donor in donor-based detection of acceptor signals. These dyes are advantageous in that many donors could be present and exchanged and/or replenished during the extension reaction, thus allowing for extended donor lifetime.
  • the dyes tested were SYBR Green I, YOYO-I, YO-PRO-I, SYBR GOLD, and SYBR Green I with YOYO-I. Representative spectra are shown in Figure 2. Fluorescence intensities were observed using YOYO-I or SYBR Green I with short primer/template duplexes and with linear genomic DNA.
  • PEBN-coated substrates were prepared using a modified version of the procedure disclosed by Braslavsky et al, 2003, PNAS Vol. 100, No. 7, pp. 3960-3964. Briefly, glass coverslips were treated overnight in alkaline base-bath, rinsed in distilled water and then cleaned with 2% Micro-90 for 60 minutes with sonication and heat, followed by boiling in RCA solution (H 2 O: 30% NH 4 OH: 30% H 2 O 2 (6:4:1)) for 60 minutes (2x 30 minutes). The cleaned glass cover slips were then immersed in 2mg/ml polyallylamine for 10 minutes, and rinsed five times in water followed by an immersion in 2mg/ml polyacrylic acid for 10 minutes and rinsed five times in water.
  • the polyallylamine and polyacrylic immersions were repeated one more time.
  • the coverslips were then rinsed in water and coated with a 5mM EDC-Biotin amine solution in 1OmM MES buffer, pH 5.5 for 30 minutes.
  • the slides were then rinsed in MES buffer for 5 minutes, in water for 5 minutes and in a solution of 1OmM Tris, pH 8.0, 1OmM NaCl for 5 minutes.
  • the slides were coated with Neutravidin by incubating for 30 minutes in a solution comprising lmg/ml Neutravidin.
  • Figure 3 depicts average fluorescence intensities obtained from SYBR
  • the DNA was incubated with increasing concentrations of SYBR Green I (0.01X, 0.1X; 0.3X; 0.6X; IX; 2X and 5X SYBR Green I in IX KB buffer supplemented with 5OmM BME). Following addition of each successive concentration of SYBR Green I to the reaction chamber, the chamber and contents were irradiated with an Argon 488nm laser at 500 uW power, and data were collected at 25ms integration time.
  • ROI Regions of interest
  • FIG. 3A depicts the SYBR Green I average intensity in a given region of interest (ROI) relative to average background intensity at each given concentration of SYBR Green I.
  • ROI region of interest
  • SYBR Green I dye did not exhibit high background fluorescence in the absence of DNA even at higher dye concentrations.
  • increased fluorescence of SYBR Green I was observed in the presence of DNA, especially at higher dye concentrations.
  • An identical titration was carried out using the intercalator YOYO-I in the place of SYBR Green I ( Figure 3B).
  • FIG. 3B depicts the YOYO-I average intensity in a given region of interest (ROI) relative to average background intensity at each given concentration of YOYO-I.
  • ROI region of interest
  • the nicked product was then incubated with 6.8nM acceptor- labeled nucleotide dU-A1610 in the presence of a mutant from of Klenow polymerase that comprises the mutation D424A and lacks exonuclease activity (hereinafter referred to as "Klenow(exo) polymerase”) at a final enzyme concentration of 0.476nM at 37 0 C for 40 minutes.
  • Klenow(exo) polymerase a mutant from of Klenow polymerase that comprises the mutation D424A and lacks exonuclease activity
  • the nicked and labeled DNA was then incubated with the intercalator dye YOYO-I and then immobilized via injection of the dye-DNA mixture at lOpM concentration into a glass chamber formed of surfaces derivatized either with +-+-+ ( Figure 5A) or PEBN ( Figure 5B)
  • Acceptor intensity on the coated surfaces was visualized using fluorescence microscopy via direct excitation with Yellow HeNe laser (for visualization of acceptor-labeled nucleotide) or Argon 488nm laser (for visualization of YOYO-I labeled DNA) in the same field of view.
  • Example 4 Detection of incorporation of acceptor labeled nucleotides based on fluorescence overlap with intercalating dye donors
  • the nicked product was then incubated with 6.8nM dU-Cy5 ( Figure 6) or dU-A1610 ( Figure 7) in the presence of Klenow(exo-) polymerase at 37 0 C for 40 minutes, following which 1 OpM of the labeled ⁇ DNA was contacted with an imaging mix containing 30OnM YOYO-I and 5OmM BME in IX KB buffer.
  • the entire mixture was injected into a glass chamber formed by PEBN-coated glass surfaces as described in Example 1.
  • visualization of YOYO-I containing regions was achieved using an Argon 488nm laser, and visualization of acceptor-containing regions using a Red HeNe laser ( Figure 6).
  • FIG. 6 shows donor fluorescence; the second panel depicts fluorescence images of the labeled ⁇ DNA as seen in the acceptor channel due to fluorescence 'bleed' into said channel, gathered following excitation with an Argon 488nm laser. Fluorescent acceptors are visible as regions of increased fluorescence intensity as indicated by white arrows.
  • the third panel shows the same field of view imaged using Red HeNe excitation to visualize the location of incorporated acceptor labels.
  • the fourth panel shows the composite image generated via overlay of the second and third panels, confirming the location of incorporated acceptor.
  • Figure 8 depicts results of a study identical to that of Figure 6, except that the
  • DNA comprising incorporated acceptors was immobilized on +-+-+ surfaces instead of PEBN surfaces prior to visualization.
  • Example 5 Detection of incorporation of acceptor-labeled nucleotides into surface-immobilized DNA
  • ⁇ DNA was nicked as described in Example 2, and then immobilized on a PEBN coated surface as described in Example 1.
  • the nicked and immobilized DNA was then contacted with an extension reaction mix containing 300-90OnM YOYO-I, 6.8nM dU-Cy5 and Klenow(exo-) in KB buffer.
  • the extension mix was replaced by buffer containing 25mM Tris pH 7.6 and 5OmM BME. (In some experiments, the YOYO-I dye was included in the Tris-BME buffer instead of in the extension mix).
  • the reactions contained 20OnM primer/template duplex, 2.5uM base- labeled NTP and 60OnM klenow(exo).
  • the reaction was initiated by addition of the enzyme.
  • Typical results are shown in Figure 12.
  • Panel 12(A) shows a schematic for the use of native (non-engineered) duplex comprising multiple intercalated SYBR Green I dye donors. In this test system, FRET occurs between the intercalated donors and the incorporated base labeled dNTPs.
  • Panel 12(B) depicts a graphed time series of fluorescence signals of both donor and acceptor groups detected using a fluorometer.
  • the X axis represents time in seconds; the Y axis represents fluorescence intensity in arbitrary units (AU).
  • Panel 12(C) depicts a bar graph of normalized FRET data from a series of individual incorporation experiments performed in a cuvette. The donor-acceptor pairs are specified on the X axis.
  • the Y axis on the right shows the normalized increase in acceptor signal due to FRET.
  • the data are normalized by applying the formula (I A after enzyme injection) - (7 A before enzyme injection) / donor intensity at start.
  • the FRET efficiency, as well as normalized acceptor intensity, using SYBR Green I as a donor are higher than corresponding FRET efficiencies and normalized acceptor intensities for the Alexa 488 donor samples.
  • Panel 12(D) depicts a bar graph of the fold increase in acceptor intensity obtained using SYBR Green I, relative to acceptor intensity obtained using A1488 as the donor.
  • a biotinylated primer template duplex consisting either of a biotinylated derivative of the engineered duplex (which contains a single donor group, A1488, at the -7 position on the primer) or a biotinylated derivative of the native duplex (which does not contain any intrinsic donor, but into which SYBR Green I donor molecules have been intercalated via co-incubation of the native duplex with SYBR Green I at 0.1X concentration) was immobilized on a PEG surface via attachment of the biotin on the template strand of each duplex.
  • Each surface-immobilized duplex was subjected to an extension reaction in situ by injecting into the chamber an extension mixture containing 15OnM Klenow(exo-) polymerase, 0.5uM of base labeled dNTPs (dUTP-ROX or dUTP-Alexa610), 5OmM Tris pH 7.2, 2mM MnSO4, 1OmM Na 2 SO4, 2mM DTT, 0.1% Triton X-100 and 0.01% Tween-20.
  • the reactions were allowed to occur for 10 minutes followed by a IM NaCl rinse, the samples were then rinsed either with buffer alone, or with buffer supplemented with 0.1X SYBR Green I.
  • the samples were excited using an Argon 488nm laser at 46OuW and the data were collected at 300ms integration time for 150 frames.
  • the emitted signals were detected by a Roper Scientific back-illuminated EMCCD camera (Cascade 1), with an inverted Nikon microscope (TE 2000U), and a 6Ox oil objective.
  • the emitted light was separated using dichroic (560nm, 650nm) and band pass filters (535/50nm; 620/40nm).
  • FRETAN software was used to obtain donor and acceptor traces and perform
  • the FRETAN software is an automated analysis software that identifies each of the spots in the donor channel (taking into consideration noise thresholds), subtracts the background fluorescence, and identifies anti-correlated changes in the time courses of fluorescence at each acceptor wavelength to identify single pair FRET events, and computes approximately 50 attributes associated with FRET.
  • the FRETAN software see U.S. Provisional App. No. 60/765,693 filed February 6, 2006, and U.S. Published App. No. 2007/0250274 Al, published October 25, 2007, herein incorporated by reference in their entirety.
  • Panel 13(A) shows a schematic of the Alexa 488 FRET system.
  • Panel 13(B) shows an example trace of FRET between the donor Alexa 488 and an incorporated base labeled dUTP-ROX.
  • Panel 13(C) shows a schematic of the SYBR Green I donor FRET system.
  • Panel 13(D) shows an example trace of FRET between the intercalating dye SYBR Green I as the donor and an incorporated base labeled dUTP-ROX.
  • the acceptor signals detected with SYBR green I as the donor are brighter compared to the signals detected with Alexa488 as the donor.
  • Figure 14 shows single molecule FRET data comparing the FRET efficiency and acceptor intensities using A1488 or the intercalating dye SYBR Green as a donor. Scatter plots of acceptor intensity (on the Y axis) and FRET efficiency (on the X axis) for acceptor Alexa 610 and acceptor ROX are shown in Panels (A) and (B), respectively. The lighter grey circles indicate data points obtained using Alexa 488 as the donor and darker stars in both plots A & B indicate data points obtained using SYBR Green as the donor.
  • Panel (C) shows a schematic of Alexa488 and SYBR Green I-driven FRET.
  • Panel (D) shows a bar graph of Acceptor intensities driven by Alexa 488 or SYBR Green above two user-defined thresholds, i.e., 1500 AU or 2000 AU.
  • the darker bars (on the left) are acceptor signals above 1500 AU and lighter grey bars (on the right) are acceptor signals above 2000 AU.
  • only very small percentage of acceptor intensities is higher than the user-defined cut-off thresholds (i.e., 1500 AU or 2000 AU) when A1488 is used as the donor, whereas acceptor intensities using SYBR Green I as the donor are consistently higher than both thresholds.
  • Panel 15(A) depicts a time series of acceptor intensity over a 30-minute period with SYBR Green present only 5' of the incorporated Alexa 610 acceptor.
  • Panel 15(B) depicts a time series of acceptor intensity over a 30-minute period with SYBR Green I present both 5' and 3' of the incorporated Alexa 610 acceptor.
  • SYBR Green I present both 5' and 3' of the incorporated Alexa 610 acceptor.
  • Lambda ( ⁇ ) DNA was randomly nicked via incubation with DNase I, an enzyme that introduces random single-strand nicks into the DNA backbone.
  • the nicked DNA was labeled in solution with dU-Cy5 or dU-A1594 by incubating 1.33nM of DNA with 6.8nM dU-Cy5 or dU-A1594 in IX KB buffer in the presence of 0.002units/ ⁇ l DNasel and 457nM Klenow(exo) polymerase.
  • the reaction was incubated at 37°C for 40min, and then stopped by adding DNasel stop solution (Promega).
  • 13pM of the labeled DNA was added to imaging mix containing IXKB, 5OmM BME and 30OnM YoYoI.
  • the imaging mix containing labeled DNA was added to the dry surface of PEBN-coated glass chambers.
  • the bound DNA was then washed with 25mM Tris pH 7.6 containing 5OmM BME to remove excess YOYO-I and unincorporated nucleotides.
  • the bound DNA was further washed with an oxygen-scavenger containing solution made of 25mM Tris pH7.6, 5OmM BME, lmg/ml glucose oxidase, 0.04mg/ml catalase and 0.4% glucose.
  • Regions of FRET activity were imaged using an Argon 488nm laser (for detection of YOYO- 1 labeled DNA), or a Red HeNe laser (for detection of Cy5), or a Yellow HeNe laser (for A1594 detection).
  • the images collected with the two lasers were overlayed using MetaMorph software to confirm the presence of incorporated acceptor label. The overlay results are shown in Figure 16.
  • Example 8 Use of the disclosed methods to screen HIV genotype within particular patients
  • diagnostic SNPs within the HIV genome are determined.
  • Primers are constructed along each strand of the HIV genome (the HIV RNA genome may be either directly interrogated, or converted into dsDNA and then ssDNA prior to interrogation) and, preferably, the 3' end of each primer terminates at the site of a candidate SNP. Some primers may terminate at non-variant sites to serve as internal controls for sequencing reaction efficiency and accuracy. If dsDNA is present in the hybridization, snap cooling is preferred to promote primer-template duplex formation, rather than slow cooling which favors reannealing of the template strands.
  • the HIV population may be screened and therapeutic regime prescribed and modified, as needed.
  • This method can be used to determine the relationship of SNPs in any nucleic acid for any application - i.e., cancer or predicting predisposition to any genetically influenced or determined disease.
  • Example 9 Use of the disclosed methods to assign detected sequence variations to particular genotypes
  • sequence reads may produce the following four variants, each of which can be analyzed to determine a consensus read for this region of the HIV genome (see
  • Variant#l ACTGT ATACGTACGATGCTATGCATCGATTCGTAC
  • Variant#2 ACTGTATACGTACGGTGCTATGCATCGATTCGTAC
  • Variant#3 CATCGATTCGTACGTGCCTCGAGTTTCTG
  • Variant#4 CATCGATTCGTACGTGCCTCGAGCCTCTG
  • 'NN' in different combinations may be very different. For example, substituting 'A' with 'CC in the above consensus may be a genotype resistant to a particular drug therapy, whereas 'A' with 'TT' may be effectively treated with the same therapy.
  • the methods and systems of the present disclosure address the central problem of how to align 1, 2, 3 and 4 because they provide important information about the relationship between these different short sequence reads, i.e., whether they occur on the same or different viral genome.
  • Example 10 Software analysis of fluorescence data gathered from Lambda
  • Data was processed as follows. A user-defined region of interest (ROI) in an average image of the Lambda ( ⁇ ) DNA volume was segmented. Thresholding and spatial connectivity information were used to automatically segment the ROI in average image of
  • Lambda ( ⁇ ) DNA in the donor channel was segmented in the donor channel. Specifically, automatic thresholding followed by largest connected component analysis method was used to segment the Lambda ( ⁇ ) DNA in the donor as shown in Figure 18, Panels (A) and (B). Using the information about the spatial extent of the ROI in the donor channel, the Lambda ( ⁇ ) DNA in the acceptor channel is segmented in a similar way as shown in Figure 18, Panels (C) and (D).
  • the segmented ROI of both channels was registered using standard image registration techniques, as shown in Figure 19.
  • signals were extracted at every corresponding spatial location in donor and acceptor channel ROI and the normalized intensity of every spatially corresponding point in both channels was compared.
  • a criterion was defined as a function of donor intensity and acceptor intensity at a particular point to determine the eligibility of that spatial coordinate as an incorporated label in the Lambda ( ⁇ ) DNA. Notably, a higher intensity was observed in the acceptor channel only at those points, as compared to the donor channel.
  • Figure 20 shows co-localization of the detected points in the acceptor channel via Argon 488nm or Red HeNe excitation, confirming the accuracy — to the level of pixel registration — of the automated analysis.
  • compositions and methods disclosed and claimed herein can be made and executed without undue experimentation in light of the present disclosure. While the compositions and methods of this invention have been described in terms of preferred embodiments, these embodiments are in no way intended to limit the scope of the claims, and it will be apparent to those of skill in the art that variations may be applied to the compositions and/or methods and in the steps or in the sequence of steps of the methods described herein without departing from the concept, spirit and scope of the invention. More specifically, it will be apparent that certain agents which are both chemically and physiologically related may be substituted for the agents described herein while the same or similar results would be achieved. All such similar substitutes and modifications apparent to those skilled in the art are deemed to be within the spirit, scope and concept of the invention as defined by the appended claims.

Abstract

L'invention concerne de nouveaux procédés pour séquencer des segments ordonnés d'une seule molécule d'acide nucléique en temps réel, en affichant une molécule d'acide nucléique dans un format allongé, en manipulant la molécule pour former une pluralité de sites d'amorçage accessibles à une polymérase le long de la longueur de la molécule d'acide nucléique, en déclenchant une réaction d'extension au niveau d'un ou de plusieurs sites d'amorçage; en surveillant des signaux émis pendant la réaction d'extension et en analysant les signaux pour déterminer la séquence de la molécule.
EP08843207A 2007-10-22 2008-10-22 Procédé et système pour obtenir des fragments de séquence segmentés et ordonnés le long d'une molécule d'acide nucléique Withdrawn EP2203568A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US98180307P 2007-10-22 2007-10-22
PCT/US2008/080843 WO2009055508A1 (fr) 2007-10-22 2008-10-22 Procédé et système pour obtenir des fragments de séquence segmentés et ordonnés le long d'une molécule d'acide nucléique

Publications (1)

Publication Number Publication Date
EP2203568A1 true EP2203568A1 (fr) 2010-07-07

Family

ID=40260414

Family Applications (1)

Application Number Title Priority Date Filing Date
EP08843207A Withdrawn EP2203568A1 (fr) 2007-10-22 2008-10-22 Procédé et système pour obtenir des fragments de séquence segmentés et ordonnés le long d'une molécule d'acide nucléique

Country Status (3)

Country Link
EP (1) EP2203568A1 (fr)
JP (1) JP2011505119A (fr)
WO (1) WO2009055508A1 (fr)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6083731B2 (ja) * 2012-09-11 2017-02-22 国立大学法人埼玉大学 Fret型バイオプローブ及びfret計測方法

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7056661B2 (en) * 1999-05-19 2006-06-06 Cornell Research Foundation, Inc. Method for sequencing nucleic acid molecules
AU2001282881B2 (en) * 2000-07-07 2007-06-14 Visigen Biotechnologies, Inc. Real-time sequence determination
AU2002337653A1 (en) * 2001-07-25 2003-02-17 The Trustees Of Princeton University Nanochannel arrays and their preparation and use for high throughput macromolecular analysis
GB0324456D0 (en) * 2003-10-20 2003-11-19 Isis Innovation Parallel DNA sequencing methods
CA2616433A1 (fr) * 2005-07-28 2007-02-01 Helicos Biosciences Corporation Sequencage monomoleculaire de bases consecutives
US20070202521A1 (en) * 2006-02-14 2007-08-30 Applera Corporation Single Molecule DNA Sequencing Using Fret Based Dynamic Labeling

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See references of WO2009055508A1 *

Also Published As

Publication number Publication date
JP2011505119A (ja) 2011-02-24
WO2009055508A1 (fr) 2009-04-30

Similar Documents

Publication Publication Date Title
US9587275B2 (en) Single molecule sequencing with two distinct chemistry steps
US9200320B2 (en) Real-time sequencing methods and systems
CN100462433C (zh) 实时序列测定
US20110281740A1 (en) Methods for Real Time Single Molecule Sequencing
US20110165652A1 (en) Compositions, methods and systems for single molecule sequencing
US10954551B2 (en) Devices, systems, and methods for single molecule, real-time nucleic acid sequencing
US20070031875A1 (en) Signal pattern compositions and methods
US20160115473A1 (en) Multifunctional oligonucleotides
US20070196832A1 (en) Methods for mutation detection
EP2203568A1 (fr) Procédé et système pour obtenir des fragments de séquence segmentés et ordonnés le long d'une molécule d'acide nucléique
US20230080657A1 (en) Methods for nucleic acid sequencing

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20100422

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MT NL NO PL PT RO SE SI SK TR

AX Request for extension of the european patent

Extension state: AL BA MK RS

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20100715

DAX Request for extension of the european patent (deleted)