WO2017117235A1 - Sequencing methods for double stranded nucleic acids - Google Patents
Sequencing methods for double stranded nucleic acids Download PDFInfo
- Publication number
- WO2017117235A1 WO2017117235A1 PCT/US2016/068899 US2016068899W WO2017117235A1 WO 2017117235 A1 WO2017117235 A1 WO 2017117235A1 US 2016068899 W US2016068899 W US 2016068899W WO 2017117235 A1 WO2017117235 A1 WO 2017117235A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- nucleotide
- polymerase
- nucleic acid
- strand
- double stranded
- Prior art date
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6869—Methods for sequencing
Definitions
- the method can include steps of (a) providing a double stranded nucleic acid having a template strand and a second strand having a discontinuity, wherein the discontinuity includes a 3' hydroxyl at a nick or gap, the 3' hydroxyl being adjacent to a position of the second strand that pairs with a first position in the template strand; (b) forming a stabilized ternary complex including a polymerase, the double stranded nucleic acid and a cognate nucleotide that binds at the position of the second strand that pairs with the first position of the template strand; (c) examining the stabilized ternary complex, thereby acquiring a signal for identifying the nucleotide at the first position of the template strand; (d) after the examining of the stabilized ternary complex, covalently adding a nucleotide to the 3' hydroxyl at the position of the second strand that pairs with
- sequencing reactions involve the formation of a complex between a primer-template nucleic acid hybrid, polymerase and nucleotide triphosphate that results in polymerase catalyzed nucleotidyl transfer of the nucleotide to the primer.
- SBS sequencing by synthesis
- nucleotides are sequentially added to the growing primer.
- the nucleotides are modified with identifying tags so that the base type of the incorporated nucleotide can be detected as synthesis proceeds.
- detection of a physical signal produced during incorporation of nucleotides may be measured concurrently with the reaction.
- the present disclosure provides a method whereby strand extension can be controlled such that a predefined number of nucleotides are incorporated to the 3' end of the strand.
- Such control is advantageous when used in the context of an SBB technique, for example, to extend a strand by a predefined number of nucleotides per reaction cycle. This in turn can provide improved resolution of individual nucleotides detected during each examination step. For example, each cycle can be correlated with a defined number of nucleotides in a repeat region, homopolymer region or other difficult sequence region.
- reaction conditions can be employed that result in only a single base extension per SBB cycle.
- the present disclosure provides compositions, methods and systems for sequencing a double stranded nucleic acid using a polymerase based sequencing by binding reaction.
- the present disclosure will employ, unless otherwise indicated, conventional molecular biology and bio-sensor techniques, which are within the skill of the art.
- the present disclosure incorporates by reference the entirety of the disclosure of the US patent application no.
- blocking moiety when used to reference a nucleotide, is a part of the nucleotide that inhibits or prevents the 3 ' position of the nucleotide from forming a covalent linkage to a second nucleotide during the incorporation step of a nucleic acid polymerization reaction.
- a "reversible” blocking moiety can be removed from the nucleotide or modified, allowing for nucleotide incorporation.
- Another exemplary discontinuity is a "gap" which is characterized by absence of one or more nucleotides within a strand in a double stranded nucleic acid. Due to the absence of nucleotide(s) a gap is further characterized by the upstream 3' hydroxyl not being adjacent to the downstream free phosphate.
- the second strand is discontinuous such that a 3 ' end occurs at a nick or gap in the strand and where free nucleotides may bind to form a ternary complex with polymerase and, optionally be incorporated by a polymerase.
- the next template nucleotide pairs with the position that is immediately downstream of the 3 ' end of the second strand.
- the next template base determines the incoming nucleotide type to be incorporated as the next correct nucleotide at the position that is immediately downstream of the 3 ' end of the second strand.
- Mutant polymerases include polymerases wherein one or more amino acids are replaced with other amino acids (naturally or non-naturally occurring), and insertions or deletions of one or more amino acids.
- Modified polymerases include polymerases that contain an external label which can be used to detect or monitor the presence and interactions of the polymerase.
- intrinsic signals from the polymerase can be used to detect or monitor their presence and interactions.
- ternary complex refers to an intermolecular association between a polymerase, a double stranded nucleic acid and a nucleotide.
- the polymerase facilitates interaction between the nucleotide and a template strand of the primed nucleic acid.
- a cognate nucleotide can interact with the template strand via Watson-Crick hydrogen bonding.
- stabilized ternary complex means a ternary complex having promoted or prolonged existence or a ternary complex for which disruption has been inhibited. Generally, stabilization of the ternary complex prevents incorporation of the nucleotide component of the ternary complex into the primed nucleic acid component of the ternary complex.
- Also provided herein are methods for sequencing a double stranded nucleic acid using a sequencing by binding reaction comprising (a) contacting a first strand of the double stranded nucleic acid molecule comprising a nick site with a base-removing enzyme to create a base gap in the first strand of the double stranded nucleic acid molecule at the nick site; (b) contacting the double stranded nucleic acid molecule comprising the base gap in the first strand with a first reaction mixture comprising a polymerase and at least one unlabeled nucleotide molecule; (c) examining the interaction of the polymerase with the double stranded nucleic acid molecule in presence of the nucleotide molecule without chemical incorporation of the nucleotide molecule into the base gap of the first strand of the double stranded nucleic acid molecule; and (d) identifying the nucleotide added to the base gap of the first strand of the double stranded nucle
- a discontinuity in a double stranded nucleic acid provides a mechanism for controlling the step size during a sequencing run.
- the location and/or size of the discontinuity can be selected to limit the number of nucleotide positions interrogated per cycle of a sequencing run.
- the size or location of the discontinuity can be chosen to limit the number of nucleotide positions interrogated to be no more than 1, 2, 3, 4, 5, 10 or more per cycle.
- a single polymerase is bound to the 3' nucleotide that flanks the discontinuity (referred to as "the 3' end of the discontinuity") to form a stabilized ternary complex that is detected during an examination phase.
- the stabilized complex is not free to extend the discontinuous strand and move in the 3' direction along the template.
- the enzyme can move in the 3' direction along the template as the discontinuous strand is extended.
- the presence of a double stranded region downstream of the extended strand can be used to restrict the distance moved by the polymerase during the incorporation phase.
- the incorporation can be carried out under non-strand displacing conditions.
- the number of nucleotides added during the extension phase will be determined by the size of the gap between the 3' end of the discontinuous strand and the 5' nucleotide that flanks the gap (referred to as "the 5' end of the discontinuity").
- the first method involves creating double stranded, hemi-modified sequencing templates which contain a restriction endonuclease recognition and cleavage site. One of the strands at the cleavage site is modified so that it will not be cleaved (for example by substituting phosphorothioate for phosphodiester linkages between nucleotides).
- Restriction endonucleases that are able to nick opposite a hemi-phosphorothioate modified strand include Hinc II, Hind II, Ava I and Nci I.
- a second method for creating double stranded nucleic acid with a discontinuity uses a double stranded DNA template to be nicked (or gapped) at or near a position that is abasic (i.e. lacking a base attached to the template strand at the position) or occupied by an
- a third method for creating a double stranded nucleic acid with a discontinuity is to create double stranded sequencing templates that contain short regions of RNA.
- the region of RNA can be covalently attached to a region of DNA at the site to be nicked, then the RNA can be specifically cleaved using RNAseH, thus creating a 3' OH useful for a sequencing method set forth herein.
- the RNA region need not be covalently attached to the DNA region, for example, being formed by hybridizing a DNA oligo and RNA oligo on a template to result in a nick or gap between the two types of nucleic acid.
- Sequencing templates can be constructed that include a recognition site for a NE from which the sequencing reaction will be initiated.
- Many nicking endonucleases are commercially available. Examples include Nb.BsmI, which cleaves the bottom strand inside of its 6 base pair asymmetric recognition site; Nt. Alwl, a modified type II restriction endonuclease which cleaves the top strand 4 nucleotides 3' of its 5 base pair asymmetric recognition site; Nb.BbvCI, which cleaves inside of its 7 base pair asymmetric recognition site.
- sequencing reactions can be individually conducted at the multiple steps described above for creating discontinuities, or others that can be contemplated.
- enzymes are described with specific activities used to create or facilitate the formation of discontinuities at specific sites in double stranded sequencing templates. It is to be understood that many naturally occurring enzymes have desired properties, but also that some properties can be created, enhanced, diminished or abolished by engineering of existing enzymes.
- the engineered enzymes may have additional or enhanced capabilities for the methods described above.
- a method set forth herein can include a step of creating a gap.
- base removing enzymes that would be useful in this context are enzymes with natural or engineered double stranded-targeting 5 '-3 ' exonuclease activity that removes one or just a few nucleotides in a controlled manor.
- a useful enzyme in this context is a flap endonuclease, for example FEN1 , which cleaves a non-hybridized single-stranded 5 ' end "flap" from a hybridized portion.
- Other useful enzymes include DNA polymerases with 5 '-3' exonuclease activities.
- the 5 '-3 ' exonuclease activity is typically contained within a distinct domain of the enzyme (which can be removed to eliminate the activity when desired, such as with Klenow fragment of E. coli DNA Pol I).
- DNA polymerases with 5 '-3' exonuclease activity include E. coli DNA Pol I, and Thermus aquaticus DNA polymerase often used for PCR.
- Taq DNA pol also contains flap endonuclease activity (which can cleave an unhybridized 5 ' end of the leading strand as polymerization proceeds).
- a gap is created by the 5 '-3' exonuclease activity of a polymerase that has been modified to disable nucleotidyl transfer activity.
- the polymerase may have mutations that disable polymerization but preserve 5 '-3 ' exonuclease activity.
- chemical modifications, non-catalytic metal ions or reagents that inhibit polymerase activity, but allow 5 '-3' exonuclease activity may be used to allow a polymerase to create a gap downstream of the 3' end.
- the ability of polymerases to selectively bind the next correct nucleotide at the 3 ' end is preserved while its catalytic extension activity is disabled (e.g. via mutations, chemical modifications, inhibitors or specific metal ions).
- the ability to form a correct ternary complex and the 5 '-3' exonuclease activity are retained, whereby the polymerase can be used to detect the presence of the next correct nucleotide.
- Polymerase-based methods for detecting nucleic acids of the present disclosure can include, for example, nucleic acid sequencing by binding reactions, wherein the polymerase undergoes conformational transitions between open and closed conformations during discrete steps of the reaction.
- the polymerase binds to a double stranded nucleic acid to form a binary complex, referred to, optionally, as the pre-insertion conformation.
- the polymerase binds to the site of a nick or base gap present on one strand, ex., the first strand of a double stranded nucleic acid.
- divalent catalytic metal ions such as Mg mediate a rapid chemical step involving nucleophilic displacement of a pyrophosphate (PPi) by the 3' hydroxyl termini of the first strand of the double stranded nucleic acid.
- the polymerase returns to an open state upon the release of PPi, the post-translocation step, and translocation initiates the next round of reaction.
- a closed-complex can form in the absence of a divalent catalytic metal ions (e.g., Mg 2+ ), it is proficient in chemical addition of nucleotide in the presence of the divalent metal ions.
- Low or deficient levels of catalytic metal ions e.g., Mg 2+
- the polymerase configuration and/or interaction with a nucleic acid may be detected or monitored during an examination step to identify the next base in the nucleic acid sequence.
- a nucleic acid sequencing reaction mixture can include any of a variety of reagents that are commonly present in polymerase based nucleic acid synthesis reactions.
- Reaction mixture reagents include, but are not limited to, enzymes (e.g., polymerase), nucleotides (e.g. dNTPs or NTPs), double stranded nucleic acids, salts, buffers, small molecules, co-factors, metals, and ions.
- the ions may be catalytic ions, non-covalent metal ions, or both.
- the reaction mixture can include salts such as NaCl, KC1, K-acetate, NH 4 -acetate, K-glutamate, NH 4 C1, or (NH4HSO4).
- An examination step of a method set forth herein can be used to detect a ternary complex between a polymerase, double stranded nucleic acid, and nucleotide.
- the ternary complex may be in a pre-chemistry conformation, wherein a nucleotide is sequestered but not incorporated.
- the closed-complex may be in a pre-translocation conformation, wherein a nucleotide is incorporated by formation of a phosphodiester bond with the 3' end of the first strand in the double stranded nucleic acid.
- the closed-complex may be formed in the absence of catalytic metal ions or deficient levels of catalytic metal ions, thereby physically sequestering the next correct nucleotide within the polymerase active site without chemical incorporation.
- the sequestered nucleotide may be a non-incorporable nucleotide.
- the closed-complex may be formed in the presence of catalytic metal ions, where the closed- complex comprises a nucleotide analog which is incorporated, but a PPi is not capable of release.
- the closed-complex is stabilized in a pre-translocation conformation.
- a pre-translocation conformation is stabilized by chemically cross-linking the polymerase.
- the closed-complex may be stabilized by extemal means.
- the closed-complex may be stabilized by allosteric binding of small molecules.
- a closed-complex may be stabilized by pyrophosphate analogs that bind close to the active site with high affinity, preventing translocation of the polymerase.
- DNA polymerases include, but are not limited to, bacterial DNA polymerases, eukaryotic DNA polymerases, archaeal DNA polymerases, viral DNA polymerases and phage DNA polymerases.
- Bacterial DNA polymerases include E. coli DNA polymerases I, II and III, IV and V, the Klenow fragment of E. coli DNA polymerase, Clostridium stercorarium (Cst) DNA polymerase, Clostridium thermocellum (Cth) DNA polymerase and Sulfolobus solfataricus (Sso) DNA polymerase.
- Eukaryotic DNA polymerases include DNA
- Viral DNA polymerases include T4 DNA polymerase, phi-29 DNA polymerase, GA-1, phi-29-like DNA polymerases, PZA DNA polymerase, phi- 15 DNA polymerase, Cpl DNA polymerase, Cp7 DNA polymerase, T7 DNA polymerase, and T4 polymerase.
- Archaeal DNA polymerases include thermostable and/or thermophilic DNA polymerases such as DNA polymerases isolated from Thermus aquaticus (Label) DNA polymerase, Thermus flliformis (Tfi) DNA polymerase, Thermococcus zilligi (Tzi) DNA polymerase, Thermus thermophilus (Tth) DNA polymerase, Thermus flavusu (Tfl) DNA polymerase, Pyrococcus woesei (Pwo) DNA polymerase, Pyrococcus furiosus (Pfu) DNA polymerase and Turbo Pfu DNA polymerase, Thermococcus litoralis (Tli) DNA polymerase, Pyrococcus sp.
- RNA polymerases include, but are not limited to, viral RNA polymerases such as T7 RNA polymerase, T3 polymerase, SP6 polymerase, and Kll polymerase; Eukaryotic RNA polymerases such as RNA polymerase I, RNA polymerase II, RNA polymerase III, RNA polymerase IV, and RNA polymerase V; and Archaea RNA polymerase.
- viral RNA polymerases such as T7 RNA polymerase, T3 polymerase, SP6 polymerase, and Kll polymerase
- Eukaryotic RNA polymerases such as RNA polymerase I, RNA polymerase II, RNA polymerase III, RNA polymerase IV, and RNA polymerase V
- Archaea RNA polymerase Archaea RNA polymerase.
- Reverse transcriptases include, but are not limited to, HIV-1 reverse transcriptase from human immunodeficiency virus type 1 (PDB 1HMV), HIV-2 reverse transcriptase from human immunodeficiency virus type 2 (PDB 5UPJ), M-MLV reverse transcriptase from the Moloney murine leukemia virus, AMY reverse transcriptase from the avian myeloblastosis virus, and Telomerase reverse transcriptase that maintains the telomeres of eukaryotic chromosomes.
- PDB 1HMV human immunodeficiency virus type 1
- PDB 5UPJ human immunodeficiency virus type 2
- M-MLV reverse transcriptase from the Moloney murine leukemia virus
- AMY reverse transcriptase from the avian myeloblastosis virus
- Telomerase reverse transcriptase that maintains the telomeres of eukaryotic chromosomes.
- a polymerase based, sequencing by binding reaction can involve steps of; a) providing a double stranded nucleic acid, b) providing the first strand of the double stranded nucleic acid with a polymerase and one or more types of nucleotides, wherein the nucleotides may or may not be complementary to the next base of the template strand of the double stranded nucleic acid, and c) examining the interaction of the polymerase with the double stranded nucleic acid under conditions wherein either (i) chemical incorporation of a nucleotide to the first strand of the double stranded nucleic acid is disabled or severely inhibited in the pre-chemistry conformation, or (ii) a single nucleotide incorporation occurs at the 3 ' end of the first strand of the double stranded nucleic acid.
- the pre- translocation conformation may be stabilized to facilitate examination and/or prevent subsequent nucleotide incorporation.
- sequencing methods provided herein readily encompass a plurality of double stranded nucleic acids, wherein the plurality of nucleic acids may be clonally amplified copies of a single nucleic acid, or disparate nucleic acids, including combinations, such as populations of disparate nucleic acids that are clonally amplified.
- a method of the present disclosure can further include an incorporation step, the incorporation step comprising incorporating a single unlabeled nucleotide in the base gap of the first strand of the double stranded nucleic acid molecule.
- the method further comprises repeating the examination step and the incorporation step to sequence the template strand of the double stranded nucleic acid molecule.
- the examination step may be repeated one or more times prior to performing the incorporation step.
- the 3' hydroxyl group of the at least one unlabeled nucleotide molecule is modified to have a 3' terminator moiety.
- the 3' terminator moiety may be a reversible terminator.
- the 3' terminator moiety may be an irreversible terminator.
- the irreversible terminator of the at least one unlabeled nucleotide molecule is replaced or removed after an examination step, for example, to facilitate a subsequent incorporation step. It will be understood that in some embodiments termination is provided by a downstream double stranded region that flanks a discontinuity at which polymerase activity occurs. As such, a nucleotide used for incorporation need not include a terminator moiety.
- a polymerase can interact with a discontinuity in one strand of a double stranded nucleic acid molecule in the presence of at least one unlabeled nucleotide molecule to form a closed complex.
- the closed complex is a ternary closed-complex comprising the first strand of the double stranded nucleic acid molecule, the polymerase, and the unlabeled nucleotide, wherein the unlabeled nucleotide is complementary to the base on the template strand of the double stranded nucleic strand.
- the formation of a ternary closed- complex is favored over the formation of a binary complex between the first strand of the double stranded nucleic acid and the polymerase.
- the formation of the ternary closed-complex may be favored over the formation of the binary complex when the first reaction mixture comprises a high
- the formation of the ternary closed-complex may be favored over the formation of the binary complex when the first reaction mixture comprises a buffer having a high pH.
- a ternary complex stabilizing reaction mixture can include 1, 2, 3, or 4 types of unlabeled nucleotide molecules, for example, selected from dATP, dTTP, dCTP, and dGTP.
- a closed-complex is formed between a discontinuity in the first strand of a double stranded nucleic acid molecule, polymerase, and any one of the four unlabeled nucleotide molecules so that four types of closed complexes may be formed
- an examination step comprises measuring association kinetics of the formation of the closed-complex formed between the first strand of the double stranded nucleic acid molecule, the polymerase, and any one of the four unlabeled nucleotide molecules.
- the measured association kinetics is different depending on the identity of the unlabeled nucleotide molecule in the closed-complex.
- the polymerase has a different affinity for each of the four types of nucleotide molecules in each type of closed-complex.
- the polymerase may have a different dissociation constant for each of the four types of unlabeled nucleotide molecules in each type of closed-complex. Such differences can be measured in a method set forth herein in order to distinguish the binding of one type of nucleotide at a 3' end of a strand from another type of nucleotide that is used in the method.
- the absence of a catalytic metal ion during an examination step prevents the chemical incorporation of the nucleotide molecule at the discontinuity of the first strand of the double stranded nucleic acid molecule.
- the presence of the catalytic metal ion in the first reaction mixture induces chelation, and catalyzes the chemical incorporation of the nucleotide molecule in the discontinuity of the first strand of the double stranded nucleic acid molecule
- At least one unlabeled nucleotide molecule used in an examination step is a non-incorporable nucleotide.
- the first strand of the double stranded nucleic acid used during an examination step does not contain a free hydroxyl group at its 3' end.
- a method provided herein can include the addition of a polymerase inhibitor to the ternary complex, thereby preventing the chemical incorporation of the unlabeled nucleotide molecule at a discontinuity of a double stranded nucleic acid molecule during examination.
- the polymerase inhibitor is a pyrophosphate analog.
- the polymerase inhibitor is an allosteric inhibitor.
- the polymerase inhibitor is a reverse transcriptase inhibitor.
- the polymerase inhibitor is HIV-1 reverse transcriptase inhibitor.
- a double stranded nucleic acid that is detected or sequenced in a method set forth herein is immobilized to a surface.
- the surface may be a planar substrate, a microparticle, or a nanoparticle.
- the surface is an array and a plurality of double stranded nucleic acids are immobilized at separate features on the array.
- the methods set forth herein can be carried out in parallel for a plurality of target nucleic acids.
- the features can include "nanoballs" of amplified DNA fragments, bound to a surface.
- many features may be present on a surface, for example up to millions, tens of millions, or more.
- RCA rolling circle amplification
- Emulsion PCR on beads can also be used, for example as described in Dressman et al, Proc. Natl. Acad. Sci. USA 100:8817-8822 (2003), WO 05/010145, US Pat. App. Pub. No. 2005/0130173 Al or US Pat. App. Pub. No. 2005/0064460 Al, each of which is incorporated herein by reference.
- a system or method of the present disclosure can use one or more of the reagents described in the above references for making and using nanoballs or other nucleic acid features.
- double stranded nucleic acids are provided as a plurality of clonally amplified double stranded nucleic acid molecules.
- the plurality of double stranded nucleic acid molecules are distinguishable from each other.
- An examination step can involve binding a polymerase to a discontinuity site of a double stranded nucleic acid in a reaction mixture comprising one or more nucleotides, and detecting or monitoring the interaction.
- a nucleotide is sequestered within the polymerase-double stranded nucleic acid complex to form a closed-complex, under conditions in which incorporation of the enclosed nucleotide by the polymerase is attenuated or inhibited.
- This closed-complex is in a stabilized or polymerase-trapped pre-chemistry conformation.
- a closed-complex allows for the incorporation of the enclosed nucleotide, but does not allow for the incorporation of a subsequent nucleotide.
- the identity of the next base is determined by detecting or monitoring the presence, formation, and/or dissociation of the ternary closed-complex.
- the identity of the next base may be determined without chemically incorporating the next correct nucleotide to the 3' end of the first strand of the double stranded nucleic acid.
- the identity of the next base is determined by detecting or monitoring the affinity of the polymerase to the first strand of the double stranded nucleic acid in the presence of added nucleotides.
- the affinity of the polymerase to the first strand of the double stranded nucleic acid in the presence of the next correct nucleotide may be used to determine the next correct base on the template strand of the double stranded nucleic acid.
- the affinity of the polymerase to a discontinuous double stranded nucleic acid in the presence of an incorrect nucleotide may be used to determine the next correct base to be incorporated on the 3' end of the discontinuity in the double stranded nucleic acid.
- An examination step may be controlled, in part, by providing reaction conditions to prevent chemical incorporation of a nucleotide, while allowing determination of the identity of the next base on the nucleic acid.
- reaction conditions may be referred to as examination reaction conditions.
- a ternary closed-complex is formed under examination conditions.
- a stabilized ternary closed-complex is formed under examination conditions.
- a stabilized ternary closed complex is in a pre-chemistry conformation.
- a stabilized ternary closed-complex is in a pre-translocation conformation, wherein the enclosed nucleotide has been incorporated, but the closed-complex does not allow for the incorporation of a subsequent nucleotide.
- the examination conditions accentuate the difference in affinity for polymerase to the first strand of the double stranded nucleic acid in the presence of different nucleotides.
- the examination conditions cause differential affinity of the polymerase to the first strand of the double stranded nucleic acid in the presence of different nucleotides.
- Examination typically includes detecting polymerase interaction with a double stranded nucleic acid and nucleotide. Detection may include optical, electrical, thermal, acoustic, chemical and mechanical means.
- examination is performed after a wash step, wherein the wash step removes any non-bound reagents from the region of observation.
- examination is performed during a wash step, such that as a further option, the dissociation kinetics of the polymerase-nucleic acid, or polymerase-nucleic acid-nucleotide complexes may be used to determine the identity of the next base.
- Multiple examination steps may be utilized in cases where multiple double stranded nucleic acids are being sequenced simultaneously in one sequencing reaction, wherein different nucleic acids react differently to the different examination reagents (e.g. in an array format).
- multiple examination steps may improve the accuracy of next base determination.
- the enclosed nucleotide is incorporated and a subsequent nucleotide incorporation is inhibited.
- the complex is stabilized or trapped in a pre-translocation conformation.
- the closed-complex can be stabilized during the examination step, allowing for controlled nucleotide incorporation.
- a stabilized closed-complex is a complex wherein incorporation of an enclosed nucleotide is attenuated, either transiently (e.g. to examine the complex and then incorporate the nucleotide) or permanently (e.g. for examination only) during an examination step.
- a stabilized closed-complex allows for the incorporation of the enclosed nucleotide, but does not allow for the incorporation of a subsequent nucleotide.
- the closed-complex is stabilized in order to detect or monitor any polymerase interaction with a double stranded nucleic acid in the presence of a nucleotide for identification of the next base in the double stranded nucleic acid.
- ternary closed-complex stabilization or ternary closed-complex release reaction conditions and/or methods may be combined.
- a polymerase inhibitor which stabilizes a ternary closed-complex may be present in the examination reaction with a catalytic ion which functions to release the closed-complex.
- the ternary closed-complex may be stabilized or released, depending on the polymerase inhibitor properties and concentration, the concentration of the catalytic metal ion, other reagents and/or conditions of the reaction mixture, and any combination thereof.
- a non-catalytic metal ion and a catalytic metal ion are both present in the reaction mixture, wherein one ion is present in a higher effective concentration than the other.
- the affinity of the polymerase e.g., Klenow fragment of E. coli DNA polymerase I, Bst
- each dNTPs e.g., dATP, dTTP, dCTP, dGTP
- examination of the ternary complex can involve measuring the binding affinities of polymerase-double stranded nucleic acids to dNTPs; wherein binding affinity is indicative of the next base in the template strand of the double stranded nucleic acid.
- a wash step can be used to remove the non- catalytic ion and unbound nucleotides, and a catalytic metal ion can then be added to the reaction to induce PPi cleavage and nucleotide incorporation.
- the reaction may be repeated until a desired read-length is obtained.
- a closed complex may be formed and/or stabilized by the addition of a polymerase inhibitor to the examination reaction mixture.
- a polymerase inhibitor to the examination reaction mixture.
- Inhibitor molecules phosphonoacetate, (phosphonoacetic acid) and phosphonoformate (phosphonoformic acid, common name Foscarnet), Suramin, Aminoglycosides, INDOPY-1 and Tagetitoxin are non-limiting examples of uncompetitive or noncompetitive inhibitors of polymerase activity that can be used.
- the binding of the inhibitor molecule near the active site of the enzyme, traps the polymerase in either a pre-translocation or post-translocation step of the nucleotide incorporation cycle, stabilizing the polymerase in its ternary closed-complex conformation before or after the incorporation of a nucleotide, and forcing the polymerase to be bound to the double stranded nucleic acid until the inhibitor molecules are not available in the reaction mixture by removal, dilution or chelation.
- Another useful inhibitor molecule is the drug Efavirenz, which acts as an uncompetitive inhibitor to the HIV I reverse transcriptase.
- Other useful uncompetitive inhibitors include, for example, pyrophosphate analogs such as Foscarnet (phosphonoformate), phosphonoacetate or other pyrophosphate analogs.
- polymerase inhibitors found to be effective in inhibiting a HIV- 1 reverse transcriptase polymerase can be employed to stabilize a ternary closed-complex.
- Useful HIV-1 reverse transcriptase inhibitors include, for example, nucleoside/nucleotide reverse transcriptase inhibitors (NRTI) and non-nucleoside reverse transcriptase inhibitors (N RTI). Exemplary conditions for delivery and use of polymerase inhibitors in a sequencing by binding reaction are set forth in US patent application no. 14/805,381, which is incorporated herein by reference.
- the stabilization of a closed-complex using polymerase inhibitors is combined with additional reaction conditions which also function to stabilize a closed- complex, including, but not limited to, sequestering, removing, reducing, omitting, and/or chelating a catalytic metal ion, the presence of a modified polymerase in the closed-complex, a non-incorporable nucleotide in the closed-complex, and any combination thereof.
- a polymerase can be stabilized in ternary complex form via cross-linking residues of the polymerase.
- the cross- linking methods can be reversible or non-reversible. Exemplary methods for making and using engineered polymerases and/or crosslinking chemistry in a sequencing by binding method are set forth in US patent application no. 14/805,381, which is incorporated herein by reference.
- a closed-complex of an examination step comprises a nucleotide analog or modified nucleotide to facilitate stabilization of the closed-complex.
- a nucleotide analog comprises a nitrogenous base, five-carbon sugar, and phosphate group; wherein any component of the nucleotide may be modified and/or replaced.
- Nucleotide analogs may be non-incorporable nucleotides. Non-incorporable nucleotides may be modified to become incorporable at any point during the sequencing method. Exemplary methods for making and using non-incorporable nucleotide analogs in a sequencing by binding method are set forth in US patent application no. 14/805,381, which is incorporated herein by reference. Other useful analogs are set forth in U.S. Patent No. 8,071,755 which is incorporated herein by reference.
- the examination step comprises formation and/or stabilization of a ternary closed-complex comprising a polymerase, double stranded nucleic acid, and nucleotide. Characteristics of the formation, presence and/or release of the closed-complex are detected or monitored to identify the enclosed nucleotide and therefore the next base in the template strand of the double stranded nucleic acid.
- ternary closed-complex characteristics are dependent on the sequencing reaction components (e.g., polymerase, double stranded nucleic acid, nucleotide) and/or reaction mixture components and/or conditions.
- the examination step involves detecting or monitoring the interaction of a polymerase with a double stranded nucleic acid and nucleotide.
- the formation of a ternary closed-complex may be detected or monitored.
- the absence of formation of ternary closed-complex is determined based on results of a detection step detected or monitored.
- the dissociation of a ternary closed-complex is detected or monitored.
- the incorporation step involves detecting or monitoring incorporation of a nucleotide. Alternatively, detection or monitoring can be carried out prior to
- no component of the sequencing reaction is detectably labeled.
- a method that detects unlabeled molecular complexes such as the surface plasmon resonance techniques set forth below, can be used.
- detecting or monitoring the variation in affinity of a polymerase to a double stranded nucleic acid in the presence of correct and incorrect nucleotides may be used to determine the sequence of the nucleic acid.
- the affinity of a polymerase to a double stranded nucleic acid in the presence of different nucleotides, including modified or labeled nucleotides can be monitored as the on-rate or off-rate of the polymerase-nucleic acid interaction in the presence of the various nucleotides.
- Exemplary methods detecting kinetics of nucleotide binding in a sequencing by binding method are set forth in US patent application no. 14/805,381, which is incorporated herein by reference.
- the interaction between the polymerase and the nucleic acid is detected or monitored via a detectable label attached to the polymerase.
- the label may be detected by methods including, but limited to, optical, electrical, thermal, mass, size, charge, vibration, and pressure.
- the label may be magnetic, fluorescent or charged.
- fluorescence anisotropy may be used to determine the stable binding of a polymerase to a nucleic acid in a closed-complex.
- Exemplary polymerase labels and techniques for their detection in a sequencing by binding method are set forth in US patent application no. 14/805,381, which is incorporated herein by reference.
- a conformationally sensitive dye may be attached close to the active site of a polymerase, wherein a change in conformation, or a change in polar environment due to the formation of a ternary closed-complex is reflected as a change in fluorescence or absorbance properties of the dye.
- polymerases that form a temary complex with a double stranded nucleic acid and nucleotide can be distinguished from polymerases that do not form a ternary complex based on differences in fluorescence or absorbance signals from a conformationally sensitive dye.
- the identity of the next correct nucleotide for a double stranded nucleic acid can be determined from the signals obtained from a
- one or more nucleotides are attached to a label that is detected in an examination step of a method set forth herein.
- Different labels on different types of nucleotides may be distinguishable by means of their differences in fluorescence, Raman spectrum, charge, mass, refractive index, luminescence, length, or any other measurable property.
- the labeled nucleotides may be enclosed in a closed-complex with the polymerase and the double stranded nucleic acid, and the identity of the nucleotide identified from the attached label.
- Exemplary nucleotide labels and techniques for their detection in a sequencing by binding method are set forth in US patent application no. 14/805,381, which is incorporated herein by reference.
- the polymerase dissociates from the polymerization initiation site after nucleotide incorporation.
- the polymerase is retained at the polymerization initiation site after incorporation.
- the polymerase may be trapped at the 3' end of the first strand of the double stranded nucleic acid after the incorporation reaction in the pre-translocation state, post-translocation state, an intermediate state thereof, or a binary complex state.
- the incorporation reaction may be facilitated by an incorporation reaction mixture.
- the incorporation reaction mixture comprises a different composition of nucleotides than the examination reaction.
- the examination reaction comprises one type of nucleotide and the incorporation reaction comprises another type of nucleotide.
- the examination reaction comprises one type of nucleotide and the incorporation reaction comprises four types of nucleotides, or vice versa.
- the examination reaction mixture is altered or replaced by the incorporation reaction mixture.
- Nucleotides present in an incorporation reaction mixture which are not sequestered in a ternary closed-complex may cause multiple nucleotide insertions.
- a wash step is optionally employed prior to the chemical incorporation step to ensure only the nucleotide sequestered within a trapped ternary closed-complex is available for incorporation during the incorporation step.
- the trapped polymerase complex may be a ternary closed-complex, a stabilized closed-complex or another complex involving the polymerase, double stranded nucleic acid and next correct nucleotide.
- the nucleotide enclosed within the ternary closed-complex of the examination step is incorporated during the examination step, but the ternary closed-complex does not allow for the incorporation of a subsequent nucleotide; optionally, the ternary closed-complex is released during an incorporation step, allowing for another nucleotide (of the same or different type) to become incorporated.
- the incorporation step comprises replacing a nucleotide from the examination step (e.g., the nucleotide is an incorrect nucleotide) and incorporating another nucleotide to the 3' end of the discontinuity of the double stranded nucleic acid.
- the incorporation step comprises releasing a nucleotide from within a ternary closed-complex (e.g., the nucleotide is a modified nucleotide or nucleotide analog) and incorporating a nucleotide of a different kind to the 3' end of the discontinuity of the double stranded nucleic acid.
- the released nucleotide is removed and replaced with an incorporation reaction mixture comprising a next correct nucleotide.
- Suitable reaction conditions for incorporation may involve replacing an examination reaction mixture with an incorporation reaction mixture.
- nucleotides present in the examination reaction mixture are replaced with one or more nucleotides in the incorporation reaction mixture.
- the polymerase present during the examination step is replaced during the incorporation step.
- the polymerase present during the examination step is modified during the incorporation step, for example, being modified from a stabilized ternary complex in the examination step to a destabilized temary complex form in the incorporation step.
- the one or more nucleotides present during the examination step are modified during the incorporation step.
- the reaction mixture and/or reaction conditions present during the examination step may be altered by any means during the incorporation step.
- These means include, but are not limited to, removing reagents that stabilized a ternary complex, chelating reagents that stabilized a temary complex, diluting reagents that stabilized a temary complex, adding reagents that destabilize a temary complex, altering reaction conditions such as conductivity or pH to destabilize a ternary complex, and any combination thereof.
- the reagents in the reaction mixture including any combination of polymerase, double stranded nucleic acid, and nucleotide and each may be modified during the examination step and/or incorporation step.
- An advantage of some embodiments of the methods set forth herein is that the next correct base is detected before the incorporation step, allowing the incorporation step to not require labeled reagents and/or monitoring.
- a nucleotide does not contain an attached detectable label.
- the examination step of the sequencing reaction may be repeated 1, 2, 3, 4 or more times prior to the incorporation step.
- the examination and incorporation steps may be repeated until the desired sequence of the double stranded nucleic acid is obtained.
- Performing the polymerase-based methods of the present disclosure at a discontinuity in a double stranded nucleic acid enhances sequencing accuracy for nucleic acid regions comprising homopolymer repeats because the length of the extension, which would be unknown or ambiguous for a homopolymer of unknown length, can be determined from a predefined gap size in the nucleic acid being sequenced.
- the block to extension is reversible.
- an exonuclease enzyme or the exonuclease activity of a polymerase can be used to translate the 5' end of the discontinuity downstream for subsequent cycles of sequencing.
- the present disclosure provides a method for identifying nucleotides in a nucleic acid that includes the steps of (a) providing a double stranded nucleic acid having a template strand and a second strand having a nick, wherein the nick includes a 3' hydroxyl adjacent to a position of the second strand that pairs with a first position in the template strand; (b) forming a stabilized ternary complex including a polymerase, the double stranded nucleic acid and a cognate nucleotide that binds at the position of the second strand that pairs with the first position of the template strand, wherein the stabilized ternary complex is formed under strand displacing conditions; (c) examining the stabilized ternary complex, thereby acquiring a signal for identifying the nucleotide at the first position of the template strand; (d) after the examining of the stabilized ternary complex, (i) contacting the double stranded nucleic acid with an
- the steps (b) through (d) can be repeated, thereby determining a nucleotide sequence of the template strand from the acquired signals.
- Also provided is a method for identifying nucleotides in a nucleic acid that includes the steps of (a) providing a double stranded nucleic acid having a template strand and a second strand having a gap, wherein the gap includes a 3' hydroxyl adjacent to a position of the second strand that pairs with a first position in the template strand; (b) forming a stabilized ternary complex including a polymerase, the double stranded nucleic acid and a cognate nucleotide that binds at the position of the second strand that pairs with the first position of the template strand, wherein the stabilized ternary complex is formed under non- strand displacing conditions; (c) examining the stabilized ternary complex, thereby acquiring a signal for identifying the nucleotide at the first position of the template strand; (d) after the examining of the stabilized ternary complex, covalently adding a nucleotide to the 3' hydroxyl at the
- the method can further include steps of (e) forming a stabilized ternary complex including a polymerase, the double stranded nucleic acid and a cognate nucleotide that binds at the position of the second strand that pairs with the first position of the template strand, wherein the stabilized ternary complex is formed under strand displacing conditions; (f) examining the stabilized ternary complex, thereby acquiring a signal for identifying the nucleotide at the first position of the template strand; (g) after the examining of the stabilized ternary complex, (i) contacting the double stranded nucleic acid with an nuclease to convert the nick to a gap, and (ii) covalently adding a nucleotide to the 3' hydroxyl at the position of the second strand that pairs with the first position of the template strand, thereby translating the nick downstream and identifying the nucleotide at the first position of the template strand from the acquired signals.
- the steps (e) through (g) can be repeated, thereby determining a nucleotide sequence of the template strand from the acquired signals.
- a single nucleotide is covalently added to the 3' hydroxyl in step (d)(ii) by contacting the double stranded nucleic acid with a polymerase, endonuclease and nucleotide that binds at the position that pairs with the first position.
- the polymerase and endonuclease are simultaneously present during the reaction.
- the endonuclease is a flap endonuclease such as FEN1.
- the polymerase can be Polymerase beta or polymerase delta.
- Exemplary incorporation conditions for adding a single nucleotide at a nick using FEN1 (flap endonuclease 1) and polymerase beta are set forth in Liu et al, J. Biol. Chem. 280:3665-3674 (2005), which is incorporated herein by reference.
- a method of the present disclosure can also employ polymerase delta3 in combination with FEN1 to add a single nucleotide at a nick during an incorporation step, for example, using conditions set forth in Lin et al, DNA Repair (Amst) 2013 November, 12(11) (doi: 10.1016/j.dnarep.2013.08.008), which is incorporated herein by reference.
- a nucleic acid detection or sequencing method of the present disclosure involves a plurality of double stranded nucleic acids, polymerases and/or nucleotides, wherein a plurality of ternary closed-complexes is detected or monitored.
- double stranded nucleic acids, or other reaction components can be attached to features of an array such as those exemplified previously herein.
- Clonally amplified double stranded nucleic acids may be sequenced together wherein the clones of a common template are localized in close proximity to allow for enhanced detecting or monitoring of the population of clones.
- the formation of a temary closed-complex ensures the synchronicity of base extension across a plurality of clonally amplified double stranded nucleic acids at one or more features of an array.
- the synchronicity of base extension allows for the addition of only one base per sequencing cycle.
- the methods of the present disclosure can be performed on a platform, where any component of the nucleic acid polymerization reaction is localized to a surface.
- the double stranded nucleic acid is attached to a planar substrate, a microparticle, or a nanoparticle.
- Such surfaces can be features of an array, for example, in a multiplex embodiment.
- all reaction components are freely suspended in the reaction mixture.
- the reaction conditions may be reset, recharged, or modified as appropriate, in preparation for the optional incorporation step or an additional examination step.
- all components of the examination step excluding the double stranded nucleic acid being sequenced, are removed or washed away, returning the system to the pre-examination condition.
- partial components of the examination step are removed.
- additional components are added to the examination step.
- this document discloses apparatus and methods for base-cycling, wherein the identity of a sequence of bases on a double stranded nucleic acid is determined by manipulating reagents using methods such as fluidic pumping, electrical manipulation, magnetic manipulation, mechanical manipulation, thermal manipulation and optical manipulation.
- one or more types of nucleotides are sequentially added to and removed from the sequencing reaction.
- 1, 2, 3, 4, or more types of nucleotides are added to and removed from the reaction mixture.
- one type of nucleotide is added to the sequencing reaction, removed, and replaced by another type of nucleotide.
- a nucleotide type present during the examination step is different from a nucleotide type present during the incorporation step.
- a nucleotide type present during one examination step is different from a nucleotide type present during a sequential examination step (i.e. the sequential examination step is performed prior to an incorporation step).
- nucleotides are added one type at a time, with the polymerase, to a reaction condition which favors ternary closed-complex formation.
- the polymerase binds only to the double stranded nucleic acid if the next correct nucleotide is present.
- a wash step after every nucleotide addition removes excess polymerases and nucleotides not involved in a ternary closed-complex. If the nucleotides are added one at a time, in a known order, the next base on the template strand of the double stranded nucleic acid is determined by the formation of a ternary closed-complex when the added nucleotide is the next correct nucleotide.
- nucleotides e.g. dATP, dGTP, dCTP, dTTP
- dATP dGTP
- dCTP dCTP
- dTTP dTTP
- nucleotides types e.g., dATP, dTTP, dCTP, dGTP
- nucleotide types are tethered to 1, 2, 3, 4, or more different polymerases; wherein each nucleotide type is tethered to a different polymerase and each polymerase has a different label or a feature from the other polymerases to enable its identification.
- all tethered nucleotide types are added together to a sequencing reaction mixture forming a ternary closed-complex comprising a tethered nucleotide-polymerase; the ternary closed-complex is monitored to identify the polymerase, thereby identifying the next correct nucleotide to which the polymerase is tethered.
- the tethering may occur at the gamma phosphate of the nucleotide through a multi-phosphate group and a linker molecule.
- Such gamma-phosphate linking methods are standard in the art, where a fluorophore is attached to the gamma phosphate linker.
- the labels include, but are not limited to, optical, electrical, thermal, colorimetric, mass, or any other detectable feature.
- different nucleotide types are identified by distinguishable labels.
- the distinguishable labels are attached to the gamma phosphate position of each nucleotide.
- the sequencing reaction mixture comprises a catalytic metal ion.
- the catalytic metal ion is available to react with a polymerase at any point in the sequencing reaction in a transient manner.
- the catalytic metal ion is available for a brief period of time, allowing for a single nucleotide complementary to the next base in the template strand of the double stranded nucleic acid to be incorporated to the 3' end of the primer during and incorporation step.
- no other nucleotides for example, the nucleotides complementary to the bases downstream of the next base in the template strand of the double stranded nucleic acid, are incorporated.
- the catalytic metal ion catalyzes the incorporation of the closed-complex next correct nucleotide, and as the catalytic metal ion is released from the active site, it is sequestered by a second chelating or caging agent, disabling the metal ion from catalyzing a subsequent incorporation.
- the localized release of the catalytic metal ion from its cheating or caged complex is ensured by using a localized uncaging or un-chelating scheme, such as an evanescent wave illumination or a structured illumination. Controlled release of the catalytic metal ions may occur for example, by thermal means.
Landscapes
- Life Sciences & Earth Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Organic Chemistry (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Health & Medical Sciences (AREA)
- Engineering & Computer Science (AREA)
- Microbiology (AREA)
- Immunology (AREA)
- Biotechnology (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Analytical Chemistry (AREA)
- Physics & Mathematics (AREA)
- Biochemistry (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Genetics & Genomics (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
A method for sequencing a nucleic acid including (a) providing a double stranded nucleic acid having a template strand and a second strand having a discontinuity, wherein the discontinuity includes a 3' hydroxyl at a nick or gap, the 3' hydroxyl being adjacent to a position of the second strand that pairs with a first position in the template strand; (b) forming a stabilized ternary complex including a polymerase, the double stranded nucleic acid and a cognate nucleotide that binds at the position of the second strand that pairs with a first position in the template strand; (c) examining the stabilized ternary complex, thereby acquiring a signal for identifying the nucleotide at the first position of the template strand; (d) covalently adding a nucleotide to the 3' hydroxyl, thereby translating the discontinuity downstream; and (e) repeating steps (b) through (d).
Description
SEQUENCING METHODS FOR DOUBLE STRANDED NUCLEIC
ACIDS
BACKGROUND
The determination of nucleic acid sequence information is an important part of biological and medical research. The sequence information may be helpful for identifying gene associations with diseases and phenotypes, identifying potential drug targets, and understanding the mechanisms of disease development. Sequence information is an important part of personalized medicine, where it is can be used for the diagnosis, treatment, or prevention of disease in a subject.
BRIEF SUMMARY
Provided herein is a method for sequencing a nucleic acid. The method can include steps of (a) providing a double stranded nucleic acid having a template strand and a second strand having a discontinuity, wherein the discontinuity includes a 3' hydroxyl at a nick or gap, the 3' hydroxyl being adjacent to a position of the second strand that pairs with a first position in the template strand; (b) forming a stabilized ternary complex including a polymerase, the double stranded nucleic acid and a cognate nucleotide that binds at the position of the second strand that pairs with the first position of the template strand; (c) examining the stabilized ternary complex, thereby acquiring a signal for identifying the nucleotide at the first position of the template strand; (d) after the examining of the stabilized ternary complex, covalently adding a nucleotide to the 3' hydroxyl at the position of the second strand that pairs with the first position of the template strand, thereby translating the discontinuity downstream; and (e) repeating steps (b) through (d), thereby determining a nucleotide sequence of the template strand from the acquired signals.
Also provided is a method for sequencing a double stranded nucleic acid molecule, the method comprising (a) contacting a first strand of the double stranded nucleic acid molecule comprising a nick site with a base-removing enzyme to create a base gap in the first strand of the double stranded nucleic acid molecule at the nick site; (b) contacting the double stranded nucleic acid molecule comprising the base gap in the first strand with a first reaction mixture comprising a polymerase and at least one unlabeled nucleotide molecule; (c)
detecting or monitoring the interaction of the polymerase with the double stranded nucleic acid molecule in presence of the unlabeled nucleotide molecule without chemical incorporation of the nucleotide molecule into the base gap of the first strand of the double stranded nucleic acid molecule; and (d) identifying the nucleotide added to the base gap of the first strand of the double stranded nucleic acid molecule based on the detected or monitored interaction of the polymerase with the double stranded nucleic acid molecule.
DETAILED DESCRIPTION
Typically, sequencing reactions involve the formation of a complex between a primer-template nucleic acid hybrid, polymerase and nucleotide triphosphate that results in polymerase catalyzed nucleotidyl transfer of the nucleotide to the primer. Such methods are known as sequencing by synthesis (SBS) and are carried out in a cyclic process whereby nucleotides are sequentially added to the growing primer. Typically, the nucleotides are modified with identifying tags so that the base type of the incorporated nucleotide can be detected as synthesis proceeds. In real-time embodiments of SBS, detection of a physical signal produced during incorporation of nucleotides may be measured concurrently with the reaction. Real-time procedures often suffer from inaccurate reads of regions containing highly repetitive sequences and homopolymeric regions. In an alternative embodiment, SBS proceeds via cycles of fluid delivery and detection (e.g. via a series of stop and proceed steps), wherein controlled reaction conditions and/or reagents reversibly stop and start the reaction at a given time during synthesis. For example, each cycle can be carried out using nucleotide(s) having a reversible terminator moiety. This moiety prevents more than one nucleotide from being added to the end of a growing primer until the terminator is removed or modified by a deblocking treatment. This in turn allows each nucleotide addition to be detected at a discreet time during the reaction cycle such that repeat regions and
homopolymers can be readily resolved. However, reversibly terminated nucleotides can be costly to manufacture, disruptive to polymerase activity due to their non-natural composition and susceptible to sequencing artifacts when termination is incomplete or when reversion to extendible form is incomplete.
An alternative sequencing technique is sequencing by binding (SBB). The SBB reaction involves examination of a ternary complex that forms between a primer-template nucleic acid hybrid, polymerase and nucleotide triphosphate. Unlike SBS, the examination step of SBB acquires signal that is used to determine nucleic acid base identity without the
need for nucleotide incorporation. The examination step may be controlled so that nucleotide incorporation is attenuated or inhibited. In such embodiments, a separate incorporation step may be performed, for example to complete a sequencing cycle and return the extended primer to a state that is ready for a subsequent cycle of examination and incorporation. The separate incorporation step may be accomplished without the need for detecting or monitoring, as the base has been identified during the previous examination step. The nucleotides used during the incorporation step can have reversible terminator moieties in order to control the length of extension between examination steps. However, many of the disadvantages of using reversibly terminated nucleotides in SBS are also undesirable for SBB.
The present disclosure provides a method whereby strand extension can be controlled such that a predefined number of nucleotides are incorporated to the 3' end of the strand. Such control is advantageous when used in the context of an SBB technique, for example, to extend a strand by a predefined number of nucleotides per reaction cycle. This in turn can provide improved resolution of individual nucleotides detected during each examination step. For example, each cycle can be correlated with a defined number of nucleotides in a repeat region, homopolymer region or other difficult sequence region. In particular embodiments, reaction conditions can be employed that result in only a single base extension per SBB cycle. For example, control of the number of nucleotides incorporated per cycle can be provided by a nucleic acid substrate having a structure that prevents extension. More specifically, an examination step can be carried out on a double stranded nucleic acid having a discontinuity in one of the strands, such as a nick or gap. Once signal has been acquired for determining the identity of the nucleotide at the nick or gap, an incorporation step can then be carried out under conditions where the incorporating polymerase is incapable of displacing the strand that is downstream of the newly added nucleotide. Thus, the present disclosure provides methods of single-base extension that do not require nucleotides with reversible terminators.
Blocking of primer extension via a downstream double stranded region can be reversed to allow subsequent examination and incorporation cycles. Thus, a sequencing method that employs a discontinuity in nucleic acid substrate to control primer extension, can be repeated in a cyclical fashion by carrying out one or more steps to translate the discontinuity downstream after each nucleotide is detected. For example, a nick or gap can be translated downstream using a polymerase having a 5' to 3' exonuclease activity or an exonuclease such as a flap endonuclease (e.g. FEN1), are described in further detail below.
In particular embodiments, discontinuity translation is achieved using a combination of polymerase beta and FEN1 or a combination of polymerase delta and FEN1.
A variety of complications that arise in sequencing methods can be overcome using methods described herein whereby a polymerase acts on a double stranded sequencing template that includes a gap, nick or other discontinuity. The discontinuity effectively provides a primer having a 3' end where appropriate polymerase and nucleotide binding can occur and, optionally, nucleotide incorporation can occur. The discontinuous, double stranded sequencing template is a natural, simple structure that many DNA polymerases utilize naturally, and present no difficulties in obtaining legitimate binding and extension complexes.
High-throughput, cost-effective nucleic acid sequencing has potential to usher in a new era of research and personalized medicine. Several commercial sequencing platforms are available, and although they sequence on the scale of entire genomes, they are still prohibitively expensive for mass-market genetic analysis. If sequencing costs are
significantly reduced, it will be possible to analyze genetic variation in detail between a larger number of species and individuals, providing a basis for personalized medicine, as well as for identifying links between genotypes and phenotypes. In addition to lower reagent and labor costs, goals for sequencing technologies include expanding throughput and improving accuracy.
The present disclosure provides compositions, methods and systems for sequencing a double stranded nucleic acid using a polymerase based sequencing by binding reaction. The present disclosure will employ, unless otherwise indicated, conventional molecular biology and bio-sensor techniques, which are within the skill of the art. The present disclosure incorporates by reference the entirety of the disclosure of the US patent application no.
14/805,381.
Terms used herein will be understood to take on their ordinary meaning in the relevant art unless specified otherwise. Several terms used herein and their meanings are set forth below.
As referred to herein, the term "blocking moiety," when used to reference a nucleotide, is a part of the nucleotide that inhibits or prevents the 3 ' position of the nucleotide from forming a covalent linkage to a second nucleotide during the incorporation step of a nucleic acid polymerization reaction. A "reversible" blocking moiety can be removed from the nucleotide or modified, allowing for nucleotide incorporation.
As used herein, the term "catalytic metal ion" refers to a metal ion that facilitates phosphodiester bond formation between the 3'-OH of a nucleic acid (e.g., a primer) and the phosphate of an incoming nucleotide by a polymerase. A "divalent catalytic metal cation" is a catalytic metal ion having a valence of two. Catalytic metal ions can be present at concentrations necessary to stabilize formation of a complex between a polymerase, a nucleotide, and a primed template nucleic acid, referred to as non-catalytic concentrations of a metal ion. Catalytic concentrations of a metal ion refer to the amount of a metal ion sufficient for polymerases to catalyze the reaction between the 3' -OH group of a nucleic acid (e.g., a primer) and the phosphate group of an incoming nucleotide.
As used herein, the term "closed complex" can be used to refer to a ternary closed complex between a first strand of a double stranded nucleic acid molecule, a polymerase, and a non-covalently bound nucleotide, wherein the nucleotide is complementary to a next template base on the template strand of the double stranded nucleic acid. Preferably, the nucleotide will be unlabeled. However, a ternary complex can be formed using a labeled nucleotide, if desired. In particular conditions set forth herein, the formation of a ternary closed-complex is favored over the formation of a binary complex between the first strand of the double stranded nucleic acid and the polymerase.
The terms "cycle" or "round," when used in reference to a sequencing run, refer to the portion of a sequencing run that is repeated to indicate the presence of a nucleotide. Typically, a cycle or round includes several steps such as steps for delivery of reagents, washing away unreacted reagents and detection of signals indicative of changes occurring in response to added reagents. For example, a sequencing by binding cycle can include steps of forming a ternary complex between polymerase nucleotide and double stranded nucleic acid, examining the ternary complex, and extending the double stranded nucleic acid by addition of one or more nucleotide. In particular embodiments, the SBB cycle can further include a wash step (e.g. to remove unbound components prior to the examining step) or a discontinuity translation step.
As used herein, the term "discontinuity," when used in reference to a nucleic acid strand, refers to a break in the covalent connectivity of the sugar phosphate backbone of the strand. A discontinuity will typically be flanked by an upstream nucleotide having a free 3' hydroxyl moiety and flanked by a downstream nucleotide to which it is not covalently attached. The downstream nucleotide can optionally have a free 5' phosphate or free 5' hydroxyl. An exemplary discontinuity is a "nick" which is characterized by absence of a phosphodiester bond between adjacent nucleotides of one strand in a double stranded nucleic
acid molecule. Another exemplary discontinuity is a "gap" which is characterized by absence of one or more nucleotides within a strand in a double stranded nucleic acid. Due to the absence of nucleotide(s) a gap is further characterized by the upstream 3' hydroxyl not being adjacent to the downstream free phosphate.
As used herein, the term "double stranded nucleic acid" can be used to refer to a polynucleotide that may include double stranded DNA, RNA, or any combination thereof that can be acted upon by a polymerizing enzyme during nucleic acid synthesis. RNA may include coding RNA, e.g. a messenger RNA (mRNA). Optionally, the RNA is a non-coding RNA. Optionally, the non-coding RNA is a transfer RNA (tRNA), ribosomal RNA (rRNA), snoRNA, microRNA, siRNA, snRNA, exRNA, piRNA and long ncRNA. A double stranded nucleic acid may possess a nick or a gap. A nucleic acid may represent a single, plural or clonally amplified population of nucleic acid molecules. A first strand of a double stranded nucleic acid can provide a template that is to be detected in a method set forth herein, for example, via a sequencing reaction. According to convention, a polymerase is understood to move along a template strand in the 3 ' to 5' direction, for example during a sequencing reaction. The second strand of the double stranded nucleic acid runs antiparallel to the template strand in the 5' to 3 ' direction. As such the polymerase interacts with the 3 ' end of the second strand. Optionally, the second strand is discontinuous such that a 3 ' end occurs at a nick or gap in the strand and where free nucleotides may bind to form a ternary complex with polymerase and, optionally be incorporated by a polymerase. The next template nucleotide pairs with the position that is immediately downstream of the 3 ' end of the second strand. The next template base determines the incoming nucleotide type to be incorporated as the next correct nucleotide at the position that is immediately downstream of the 3 ' end of the second strand.
As used herein, the term "exogenous," when used in reference to a moiety of a molecule, means a chemical moiety that is not present in a natural analog of the molecule. For example, an exogenous label of a nucleotide is a label that is not present on a naturally occurring nucleotide. Similarly an exogenous label that is present on a polymerase is not found on the polymerase in its native milieu. While a native nucleotide or polymerase may have a characteristic limited fluorescence profile, the native nucleotide or polymerase does not include an exogenous label moiety. Conversely, a dATP (2'-deoxyadenosine-5'- triphosphate) molecule modified to include a chemical linker and fluorescent moiety attached to the gamma phosphate would be said to include an exogenous label because the attached chemical components are not ordinarily a part of the nucleotide.
As used herein, the term "extension," when used in reference to a nucleic acid, refers to a process of adding at least one nucleotide to the 3' end of the nucleic acid. A nucleotide that is added to a nucleic acid by extension is said to be incorporated into the nucleic acid. Accordingly, the term "incorporating" can be used to refer to the process of joining a nucleotide to the 3' end of a nucleic acid by formation of a phosphodiester bond.
As used herein, the term "next correct nucleotide" refers to the nucleotide type that will bind and/or incorporate at the 3' end of a primer to complement a base in a template strand to which the primer is hybridized. The base in the template strand is referred to as the "next template nucleotide" and is immediately 5' of the base in the template that is hybridized to the 3' end of the primer. The next correct nucleotide can be referred to as the "cognate" of the next template nucleotide and vice versa. Cognate nucleotides that interact with each other in a ternary complex or in a double stranded nucleic acid are said to "pair" with each other. A nucleotide having a base that is not complementary to the next template base is referred to as an "incorrect" (or "non-cognate") nucleotide.
As used herein, the term "non-catalytic metal ion" refers to a cation that, when in the presence of a polymerase enzyme, does not facilitate phosphodiester bond formation needed for chemical incorporation of a nucleotide into a primer. A non-catalytic metal ion may interact with a polymerase, for example, via competitive binding compared to catalytic metal ions. A "divalent non-catalytic metal ion" is a non-catalytic metal ion having a valence of two. Examples of divalent non-catalytic metal ions include, but are not limited to, Ca2+, Zn , Co , Ni , and Sr . The trivalent Eu and Tb ions are non-catalytic metal ions having a valence of three.
As used herein, the term "nucleotide" can optionally refer to, a ribonucleotide, deoxyribonucleotide, nucleoside, modified nucleotide. The term can refer to any monomer component that can be or is incorporated into a double stranded nucleic acid as part of the sequencing process. A nucleotide comprises a nitrogenous base, a five-carbon sugar (ribose or deoxyribose), and at least one phosphate group. Optionally, a nucleotide has an exogenous moiety such as a detectable label. However, in alternative embodiments the nucleotide can alack any exogenous label. Optionally, the 3' position of the nucleotide is modified with a moiety, such as a 3' reversible or irreversible terminator. However, in alternative
embodiments the nucleotide can alack any terminator moiety instead being extendible at a reactive 3' hydroxyl moiety. A nucleotide may be adenine, cytosine, guanine, thymine, or uracil. Optionally, a nucleotide is an inosine, xanthine, hypoxanthine, isocytosine, isoguanine, nitropyrrole (including 3-nitropyrrole) or nitroindole (including 5-nitroindole)
base. Nucleotides may include, but are not limited to, ATP, UTP, CTP, GTP, ADP, UDP, CDP, GDP, AMP, UMP, CMP, GMP, dATP, dTTP, dCTP, dGTP, dADP, dTDP, dCDP, dGDP, dAMP, dTMP, dCMP, and dGMP. Nucleotides may also contain terminating inhibitors of DNA polymerase, dideoxynucleotides or 2', 3' dideoxynucleotides, which are abbreviated as ddNTPs (ddGTP, ddATP, ddTTP and ddCTP).
As used herein, the term "polymerase" can be used to refer to a nucleic acid synthesizing enzyme, including but not limited to, DNA polymerase, RNA polymerase, reverse transcriptase, primase and transferase. Typically, the polymerase comprises one or more active sites at which nucleotide binding and/or catalysis of nucleotide polymerization may occur. The polymerase may catalyze the polymerization of nucleotides to the 3' end of the first strand of the double stranded nucleic acid molecule. For example, a polymerase catalyzes the addition of a next correct nucleotide to the 3' OH group of the first strand of the double stranded nucleic acid molecule via a phosphodiester bond, thereby covalently incorporating the nucleotide to the first strand of the double stranded nucleic acid molecule. Optionally, a polymerase need not be capable of nucleotide incorporation under one or more conditions used in a method set forth herein.
Polymerases may include naturally-occurring polymerases and any modified variations thereof, including, but not limited to, mutants, recombinants, fusions, genetic modifications, chemical modifications, synthetics, and analogs. Naturally-occurring polymerases and modified variations thereof are not limited to polymerases which retain the ability to catalyze a polymerization reaction at the 3' end of a primer. Rather, a polymerase that retains the ability to discriminate properly base-paired nucleotides when forming a ternary complex can be useful whether or not the polymerase is capable of catalyzing incorporation of the properly paired nucleotide into the 3' end of the primer. Optionally, the naturally-occurring and/or modified variations thereof retain the ability to catalyze a polymerization reaction. Mutant polymerases include polymerases wherein one or more amino acids are replaced with other amino acids (naturally or non-naturally occurring), and insertions or deletions of one or more amino acids. Modified polymerases include polymerases that contain an external label which can be used to detect or monitor the presence and interactions of the polymerase. Optionally, intrinsic signals from the polymerase can be used to detect or monitor their presence and interactions. Some modified polymerases, or naturally occurring polymerases under specific reaction conditions, may incorporate only single nucleotides, and may remain bound to the first strand of the double stranded nucleic acid molecule after the incorporation of the single nucleotide.
Other useful polymerases include, for example, fusion proteins comprising at least two portions linked to each other. For example, a polymerase can comprise a peptide that can catalyze the polymerization of nucleotides into a nucleic acid strand linked to a second peptide, such as, a reporter enzyme or a processivity-modifying domain. One exemplary option of such a polymerase is T7 DNA polymerase, which comprises a nucleic acid polymerizing domain and a thioredoxin binding domain, wherein thioredoxin binding enhances the processivity of the polymerase. Absent the thioredoxin binding, T7 DNA polymerase is a distributive polymerase with processivity of only one to a few bases.
Although DNA polymerases differ in detail, they have a similar overall shape that has been analogized to that of a hand with specific regions referred to as the fingers, the palm, and the thumb; and a similar overall structural transition (analogized to opening and closing of the hand), comprising the movement of the thumb and/or finger domains, during the synthesis of nucleic acids.
As used herein, the term "sequencing run" refers to a repetitive process of physical or chemical steps that is carried out to obtain signals indicative of a sequence of nucleotides in a nucleic acid. The process can be carried out until signals from the process can no longer distinguish nucleotides of the target with a desired level of certainty. In some embodiments, completion can occur earlier, for example, once a desired amount of sequence information has been obtained. In some embodiments, a sequencing run is terminated when signals are no longer obtained from one or more target nucleic acid molecules from which signal acquisition was initiated.
As used herein, the term "strand displacing" refers to a condition or activity that results in separation of two nucleic acid strands downstream of a polymerase or other nucleic acid enzyme. For example, some polymerases have a strand displacing activity such that they can extend a primer through a downstream double stranded region of a template. Exemplary strand displacing polymerases include, but are not limited to phi29 polymerase and Bst Polymerase. Other polymerases, such as T4 and T7 polymerases, lack the ability to displace downstream double stranded regions and are thus non-strand displacing polymerases. Strand displacement can occur without degradation of the displaced strands, thus being distinct from exonuclease activity.
As used herein, the term "ternary complex" refers to an intermolecular association between a polymerase, a double stranded nucleic acid and a nucleotide. Typically, the polymerase facilitates interaction between the nucleotide and a template strand of the primed nucleic acid. A cognate nucleotide can interact with the template strand via Watson-Crick
hydrogen bonding. The term "stabilized ternary complex" means a ternary complex having promoted or prolonged existence or a ternary complex for which disruption has been inhibited. Generally, stabilization of the ternary complex prevents incorporation of the nucleotide component of the ternary complex into the primed nucleic acid component of the ternary complex.
As used herein, the term "translation," when used in reference to a discontinuity in a nucleic acid strand, refers to the result whereby a discontinuity in one part of the strand is replaced with a discontinuity in another part of the strand. In particular embodiments, the discontinuity in a strand is translated to a downstream position of the same strand. The discontinuity can be a nick or gap that is replaced with a nick or a gap. For example, nick translation can result when a nick in one strand of a double stranded nucleic acid is closed and a new nick or a gap is created at another location in the strand (i.e. the free 3' hydroxyl is moved from one nucleotide position to another). Gap translation can result when a gap in one strand of a double stranded nucleic acid is closed and a new gap or a nick is created at another location in the strand.
As used herein, the term "unlabeled" refers to a molecular species free of added or exogenous label(s) or tag(s). Of course, unlabeled nucleotides will not include either of an exogenous fluorescent label, or an exogenous Raman scattering tag. A native nucleotide is another example of an unlabeled molecular species. An unlabeled molecular species can exclude one or more of the labels set forth herein or otherwise known in the art relevant to nucleic acid sequencing or analytical biochemistry.
The embodiments set forth below and recited in the claims can be understood in view of the above definitions.
The present disclosure provides a method for sequencing a nucleic acid. The method can include steps of (a) providing a double stranded nucleic acid having a template strand and a second strand having a discontinuity, wherein the discontinuity includes a 3' hydroxyl at a nick or gap, the 3' hydroxyl being adjacent to a position of the second strand that pairs with a first position in the template strand; (b) forming a stabilized ternary complex including a polymerase, the double stranded nucleic acid and a cognate nucleotide that binds at the position of the second strand that pairs with the first position of the template strand; (c) examining the stabilized ternary complex, thereby acquiring a signal for identifying the nucleotide at the first position of the template strand; (d) after the examining of the stabilized ternary complex, covalently adding a nucleotide to the 3' hydroxyl at the position of the second strand that pairs with the first position of the template strand, thereby translating the
discontinuity downstream; and (e) repeating steps (b) through (d), thereby determining a nucleotide sequence of the template strand from the acquired signals.
Also provided herein are methods for sequencing a double stranded nucleic acid using a sequencing by binding reaction comprising (a) contacting a first strand of the double stranded nucleic acid molecule comprising a nick site with a base-removing enzyme to create a base gap in the first strand of the double stranded nucleic acid molecule at the nick site; (b) contacting the double stranded nucleic acid molecule comprising the base gap in the first strand with a first reaction mixture comprising a polymerase and at least one unlabeled nucleotide molecule; (c) examining the interaction of the polymerase with the double stranded nucleic acid molecule in presence of the nucleotide molecule without chemical incorporation of the nucleotide molecule into the base gap of the first strand of the double stranded nucleic acid molecule; and (d) identifying the nucleotide added to the base gap of the first strand of the double stranded nucleic acid molecule based on the detected or monitored interaction of the polymerase with the double stranded nucleic acid molecule. Optionally, the method further comprises contacting the double stranded nucleic acid molecule with a nicking enzyme to create the nick site in the first strand of the double stranded nucleic acid molecule. Optionally, the double stranded nucleic acid molecule comprises a flap.
The present disclosure also provides methods for sequencing a double stranded nucleic acid using a sequencing by binding reaction comprising a) providing a double stranded nucleic acid molecule with a nick site in the first strand of the double stranded nucleic acid molecule; b) contacting the first strand of the double stranded nucleic acid molecule comprising the nick site with a base-removing enzyme to create a base gap in the first strand of the double stranded nucleic acid molecule at the nick site; c) contacting the double stranded nucleic acid molecule comprising the base gap in the first strand with a first reaction mixture comprising a polymerase and at least one unlabeled nucleotide molecule; d) detecting or monitoring the interaction of the polymerase with the double stranded nucleic acid molecule in presence of the nucleotide molecule without chemical incorporation of the nucleotide molecule into the base gap of the first strand of the double stranded nucleic acid molecule; and e) identifying the nucleotide added to the base gap of the first strand of the double stranded nucleic acid molecule based on the detected or monitored interaction of step d).
A discontinuity in a double stranded nucleic acid, such as a nick or gap, provides a mechanism for controlling the step size during a sequencing run. Specifically, the location
and/or size of the discontinuity can be selected to limit the number of nucleotide positions interrogated per cycle of a sequencing run. For example, the size or location of the discontinuity can be chosen to limit the number of nucleotide positions interrogated to be no more than 1, 2, 3, 4, 5, 10 or more per cycle. In particular embodiments, a single polymerase is bound to the 3' nucleotide that flanks the discontinuity (referred to as "the 3' end of the discontinuity") to form a stabilized ternary complex that is detected during an examination phase. The stabilized complex is not free to extend the discontinuous strand and move in the 3' direction along the template. However, during a subsequent incorporation phase, the enzyme can move in the 3' direction along the template as the discontinuous strand is extended. The presence of a double stranded region downstream of the extended strand can be used to restrict the distance moved by the polymerase during the incorporation phase. For example, the incorporation can be carried out under non-strand displacing conditions. As such, the number of nucleotides added during the extension phase will be determined by the size of the gap between the 3' end of the discontinuous strand and the 5' nucleotide that flanks the gap (referred to as "the 5' end of the discontinuity").
Several methods can be used to create specifically located, single stranded nicks in double stranded templates for use in a method set forth herein. A few examples will be described here that are among those useful for DNA sequencing. The first method involves creating double stranded, hemi-modified sequencing templates which contain a restriction endonuclease recognition and cleavage site. One of the strands at the cleavage site is modified so that it will not be cleaved (for example by substituting phosphorothioate for phosphodiester linkages between nucleotides). When the restriction enzyme that is sensitive to such a hemi-phosphorothioate is used to cleave the DNA, only the non-modified strand is cleaved, thus creating a 3' OH from which primer extension can be initiated. Restriction endonucleases that are able to nick opposite a hemi-phosphorothioate modified strand include Hinc II, Hind II, Ava I and Nci I.
A second method for creating double stranded nucleic acid with a discontinuity uses a double stranded DNA template to be nicked (or gapped) at or near a position that is abasic (i.e. lacking a base attached to the template strand at the position) or occupied by an
"abnormal" base (e.g. deoxyuracil, deoxyinosine, or 8-oxoguanine). Removal or cleavage at or near the nucleotide containing the abnormal or missing base creates a nick or gap. An abnormal base can be incorporated in numerous ways, and several endonucleases and other repair enzymes have activity that generates single stranded gaps or nicks at such structures. One example is the use of deoxyinosine in the strand to be cleaved, and E. coli Endonuclease
V. Deoxyinosine may be included in a primer used for amplifying or generating double stranded sequencing templates; it is efficiently recognized by many DNA polymerases, and substitutes for deoxyguanosine by base pairing with deoxycytosine. After construction of the deoxyinosine containing template, specific cleavage 3' of the deoxyinosine nucleotide (but not removal) is accomplished with Endonuclease V, creating a 3' OH which can be recognized, bound in a ternary complex, extended, and/or otherwise utilized in sequencing determinations.
A third method for creating a double stranded nucleic acid with a discontinuity is to create double stranded sequencing templates that contain short regions of RNA. For example, the region of RNA can be covalently attached to a region of DNA at the site to be nicked, then the RNA can be specifically cleaved using RNAseH, thus creating a 3' OH useful for a sequencing method set forth herein. Alternatively, the RNA region need not be covalently attached to the DNA region, for example, being formed by hybridizing a DNA oligo and RNA oligo on a template to result in a nick or gap between the two types of nucleic acid. RNA segments can be included in single stranded or double stranded regions of template construction or amplification primers. RNAseH activity is present in enzymes such as E. coli RNAseH, and reverse transcriptases such as MMLV-RT or AMV-RT.
A fourth method for introducing a discontinuity uses restriction endonucleases specifically engineered to bind to its recognition sequence as usual, but only cleaves one strand (specifically the top or bottom strand relative to the recognition sequence). Such "nicking endonucleases" or NEs are typically engineered from restriction endonucleases that consist of a heterodimeric structure, and typically have an asymmetric recognition sequence. Each dimer identifies and cleaves one of the DNA strands at the recognition site. Thus the NEs are created by modifying one of the dimer polypeptides so that it still binds to the recognition sequence, but is unable to cut it. Thus, only one strand is cleaved, creating a nick and 3' OH that can interact with a polymerase. Sequencing templates can be constructed that include a recognition site for a NE from which the sequencing reaction will be initiated. Many nicking endonucleases are commercially available. Examples include Nb.BsmI, which cleaves the bottom strand inside of its 6 base pair asymmetric recognition site; Nt. Alwl, a modified type II restriction endonuclease which cleaves the top strand 4 nucleotides 3' of its 5 base pair asymmetric recognition site; Nb.BbvCI, which cleaves inside of its 7 base pair asymmetric recognition site. There also exist natural endonucleases with nicking activity, such as Nt.BstNBI, which cleaves the top strand 4 nucleotides 3' of its 5 base pair recognition site. Other natural enzymes with nicking activity, such as Nt.CviPII, may be less useful since
in this example the recognition sequence is very short, and therefore not as specific as the engineered and other natural NEs currently available. Some homing endonucleases which have very large degenerate recognition sites (up to >35 base pairs) may also exhibit single strand nicking activity.
In each of the methods described above for creating discontinuities, or others that can be contemplated, it will often be desirable to create double stranded sequencing templates that contain a single discontinuity site from which sequencing can be initiated. In other methods which can be contemplated, it may be desirable to create templates in which multiple discontinuity sites can be activated (sequentially or in parallel). In particular embodiments, sequencing reactions can be individually conducted at the multiple
discontinuity sites in a controlled fashion.
In the methods described above for creating discontinuities, or others that can be contemplated, enzymes are described with specific activities used to create or facilitate the formation of discontinuities at specific sites in double stranded sequencing templates. It is to be understood that many naturally occurring enzymes have desired properties, but also that some properties can be created, enhanced, diminished or abolished by engineering of existing enzymes. The engineered enzymes may have additional or enhanced capabilities for the methods described above.
In some embodiments, a method set forth herein can include a step of creating a gap. Examples of base removing enzymes that would be useful in this context are enzymes with natural or engineered double stranded-targeting 5 '-3 ' exonuclease activity that removes one or just a few nucleotides in a controlled manor. A useful enzyme in this context is a flap endonuclease, for example FEN1 , which cleaves a non-hybridized single-stranded 5 ' end "flap" from a hybridized portion. Other useful enzymes include DNA polymerases with 5 '-3' exonuclease activities. The 5 '-3 ' exonuclease activity is typically contained within a distinct domain of the enzyme (which can be removed to eliminate the activity when desired, such as with Klenow fragment of E. coli DNA Pol I). Examples of DNA polymerases with 5 '-3' exonuclease activity include E. coli DNA Pol I, and Thermus aquaticus DNA polymerase often used for PCR. Taq DNA pol also contains flap endonuclease activity (which can cleave an unhybridized 5 ' end of the leading strand as polymerization proceeds). Base removing enzymes and methods of their use are described in, for example, Guo et al, Molecular and Cellular Biology, 28(13):4310-4319 (2008); and Vallur and Maizels, PLoS One, 5(l):e8908 (2010); which are incorporated by reference herein in their entireties.
In some embodiments, a gap is created by the 5 '-3' exonuclease activity of a polymerase that has been modified to disable nucleotidyl transfer activity. For example, the polymerase may have mutations that disable polymerization but preserve 5 '-3 ' exonuclease activity. Alternatively or additionally, chemical modifications, non-catalytic metal ions or reagents that inhibit polymerase activity, but allow 5 '-3' exonuclease activity, may be used to allow a polymerase to create a gap downstream of the 3' end. In some embodiments, the ability of polymerases to selectively bind the next correct nucleotide at the 3 ' end is preserved while its catalytic extension activity is disabled (e.g. via mutations, chemical modifications, inhibitors or specific metal ions). In such embodiments, the ability to form a correct ternary complex and the 5 '-3' exonuclease activity are retained, whereby the polymerase can be used to detect the presence of the next correct nucleotide. Formation of a stabilized ternary complex leads to the non-processive trapping of the polymerase at the 3 ' end of the primer and the 5 '-3' exonuclease activity digests one or more bases downstream resulting in a gapped substrate. As such, a modified polymerase can be used for examination and discontinuity translation in a method set forth herein.
Polymerase-based methods for detecting nucleic acids of the present disclosure can include, for example, nucleic acid sequencing by binding reactions, wherein the polymerase undergoes conformational transitions between open and closed conformations during discrete steps of the reaction. In one step, the polymerase binds to a double stranded nucleic acid to form a binary complex, referred to, optionally, as the pre-insertion conformation. Optionally, the polymerase binds to the site of a nick or base gap present on one strand, ex., the first strand of a double stranded nucleic acid. In a subsequent step, an incoming nucleotide triphosphate (NTP) is bound and the polymerase fingers close, forming a pre-chemistry conformation comprising a polymerase, double stranded nucleic acid and nucleotide; wherein the bound nucleotide has not been incorporated. This step may be followed by a chemical step wherein a phosphodiester bond is formed with concomitant pyrophosphate cleavage from the nucleotide (nucleotide incorporation). The polymerase and double stranded nucleic acid with newly incorporated nucleotide produce a post-chemistry, pre-translation conformation. As both the pre-chemistry conformation and the pre-translocation
conformation comprise a polymerase, double stranded nucleic acid and the nucleotide, wherein the polymerase is in a closed state, either conformation may be referred to herein as
2+ a closed-complex. In the closed pre-insertion state, divalent catalytic metal ions, such as Mg mediate a rapid chemical step involving nucleophilic displacement of a pyrophosphate (PPi)
by the 3' hydroxyl termini of the first strand of the double stranded nucleic acid. The polymerase returns to an open state upon the release of PPi, the post-translocation step, and translocation initiates the next round of reaction. While a closed-complex can form in the absence of a divalent catalytic metal ions (e.g., Mg2+), it is proficient in chemical addition of nucleotide in the presence of the divalent metal ions. Low or deficient levels of catalytic
2+
metal ions, such as Mg tend to lead to non-covalent (physical) sequestration of the next correct nucleotide in a tight closed-complex. This closed-complex may be referred to as a stabilized or trapped closed-complex. In any reaction step described above, the polymerase configuration and/or interaction with a nucleic acid may be detected or monitored during an examination step to identify the next base in the nucleic acid sequence.
A nucleic acid sequencing reaction mixture can include any of a variety of reagents that are commonly present in polymerase based nucleic acid synthesis reactions. Reaction mixture reagents include, but are not limited to, enzymes (e.g., polymerase), nucleotides (e.g. dNTPs or NTPs), double stranded nucleic acids, salts, buffers, small molecules, co-factors, metals, and ions. The ions may be catalytic ions, non-covalent metal ions, or both. The reaction mixture can include salts such as NaCl, KC1, K-acetate, NH4-acetate, K-glutamate, NH4C1, or (NH4HSO4). The reaction mixture can include a source of ions, such as Mg2+ or Mn2+ Mg-acetate, Co2+ or Ba2+. The reaction mixture can include Ca2+, Zn2+, Cu2+, Co2+, FeS04, or Ni2+. The buffer can include Tris, Tricine, HEPES, MOPS, ACES, MES, phosphate-based buffers, and acetate-based buffers. The reaction mixture can include chelating agents such as EDTA and EGTA, and the like. Optionally, the reaction mixture includes cross-linking reagents.
An examination step of a method set forth herein can be used to detect a ternary complex between a polymerase, double stranded nucleic acid, and nucleotide. The ternary complex may be in a pre-chemistry conformation, wherein a nucleotide is sequestered but not incorporated. The closed-complex may be in a pre-translocation conformation, wherein a nucleotide is incorporated by formation of a phosphodiester bond with the 3' end of the first strand in the double stranded nucleic acid. The closed-complex may be formed in the absence of catalytic metal ions or deficient levels of catalytic metal ions, thereby physically sequestering the next correct nucleotide within the polymerase active site without chemical incorporation. Optionally, the sequestered nucleotide may be a non-incorporable nucleotide. The closed-complex may be formed in the presence of catalytic metal ions, where the closed- complex comprises a nucleotide analog which is incorporated, but a PPi is not capable of release. Optionally, the closed-complex is stabilized in a pre-translocation conformation.
Optionally, a pre-translocation conformation is stabilized by chemically cross-linking the polymerase. Optionally, the closed-complex may be stabilized by extemal means. Optionally, the closed-complex may be stabilized by allosteric binding of small molecules. Optionally, a closed-complex may be stabilized by pyrophosphate analogs that bind close to the active site with high affinity, preventing translocation of the polymerase.
A stabilized closed-complex can be formed by effectively trapping a polymerase at the 3' end of a discontinuity in one strand of a double stranded nucleic acid by one or a combinations of means, including but not limited to, crosslinking the thumb and finger domains in the closed conformation, binding of an allosteric inhibitor that prevents return of the polymerase to an open conformation, binding of pyrophosphate analogs that trap polymerase in the pre-translocation step, use of a polymerase mutant that is attenuated or inhibited in nucleotidyl transfer activity, or addition of non-catalytic divalent metal ions such as Ca2+ and Sr2+ as substitutes for a catalytic metal ion. As such, the polymerase may be trapped at the discontinuity site even after the incorporation of a nucleotide. Therefore, the polymerase may be trapped in the pre-chemistry conformation, pre-translocation step, post-translocation step or any intermediate step thereof.
Any of a variety of polymerases can be used in a method set forth herein. DNA polymerases include, but are not limited to, bacterial DNA polymerases, eukaryotic DNA polymerases, archaeal DNA polymerases, viral DNA polymerases and phage DNA polymerases. Bacterial DNA polymerases include E. coli DNA polymerases I, II and III, IV and V, the Klenow fragment of E. coli DNA polymerase, Clostridium stercorarium (Cst) DNA polymerase, Clostridium thermocellum (Cth) DNA polymerase and Sulfolobus solfataricus (Sso) DNA polymerase. Eukaryotic DNA polymerases include DNA
polymerases α, β, γ, δ, ζ, η, λ, σ, μ, i, k, as well as the Revl polymerase (terminal
deoxycytidyl transferase) and terminal deoxynucleotidyl transferase (TdT). Viral DNA polymerases include T4 DNA polymerase, phi-29 DNA polymerase, GA-1, phi-29-like DNA polymerases, PZA DNA polymerase, phi- 15 DNA polymerase, Cpl DNA polymerase, Cp7 DNA polymerase, T7 DNA polymerase, and T4 polymerase. Archaeal DNA polymerases include thermostable and/or thermophilic DNA polymerases such as DNA polymerases isolated from Thermus aquaticus (Label) DNA polymerase, Thermus flliformis (Tfi) DNA polymerase, Thermococcus zilligi (Tzi) DNA polymerase, Thermus thermophilus (Tth) DNA polymerase, Thermus flavusu (Tfl) DNA polymerase, Pyrococcus woesei (Pwo) DNA polymerase, Pyrococcus furiosus (Pfu) DNA polymerase and Turbo Pfu DNA polymerase, Thermococcus litoralis (Tli) DNA polymerase, Pyrococcus sp. GB-D polymerase,
Thermotoga maritima (Tma) DNA polymerase, Bacillus stearothermophilus (Bst) DNA polymerase, Pyrococcus Kodakaraensis (KOD) DNA polymerase, Pfx DNA polymerase, Thermococcus sp. JDF-3 (JDF-3) DNA polymerase, Thermococcus gorgonarius (Tgo) DNA polymerase, Thermococcus acidophilium DNA polymerase; Sulfolobus acidocaldarius DNA polymerase; Thermococcus sp. go N-7 DNA polymerase; Pyrodictium occultum DNA polymerase; Methanococcus voltae DNA polymerase; Methanococcus thermoautotrophicum DNA polymerase; Methanococcus jannaschii DNA polymerase; Desulfurococcus strain TOK DNA polymerase (D. Tok Pol); Pyrococcus abyssi DNA polymerase; Pyrococcus horikoshii DNA polymerase; Pyrococcus islandicum DNA polymerase; Thermococcus fumicolans DNA polymerase; Aeropyrum pernix DNA polymerase; and the heterodimeric DNA polymerase DP1/DP2.
RNA polymerases include, but are not limited to, viral RNA polymerases such as T7 RNA polymerase, T3 polymerase, SP6 polymerase, and Kll polymerase; Eukaryotic RNA polymerases such as RNA polymerase I, RNA polymerase II, RNA polymerase III, RNA polymerase IV, and RNA polymerase V; and Archaea RNA polymerase.
[0001] Reverse transcriptases include, but are not limited to, HIV-1 reverse transcriptase from human immunodeficiency virus type 1 (PDB 1HMV), HIV-2 reverse transcriptase from human immunodeficiency virus type 2 (PDB 5UPJ), M-MLV reverse transcriptase from the Moloney murine leukemia virus, AMY reverse transcriptase from the avian myeloblastosis virus, and Telomerase reverse transcriptase that maintains the telomeres of eukaryotic chromosomes.
A nucleotide analog may bind transiently to a polymerase-double stranded nucleic acid complex, but may, in some embodiments, be not incorporable (or substantially non- incorporable) in a nucleic acid polymerization reaction. Such nucleotides are particularly useful during an examination step. Alternatively, a nucleotide analog may bind to a polymerase-double stranded nucleic acid complex, and become incorporated (or substantially incorporated) in a nucleic acid polymerization reaction. Such incorporable nucleotides are useful during an incorporation step. However, such nucleotides can also be used during an examination phase when conditions are used to stabilize the nucleotide in a ternary complex that is not extended during examination. Nucleotide analogs may or may not have a structure similar to that of a native nucleotide comprising a nitrogenous base, five-carbon sugar, and phosphate group. Modified nucleotides may have modifications, such as moieties, which replace and/or modify any of the components of a native nucleotide. The non-incorporable nucleotides can be alpha-phosphate modified nucleotides, alpha-beta nucleotide analogs,
beta-phosphate modified nucleotides, beta-gamma nucleotide analogs, gamma-phosphate modified nucleotides, caged nucleotides, or ddNTPs. Some examples of nucleotide analogs are described in, U. S. Patent No. US 8,071,755, which is incorporated by reference here in its entirety.
As described herein, a polymerase based, sequencing by binding reaction can involve steps of; a) providing a double stranded nucleic acid, b) providing the first strand of the double stranded nucleic acid with a polymerase and one or more types of nucleotides, wherein the nucleotides may or may not be complementary to the next base of the template strand of the double stranded nucleic acid, and c) examining the interaction of the polymerase with the double stranded nucleic acid under conditions wherein either (i) chemical incorporation of a nucleotide to the first strand of the double stranded nucleic acid is disabled or severely inhibited in the pre-chemistry conformation, or (ii) a single nucleotide incorporation occurs at the 3 ' end of the first strand of the double stranded nucleic acid. Optionally, wherein the pre-chemistry conformation is stabilized prior to nucleotide incorporation, a separate incorporation step may follow the examination step to incorporate a single nucleotide to the 3 ' end of the first strand of the double stranded nucleic acid.
Optionally, wherein a single nucleotide incorporation occurs, the pre- translocation conformation may be stabilized to facilitate examination and/or prevent subsequent nucleotide incorporation.
Whereas individual double stranded nucleic acid molecules may be described for ease of exposition, the sequencing methods provided herein readily encompass a plurality of double stranded nucleic acids, wherein the plurality of nucleic acids may be clonally amplified copies of a single nucleic acid, or disparate nucleic acids, including combinations, such as populations of disparate nucleic acids that are clonally amplified.
Optionally, a method of the present disclosure can further include an incorporation step, the incorporation step comprising incorporating a single unlabeled nucleotide in the base gap of the first strand of the double stranded nucleic acid molecule. Optionally, the method further comprises repeating the examination step and the incorporation step to sequence the template strand of the double stranded nucleic acid molecule. Optionally, the examination step may be repeated one or more times prior to performing the incorporation step. Optionally, prior to incorporating the single unlabeled nucleotide in the base gap of the first strand of the double stranded nucleic acid molecule, the first reaction mixture is replaced with a second reaction mixture comprising a polymerase and 1, 2, 3 or 4 types of unlabeled nucleotide molecules selected from dATP, dTTP, dCTP and dGTP.
Optionally, an unlabeled nucleotide used in a method herein will include a 3' hydroxyl group. Optionally, at least one unlabeled nucleotide molecule includes a free 3' hydroxyl group. Optionally, the 3' hydroxyl group of the at least one unlabeled nucleotide molecule is modified to have a 3' terminator moiety. The 3' terminator moiety may be a reversible terminator. The 3' terminator moiety may be an irreversible terminator. Optionally, the irreversible terminator of the at least one unlabeled nucleotide molecule is replaced or removed after an examination step, for example, to facilitate a subsequent incorporation step. It will be understood that in some embodiments termination is provided by a downstream double stranded region that flanks a discontinuity at which polymerase activity occurs. As such, a nucleotide used for incorporation need not include a terminator moiety.
A polymerase can interact with a discontinuity in one strand of a double stranded nucleic acid molecule in the presence of at least one unlabeled nucleotide molecule to form a closed complex. Optionally, the closed complex is a ternary closed-complex comprising the first strand of the double stranded nucleic acid molecule, the polymerase, and the unlabeled nucleotide, wherein the unlabeled nucleotide is complementary to the base on the template strand of the double stranded nucleic strand. Optionally, the formation of a ternary closed- complex is favored over the formation of a binary complex between the first strand of the double stranded nucleic acid and the polymerase.
Optionally, the formation of the ternary closed-complex may be favored over the formation of the binary complex when the first reaction mixture comprises a high
concentration of salt. The formation of the ternary closed-complex may be favored over the formation of the binary complex when the first reaction mixture comprises a buffer having a high pH.
A ternary complex stabilizing reaction mixture can include 1, 2, 3, or 4 types of unlabeled nucleotide molecules, for example, selected from dATP, dTTP, dCTP, and dGTP. Optionally, a closed-complex is formed between a discontinuity in the first strand of a double stranded nucleic acid molecule, polymerase, and any one of the four unlabeled nucleotide molecules so that four types of closed complexes may be formed Optionally, an examination step comprises measuring association kinetics of the formation of the closed-complex formed between the first strand of the double stranded nucleic acid molecule, the polymerase, and any one of the four unlabeled nucleotide molecules. Optionally, the measured association kinetics is different depending on the identity of the unlabeled nucleotide molecule in the closed-complex.
Optionally, the polymerase has a different affinity for each of the four types of nucleotide molecules in each type of closed-complex. Additionally, the polymerase may have a different dissociation constant for each of the four types of unlabeled nucleotide molecules in each type of closed-complex. Such differences can be measured in a method set forth herein in order to distinguish the binding of one type of nucleotide at a 3' end of a strand from another type of nucleotide that is used in the method.
The absence of a catalytic metal ion during an examination step prevents the chemical incorporation of the nucleotide molecule at the discontinuity of the first strand of the double stranded nucleic acid molecule. Optionally, the presence of the catalytic metal ion in the first reaction mixture induces chelation, and catalyzes the chemical incorporation of the nucleotide molecule in the discontinuity of the first strand of the double stranded nucleic acid molecule
Optionally, at least one unlabeled nucleotide molecule used in an examination step is a non-incorporable nucleotide. Optionally, the first strand of the double stranded nucleic acid used during an examination step does not contain a free hydroxyl group at its 3' end.
A method provided herein can include the addition of a polymerase inhibitor to the ternary complex, thereby preventing the chemical incorporation of the unlabeled nucleotide molecule at a discontinuity of a double stranded nucleic acid molecule during examination. Optionally, the polymerase inhibitor is a pyrophosphate analog. Optionally, the polymerase inhibitor is an allosteric inhibitor. Optionally, the polymerase inhibitor is a reverse transcriptase inhibitor. Optionally, the polymerase inhibitor is HIV-1 reverse transcriptase inhibitor. Optionally, the HIV-1 reverse transcriptase inhibitor may be a (4/6- halogen/MeO/EtO-substituted benzo[d]thiazol-2-yl)thiazolidin-4-one. Optionally, the polymerase inhibitor is HIV-2 reverse transcriptase inhibitor.
Optionally, a double stranded nucleic acid that is detected or sequenced in a method set forth herein is immobilized to a surface. The surface may be a planar substrate, a microparticle, or a nanoparticle. Optionally the surface is an array and a plurality of double stranded nucleic acids are immobilized at separate features on the array. As such, the methods set forth herein can be carried out in parallel for a plurality of target nucleic acids. For example, the features can include "nanoballs" of amplified DNA fragments, bound to a surface. Depending on the target application, many features may be present on a surface, for example up to millions, tens of millions, or more. DNA nanoballs can be made using methods and compositions as described, for example, in U.S. Pat. No. 7,910,354; or US Pat. App. Publ. Nos. 2009/0264299 Al, 2009/0011943 Al, 2009/0005252 Al, 2009/0155781 Al,
or 2009/0118488 Al; or Drmanac et al, 2010, Science 327(5961): 78-81 ; each of which is incorporated herein by reference.
Nanoballs are one type of nucleic acid amplification product that can be used to form a feature on an array. Other useful amplification products include those produced by solid-phase amplification techniques. For example, amplification can be carried out using bridge amplification to form nucleic acid clusters on a surface. Useful bridge amplification methods are described, for example, in U.S. Pat. Nos. 5,641,658 or 7,115,400; or US Pat. App. Pub. Nos. 2002/0055100 Al, 2004/0096853 Al, 2004/0002090 Al, 2007/0128624 Al; or 2008/0009420 Al, each of which is incorporated herein by reference. Another useful method for amplifying nucleic acids on a surface is rolling circle amplification (RCA), for example, as described in Lizardi et al, Nat. Genet. 19:225-232 (1998) and US Pat. App. Pub. No. 2007/0099208 Al, each of which is incorporated herein by reference. Emulsion PCR on beads can also be used, for example as described in Dressman et al, Proc. Natl. Acad. Sci. USA 100:8817-8822 (2003), WO 05/010145, US Pat. App. Pub. No. 2005/0130173 Al or US Pat. App. Pub. No. 2005/0064460 Al, each of which is incorporated herein by reference. A system or method of the present disclosure can use one or more of the reagents described in the above references for making and using nanoballs or other nucleic acid features.
Optionally, double stranded nucleic acids are provided as a plurality of clonally amplified double stranded nucleic acid molecules. Optionally, the plurality of double stranded nucleic acid molecules are distinguishable from each other.
An examination step can involve binding a polymerase to a discontinuity site of a double stranded nucleic acid in a reaction mixture comprising one or more nucleotides, and detecting or monitoring the interaction. Optionally, a nucleotide is sequestered within the polymerase-double stranded nucleic acid complex to form a closed-complex, under conditions in which incorporation of the enclosed nucleotide by the polymerase is attenuated or inhibited. This closed-complex is in a stabilized or polymerase-trapped pre-chemistry conformation. Optionally, a closed-complex allows for the incorporation of the enclosed nucleotide, but does not allow for the incorporation of a subsequent nucleotide. This closed- complex is in a stabilized or trapped pre-translocation conformation. Optionally, the polymerase is trapped at the polymerization site in its closed-complex by one or a combination of means, not limited to, crosslinking of the polymerase domains, crosslinking of the polymerase to the nucleic acid, allosteric inhibition by small molecules, uncompetitive inhibitors, competitive inhibitors, non-competitive inhibitors, and denaturation; wherein the
formation of the trapped closed-complex provides information about the identity of the next base on the temple strand of the double stranded nucleic acid.
Optionally, the identity of the next base is determined by detecting or monitoring the presence, formation, and/or dissociation of the ternary closed-complex. The identity of the next base may be determined without chemically incorporating the next correct nucleotide to the 3' end of the first strand of the double stranded nucleic acid. Optionally, the identity of the next base is determined by detecting or monitoring the affinity of the polymerase to the first strand of the double stranded nucleic acid in the presence of added nucleotides.
Optionally, the affinity of the polymerase to the first strand of the double stranded nucleic acid in the presence of the next correct nucleotide may be used to determine the next correct base on the template strand of the double stranded nucleic acid. Optionally, the affinity of the polymerase to a discontinuous double stranded nucleic acid in the presence of an incorrect nucleotide may be used to determine the next correct base to be incorporated on the 3' end of the discontinuity in the double stranded nucleic acid.
An examination step may be controlled, in part, by providing reaction conditions to prevent chemical incorporation of a nucleotide, while allowing determination of the identity of the next base on the nucleic acid. Such reaction conditions may be referred to as examination reaction conditions. Optionally, a ternary closed-complex is formed under examination conditions. Optionally, a stabilized ternary closed-complex is formed under examination conditions. Optionally, a stabilized ternary closed complex is in a pre-chemistry conformation. Optionally, a stabilized ternary closed-complex is in a pre-translocation conformation, wherein the enclosed nucleotide has been incorporated, but the closed-complex does not allow for the incorporation of a subsequent nucleotide. Optionally, the examination conditions accentuate the difference in affinity for polymerase to the first strand of the double stranded nucleic acid in the presence of different nucleotides. Optionally, the examination conditions cause differential affinity of the polymerase to the first strand of the double stranded nucleic acid in the presence of different nucleotides.
Examination typically includes detecting polymerase interaction with a double stranded nucleic acid and nucleotide. Detection may include optical, electrical, thermal, acoustic, chemical and mechanical means. Optionally, examination is performed after a wash step, wherein the wash step removes any non-bound reagents from the region of observation. Optionally, examination is performed during a wash step, such that as a further option, the dissociation kinetics of the polymerase-nucleic acid, or polymerase-nucleic acid-nucleotide complexes may be used to determine the identity of the next base. Optionally, examination is
performed during the course of addition of the examination mixture, such that as a further option, the association kinetics of the polymerase to the nucleic acid may be used to determine the identity of the next base on the template strand of the double stranded nucleic acid. Optionally, examination involves distinguishing ternary closed-complexes from binary complexes of polymerase and nucleic acid. Optionally, examination is performed under equilibrium conditions where the affinities measured are equilibrium affinities. In exemplary options, multiple examination steps, comprising different or similar examination reagents, may be performed sequentially to ascertain the identity of the next template base. Multiple examination steps may be utilized in cases where multiple double stranded nucleic acids are being sequenced simultaneously in one sequencing reaction, wherein different nucleic acids react differently to the different examination reagents (e.g. in an array format). Optionally, multiple examination steps may improve the accuracy of next base determination.
Provided herein are methods for the formation and/or stabilization of a ternary closed-complex comprising a polymerase bound to a discontinuity in a double stranded nucleic acid and a nucleotide enclosed within the polymerase-nucleic acid complex, under examination reaction mixture conditions. Examination reaction conditions may inhibit or attenuate nucleotide incorporation. Optionally, incorporation of the enclosed nucleotide is inhibited and the complex is stabilized or trapped in a pre-chemistry conformation.
Optionally, the enclosed nucleotide is incorporated and a subsequent nucleotide incorporation is inhibited. Optionally, the complex is stabilized or trapped in a pre-translocation conformation. For the sequencing reactions provided herein, the closed-complex can be stabilized during the examination step, allowing for controlled nucleotide incorporation. Optionally, a stabilized closed-complex is a complex wherein incorporation of an enclosed nucleotide is attenuated, either transiently (e.g. to examine the complex and then incorporate the nucleotide) or permanently (e.g. for examination only) during an examination step. Optionally, a stabilized closed-complex allows for the incorporation of the enclosed nucleotide, but does not allow for the incorporation of a subsequent nucleotide. Optionally, the closed-complex is stabilized in order to detect or monitor any polymerase interaction with a double stranded nucleic acid in the presence of a nucleotide for identification of the next base in the double stranded nucleic acid.
Optionally, the ternary closed-complex is transiently formed during the examination step of the sequencing methods provided herein. Optionally, the ternary closed-complex is stabilized during the examination step. The stabilized closed-complex may, in some conditions, not allow for the incorporation of a nucleotide in a polymerization reaction during
the examination step. For example, this includes incorporation of the enclosed nucleotide and/or incorporation of a subsequent nucleotide after the enclosed nucleotide. Reaction conditions that may modulate the stability of a ternary closed-complex include, but are not limited to, the availability of catalytic metal ions, suboptimal or inhibitory metal ions, ionic strength, pH, temperature, polymerase inhibitors, cross-linking reagents, and any combination thereof. Reaction reagents which may modulate the stability of a ternary closed-complex include, but are not limited to, non-incorporable nucleotides, incorrect nucleotides, nucleotide analogs, modified polymerases, template nucleic acids with non-extendible polymerization initiation sites, and any combination thereof.
Optionally, a ternary closed-complex is released from its trapped or stabilized conformation, which may allow for nucleotide incorporation to the 3' end of the discontinuity in a strand of the double stranded nucleic acid, for example, in a subsequent incorporation step. The ternary closed-complex can be destabilized and/or released by modulating the composition of the reaction conditions. In addition, the ternary closed-complex can be destabilized by electrical, magnetic, and/or mechanical means. Mechanical means include mechanical agitation, for example, by using ultrasound agitation. The mechanical vibration destabilizes the ternary closed- complex, as well as suppresses binding of the polymerase to the DNA. Thus, although a wash step can be used to replace the examination reaction mixture with an incorporation mixture, alternatively or additionally, mechanical agitation may be used to remove the polymerase from the template strand of the double stranded nucleic acid. Removal of the polymerase can be used to facilitate cycling through successive incorporation steps at a single nucleotide addition per step.
Any combination of ternary closed-complex stabilization or ternary closed-complex release reaction conditions and/or methods may be combined. For example, a polymerase inhibitor which stabilizes a ternary closed-complex may be present in the examination reaction with a catalytic ion which functions to release the closed-complex. In the aforementioned example, the ternary closed-complex may be stabilized or released, depending on the polymerase inhibitor properties and concentration, the concentration of the catalytic metal ion, other reagents and/or conditions of the reaction mixture, and any combination thereof.
Optionally, the ternary closed-complex is stabilized under reaction conditions where covalent attachment of a nucleotide to the 3' end of a discontinuity in a nucleic acid molecule is attenuated or inhibited. Optionally, the ternary closed-complex is in a pre-chemistry conformation. Optionally, the closed-ternary complex is in a pre-translocation conformation.
The formation of this ternary closed-complex can be initiated and/or stabilized by modulating the availability of a catalytic metal ion that permits ternary closed-complex release and/or chemical incorporation of a nucleotide to the 3' end of the discontinuity. Exemplary catalytic metal ions include, but are not limited to, magnesium, manganese, cobalt, and barium.
Catalytic ions may be any formulation, for example, salts such as MgCl2, Mg (C2H302)2, and MnCl2. Exemplary conditions for delivery and use of catalytic ions in a sequencing by binding reaction are set forth in US patent application no. 14/805,381, which is incorporated herein by reference.
A ternary closed-complex may be formed and/or stabilized by sequestering, removing, reducing, omitting, and/or chelating a catalytic metal ion during an examination step so that closed-complex release and/or chemical incorporation does not occur. Chelation can be carried out using any condition which renders the catalytic metal ion unavailable for nucleotide incorporation, including presence EDTA and/or EGTA. A reduction can be achieved by diluting the concentration of a catalytic metal ion in the reaction mixture. For example, the reaction mixture can be diluted or replaced with a solution comprising a non- catalytic ion, which permits closed-complex formation, but inhibits nucleotide incorporation. Non-catalytic ions include, but are not limited to, calcium, strontium, scandium, titanium, vanadium, chromium, iron, cobalt, nickel, copper, zinc, gallium, germanium, arsenic, selenium, rhodium, and strontium. Optionally, Ca2+ is provided in an examination reaction to facilitate closed-complex formation. Optionally, Sr2+ is provided in an examination reaction to facilitate closed-complex formation. Exemplary conditions for delivery and use of non- catalytic ions in a sequencing by binding reaction are set forth in US patent application no. 14/805,381, which is incorporated herein by reference.
Optionally, a non-catalytic metal ion and a catalytic metal ion are both present in the reaction mixture, wherein one ion is present in a higher effective concentration than the other.
Optionally, the affinity of the polymerase (e.g., Klenow fragment of E. coli DNA polymerase I, Bst) for each dNTPs (e.g., dATP, dTTP, dCTP, dGTP) in the presence of a non-catalytic ion such as Sr2+ is different. Therefore, examination of the ternary complex can involve measuring the binding affinities of polymerase-double stranded nucleic acids to dNTPs; wherein binding affinity is indicative of the next base in the template strand of the double stranded nucleic acid. After examination, a wash step can be used to remove the non- catalytic ion and unbound nucleotides, and a catalytic metal ion can then be added to the
reaction to induce PPi cleavage and nucleotide incorporation. In sequencing embodiments, the reaction may be repeated until a desired read-length is obtained.
A closed complex may be formed and/or stabilized by the addition of a polymerase inhibitor to the examination reaction mixture. Inhibitor molecules phosphonoacetate, (phosphonoacetic acid) and phosphonoformate (phosphonoformic acid, common name Foscarnet), Suramin, Aminoglycosides, INDOPY-1 and Tagetitoxin are non-limiting examples of uncompetitive or noncompetitive inhibitors of polymerase activity that can be used. The binding of the inhibitor molecule, near the active site of the enzyme, traps the polymerase in either a pre-translocation or post-translocation step of the nucleotide incorporation cycle, stabilizing the polymerase in its ternary closed-complex conformation before or after the incorporation of a nucleotide, and forcing the polymerase to be bound to the double stranded nucleic acid until the inhibitor molecules are not available in the reaction mixture by removal, dilution or chelation. Another useful inhibitor molecule is the drug Efavirenz, which acts as an uncompetitive inhibitor to the HIV I reverse transcriptase. Other useful uncompetitive inhibitors include, for example, pyrophosphate analogs such as Foscarnet (phosphonoformate), phosphonoacetate or other pyrophosphate analogs.
Alternatively or additionally, polymerase inhibitors found to be effective in inhibiting a HIV- 1 reverse transcriptase polymerase can be employed to stabilize a ternary closed-complex. Useful HIV-1 reverse transcriptase inhibitors include, for example, nucleoside/nucleotide reverse transcriptase inhibitors (NRTI) and non-nucleoside reverse transcriptase inhibitors (N RTI). Exemplary conditions for delivery and use of polymerase inhibitors in a sequencing by binding reaction are set forth in US patent application no. 14/805,381, which is incorporated herein by reference.
Optionally, the stabilization of a closed-complex using polymerase inhibitors is combined with additional reaction conditions which also function to stabilize a closed- complex, including, but not limited to, sequestering, removing, reducing, omitting, and/or chelating a catalytic metal ion, the presence of a modified polymerase in the closed-complex, a non-incorporable nucleotide in the closed-complex, and any combination thereof.
Optionally, an engineered polymerase is used. The polymerase can be engineered to be stabilized in the ternary complex conformation compared to a native version of the polymerase. For example, a polymerase can be engineered to include cysteines that are positioned so that when a closed-complex is formed, the cysteines are in close proximity to form at least one disulfide bond to trap the polymerase in the closed conformation. A reducing agent such as 2-mercaptoethanol (BME), cysteine- HC1, dithiothreitol (DTT), Tris
(2-carboxy ethyl) phosphine (TCEP), or any combination thereof may be used to reduce the disulfide bond and release the polymerase. Alternatively or additionally, a polymerase can be stabilized in ternary complex form via cross-linking residues of the polymerase. The cross- linking methods can be reversible or non-reversible. Exemplary methods for making and using engineered polymerases and/or crosslinking chemistry in a sequencing by binding method are set forth in US patent application no. 14/805,381, which is incorporated herein by reference.
Optionally, a closed-complex of an examination step comprises a nucleotide analog or modified nucleotide to facilitate stabilization of the closed-complex. Optionally, a nucleotide analog comprises a nitrogenous base, five-carbon sugar, and phosphate group; wherein any component of the nucleotide may be modified and/or replaced. Nucleotide analogs may be non-incorporable nucleotides. Non-incorporable nucleotides may be modified to become incorporable at any point during the sequencing method. Exemplary methods for making and using non-incorporable nucleotide analogs in a sequencing by binding method are set forth in US patent application no. 14/805,381, which is incorporated herein by reference. Other useful analogs are set forth in U.S. Patent No. 8,071,755 which is incorporated herein by reference.
In an exemplary sequencing reaction, the examination step comprises formation and/or stabilization of a ternary closed-complex comprising a polymerase, double stranded nucleic acid, and nucleotide. Characteristics of the formation, presence and/or release of the closed-complex are detected or monitored to identify the enclosed nucleotide and therefore the next base in the template strand of the double stranded nucleic acid. Optionally, ternary closed-complex characteristics are dependent on the sequencing reaction components (e.g., polymerase, double stranded nucleic acid, nucleotide) and/or reaction mixture components and/or conditions.
In particular embodiments, the examination step involves detecting or monitoring the interaction of a polymerase with a double stranded nucleic acid and nucleotide. The formation of a ternary closed-complex may be detected or monitored. Optionally, the absence of formation of ternary closed-complex is determined based on results of a detection step detected or monitored. Optionally, the dissociation of a ternary closed-complex is detected or monitored. Optionally, the incorporation step involves detecting or monitoring incorporation of a nucleotide. Alternatively, detection or monitoring can be carried out prior to
incorporation. As such, a nucleic acid detection or sequencing method of the present
disclosure may be carried out without detecting signals due to incorporation of a nucleotide and without detecting signals from a nucleotide after it has been incorporated.
Optionally, any process of the examination and/or incorporation step may be detected or monitored. Optionally, a nucleotide does not have a detectable label. For example, a polymerase can be labeled instead. Optionally, a nucleotide has a detectable label. A polymerase can have a detectable label in a method set forth herein, whether or not nucleotide in a temary complex with the polymerase is labeled. Optionally, a detectable label present on a nucleotide or polymerase is removable under conditions that are not disruptive of other reaction components that are to be maintained to complete a method set forth herein.
Alternatively, no component of the sequencing reaction is detectably labeled. For example, a method that detects unlabeled molecular complexes, such as the surface plasmon resonance techniques set forth below, can be used.
Optionally, detecting or monitoring the variation in affinity of a polymerase to a double stranded nucleic acid in the presence of correct and incorrect nucleotides, under conditions that may or may not allow the incorporation of the nucleotide, may be used to determine the sequence of the nucleic acid. The affinity of a polymerase to a double stranded nucleic acid in the presence of different nucleotides, including modified or labeled nucleotides, can be monitored as the on-rate or off-rate of the polymerase-nucleic acid interaction in the presence of the various nucleotides. Exemplary methods detecting kinetics of nucleotide binding in a sequencing by binding method are set forth in US patent application no. 14/805,381, which is incorporated herein by reference.
Optionally, the interaction between the polymerase and the nucleic acid is detected or monitored via a detectable label attached to the polymerase. The label may be detected by methods including, but limited to, optical, electrical, thermal, mass, size, charge, vibration, and pressure. The label may be magnetic, fluorescent or charged. For external and internal label schemes, fluorescence anisotropy may be used to determine the stable binding of a polymerase to a nucleic acid in a closed-complex. Exemplary polymerase labels and techniques for their detection in a sequencing by binding method are set forth in US patent application no. 14/805,381, which is incorporated herein by reference.
Optionally, a conformationally sensitive dye may be attached close to the active site of a polymerase, wherein a change in conformation, or a change in polar environment due to the formation of a ternary closed-complex is reflected as a change in fluorescence or absorbance properties of the dye. Accordingly, polymerases that form a temary complex with a double stranded nucleic acid and nucleotide can be distinguished from polymerases that do
not form a ternary complex based on differences in fluorescence or absorbance signals from a conformationally sensitive dye. As such, the identity of the next correct nucleotide for a double stranded nucleic acid can be determined from the signals obtained from a
conformationally sensitive dye on a polymerase that forms a ternary complex with the nucleic acid and next correct nucleotide.
Optionally, one or more nucleotides are attached to a label that is detected in an examination step of a method set forth herein. Different labels on different types of nucleotides may be distinguishable by means of their differences in fluorescence, Raman spectrum, charge, mass, refractive index, luminescence, length, or any other measurable property. Under suitable reaction conditions, the labeled nucleotides may be enclosed in a closed-complex with the polymerase and the double stranded nucleic acid, and the identity of the nucleotide identified from the attached label. Exemplary nucleotide labels and techniques for their detection in a sequencing by binding method are set forth in US patent application no. 14/805,381, which is incorporated herein by reference.
Optionally, the interaction between a polymerase, double stranded nucleic acid and nucleotide is detected or monitored without the use of a label. The interaction may be detected or monitored by detecting the change in refractive index, charge detection, Raman scattering detection, ellipsometry detection, pH detection, size detection, mass detection, surface plasmon resonance, guided mode resonance, nanopore optical interferometry, whispering gallery mode resonance, nanoparticle scattering, photonic crystal, quartz crystal microbalance, bio-layer interferometry, vibrational detection, pressure detection and other label-free detection schemes that detect the added mass or refractive index due to polymerase binding in a ternary closed-complex with a double stranded nucleic acid. Exemplary methods for label-free detection of ternary complexes in a sequencing by binding method are set forth in US patent application no. 14/805,381, which is incorporated herein by reference.
Optionally, the sequencing methods described herein involve an incorporation step. The incorporation step involves chemically incorporating one or more nucleotides to the 3' end of the first strand of the double stranded nucleic acid. In exemplary options, a single nucleotide is incorporated at the 3' end of a discontinuity in a double stranded nucleic acid. Optionally, multiple nucleotides are incorporated at the 3' end of the discontinuity.
Optionally, multiple nucleotides of the same kind are incorporated at the 3' end of the discontinuity. Optionally, multiple nucleotides of different kinds are incorporated at the 3' end of the discontinuity. Optionally, the polymerase dissociates from the polymerization
initiation site after nucleotide incorporation. Optionally, the polymerase is retained at the polymerization initiation site after incorporation. The polymerase may be trapped at the 3' end of the first strand of the double stranded nucleic acid after the incorporation reaction in the pre-translocation state, post-translocation state, an intermediate state thereof, or a binary complex state. The incorporation reaction may be facilitated by an incorporation reaction mixture. Optionally, the incorporation reaction mixture comprises a different composition of nucleotides than the examination reaction. For example, the examination reaction comprises one type of nucleotide and the incorporation reaction comprises another type of nucleotide. In another example, the examination reaction comprises one type of nucleotide and the incorporation reaction comprises four types of nucleotides, or vice versa. Optionally, the examination reaction mixture is altered or replaced by the incorporation reaction mixture.
Nucleotides present in an incorporation reaction mixture which are not sequestered in a ternary closed-complex may cause multiple nucleotide insertions. Thus, a wash step is optionally employed prior to the chemical incorporation step to ensure only the nucleotide sequestered within a trapped ternary closed-complex is available for incorporation during the incorporation step. The trapped polymerase complex may be a ternary closed-complex, a stabilized closed-complex or another complex involving the polymerase, double stranded nucleic acid and next correct nucleotide.
Optionally, the nucleotide enclosed within the ternary closed-complex of the examination step is subsequently incorporated at the 3 ' end of the discontinuity of the double stranded nucleic acid during the incorporation step. For example, a stabilized ternary closed- complex of the examination step comprises an incorporated next correct nucleotide.
Optionally, the nucleotide enclosed within the ternary closed-complex of the examination step is incorporated during the examination step, but the ternary closed-complex does not allow for the incorporation of a subsequent nucleotide; optionally, the ternary closed-complex is released during an incorporation step, allowing for another nucleotide (of the same or different type) to become incorporated.
Optionally, the incorporation step comprises replacing a nucleotide from the examination step (e.g., the nucleotide is an incorrect nucleotide) and incorporating another nucleotide to the 3' end of the discontinuity of the double stranded nucleic acid. Optionally, the incorporation step comprises releasing a nucleotide from within a ternary closed-complex (e.g., the nucleotide is a modified nucleotide or nucleotide analog) and incorporating a nucleotide of a different kind to the 3' end of the discontinuity of the double stranded nucleic
acid. Optionally, the released nucleotide is removed and replaced with an incorporation reaction mixture comprising a next correct nucleotide.
Suitable reaction conditions for incorporation may involve replacing an examination reaction mixture with an incorporation reaction mixture. Optionally, nucleotides present in the examination reaction mixture are replaced with one or more nucleotides in the incorporation reaction mixture. Optionally, the polymerase present during the examination step is replaced during the incorporation step. Optionally, the polymerase present during the examination step is modified during the incorporation step, for example, being modified from a stabilized ternary complex in the examination step to a destabilized temary complex form in the incorporation step. Optionally, the one or more nucleotides present during the examination step are modified during the incorporation step. The reaction mixture and/or reaction conditions present during the examination step may be altered by any means during the incorporation step. These means include, but are not limited to, removing reagents that stabilized a ternary complex, chelating reagents that stabilized a temary complex, diluting reagents that stabilized a temary complex, adding reagents that destabilize a temary complex, altering reaction conditions such as conductivity or pH to destabilize a ternary complex, and any combination thereof. The reagents in the reaction mixture including any combination of polymerase, double stranded nucleic acid, and nucleotide and each may be modified during the examination step and/or incorporation step.
An advantage of some embodiments of the methods set forth herein is that the next correct base is detected before the incorporation step, allowing the incorporation step to not require labeled reagents and/or monitoring. Optionally, a nucleotide does not contain an attached detectable label. The examination step of the sequencing reaction may be repeated 1, 2, 3, 4 or more times prior to the incorporation step. The examination and incorporation steps may be repeated until the desired sequence of the double stranded nucleic acid is obtained.
A further advantage of some embodiments of the methods set forth herein, is the ability to control incorporation of a discrete number of nucleotides per sequencing cycle. More specifically, the methods can employ polymerase-based manipulations at the 3 ' end of a discontinuity in a double stranded nucleic acid, such that the presence of downstream double stranded regions (i.e. at the 5' end of the discontinuity) provides a block to extension. For example, the distance between the ends of the discontinuity (i.e. gap size) limits the number of nucleotides that a non-strand displacing polymerase can add to the 3' end of the discontinuity. Performing the polymerase-based methods of the present disclosure at a
discontinuity in a double stranded nucleic acid enhances sequencing accuracy for nucleic acid regions comprising homopolymer repeats because the length of the extension, which would be unknown or ambiguous for a homopolymer of unknown length, can be determined from a predefined gap size in the nucleic acid being sequenced. Moreover, the block to extension is reversible. For example, an exonuclease enzyme or the exonuclease activity of a polymerase can be used to translate the 5' end of the discontinuity downstream for subsequent cycles of sequencing.
Accordingly, the present disclosure provides a method for identifying nucleotides in a nucleic acid that includes the steps of (a) providing a double stranded nucleic acid having a template strand and a second strand having a nick, wherein the nick includes a 3' hydroxyl adjacent to a position of the second strand that pairs with a first position in the template strand; (b) forming a stabilized ternary complex including a polymerase, the double stranded nucleic acid and a cognate nucleotide that binds at the position of the second strand that pairs with the first position of the template strand, wherein the stabilized ternary complex is formed under strand displacing conditions; (c) examining the stabilized ternary complex, thereby acquiring a signal for identifying the nucleotide at the first position of the template strand; (d) after the examining of the stabilized ternary complex, (i) contacting the double stranded nucleic acid with an nuclease to convert the nick to a base gap, and (ii) covalently adding a nucleotide to the 3' hydroxyl at the position of the second strand that pairs with the first position of the template strand, thereby translating the nick downstream and identifying the nucleotide at the first position of the template strand from the acquired signals.
Optionally, the steps (b) through (d) can be repeated, thereby determining a nucleotide sequence of the template strand from the acquired signals.
Also provided is a method for identifying nucleotides in a nucleic acid that includes the steps of (a) providing a double stranded nucleic acid having a template strand and a second strand having a gap, wherein the gap includes a 3' hydroxyl adjacent to a position of the second strand that pairs with a first position in the template strand; (b) forming a stabilized ternary complex including a polymerase, the double stranded nucleic acid and a cognate nucleotide that binds at the position of the second strand that pairs with the first position of the template strand, wherein the stabilized ternary complex is formed under non- strand displacing conditions; (c) examining the stabilized ternary complex, thereby acquiring a signal for identifying the nucleotide at the first position of the template strand; (d) after the examining of the stabilized ternary complex, covalently adding a nucleotide to the 3' hydroxyl at the position of the second strand that pairs with the first position of the template
strand, thereby translating the gap to a downstream nick and identifying the nucleotide at the first position of the template strand from the acquired signals. Optionally, the method can further include steps of (e) forming a stabilized ternary complex including a polymerase, the double stranded nucleic acid and a cognate nucleotide that binds at the position of the second strand that pairs with the first position of the template strand, wherein the stabilized ternary complex is formed under strand displacing conditions; (f) examining the stabilized ternary complex, thereby acquiring a signal for identifying the nucleotide at the first position of the template strand; (g) after the examining of the stabilized ternary complex, (i) contacting the double stranded nucleic acid with an nuclease to convert the nick to a gap, and (ii) covalently adding a nucleotide to the 3' hydroxyl at the position of the second strand that pairs with the first position of the template strand, thereby translating the nick downstream and identifying the nucleotide at the first position of the template strand from the acquired signals.
Optionally, the steps (e) through (g) can be repeated, thereby determining a nucleotide sequence of the template strand from the acquired signals.
In particular embodiments, a single nucleotide is covalently added to the 3' hydroxyl in step (d)(ii) by contacting the double stranded nucleic acid with a polymerase, endonuclease and nucleotide that binds at the position that pairs with the first position. Optionally, the polymerase and endonuclease are simultaneously present during the reaction.
In particular embodiments, the endonuclease is a flap endonuclease such as FEN1. Alternatively or additionally, the polymerase can be Polymerase beta or polymerase delta.
Exemplary incorporation conditions for adding a single nucleotide at a nick using FEN1 (flap endonuclease 1) and polymerase beta are set forth in Liu et al, J. Biol. Chem. 280:3665-3674 (2005), which is incorporated herein by reference. A method of the present disclosure can also employ polymerase delta3 in combination with FEN1 to add a single nucleotide at a nick during an incorporation step, for example, using conditions set forth in Lin et al, DNA Repair (Amst) 2013 November, 12(11) (doi: 10.1016/j.dnarep.2013.08.008), which is incorporated herein by reference.
Optionally, a nucleic acid detection or sequencing method of the present disclosure involves a plurality of double stranded nucleic acids, polymerases and/or nucleotides, wherein a plurality of ternary closed-complexes is detected or monitored. For example, double stranded nucleic acids, or other reaction components, can be attached to features of an array such as those exemplified previously herein. Clonally amplified double stranded nucleic acids may be sequenced together wherein the clones of a common template are localized in close proximity to allow for enhanced detecting or monitoring of the population
of clones. Optionally, the formation of a temary closed-complex ensures the synchronicity of base extension across a plurality of clonally amplified double stranded nucleic acids at one or more features of an array. The synchronicity of base extension allows for the addition of only one base per sequencing cycle.
The methods of the present disclosure can be performed on a platform, where any component of the nucleic acid polymerization reaction is localized to a surface. Optionally, the double stranded nucleic acid is attached to a planar substrate, a microparticle, or a nanoparticle. Such surfaces can be features of an array, for example, in a multiplex embodiment. Optionally, all reaction components are freely suspended in the reaction mixture.
Following an examination step, where the identity of the next base has been identified via formation of a temary closed-complex, the reaction conditions may be reset, recharged, or modified as appropriate, in preparation for the optional incorporation step or an additional examination step. Optionally, all components of the examination step, excluding the double stranded nucleic acid being sequenced, are removed or washed away, returning the system to the pre-examination condition. Optionally, partial components of the examination step are removed. Optionally, additional components are added to the examination step.
Disclosed herein, in part, are reagent cycling sequencing methods, wherein sequencing reagents are introduced, one after another, for every cycle of examination and/or incorporation. Optionally, the sequencing reaction mixture comprises a polymerase, a double stranded nucleic acid and at least one type of nucleotide. Optionally, the nucleotide and/or polymerase are introduced cyclically to the sequencing reaction mixture. In exemplary embodiments, the sequencing reaction mixture comprises a plurality of polymerases, double stranded nucleic acids, and nucleotides (e.g. in an array format). Optionally, a plurality of nucleotides and/or a plurality of polymerases are introduced cyclically to the sequencing reaction mixture. Optionally, reagent cycling involves immobilizing a double stranded nucleic acid to a platform, washing away the current reaction mixture, and adding a new reaction mixture to the double stranded nucleic acid.
Furthermore, this document discloses apparatus and methods for base-cycling, wherein the identity of a sequence of bases on a double stranded nucleic acid is determined by manipulating reagents using methods such as fluidic pumping, electrical manipulation, magnetic manipulation, mechanical manipulation, thermal manipulation and optical manipulation.
Optionally, one or more types of nucleotides are sequentially added to and removed from the sequencing reaction. Optionally, 1, 2, 3, 4, or more types of nucleotides are added to and removed from the reaction mixture. For example, one type of nucleotide is added to the sequencing reaction, removed, and replaced by another type of nucleotide. Optionally, a nucleotide type present during the examination step is different from a nucleotide type present during the incorporation step. Optionally, a nucleotide type present during one examination step is different from a nucleotide type present during a sequential examination step (i.e. the sequential examination step is performed prior to an incorporation step).
Optionally, 1, 2, 3, 4 or more types of nucleotide are present in the examination reaction mixture and 1, 2, 3, 4, or more types of nucleotides are present in the incorporation reaction mixture.
Optionally, a polymerase is cyclically added to and removed from a sequencing reaction. Optionally, one or more different types of polymerases are cyclically added to and removed from the sequencing reaction. Optionally, a polymerase type present during the examination step is different from a polymerase type present during the incorporation step. For example a labeled polymerase can be used in an examination step whereas the labels are not present on the polymerase used in the incorporation step. Optionally, a polymerase type present during one examination step is different from a polymerase type present during a sequential examination step (i.e. the sequential examination step is performed prior to an incorporation step).
Optionally, reaction conditions such as the presence of reagents, pH, temperature, and ionic strength are varied throughout the sequencing reaction. Optionally, a metal is cyclically added to and removed from the sequencing reaction. For example, a catalytic metal ion is absent during an examination step and present during an incorporation step. In another example, a polymerase inhibitor (e.g. non-catalytic metal ion) is present during an examination step and absent during an incorporation step. Optionally, reaction components that are consumed during the sequencing reaction are supplemented with the addition of new components at any point during the sequencing reaction.
Optionally, nucleotides are added one type at a time, with the polymerase, to a reaction condition which favors ternary closed-complex formation. The polymerase binds only to the double stranded nucleic acid if the next correct nucleotide is present. A wash step after every nucleotide addition removes excess polymerases and nucleotides not involved in a ternary closed-complex. If the nucleotides are added one at a time, in a known order, the next base on the template strand of the double stranded nucleic acid is determined by the
formation of a ternary closed-complex when the added nucleotide is the next correct nucleotide.
Alternatively 2, 3, 4 or more types of nucleotides (e.g. dATP, dGTP, dCTP, dTTP) can be present together at the same time a reaction condition which favors ternary closed-complex formation, wherein one type of nucleotide is a next correct nucleotide.
Optionally, 1, 2, 3, 4, or more nucleotides types (e.g., dATP, dTTP, dCTP, dGTP) are tethered to 1, 2, 3, 4, or more different polymerases; wherein each nucleotide type is tethered to a different polymerase and each polymerase has a different label or a feature from the other polymerases to enable its identification. In this embodiment, all tethered nucleotide types are added together to a sequencing reaction mixture forming a ternary closed-complex comprising a tethered nucleotide-polymerase; the ternary closed-complex is monitored to identify the polymerase, thereby identifying the next correct nucleotide to which the polymerase is tethered. The tethering may occur at the gamma phosphate of the nucleotide through a multi-phosphate group and a linker molecule. Such gamma-phosphate linking methods are standard in the art, where a fluorophore is attached to the gamma phosphate linker. The labels include, but are not limited to, optical, electrical, thermal, colorimetric, mass, or any other detectable feature. Optionally, different nucleotide types are identified by distinguishable labels. Optionally, the distinguishable labels are attached to the gamma phosphate position of each nucleotide.
Optionally, the sequencing reaction mixture comprises a catalytic metal ion.
Optionally, the catalytic metal ion is available to react with a polymerase at any point in the sequencing reaction in a transient manner. To ensure robust sequencing, the catalytic metal ion is available for a brief period of time, allowing for a single nucleotide complementary to the next base in the template strand of the double stranded nucleic acid to be incorporated to the 3' end of the primer during and incorporation step. In this instance, no other nucleotides, for example, the nucleotides complementary to the bases downstream of the next base in the template strand of the double stranded nucleic acid, are incorporated. Optionally, the catalytic metal ion magnesium is present as a photocaged complex (e.g., DM-Nitrophen) in the sequencing reaction mixture such that localized UV illumination releases the magnesium, making it available to the polymerase for nucleotide incorporation. Furthermore, the sequencing reaction mixture may contain EDTA, wherein the magnesium is released from the polymerase active site after catalytic nucleotide incorporation and captured by the EDTA in the sequencing reaction mixture, thereby rendering magnesium incapable of catalyzing a subsequent nucleotide incorporation.
This disclosure provides, in part, for a catalytic metal ion to be present in a sequencing reaction in a chelated or caged form from which it can be released by a trigger. For example, the catalytic metal ion catalyzes the incorporation of the closed-complex next correct nucleotide, and as the catalytic metal ion is released from the active site, it is sequestered by a second chelating or caging agent, disabling the metal ion from catalyzing a subsequent incorporation. The localized release of the catalytic metal ion from its cheating or caged complex is ensured by using a localized uncaging or un-chelating scheme, such as an evanescent wave illumination or a structured illumination. Controlled release of the catalytic metal ions may occur for example, by thermal means. Controlled release of the catalytic metal ions from their photocaged complex may be released locally near the double stranded nucleic acid by confined optical fields, for instance by evanescent illumination such as waveguides or total internal reflection microscopy. Controlled release of the catalytic metal ions may occur for example, by altering the pH of the solution near the vicinity of the double stranded nucleic acid. Chelating agents such as EDTA, EGTA are pH dependent. At a pH below 5, divalent cations Mg2+ and Mn2+ are not effectively chelated by EDTA. A method to controllably manipulate the pH near the double stranded nucleic acid allows the controlled release of a catalytic metal ion from a chelating agent. Optionally, the local pH change is induced by applying a voltage to the surface to which the nucleic acid is attached. The pH method offers an advantage in that that metal goes back to its chelated form when the pH is reverted back to the chelating range.
Optionally, a catalytic metal ion is strongly bound to the active site of the polymerase, the polymerase is removed from the template strand of the double stranded nucleic acid after a single nucleotide incorporation. The removal of polymerase may be accomplished by the use of a highly distributive polymerase, which falls off the 3' end of the strand being synthesized after the addition of every nucleotide. The unbound polymerase may further be subjected to an electric or magnetic field to remove it from the vicinity of the nucleic acid molecules. Any metal ions bound to the polymerase may be sequestered by chelating agents present in the sequencing reaction mixture, or by molecules which compete with the metal ions for binding to the active site of the polymerase without disturbing the formation of the closed- complex. The forces which remove or move the polymerase away from the double stranded nucleic acid (e.g., electric field, magnetic field, chelating agent) may be terminated, allowing for the polymerase to approach the double stranded nucleic acid for another round of sequencing (i.e. examination and incorporation). The next round of sequencing, in a non-limiting example, comprises the formation of a closed-complex,
removing unbound polymerase away from the vicinity of the double stranded nucleic acid and/or closed-complex, controlling the release of a catalytic metal ion to incorporate a single nucleotide sequestered within the closed-complex, removing the polymerase which dissociates from the double stranded nucleic acid after single incorporation away from the vicinity of the double stranded nucleic acid, sequestering any free catalytic metal ions through the use of chelating agents or competitive binders, and allowing the polymerase to approach the double stranded nucleic acid to perform the next cycle of sequencing.
All publications, patents, and patent applications mentioned in this specification are herein incorporated by reference to the same extent as if each individual publication, patent, or patent application was specifically and individually indicated to be incorporated by reference.
Claims
1. A method for sequencing a nucleic acid, comprising
(a) providing a double stranded nucleic acid comprising a template strand and a second strand comprising a discontinuity, wherein the discontinuity comprises a 3' hydroxyl at a nick or gap, the 3 ' hydroxyl being adjacent to a position of the second strand that pairs with a first position in the template strand;
(b) forming a stabilized ternary complex comprising a polymerase, the double stranded nucleic acid and a cognate nucleotide that binds at the position of the second strand that pairs with the first position of the template strand;
(c) examining the stabilized ternary complex, thereby acquiring a signal for identifying the nucleotide at the first position of the template strand;
(d) after the examining of the stabilized ternary complex, covalently adding a nucleotide to the 3' hydroxyl at the position of the second strand that pairs with the first position of the template strand, thereby translating the discontinuity downstream; and
(e) repeating steps (b) through (d), thereby determining a nucleotide sequence of the template strand from the acquired signals.
2. The method of claim 1 , wherein the discontinuity comprises a 3' hydroxyl at a nick.
3. The method of claim 2, wherein the stabilized ternary complex is formed under strand displacing conditions.
4. The method of claim 2, further comprising contacting the double stranded nucleic acid with a nuclease to convert the nick to a gap.
5. The method of claim 2, wherein the stabilized ternary complex is formed under non-strand displacing conditions.
6. The method of claim 5, wherein the stabilized ternary complex is formed using a polymerase having 5 '-3' exonuclease activity, thereby converting the nick to a gap.
7. The method of claim 1 , wherein the discontinuity comprises a 3' hydroxyl at a gap.
8. The method of claim 7, wherein the stabilized ternary complex is formed under non-strand displacing conditions.
9. The method of claim 1, wherein the covalently adding comprises contacting the double stranded nucleic acid with a polymerase, endonuclease and nucleotide that binds at the position that pairs with the first position.
10. The method of claim 9, wherein the endonuclease comprises a flap endonuclease.
11. The method of claim 9, wherein the polymerase comprises Pol Beta or Pol
Delta.
12. The method of claim 1 , wherein the cognate nucleotide of the stabilized ternary complex lacks a label that produces the signal.
13. The method of claim 12, wherein the polymerase of the stabilized ternary complex comprises a label that produces the signal.
14. The method of claim 1 , wherein the nucleotide that is added in step (d) comprises an extendable 3' hydroxyl moiety.
15. The method of claim 1 , wherein the nucleotide that is added in step (d) lacks an exogenous label moiety.
16. The method of claim 1 , further comprising a step of removing the stabilized ternary complex from the double stranded nucleic acid between steps (c) and (d).
17. The method of claim 1 , wherein step (d) comprises, after the examining of the stabilized ternary complex, modifying the stabilized ternary complex to add the cognate
nucleotide to the 3' hydroxyl at the position of the second strand that pairs with the first position of the template strand, thereby translating the discontinuity downstream
18. The method of claim 1, wherein steps (b) and (c) are repeated one or more times, using different types of nucleotides each time, prior to performing step (d).
19. The method of claim 1 , wherein the stabilized ternary complex is formed and examined in the absence of a catalytic metal ion, thereby preventing covalent addition of the cognate nucleotide to the 3 ' hydroxyl.
20. The method of claim 19, further comprising contacting the stabilized ternary complex with the catalytic metal ion between steps (c) and (d).
21. The method of claim 1, wherein the double stranded nucleic acid is immobilized to a surface.
22. The method of claim 21, wherein the method is carried out on a plurality of distinguishable double stranded nucleic acid molecules attached to the surface as an array.
23. The method of claim 1, wherein step (d) translates the discontinuity only one position downstream.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201562273186P | 2015-12-30 | 2015-12-30 | |
US62/273,186 | 2015-12-30 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2017117235A1 true WO2017117235A1 (en) | 2017-07-06 |
Family
ID=57838518
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2016/068899 WO2017117235A1 (en) | 2015-12-30 | 2016-12-28 | Sequencing methods for double stranded nucleic acids |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2017117235A1 (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018034780A1 (en) * | 2016-08-15 | 2018-02-22 | Omniome, Inc. | Sequencing method for rapid identification and processing of cognate nucleotide pairs |
US10294514B2 (en) | 2016-04-29 | 2019-05-21 | Omniome, Inc. | Sequencing method employing ternary complex destabilization to identify cognate nucleotides |
WO2020154512A1 (en) * | 2019-01-23 | 2020-07-30 | Emory University | Methods of Identifying Adenosine-to-Inosine Edited RNA |
US10768173B1 (en) | 2019-09-06 | 2020-09-08 | Element Biosciences, Inc. | Multivalent binding composition for nucleic acid analysis |
US11248254B2 (en) | 2016-12-30 | 2022-02-15 | Omniome, Inc. | Method and system employing distinguishable polymerases for detecting ternary complexes and identifying cognate nucleotides |
US11287422B2 (en) | 2019-09-23 | 2022-03-29 | Element Biosciences, Inc. | Multivalent binding composition for nucleic acid analysis |
US11584963B2 (en) | 2018-02-16 | 2023-02-21 | Ultima Genomics, Inc. | Methods for sequencing with single frequency detection |
WO2023126457A1 (en) * | 2021-12-29 | 2023-07-06 | Illumina Cambridge Ltd. | Methods of nucleic acid sequencing using surface-bound primers |
WO2023220300A1 (en) | 2022-05-11 | 2023-11-16 | 10X Genomics, Inc. | Compositions and methods for in situ sequencing |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2009145820A2 (en) * | 2008-03-31 | 2009-12-03 | Pacific Biosciences Of California, Inc. | Generation of modified polymerases for improved accuracy in single molecule sequencing |
-
2016
- 2016-12-28 WO PCT/US2016/068899 patent/WO2017117235A1/en active Application Filing
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2009145820A2 (en) * | 2008-03-31 | 2009-12-03 | Pacific Biosciences Of California, Inc. | Generation of modified polymerases for improved accuracy in single molecule sequencing |
Non-Patent Citations (7)
Title |
---|
DRESSMAN ET AL., PROC. NATL. ACAD. SCI. USA, vol. 100, 2003, pages 8817 - 8822 |
DRMANAC ET AL., SCIENCE, vol. 327, no. 5961, 2010, pages 78 - 81 |
GUO ET AL., MOLECULAR AND CELLULAR BIOLOGY, vol. 28, no. 13, 2008, pages 4310 - 4319 |
LIN ET AL., DNA REPAIR (AMST, vol. 12, no. 11, November 2013 (2013-11-01) |
LIU ET AL., J. BIOL. CHEM., vol. 280, 2005, pages 3665 - 3674 |
LIZARDI ET AL., NAT. GENET., vol. 19, 1998, pages 225 - 232 |
VALLUR; MAIZELS, PLOS ONE, vol. 5, no. 1, 2010, pages E8908 |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11203778B2 (en) | 2016-04-29 | 2021-12-21 | Omniome, Inc. | Sequencing method employing ternary complex destabilization to identify cognate nucleotides |
US10294514B2 (en) | 2016-04-29 | 2019-05-21 | Omniome, Inc. | Sequencing method employing ternary complex destabilization to identify cognate nucleotides |
US10633692B2 (en) | 2016-04-29 | 2020-04-28 | Omniome, Inc. | Sequencing method employing ternary complex destabilization to identify cognate nucleotides |
US10428378B2 (en) | 2016-08-15 | 2019-10-01 | Omniome, Inc. | Sequencing method for rapid identification and processing of cognate nucleotide pairs |
US11203779B2 (en) | 2016-08-15 | 2021-12-21 | Omnionie, Inc. | Sequencing method for rapid identification and processing of cognate nucleotide pairs |
WO2018034780A1 (en) * | 2016-08-15 | 2018-02-22 | Omniome, Inc. | Sequencing method for rapid identification and processing of cognate nucleotide pairs |
US11248254B2 (en) | 2016-12-30 | 2022-02-15 | Omniome, Inc. | Method and system employing distinguishable polymerases for detecting ternary complexes and identifying cognate nucleotides |
US11584963B2 (en) | 2018-02-16 | 2023-02-21 | Ultima Genomics, Inc. | Methods for sequencing with single frequency detection |
WO2020154512A1 (en) * | 2019-01-23 | 2020-07-30 | Emory University | Methods of Identifying Adenosine-to-Inosine Edited RNA |
US10768173B1 (en) | 2019-09-06 | 2020-09-08 | Element Biosciences, Inc. | Multivalent binding composition for nucleic acid analysis |
US12117438B2 (en) | 2019-09-06 | 2024-10-15 | Element Biosciences, Inc. | Multivalent binding composition for nucleic acid analysis |
US11287422B2 (en) | 2019-09-23 | 2022-03-29 | Element Biosciences, Inc. | Multivalent binding composition for nucleic acid analysis |
WO2023126457A1 (en) * | 2021-12-29 | 2023-07-06 | Illumina Cambridge Ltd. | Methods of nucleic acid sequencing using surface-bound primers |
WO2023220300A1 (en) | 2022-05-11 | 2023-11-16 | 10X Genomics, Inc. | Compositions and methods for in situ sequencing |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2017117235A1 (en) | Sequencing methods for double stranded nucleic acids | |
EP3562962B1 (en) | Method and system employing distinguishable polymerases for detecting ternary complexes and identifying cognate nucleotides | |
US20220162690A1 (en) | Method and system for sequencing nucleic acids | |
US20240035082A1 (en) | Nucleic acid sequencing methods and systems | |
CA3050695C (en) | Process for cognate nucleotide detection in a nucleic acid sequencing workflow | |
AU2015402762B2 (en) | Nucleic acid sequencing methods and systems | |
US8795961B2 (en) | Preparations, compositions, and methods for nucleic acid sequencing | |
JP2007530051A (en) | Ligation and amplification reactions to determine target molecules | |
WO2017177017A1 (en) | Methods of quantifying target nucleic acids and identifying sequence variants | |
JP2021525078A (en) | Increased signal vs. noise in nucleic acid sequencing | |
US11180794B2 (en) | Methods and compositions for capping nucleic acids | |
JP2021512648A (en) | Compositions and Techniques for Nucleic Acid Primer Extension | |
WO2022125977A1 (en) | Methods for duplex repair | |
CN114599795A (en) | Methods and compositions for capping nucleic acids | |
AU2021463390A1 (en) | Method for analyzing sequence of target polynucleotide |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 16828866 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 16828866 Country of ref document: EP Kind code of ref document: A1 |