WO2006073504A2 - Wobble sequencing - Google Patents

Wobble sequencing Download PDF

Info

Publication number
WO2006073504A2
WO2006073504A2 PCT/US2005/027695 US2005027695W WO2006073504A2 WO 2006073504 A2 WO2006073504 A2 WO 2006073504A2 US 2005027695 W US2005027695 W US 2005027695W WO 2006073504 A2 WO2006073504 A2 WO 2006073504A2
Authority
WO
WIPO (PCT)
Prior art keywords
sequencing
primer
anchor
bases
sequence
Prior art date
Application number
PCT/US2005/027695
Other languages
French (fr)
Other versions
WO2006073504A3 (en
WO2006073504A8 (en
Inventor
George Church
Gregory Porreca
Jay Shendure
Original Assignee
President And Fellows Of Harvard College
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by President And Fellows Of Harvard College filed Critical President And Fellows Of Harvard College
Publication of WO2006073504A2 publication Critical patent/WO2006073504A2/en
Priority to US11/670,588 priority Critical patent/US20070207482A1/en
Publication of WO2006073504A3 publication Critical patent/WO2006073504A3/en
Publication of WO2006073504A8 publication Critical patent/WO2006073504A8/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing

Definitions

  • the present invention relates to novel methods and compositions for DNA sequencing.
  • the methods described herein are useful for sequencing homopolymeric regions of DNA.
  • a second problem with the FISSEQ approach is that the set of polymerases typically utilized in such reactions do not efficiently incorporate nucleotides due to the high density of modified nucleotides. For that reason, a large fraction of unlabeled nucleotides are introduced, thus reducing the overall density of modification and extending read-lengths. This results in less labeled nucleotide and, accordingly, less signal. Accordingly, the present invention is directed to novel methods of sequencing that circumvent these problems and provides advantages over methods of sequencing known in the art.
  • the present invention provides novel sequencing methods designed to circumvent problems associated with sequencing-by-synthesis methods known in the art.
  • the methods described herein are based on sequencing by polymerase-extension, they differ from FISSEQ and pyrosequencing in that base-additions are not "progressive.” Instead, after a given single-base-extension (SBE), the sequencing primer is stripped from the bead- immobilized templates and a new primer is hybridized. Thus to get beyond the first base, each sequencing primer in the set "reaches" out to a defined position in the unknown unique sequence of the template (e.g., to the fourth base or the fifth base).
  • SBE single-base-extension
  • a sequencing primer from 5' to 3', thus consists of an "anchor sequence" that is complementary to the constant sequence on the template, and a defined number of additional bases (e.g., universal, degenerate and/or natural bases), that will hybridize to the unknown sequence regardless of what it is. If, for example, there are three fixed universal bases, then the sequencing primer is positioned to sequence the fourth base via SBE with labeled nucleotides. After a single-base- extension and data acquisition, extended and unextended primers are stripped (e.g., with heat) and a new primer is annealed that has a different number of universal bases, thus querying a different base-position within the unknown sequence. Thus in this simplest iteration of the scheme, one only needs a set of N primers to achieve a read-length of N.
  • additional bases e.g., universal, degenerate and/or natural bases
  • the present invention provides many advantages over sequencing methods known in the art.
  • the methods described herein 1) provide a quick solution to the problem of sequencing homopolymers; 2) enable manual mistakes and biochemical inefficiencies to be non-cumulative; 3) greatly expedite the technology development for longer reads (i.e. don't have to cycle out to test a method for improving read-lengths); 4) provide better signals than are obtained by the FISSEQ system currently used in the art (i.e., in which a desire for signal has to be balanced against a desire to minimize the fraction of extended templates with cleaved linker as it inhibits the polymerase); and 5) greatly increase the choice and amounts of enzyme (polymerase or ligase) due to the lack of a requirement to take extensions to completion.
  • Figure 1 depicts primer information.
  • the first column of numbers indicates the cycle number assigned to a given query.
  • the second and third columns indicate the sequencing primer used, and the fourth column indicates the conditions of hybridization.
  • the fifth column indicates the base(s) used to extend, and the 6 th column indicates the templates expected to add.
  • the remaining columns indicate the best-fit slope coefficient for adders and non-adders, and finally the ratio of these values.
  • TR Texas Red.
  • Figure 2 depicts an extension with 37C.8N.CG, sequencing bases 10,11,12 on T4. Blue indicates bases that were sequenced; yellow indicates bases attempted and failed; uncolored indicates bases that were not attempted.
  • Figure 3 depicts sequencing on emulsion beads.
  • Figure 4 depicts primer information for primers that extended either T2, T3 or T4.
  • Figure 5 depicts bases that were sequenced. Blue indicates bases that were sequenced; yellow indicates bases attempted and failed; uncolored indicates bases that were not attempted.
  • Figure 6 depicts sequencing on emulsion beads.
  • Figure 7 is a schematic depicting query of tag positions (-5) by mismatch ligation.
  • Figures 8A and 8B is a schematic depicting unique tags and queries that will ligate.
  • Figures 9 A and 9B is a schematic of the method of the present invention.
  • Figure 10 is a four color depiction of four possible base calls.
  • Figures 11 is a graph showing variation in accuracy over each of 26 cycles of nonprogressive sequencing.
  • DNA sequences of numerous features are obtained in parallel by cycles of hybridization of sequencing primers that contain universal, degenerate, and/or specific bases at positions of unknown sequence, followed by single-base- extension with polymerase and nucleotide.
  • polymerase and nucleotide As polymerases generally only extend from terminally-matched nucleotides, when an extension occurs, the identity of the bases complementary to specific bases present at the 3' terminus of a given sequencing primer is revealed.
  • use of modified nucleotides with different fluorescent labels reveals the identity of the incorporated nucleotide.
  • a given sequencing primer is designed with a known number of universal or degenerate nucleotides, and a known number of specific nucleotides, one knows the specific position within the unknown template that one is sequencing.
  • the methods of the invention include the use of "degenerate bases” which are intended to include, but are not limited to, primer mixes that contain all possible sequences at unknown positions.
  • the methods of the invention also include the use of universal bases at some or all of the primer positions.
  • Universal bases are intended to include, but are not limited to, synthetic nucleotide analogs that ideally pair with equal affinities to each of the natural nucleotides, and are readily accepted as substrates by natural enzymes. Examples of universal bases include 5-nitroindole, 3-nitropyrole, deoxyinosine, and the like.
  • the methods of the invention further include the use of natural bases, wherein sequencing primer oligonucleotides are synthesized with fully degenerate positions, such that all possible sequencing primers (or some random subset of all possibilities) are present during hybridization.
  • overall efficiency could be improved by enzyme engineering for greater permissiveness with respect to mismatches (e.g., the M1/M4 variants of Taq) or alterations to the primer design strategy.
  • methods of the invention are directed to fixing the terminal two bases of a given sequencing primer, but allowing the remainder of bases at "universal" positions to be synthesized with fully-degenerate natural bases.
  • Non-terminator FISSEQ yields approximately 0.50 bases-per-cycle (assuming no homopolymer resolution and thus counting multi-base runs as single extensions). By this consideration, achieving an identical read-length would require approximately 2.67 times as many cycles in the 2 bp-matched-wobble-sequencing system.
  • a typical primer-name below is "37C.2N.CA”.
  • the anchor sequence is a trimmed version of the original FISSEQ primer for the Tl..T5 template.
  • the "37C” indicates the extent to which it has been trimmed (i.e. 37C is the Tm of the anchor sequence if it were a stand-alone primer).
  • the "2N” indicates that the anchor-sequence is followed by two full “wobble” or degenerate bases, and the CA indicates the fixed two terminal bases. This primer would extend to the 5 th base, thus sequencing 3 bases (base 3, 4 and 5) on 1/16 th of the templates of a random library.
  • primers with even numbers of "wobble” or degenerate bases and terminal bases that match at least one of the five T1..T5 templates were focused on to ensure extension at every cycle. For a given "reach-length,” this was approximately 1/4* of the primers that would be required in a real sequencing experiment involving sequencing of genomic fragments. However, this estimate is slightly conservative in that one could do multiples of three for the number of "wobble” or degenerate bases, rather than multiples of two. Some optional redundancy was built in. For example, 37C.2N.XX sequences bases 3, 4 and 5. 37C.4N.X sequences bases 5, 6 and 7. Thus, base 5 was sequenced twice (as is base 7, base 9, etc.)
  • Figure 1 depicts results from top-layered, 1 ⁇ M beads with loaded Tl..T5 templates. These are primers that would be required in a full sequencing experiment on unknown sequence. Primers were ordered to sequence through to the 11 th base on all five templates (37C.0N.XX through 37C.8N.XX). Only one primer was ordered for 37C.10N.XX through 37C.18N.XX.
  • Wobble Ligation an embodiment of the invention referred to as "Wobble Ligation.”
  • Several of the principles are identical or similar to Wobble Extension as previously described herein. These principles are distinguishable from FISSEQ and other sequencing methods, such as that described in Macevicz US Patent No. 5,750,341.
  • a single primer is hybridized and extended; degenerate bases within the oligonucleotide primer are included to 'reach' a specific distance into the unknown sequence.
  • a single primer is hybridized that is universal (the 'anchor' primer) and sits such that either its 5' or 3' end is immediately adjacent to the unknown sequence.
  • the position to be queried is encoded in a pool of degenerate nonamers (9-mer) that are ligated to the anchor primer.
  • anchor primers having one or several degenerate positions at the terminus to be ligated to can serve as substrates for ligation and so can be used to position the query even further into the unknown sequence.
  • the assays are always identical, in that the full pool of possible nonamers is being ligated to the anchor primer. What changes between the assays (and determines whether one is sequencing base 4 or base 7 in a particular cycle, for example), is the correlations between specific positions in the degenerate nonamer and fluorescent labels at its end.
  • Figure 7 depicts, for example, the querying of position (-4) relative to the anchor primer.
  • Such error establishes an upper limit on the accuracy of any sequencing method which operates on material that is the product of the amplification.
  • template is diluted to the point where 1 template molecule and 1 bead will be trapped in an emulsion compartment, and PCR will proceed from this single molecule resulting in many copies bound to the bead.
  • An error arising early during the amplification will result in a bead having either a homogenous population of amplicons bearing the error, or a heterogenous population of amplicons, some bearing the error and some not. In either case, the accuracy of the sequence derived from such a bead will be low.
  • emulsion PCR will be started with multiple copies of a given template molecule in a compartment. Then, PCR will initiate from each copy independently, and the product bound to the bead in that compartment will be largely homogenous and error-free, even if errors arise early during amplification from 1 of the copies of the template.
  • the first is to clone the template desired to be sequenced into a plasmid, transform into bacteria or yeast, and perform emulsion PCR not with naked single-copy template DNA, but rather with individual cells, each of which includes multiple copies of the template. During PCR the cells will rupture and amplification will proceed from each copy of the plasmid present. Since multiple copies of the template were present, and since each was copied independently by the host cell's low- error replication machinery, the probability of obtaining a PCR-based error in a preponderance of amplicons is very low.
  • the second approach uses linear rolling circle amplification to prepare template molecules which are linear concatemers of independent copies of the original template. PCR then initiates from each site on the concatemer independently.
  • the important constraint (regardless of the method used to get multiple copies of a template into an emulsion compartment or otherwise to initiate a spatially-clustered exponential amplification) is that the initial copies made of the original template are independent of each other and so the probability of two such copies bearing the same error is very low.
  • the original template (a circular molecule) is iterated over many times, such that all copies are copies of the original template (unlike PCR, which makes copies of copies).
  • Embodiments of the present invention are directed to methods to determine, with single-base resolution, the length of the unique region of a library molecule.
  • a paired-tag genomic library is constructed where each library molecule is comprised of a unique region flanked by common primer sites.
  • the type Hs restriction enzyme Mmel is used. Mmel cuts either 17bp or 18bp from its recognition sequence, and in the embodiment described here thus produces inserts of 17bp or 18bp at a ratio of about 50:50 with little to no sequence-dependence. Knowing the exact length of each insert is advantageous since sequencing methods described herein include the step of reading a certain number of bases from each side of the 17-18bp tag. In order to generate a contiguous sequence from such reads, knowing the exact length of the insert would be beneficial.
  • a ligation-query scheme which relies on the specificity of the ligase reaction catalyzed by ampligase or some other ligase capable of yielding sufficient base paring specificity to first 'walk' across the insert with fully degenerate nonamers, and then query the identity of a base in the opposing universal primer sequence.
  • An 'anchor' primer complementary to sequence in universal primer A can be first hybridized, then perform degenerate nonamer ligation to span the unique insert, and finally query the length of such insert with a pair of fluorescently-labeled query primers, where each possible length (17 or 18) is coded by a different fluorophore as depicted in Figure 8 A and 8B.
  • This embodiment can be carried out in the 5'->3' direction by using a degenerate nonamer population that is phosphorylated at the 5' end (such that that end will ligate to the anchor primer), and the fluorophore resides on its 3' end.
  • a kit including endonuclease 8 and UDG is commercially available from New England Biolabs under the tradename USER.
  • a schematic of a sample UDG reaction is provided in the figure below.
  • a P endonuclease (variable specificity)
  • Certain polymerase- and ligase- driven cyclic sequencing methods are termed "progressive,” in that they interrogate the sequencing template by incorporating onto the end of a growing polynucleotide chain, digesting from the end of the template, or ligating to a growing oligonucleotide primer.
  • progressive in that they interrogate the sequencing template by incorporating onto the end of a growing polynucleotide chain, digesting from the end of the template, or ligating to a growing oligonucleotide primer.
  • the non-progressive cycling method of the present invention reduces, or in certain embodiments, eliminates, the adverse effects of amplicon dephasing in existing sequencing by synthesis methods (both polymerase- and ligase- driven) by removing the sequencing primer periodically (as often as after each base-position is interrogated).
  • amplicon dephasing in existing sequencing by synthesis methods (both polymerase- and ligase- driven) by removing the sequencing primer periodically (as often as after each base-position is interrogated).
  • enzymatic and chemical inefficiencies and other errors do not accumulate as the sequencing run proceeds. Rather, each cycle is independent of previous inefficiencies or misincorporations (assuming the primer is removed after each sequencing cycle).
  • the non-progressive cycling method of the present invention has the added advantage of allowing one to know, with reasonably certainty, which position in the template is being interrogated.
  • the primer can be removed in a number of ways.
  • Heat can be used to melt the primer off the template.
  • Alkali can be used to chemically denature the primer from the template.
  • Numerous other chemical denaturants can be used, which include: methanol, ethanol, isopropanol, n-propanol, allyl alcohol, sec-butyl alcohol, tert-butyl alcohol, isobutyl alcohol, n-butyl alcohol, tert-amyl alcohol, ethylene glycol, glycerol, dithioglycerol, propylene glycol, cyclohexyl alcohol, benzyl alcohol, inositol, phenol, p-methoxyphenol, aniline, pyridine, purine, 1,4-dioxane, gamma-butyrolactone, 3 -amino triazole, formamide, N-ethyl formamide, N-N- dimethylform
  • Chemically-labile linkages such as phosphorothioate with heavy-metal ion cleavage treatment as described in M. Mag, S. Luking, J. W. Engels, Nucleic Acids Res., 19:1437 (April 11, 1991) can be included in the primer to allow it to be fragmented into many pieces, each of which has a Tm low enough to cause the prime ⁇ query complex to denature from the template.
  • Primers can be made enzymatically-labile by the inclusion of ribonucleotides or ribonucleotide stretches (susceptible to cleavage by RNase H or alkali) or the inclusion of deoxyuridines (subject to cleavage by a mixture of uracil DNA glycosylase and endonuclease VIII) or abasic sites (subject to cleavage by endonuclease VIII).
  • the primer can also be removed enzymatically by the use of a suitable exonuclease.
  • the following steps were carried out cyclically to interrogate each base of the template sequentially.
  • An 'anchor primer' was hybridized complementary to common library sequence.
  • a pool of fluorescently-labeled 'query primers' specific to one tag-position was then ligated to the template. Imaging was then used to determine which primer pool ligated to which bead.
  • the anchor: :query primer complex was then stripped. The process was then repeated.
  • Query primers used were nonamers which were degenerate at all positions excepy the query position. At the query position, only one base was present for a given fluorophore.
  • the pool of probes used to query position five was composed of the following four label-subpools:
  • Anchor primers were hybridized in a flowcell (10OuM primer in 6x SSPE) for 5 minutes at 56C, then cooled to 42C and held for 2 minutes. Excess primer was then washed out at room temperature with Wash IE (1OmM Tris-HCl pH 7.5, 5OmM KCl, 2mM EDTA pH 8.0, 0.01% Triton X-100) for 2 minutes.
  • Query primers were ligated in the flowcell (8uM query primer mix (2uM each subpool), 6000U T4 DNA ligase (NEB), Ix T4 DNA ligase buffer (NEB)) at 35C and held for 30 minutes. At the end of the reaction, excess query primer was washed out at room temperature with Wash IE for 5 minutes.
  • the cycles consist of the following four steps: (a) hybridization of one of four anchor primer, (b) ligation of fluorescent, degenerate nonamers, (c) four color imaging on epifluorescence microscope, (d) stripping of the anchor primer:nonamer complexes prior to beginning the next cycle.
  • the anchor primers are each designed to be complementary to universal sequence immediately 5' or 3' to one of the two tags.
  • Al, A2, A3 and A4 indicate the four locations to which anchor primers are targeted relative to the amplicon. Arrows indicate the direction sequenced into the tag from each anchor primer. From anchor primers Al and A3, 7 bases are sequenced into each tag, and from anchor primers A2 and A4, 6 bases are sequenced into each tag. Thus, 13 bp per tag are obtained, and 26 bp per amplicon, with 4 to 5 bp gaps within each tag sequence.
  • each cycle involves performing a ligation reaction with T4 DNA ligase and a fully degenerate population of nonamers.
  • the nonamer molecules are individually labeled with one of four fluorophores (e.g., Texas Red, Cy5, Cy3, FITC).
  • fluorophores e.g., Texas Red, Cy5, Cy3, FITC
  • the nonamers are structured differently. Specifically, a single position within each nonamer is correlated with the identity of the fluorophore with which it is labeled.
  • the fiuorphore molecule is attached at the opposite end of the nonamer relative to the end targeted to the ligation junction.
  • the anchor primer is hybridized such that its 3' end is adjacent to the genomic tag. To query a position five bases in to the tag sequence, the four- color population of nonamersis used.
  • Figure 11 shows data from a single cycle of non-progressive sequencing by ligation, and in particular is the sequencing data from position (-1) of the proximal tag of a complex E. coli derived library.
  • Figure 11 shows variation in accuracy over each of 26 cycles of non-progressive sequencing by ligation in a single experiment resequencing an E. coli genome. Cumulative distribution of raw error as a function of rank- ordered quality, with each of 26 sequencing-by-ligation cycles in a single sequencing experiment is treated as an independent data-set.
  • the x-axis indicates percentile bins of beads, sorted on the basis of a confidence metric.
  • the >>-axis (log scale) indicates the raw base-calling accuracy of each cumulative bin.
  • Pritchard CE Southern EM., "Effects of base mismatches on joining of short oligodeoxynucleotides by DNA ligases," Nucleic Acids Res., 1997 Sept. 1; 25(17):3403- 3407.

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Organic Chemistry (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Microbiology (AREA)
  • Immunology (AREA)
  • Biotechnology (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Analytical Chemistry (AREA)
  • Physics & Mathematics (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

Novel methods and compositions for DNA sequencing are provided. The methods described herein are useful for sequencing homopolymeric regions of DNA. The methods also prevent the accumulation of mistakes and inefficiencies in the sequencing reaction.

Description

PATENT ATTORNEY DOCKET NO. 10498-00091
WOBBLE SEQUENCING
STATEMENT OF GOVERNMENT INTERESTS
This invention was made with Government support under Award Numbers 1P50 HG003170, awarded by the Centers of Excellence in Genomic Science (CEGS); and DE-FG02-02ER63445, awarded by Genomes to Life (GTL). The Government has certain rights in the invention.
FIELD OF THE INVENTION
The present invention relates to novel methods and compositions for DNA sequencing. The methods described herein are useful for sequencing homopolymeric regions of DNA.
BACKGROUND OF THE INVENTION
Current state-of-the-art in sequencing-by-synthesis relies on a single sequencing primer, with a known sequence, followed by cyclic additions of a single nucleotide species at each cycle and detection of incorporation events (e.g., C-A-G-T-C-A-G-T...) via fluorescence or light. Examples of these methods include fluorescent in situ sequencing (FISSEQ) and pyrosequencing. A major problem for both of these approaches is that it is very difficult to decode consecutive runs of the same base in the unknown sequence (i.e., hompolymeric runs), and it is difficult to distinguish single from multiple incorporation events. As approximately 44% of nucleotides are part of a homopolymeric run, this is obviously a major consideration. Most efforts to circumvent this problem involve the development of reversibly terminating nucleotides, which cause a variety of difficulties.
A second problem with the FISSEQ approach is that the set of polymerases typically utilized in such reactions do not efficiently incorporate nucleotides due to the high density of modified nucleotides. For that reason, a large fraction of unlabeled nucleotides are introduced, thus reducing the overall density of modification and extending read-lengths. This results in less labeled nucleotide and, accordingly, less signal. Accordingly, the present invention is directed to novel methods of sequencing that circumvent these problems and provides advantages over methods of sequencing known in the art. SUMMARY
The present invention provides novel sequencing methods designed to circumvent problems associated with sequencing-by-synthesis methods known in the art. Although the methods described herein are based on sequencing by polymerase-extension, they differ from FISSEQ and pyrosequencing in that base-additions are not "progressive." Instead, after a given single-base-extension (SBE), the sequencing primer is stripped from the bead- immobilized templates and a new primer is hybridized. Thus to get beyond the first base, each sequencing primer in the set "reaches" out to a defined position in the unknown unique sequence of the template (e.g., to the fourth base or the fifth base). A sequencing primer, from 5' to 3', thus consists of an "anchor sequence" that is complementary to the constant sequence on the template, and a defined number of additional bases (e.g., universal, degenerate and/or natural bases), that will hybridize to the unknown sequence regardless of what it is. If, for example, there are three fixed universal bases, then the sequencing primer is positioned to sequence the fourth base via SBE with labeled nucleotides. After a single-base- extension and data acquisition, extended and unextended primers are stripped (e.g., with heat) and a new primer is annealed that has a different number of universal bases, thus querying a different base-position within the unknown sequence. Thus in this simplest iteration of the scheme, one only needs a set of N primers to achieve a read-length of N.
The present invention provides many advantages over sequencing methods known in the art. The methods described herein: 1) provide a quick solution to the problem of sequencing homopolymers; 2) enable manual mistakes and biochemical inefficiencies to be non-cumulative; 3) greatly expedite the technology development for longer reads (i.e. don't have to cycle out to test a method for improving read-lengths); 4) provide better signals than are obtained by the FISSEQ system currently used in the art (i.e., in which a desire for signal has to be balanced against a desire to minimize the fraction of extended templates with cleaved linker as it inhibits the polymerase); and 5) greatly increase the choice and amounts of enzyme (polymerase or ligase) due to the lack of a requirement to take extensions to completion.
BRIEF DESCRIPTION OF THE DRAWINGS
The patent or application file contains at least one drawing executed in color. Copies of this patent or patent application publication with color drawing(s) will be provided by the Office upon request and payment of the necessary fee. The foregoing and other features and advantages of the present invention will be more fully understood from the following detailed description of illustrative embodiments taken in conjunction with the accompanying drawings in which:
Figure 1 depicts primer information. The first column of numbers indicates the cycle number assigned to a given query. The second and third columns indicate the sequencing primer used, and the fourth column indicates the conditions of hybridization. The fifth column indicates the base(s) used to extend, and the 6th column indicates the templates expected to add. The remaining columns indicate the best-fit slope coefficient for adders and non-adders, and finally the ratio of these values. TR = Texas Red.
Figure 2 depicts an extension with 37C.8N.CG, sequencing bases 10,11,12 on T4. Blue indicates bases that were sequenced; yellow indicates bases attempted and failed; uncolored indicates bases that were not attempted.
Figure 3 depicts sequencing on emulsion beads.
Figure 4 depicts primer information for primers that extended either T2, T3 or T4.
Figure 5 depicts bases that were sequenced. Blue indicates bases that were sequenced; yellow indicates bases attempted and failed; uncolored indicates bases that were not attempted.
Figure 6 depicts sequencing on emulsion beads.
Figure 7 is a schematic depicting query of tag positions (-5) by mismatch ligation.
Figures 8A and 8B is a schematic depicting unique tags and queries that will ligate.
Figures 9 A and 9B is a schematic of the method of the present invention.
Figure 10 is a four color depiction of four possible base calls.
Figures 11 is a graph showing variation in accuracy over each of 26 cycles of nonprogressive sequencing.
DETAILED DESCRIPTION
In the methods described herein, DNA sequences of numerous features are obtained in parallel by cycles of hybridization of sequencing primers that contain universal, degenerate, and/or specific bases at positions of unknown sequence, followed by single-base- extension with polymerase and nucleotide. As polymerases generally only extend from terminally-matched nucleotides, when an extension occurs, the identity of the bases complementary to specific bases present at the 3' terminus of a given sequencing primer is revealed. Furthermore, use of modified nucleotides with different fluorescent labels reveals the identity of the incorporated nucleotide. As a given sequencing primer is designed with a known number of universal or degenerate nucleotides, and a known number of specific nucleotides, one knows the specific position within the unknown template that one is sequencing.
The methods of the invention include the use of "degenerate bases" which are intended to include, but are not limited to, primer mixes that contain all possible sequences at unknown positions. The methods of the invention also include the use of universal bases at some or all of the primer positions. "Universal bases" are intended to include, but are not limited to, synthetic nucleotide analogs that ideally pair with equal affinities to each of the natural nucleotides, and are readily accepted as substrates by natural enzymes. Examples of universal bases include 5-nitroindole, 3-nitropyrole, deoxyinosine, and the like. The methods of the invention further include the use of natural bases, wherein sequencing primer oligonucleotides are synthesized with fully degenerate positions, such that all possible sequencing primers (or some random subset of all possibilities) are present during hybridization. Without intending to be bound by theory, overall efficiency could be improved by enzyme engineering for greater permissiveness with respect to mismatches (e.g., the M1/M4 variants of Taq) or alterations to the primer design strategy.
In one embodiment, methods of the invention are directed to fixing the terminal two bases of a given sequencing primer, but allowing the remainder of bases at "universal" positions to be synthesized with fully-degenerate natural bases. The disadvantage of this compromise is that 16 separate hybridizations are required for each "reach" length (42 combinations of the two terminal bases). This is mitigated by the fact that polymerases don't extend off of mispaired termini very well, so a given extension set reveals the identity of both the two terminal bases and the extended base. So the average efficiency of the process is 3/16 = 0.188 bases per cycle.
Non-terminator FISSEQ, by comparison, yields approximately 0.50 bases-per-cycle (assuming no homopolymer resolution and thus counting multi-base runs as single extensions). By this consideration, achieving an identical read-length would require approximately 2.67 times as many cycles in the 2 bp-matched-wobble-sequencing system. This invention is further illustrated by the following examples, which should not be construed as limiting. The contents of all references, patents and published patent applications cited throughout this application are hereby incorporated by reference in their entirety for all purposes.
EXAMPLE I Cycle Protocol
Typical cycles were as follows:
1. Hybridize sequencing primer (15 minutes, 10 μM primer in 6x SSPE, 40-500C)
2. Extend (4 minutes, SSB + polymerase + nucleotide)
3. Wash (2 minute)
4. Image acquisition
5. Strip primer (5 minutes, Wash IE, 70 'C)
If the wobble-bases were fixed (poly- A, poly-G, poly-C, or poly-T instead of poly-N), extensions were no longer efficient. Without intending to be bound by theory, this indicates that some degree of "sorting" is going on during the hybridization that is critical to the overall process working. Hoping for this to occur, the "anchor sequence" is purposefully short (Tm = 370C if it were alone), weighting the hybridization process to depend to a greater degree on the "wobble" or degenerate sequences. Initial data indicated that SEQUENASE™ was significantly better than Klenow for this approach. Primer-stripping was initially very inefficient with beads. It only started working when the bead array was fabricated such that the beads were embedded in the gel near the gel-liquid interface (opposite the glass surface, or "top-layered").
EXAMPLE II Primer Nomenclature
A typical primer-name below is "37C.2N.CA". For the primers described herein, the anchor sequence is a trimmed version of the original FISSEQ primer for the Tl..T5 template. The "37C" (or "23C" or the like) indicates the extent to which it has been trimmed (i.e. 37C is the Tm of the anchor sequence if it were a stand-alone primer). The "2N" indicates that the anchor-sequence is followed by two full "wobble" or degenerate bases, and the CA indicates the fixed two terminal bases. This primer would extend to the 5th base, thus sequencing 3 bases (base 3, 4 and 5) on 1/16th of the templates of a random library.
In the examples below, primers with even numbers of "wobble" or degenerate bases and terminal bases that match at least one of the five T1..T5 templates were focused on to ensure extension at every cycle. For a given "reach-length," this was approximately 1/4* of the primers that would be required in a real sequencing experiment involving sequencing of genomic fragments. However, this estimate is slightly conservative in that one could do multiples of three for the number of "wobble" or degenerate bases, rather than multiples of two. Some optional redundancy was built in. For example, 37C.2N.XX sequences bases 3, 4 and 5. 37C.4N.XX sequences bases 5, 6 and 7. Thus, base 5 was sequenced twice (as is base 7, base 9, etc.)
EXAMPLE III Proof of Principle on Loaded Beads
Figure 1 depicts results from top-layered, 1 μM beads with loaded Tl..T5 templates. These are primers that would be required in a full sequencing experiment on unknown sequence. Primers were ordered to sequence through to the 11th base on all five templates (37C.0N.XX through 37C.8N.XX). Only one primer was ordered for 37C.10N.XX through 37C.18N.XX.
Failures are listed in yellow. Without intending to be bound by theory, the first failure (cycle 17), was likely due to manual error in preparing the extension reagent mix, as its repeat (cycle 24) was successful, and this primer worked well in the emulsion-bead experiment below. Without intending to be bound by theory, the remaining failures correlate with attempts at longer reads. The 37C.12N.CG primer, interestingly, works quite well for one template but not another. In a subsequent experiment, using SEQUENASE™ instead of Klenow resulted in both templates working with this primer. SEQUENASE™ also yields greater signal in general than Klenow in this protocol.
Without intending to be bound by theory, several trends emerged: a) there was poor performance of "G" extensions, which was improved using SEQUENASE™; and b) poor performance of the T5 template in terms of signal yield at any given cycle when it was expected to extend. This outcome may be explained by the shortening of the anchor of the sequencing primer. Approximately 11 -base-pair reads were obtained from all five templates, and all observations appear consistent. A 15-bp read was obtained on one of the templates (T4), but results were not consistent (i.e. cycle 28) and failure was experienced beyond base 15 (cycles 29-31). Extension was performed with 37C.8N.CG, sequencing bases 10,11,12 on T4 (Figure 2).
Since the above worked so well, the experiment was repeated on emulsion-generated beads top-layered (Figure 3). The templates were diluted independently, only mixing them as they went into the emulsion mix. The reason for this is that they are single-stranded, and this procedure minimizes their binding to one another, which confounds results. However, the ratios of the five templates deviated from 1 :1. The initial set of primers used on these templates were the 37C.0N.XX series, which essentially establishes the identity of each bead. As the fraction of beads with 1 or more template was high, it was not surprising that a high fraction of non-clonal beads was observed. Only approximately 1% of the gel (25 frames) was imaged at each cycle. The overall numbers were as follows: no template, 29,658; weakly amplified, 10,164; strong clonal, 13,350; Tl = 57; T2 = 8,945; T3 = 2,165; T4 = 1,834; T5 = 349; strong non-clonal, 7,668; and total, 60,840.
The numbers are generally consistent with what one would expect from Poisson statistics, but with a modest excess of non-clonal beads. Without intending to be bound by theory, these data indicate that some fraction of the "no template" beads actually don't participate in the distribution (e.g., they are excluded because they are in the oil compartment, or in a compartment that is too small to initiate PCR and the like).
EXAMPLE IV Primers That Extended Either T2, T3, or T4
The initial analysis of clonality and identity, which were based on the 37C.0N.XX primers, led to the focus on primers that extended either T2, T3, or T4, as these dominated the slide (Figures 4 and 6). Relative to the above there are also changes to the hybridization conditions and modified nucleotides, but the most important difference (other than the fact that these are emulsion-generated beads) was that SEQUENASE™ was utilized instead of Klenow. Extension was performed with 37C.8N.CG, sequencing bases 10,11,12 on T4 using emulsion-beads instead of loaded beads (Figure 5). On cycle 19/20 (Figure 4), stripping was performed before reading the Cy3 signal out. Interestingly, less than 30 seconds in Wash IE at 7O0C was sufficient for stripping, or at least for redistribution of signal amongst the beads. Thus, cycles 22 and 23 were repeated with 37C.12N.CG.
What worked and what didn't work was based on visual inspection of the graphs. Thus, without intending to be bound by theory, even though 37.12N.CG->T had lower "ratios" than 37C.14N. AT-^C, it still appears to have worked, whereas 37C.14N.AT->C appeared not to have worked.
The slide was stripped and sequencing primer was re-annealed at the conclusion to determine to what extent the templates had fallen off due to heat exposure and the like. The difference between the two sets of images (pre-sequencing and post-sequencing) was negligible. The two sets of images were strikingly consistent with one another, which indicated that template was not being lost over the course of the experiment. This inspection also demonstrated quite clearly that the extent of gel warping over the approximately 20 cycles was negligible. Good signal was obtained for nearly all of the cycles.
An additional experiment was performed using the same primer, 37C.8N.CG, sequencing bases 10,11,12 on T4 (except with emulsion beads instead of loaded beads, and showing only well-amplified, clonal beads). The signal on these beads was higher than the loaded beads. Without intending to be bound by theory, reasons for this include: a) more template on amplified beads; and (b) the switch to SEQUENASE™ from Klenow.
EXAMPLE V
Wobble Ligation Method
The following describes an embodiment of the invention referred to as "Wobble Ligation." Several of the principles are identical or similar to Wobble Extension as previously described herein. These principles are distinguishable from FISSEQ and other sequencing methods, such as that described in Macevicz US Patent No. 5,750,341.
According to the Wobble Ligation embodiment described herein:
(a) At each step of the sequencing, a single base position in the unknown sequence is being queried.
(b) Which base is being queried is directly a function of the structure of the oligonucleotides used in the reaction. (c) After each cycle of enzymatic treatment and imaging, these oligonucleotides are stripped from the DNA attached to the beads; the method is thus non-progressive, in that any given cycle is not dependent on the efficiency of previous cycles.
There are several differences between Wobble Extension and Wobble Ligation:
(a) Ligases, rather than polymerases, are used as the discriminatory enzyme,
(b) In Wobble Extension, a single primer is hybridized and extended; degenerate bases within the oligonucleotide primer are included to 'reach' a specific distance into the unknown sequence. In Wobble Ligation, a single primer is hybridized that is universal (the 'anchor' primer) and sits such that either its 5' or 3' end is immediately adjacent to the unknown sequence. The position to be queried is encoded in a pool of degenerate nonamers (9-mer) that are ligated to the anchor primer. However, anchor primers having one or several degenerate positions at the terminus to be ligated to can serve as substrates for ligation and so can be used to position the query even further into the unknown sequence.
(c) The assays are always identical, in that the full pool of possible nonamers is being ligated to the anchor primer. What changes between the assays (and determines whether one is sequencing base 4 or base 7 in a particular cycle, for example), is the correlations between specific positions in the degenerate nonamer and fluorescent labels at its end. Figure 7 depicts, for example, the querying of position (-4) relative to the anchor primer.
EXAMPLE VI Ultra Low-Error PCR colonies
There is generally a high error rate for any pre-sequencing amplification method which starts from single templates and employs exponential amplification, including PCR, emlusion PCR, bead emulsion PCR, in situ polonies, digital PCR, bridge PCR, multiple displacement amplification (MDA) and the like. Such methods are described in C. P. Adams, S. J. Kron. (U.S. Patent 5,641,658, Mosaic Technologies, Inc.; Whitehead Institute for Biomedical Research, USA, 1997); D. Dressman, H. Yan, G. Traverso, K. W. Kinzler, B. Vogelstein, Proc. Natl. Acad. ScL USA, 100, 8817 (July 22, 2003); D. S. Tawfik, A. D. Griffiths, Natl. Biotechnol, 16, 652 (July, 1998); F. J. Ghadessy, J. L. Ong, P. Holliger, Proc. Natl. Acad. Sci. USA, 98, 4552 (April 10, 2001); M. Nakano et al., J. Biotechnol, 102, 117 (April 24, 2003); R. D. Mitra, G. M. Church, Nucleic Acids Res 27, e34 (Dec 15, 1999); and F. B. Dean et al., Proc. Natl. Acad. ScL USA, 99, 5261 (April 16, 2002), each of which are hereby incorporated by reference.
Such error establishes an upper limit on the accuracy of any sequencing method which operates on material that is the product of the amplification. For example, during bead emulsion PCR, template is diluted to the point where 1 template molecule and 1 bead will be trapped in an emulsion compartment, and PCR will proceed from this single molecule resulting in many copies bound to the bead. An error arising early during the amplification will result in a bead having either a homogenous population of amplicons bearing the error, or a heterogenous population of amplicons, some bearing the error and some not. In either case, the accuracy of the sequence derived from such a bead will be low.
According to embodiments of the present invention, emulsion PCR will be started with multiple copies of a given template molecule in a compartment. Then, PCR will initiate from each copy independently, and the product bound to the bead in that compartment will be largely homogenous and error-free, even if errors arise early during amplification from 1 of the copies of the template.
To achieve this goal, two techniques are useful. The first is to clone the template desired to be sequenced into a plasmid, transform into bacteria or yeast, and perform emulsion PCR not with naked single-copy template DNA, but rather with individual cells, each of which includes multiple copies of the template. During PCR the cells will rupture and amplification will proceed from each copy of the plasmid present. Since multiple copies of the template were present, and since each was copied independently by the host cell's low- error replication machinery, the probability of obtaining a PCR-based error in a preponderance of amplicons is very low.
The second approach uses linear rolling circle amplification to prepare template molecules which are linear concatemers of independent copies of the original template. PCR then initiates from each site on the concatemer independently. The important constraint (regardless of the method used to get multiple copies of a template into an emulsion compartment or otherwise to initiate a spatially-clustered exponential amplification) is that the initial copies made of the original template are independent of each other and so the probability of two such copies bearing the same error is very low. With a linear rolling circle amplification, the original template (a circular molecule) is iterated over many times, such that all copies are copies of the original template (unlike PCR, which makes copies of copies).
EXAMPLE VII Ligase-Driven DNA Molecular Ruler
Embodiments of the present invention are directed to methods to determine, with single-base resolution, the length of the unique region of a library molecule. To perform polony sequencing, a paired-tag genomic library is constructed where each library molecule is comprised of a unique region flanked by common primer sites. In order to generate a library where all inserts are short and of strictly defined length (which is important for signal homogeneity when using emulsion PCR to load the templates to sequencing beads), the type Hs restriction enzyme Mmel is used. Mmel cuts either 17bp or 18bp from its recognition sequence, and in the embodiment described here thus produces inserts of 17bp or 18bp at a ratio of about 50:50 with little to no sequence-dependence. Knowing the exact length of each insert is advantageous since sequencing methods described herein include the step of reading a certain number of bases from each side of the 17-18bp tag. In order to generate a contiguous sequence from such reads, knowing the exact length of the insert would be beneficial.
According to this embodiment a ligation-query scheme is used which relies on the specificity of the ligase reaction catalyzed by ampligase or some other ligase capable of yielding sufficient base paring specificity to first 'walk' across the insert with fully degenerate nonamers, and then query the identity of a base in the opposing universal primer sequence. An 'anchor' primer complementary to sequence in universal primer A can be first hybridized, then perform degenerate nonamer ligation to span the unique insert, and finally query the length of such insert with a pair of fluorescently-labeled query primers, where each possible length (17 or 18) is coded by a different fluorophore as depicted in Figure 8 A and 8B. EXAMPLE VIII
An additional embodiment of the present invention is described in the following method.
1. Hybridize 5'-phosphorylated, deoxyuridine-containing anchor-primer to target sequence
3 ' -AGAGUCUACUCA-/5 ' Phos/ 5 ' TCTCAGATGAGT??????????????? ...
2. Perform a base-query by ligating to this, with T4 DNA ligase, fully degenerate nonamers, where an internal base correlates with the identity of one of four fluorophores (four color nonamers) as illustrated in Figure 7.
3. Collect data by four-color imaging or some other means.
4. To remove the primer:degenerate-sequence:fluorophore complex before beginning the next cycle, treat with both Endonuclease 8 and E. coli Uracil-DNA Glycosylase ("UDG"). The UDG will cleave the uracils in the anchor primer, leaving abasic sites that will be cleaved by Endonuclease 8, leaving short fragments with low Tm's that will melt off the immobilized DNA strands at ambient temperatures. Heat, chemical denaturants, or other chemically or enzymatically labile bonds in the anchor primer could also be used in place of deoxyuridines to remove the primer:degenerate-sequence:fluorophore complex.
This embodiment can be carried out in the 5'->3' direction by using a degenerate nonamer population that is phosphorylated at the 5' end (such that that end will ligate to the anchor primer), and the fluorophore resides on its 3' end.
A kit including endonuclease 8 and UDG is commercially available from New England Biolabs under the tradename USER. A schematic of a sample UDG reaction is provided in the figure below. Base Excision Operates Where a Single Damaged Base Occurs
Uracil degiycosylase
A
Figure imgf000014_0001
P endonuclease (variable specificity)
JIOOf.
Figure imgf000014_0002
Example IX
Non-Progressive Cycling as Described in Example V
Certain polymerase- and ligase- driven cyclic sequencing methods are termed "progressive," in that they interrogate the sequencing template by incorporating onto the end of a growing polynucleotide chain, digesting from the end of the template, or ligating to a growing oligonucleotide primer. See for example , Braslavsky, B. Hebert, E. Kartalov, S. R. Quake, Proc. Natl. Acad. ScL USA, 100, 3960 (April 1, 2003); R. D. Mitra, J. Shendure, J. Olejnik, O. Edyta Krzymanska, G. M. Church, Anal. Biochem., 320, 55 (Sep 1, 2003); M. Ronaghi, S. Karamohamed, B. Pettersson, M. Uhlen, P. Nyren, Anal. Biochem., 242, 84 (Nov 1, 1996); S. C. C. Macevicz. (U.S. Patent 5,750,341, Lynx Therapeutics, Inc., USA, 1998), and S. Brenner et al., Natl. Biotechnol., 18:630 (Jun, 2000) each of which are hereby incorporated by reference. These "progressive" methods, however, are disadvantageous in that they exhibit amplicon dephasing, which results in decreased sequencing fidelity as the number of bases sequenced into the template increases.
The non-progressive cycling method of the present invention reduces, or in certain embodiments, eliminates, the adverse effects of amplicon dephasing in existing sequencing by synthesis methods (both polymerase- and ligase- driven) by removing the sequencing primer periodically (as often as after each base-position is interrogated). Thus, enzymatic and chemical inefficiencies and other errors do not accumulate as the sequencing run proceeds. Rather, each cycle is independent of previous inefficiencies or misincorporations (assuming the primer is removed after each sequencing cycle). The non-progressive cycling method of the present invention has the added advantage of allowing one to know, with reasonably certainty, which position in the template is being interrogated. This advantageously allows one to resolve homopolymers since the interrogation event has been de-coupled from the positioning event. Furthermore, it allows one to sequence a template out-of-order, rather than requiring one to sequentially query positions 5' to 3' or 3' to 5'.
According to the non-progressive cycling method of the present invention, the primer can be removed in a number of ways. Heat can be used to melt the primer off the template. Alkali can be used to chemically denature the primer from the template. Numerous other chemical denaturants can be used, which include: methanol, ethanol, isopropanol, n-propanol, allyl alcohol, sec-butyl alcohol, tert-butyl alcohol, isobutyl alcohol, n-butyl alcohol, tert-amyl alcohol, ethylene glycol, glycerol, dithioglycerol, propylene glycol, cyclohexyl alcohol, benzyl alcohol, inositol, phenol, p-methoxyphenol, aniline, pyridine, purine, 1,4-dioxane, gamma-butyrolactone, 3 -amino triazole, formamide, N-ethyl formamide, N-N- dimethylformamide, acetamide, N-ethyl acetamide, N-N-dimethyl acetamide, propionamide, butyramide, hexamide, glycolamide, thioacetamide, delta-valerolactam, urethan, N-methyl urethan, N-propylurethan, cyanoguanidine, sulfamide, glycine, acetonitrile, urea, Tween 40, Triton X-100, sodium trichloroacetate, sodium perchlorate, lithium bromide, cesium chloride, lithium chloride, potassium thiocyanate, sodium trifluoroacetate, sodium dodecyl sulfate, salicylate, dimethylsulfoxide, dioxane, and the like. Suitable denaturation methods are described in L. Levine, J. A. Gordon, W. P. Jencks, Biochem. 2:168 (Jan 1963); and J. Shendure et al., Science (published online Aug. 4, 2005).
Chemically-labile linkages, such as phosphorothioate with heavy-metal ion cleavage treatment as described in M. Mag, S. Luking, J. W. Engels, Nucleic Acids Res., 19:1437 (April 11, 1991) can be included in the primer to allow it to be fragmented into many pieces, each of which has a Tm low enough to cause the primeπquery complex to denature from the template. Primers can be made enzymatically-labile by the inclusion of ribonucleotides or ribonucleotide stretches (susceptible to cleavage by RNase H or alkali) or the inclusion of deoxyuridines (subject to cleavage by a mixture of uracil DNA glycosylase and endonuclease VIII) or abasic sites (subject to cleavage by endonuclease VIII). The primer can also be removed enzymatically by the use of a suitable exonuclease.
Non-Progressive Sequencing By Ligation Using Deoxyuridine Stripping
According to one aspect of the present invention, the following steps were carried out cyclically to interrogate each base of the template sequentially. An 'anchor primer' was hybridized complementary to common library sequence. A pool of fluorescently-labeled 'query primers' specific to one tag-position was then ligated to the template. Imaging was then used to determine which primer pool ligated to which bead. The anchor: :query primer complex was then stripped. The process was then repeated.
Anchor primers used had the following sequences (U = deoxyuridine):
• T30UIA 5'-GGGCCGUACGUCCAACT-S'
• T30UIB 5'-CGCCUUGGCCUCCGACT-S'
• PRlUION 5'-CCCGGGUUCCUCAUUCUCT-S'
• LIGFIXDD 5'-Phos/AUCACCGACUGCCCA-3'
• LIGFIXD2T30A S'-Phos/AGUUGGAGGUACGGC-S'
• LIGFIXD2T30B S'-Phos/AGUCGGAGGCCAAGC-S'
Query primers used were nonamers which were degenerate at all positions excepy the query position. At the query position, only one base was present for a given fluorophore. For example, the pool of probes used to query position five was composed of the following four label-subpools:
• Cy54NA 5'-Phos/NNNNANNNN/Cy5~3'
• Cy34NG S'-Phos/NNNNGNNNN/CyS-S'
• TexasRed4NC 5 ' -Phos/NNNNCNNNN/TR-3 '
• FRET4NT 5'-Phos/NNNNTNNNN/FRET-3'
Anchor primers were hybridized in a flowcell (10OuM primer in 6x SSPE) for 5 minutes at 56C, then cooled to 42C and held for 2 minutes. Excess primer was then washed out at room temperature with Wash IE (1OmM Tris-HCl pH 7.5, 5OmM KCl, 2mM EDTA pH 8.0, 0.01% Triton X-100) for 2 minutes.
Query primers were ligated in the flowcell (8uM query primer mix (2uM each subpool), 6000U T4 DNA ligase (NEB), Ix T4 DNA ligase buffer (NEB)) at 35C and held for 30 minutes. At the end of the reaction, excess query primer was washed out at room temperature with Wash IE for 5 minutes.
Four-color imaging was performed on an epifluorescence microscope with filters appropriate to the fluorophores attached to the nonamers.
Anchor:: query primer complex was stripped with USER (NEB), a combination of uracil DNA glycosylase and endonuclease VIII. To perform the stripping reaction, the following protocol was executed in the flowcell:
• Incubate 15OuL stripping mix (3 ul USER (NEB), 150 ul TE) for 5 minutes at 37C
• Raise temperature to 56C and hold 1 minute
• Wash for 1 minute with Wash IE; temperature gradually decreases
• Incubate 150 ul fresh stripping mix for 5 minutes at 37C
• Wash for 5 minutes with Wash IE; temperature gradually decreases
With reference to Figure 9A, the cycles consist of the following four steps: (a) hybridization of one of four anchor primer, (b) ligation of fluorescent, degenerate nonamers, (c) four color imaging on epifluorescence microscope, (d) stripping of the anchor primer:nonamer complexes prior to beginning the next cycle. The anchor primers are each designed to be complementary to universal sequence immediately 5' or 3' to one of the two tags. Al, A2, A3 and A4 indicate the four locations to which anchor primers are targeted relative to the amplicon. Arrows indicate the direction sequenced into the tag from each anchor primer. From anchor primers Al and A3, 7 bases are sequenced into each tag, and from anchor primers A2 and A4, 6 bases are sequenced into each tag. Thus, 13 bp per tag are obtained, and 26 bp per amplicon, with 4 to 5 bp gaps within each tag sequence.
With reference to Figure 9B, each cycle involves performing a ligation reaction with T4 DNA ligase and a fully degenerate population of nonamers. The nonamer molecules are individually labeled with one of four fluorophores (e.g., Texas Red, Cy5, Cy3, FITC). Depending on which position that a given cycle is aiming to interrogate, the nonamers are structured differently. Specifically, a single position within each nonamer is correlated with the identity of the fluorophore with which it is labeled. Additionally, the fiuorphore molecule is attached at the opposite end of the nonamer relative to the end targeted to the ligation junction. For example, in Figure 9B, the anchor primer is hybridized such that its 3' end is adjacent to the genomic tag. To query a position five bases in to the tag sequence, the four- color population of nonamersis used.
Referring to Figure 10, four-color data from each cycle can be visualized in tetrahedral space, where each point represents a single bead, and the four clusters correspond to the four possible base calls. Figure 11 shows data from a single cycle of non-progressive sequencing by ligation, and in particular is the sequencing data from position (-1) of the proximal tag of a complex E. coli derived library. Figure 11 shows variation in accuracy over each of 26 cycles of non-progressive sequencing by ligation in a single experiment resequencing an E. coli genome. Cumulative distribution of raw error as a function of rank- ordered quality, with each of 26 sequencing-by-ligation cycles in a single sequencing experiment is treated as an independent data-set. The x-axis indicates percentile bins of beads, sorted on the basis of a confidence metric. The >>-axis (log scale) indicates the raw base-calling accuracy of each cumulative bin.
References
Housby JN, Southern EM., "Thermus scotoductus and Rhodothermus marinus DNA ligases have higher ligation efficiencies than thermus thermophilus DNA ligase," Anal Biochem., 2002 March 1; 302(l):88-94.
Housby JN, Thorbjarnardottir SH, Jonsson ZO, Southern EM., "Optimised ligation of oligonucleotides by thermal ligases: comparison of Thermus scotoductus and Rhodothermus marinus DNA ligases to other thermophilic ligases," Nucleic Acids Res., 2000 Feb. 1; 28(3):E10.
Housby JN, Southern EM., "Fidelity of DNA ligation: a novel experimental approach based on the polymerisation of libraries of oligonucleotides," Nucleic Acids Res., 1998 Sept. 15; 26(18):4259-4266.
Pritchard CE, Southern EM., "Effects of base mismatches on joining of short oligodeoxynucleotides by DNA ligases," Nucleic Acids Res., 1997 Sept. 1; 25(17):3403- 3407.

Claims

What is claimed is:
1. A method described above for DNA sequencing, useful for sequencing homopolymeric regions of DNA.
2. A method of sequencing a target nucleic acid comprising: a. providing a sequencing primer, wherein the sequencing primer has at least one anchor sequence and a universal base; b. hybridizing the sequencing primer to a target nucleic acid; and c. extending the sequencing primer.
3. A method of sequencing a target nucleic acid comprising: a. providing a sequencing primer, wherein the sequencing primer has at least one anchor sequence and a degenerate base; b. hybridizing the sequencing primer to a target nucleic acid; and c. extending the sequencing primer.
4. A method of sequencing a target nucleic acid comprising: a. providing a sequencing primer, wherein the sequencing primer has at least one anchor sequence and a natural base; b. hybridizing the sequencing primer to a target nucleic acid; and c. extending the sequencing primer.
5. A method for sequencing a target nucleic acid comprising:
(a) hybridization of one of several anchor primers to a common sequence adjacent to an unknown sequence,
(b) ligation of fluorescently labeled, degenerate oligonucleotides to the anchor primer, such that identity of the fluorophore is informative of the identity of one or more positions within the degenerate oligonucleotide,
(c) imaging to determine primer ligation,
(d) stripping of the anchor primeπdegenerate oligonucleotide complexes, and
(e) repeating steps (a)-(d) one or more times.
PCT/US2005/027695 2004-08-04 2005-08-04 Wobble sequencing WO2006073504A2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US11/670,588 US20070207482A1 (en) 2004-08-04 2007-02-02 Wobble sequencing

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US59861004P 2004-08-04 2004-08-04
US60/598,610 2004-08-04
US69271805P 2005-06-22 2005-06-22
US60/692,718 2005-06-22

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US11/670,588 Continuation US20070207482A1 (en) 2004-08-04 2007-02-02 Wobble sequencing

Publications (3)

Publication Number Publication Date
WO2006073504A2 true WO2006073504A2 (en) 2006-07-13
WO2006073504A3 WO2006073504A3 (en) 2007-04-12
WO2006073504A8 WO2006073504A8 (en) 2007-09-27

Family

ID=36647934

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2005/027695 WO2006073504A2 (en) 2004-08-04 2005-08-04 Wobble sequencing

Country Status (2)

Country Link
US (1) US20070207482A1 (en)
WO (1) WO2006073504A2 (en)

Cited By (43)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2009094583A1 (en) * 2008-01-23 2009-07-30 Complete Genomics, Inc. Methods and compositions for preventing bias in amplification and sequencing reactions
EP2189793A1 (en) * 2008-11-21 2010-05-26 Micronas GmbH Method for regenerating a biosensor
US7811810B2 (en) 2007-10-25 2010-10-12 Industrial Technology Research Institute Bioassay system including optical detection apparatuses, and method for detecting biomolecules
WO2010148039A2 (en) 2009-06-15 2010-12-23 Complete Genomics, Inc. Methods and compositions for long fragment read sequencing
US7906285B2 (en) 2003-02-26 2011-03-15 Callida Genomics, Inc. Random array DNA analysis by hybridization
EP2362209A2 (en) 2009-03-11 2011-08-31 Industrial Technology Research Institute Apparatus and method for detection and discrimination of the type of a molecular object
EP2511843A2 (en) 2009-04-29 2012-10-17 Complete Genomics, Inc. Method and system for calling variations in a sample polynucleotide sequence with respect to a reference polynucleotide sequence
EP2565279A1 (en) 2007-12-05 2013-03-06 Complete Genomics, Inc. Efficient base determination in sequencing reactions
US8431691B2 (en) 2005-02-01 2013-04-30 Applied Biosystems Llc Reagents, methods, and libraries for bead-based sequencing
WO2013066975A1 (en) 2011-11-02 2013-05-10 Complete Genomics, Inc. Treatment for stabilizing nucleic acid arrays
EP2657869A2 (en) 2007-08-29 2013-10-30 Applied Biosystems, LLC Alternative nucleic acid sequencing methods
WO2013166517A1 (en) 2012-05-04 2013-11-07 Complete Genomics, Inc. Methods for determining absolute genome-wide copy number variations of complex tumors
US8615365B2 (en) 2009-02-03 2013-12-24 Complete Genomics, Inc. Oligomer sequences mapping
US8725422B2 (en) 2010-10-13 2014-05-13 Complete Genomics, Inc. Methods for estimating genome-wide copy number variations
US8731843B2 (en) 2009-02-03 2014-05-20 Complete Genomics, Inc. Oligomer sequences mapping
US8738296B2 (en) 2009-02-03 2014-05-27 Complete Genomics, Inc. Indexing a reference sequence for oligomer sequence mapping
WO2014145820A2 (en) 2013-03-15 2014-09-18 Complete Genomics, Inc. Multiple tagging of long dna fragments
US8865078B2 (en) 2010-06-11 2014-10-21 Industrial Technology Research Institute Apparatus for single-molecule detection
US8945835B2 (en) 2006-02-08 2015-02-03 Illumina Cambridge Limited Method for sequencing a polynucleotide template
US9023769B2 (en) 2009-11-30 2015-05-05 Complete Genomics, Inc. cDNA library for nucleic acid sequencing
US9222132B2 (en) 2008-01-28 2015-12-29 Complete Genomics, Inc. Methods and compositions for efficient base calling in sequencing reactions
US9267172B2 (en) 2007-11-05 2016-02-23 Complete Genomics, Inc. Efficient base determination in sequencing reactions
EP3037553A1 (en) 2007-10-25 2016-06-29 Industrial Technology Research Institute Bioassay system including optical detection apparatuses, and method for detecting biomolecules
US9382585B2 (en) 2007-10-30 2016-07-05 Complete Genomics, Inc. Apparatus for high throughput sequencing of nucleic acids
EP3043319A1 (en) 2010-04-30 2016-07-13 Complete Genomics, Inc. Method and system for accurate alignment and registration of array for dna sequencing
US9482615B2 (en) 2010-03-15 2016-11-01 Industrial Technology Research Institute Single-molecule detection system and methods
US9524369B2 (en) 2009-06-15 2016-12-20 Complete Genomics, Inc. Processing and analysis of complex nucleic acid sequence data
US9551026B2 (en) 2007-12-03 2017-01-24 Complete Genomincs, Inc. Method for nucleic acid detection using voltage enhancement
US9637784B2 (en) 2005-06-15 2017-05-02 Complete Genomics, Inc. Methods for DNA sequencing and analysis using multiple tiers of aliquots
US9803239B2 (en) 2012-03-29 2017-10-31 Complete Genomics, Inc. Flow cells for high density array chips
US9880089B2 (en) 2010-08-31 2018-01-30 Complete Genomics, Inc. High-density devices with synchronous tracks for quad-cell based alignment correction
WO2018129214A1 (en) 2017-01-04 2018-07-12 Complete Genomics, Inc. Stepwise sequencing by non-labeled reversible terminators or natural nucleotides
US10227647B2 (en) 2015-02-17 2019-03-12 Complete Genomics, Inc. DNA sequencing using controlled strand displacement
WO2019071471A1 (en) 2017-10-11 2019-04-18 深圳华大智造科技有限公司 Method for improving loading and stability of nucleic acid on solid support
US10385391B2 (en) 2009-09-22 2019-08-20 President And Fellows Of Harvard College Entangled mate sequencing
US10392726B2 (en) 2010-10-08 2019-08-27 President And Fellows Of Harvard College High-throughput immune sequencing
US10726942B2 (en) 2013-08-23 2020-07-28 Complete Genomics, Inc. Long fragment de novo assembly using short reads
WO2020180813A1 (en) 2019-03-06 2020-09-10 Qiagen Sciences, Llc Compositions and methods for adaptor design and nucleic acid library construction for rolony-based sequencing
EP3746564A1 (en) 2018-01-29 2020-12-09 St. Jude Children's Research Hospital, Inc. Method for nucleic acid amplification
WO2021103695A1 (en) * 2019-11-25 2021-06-03 齐鲁工业大学 Single-base continuous extension flow-type targeted sequencing method
WO2021185320A1 (en) 2020-03-18 2021-09-23 Mgi Tech Co., Ltd. Restoring phase in massively parallel sequencing
US11198855B2 (en) 2014-11-13 2021-12-14 The Board Of Trustees Of The University Of Illinois Bio-engineered hyper-functional “super” helicases
US11389779B2 (en) 2007-12-05 2022-07-19 Complete Genomics, Inc. Methods of preparing a library of nucleic acid fragments tagged with oligonucleotide bar code sequences

Families Citing this family (108)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
SG10201405158QA (en) 2006-02-24 2014-10-30 Callida Genomics Inc High throughput genome sequencing on dna arrays
US7910302B2 (en) 2006-10-27 2011-03-22 Complete Genomics, Inc. Efficient arrays of amplified polynucleotides
US20090111705A1 (en) 2006-11-09 2009-04-30 Complete Genomics, Inc. Selection of dna adaptor orientation by hybrid capture
US8716190B2 (en) 2007-09-14 2014-05-06 Affymetrix, Inc. Amplification and analysis of selected targets on solid supports
US9388457B2 (en) 2007-09-14 2016-07-12 Affymetrix, Inc. Locus specific amplification using array probes
WO2009052214A2 (en) 2007-10-15 2009-04-23 Complete Genomics, Inc. Sequence analysis using decorated nucleic acids
US8298768B2 (en) 2007-11-29 2012-10-30 Complete Genomics, Inc. Efficient shotgun sequencing methods
EP2285979B2 (en) 2008-05-27 2020-02-19 Dako Denmark A/S Hybridization compositions and methods
CN102186992A (en) * 2008-08-15 2011-09-14 康奈尔大学 Device for rapid identification of nucleic acids for binding to specific chemical targets
US9303287B2 (en) 2009-02-26 2016-04-05 Dako Denmark A/S Compositions and methods for RNA hybridization applications
US20190300945A1 (en) 2010-04-05 2019-10-03 Prognosys Biosciences, Inc. Spatially Encoded Biological Assays
US10787701B2 (en) 2010-04-05 2020-09-29 Prognosys Biosciences, Inc. Spatially encoded biological assays
PT2556171E (en) 2010-04-05 2015-12-21 Prognosys Biosciences Inc Spatially encoded biological assays
US8865077B2 (en) 2010-06-11 2014-10-21 Industrial Technology Research Institute Apparatus for single-molecule detection
US10167508B2 (en) 2010-08-06 2019-01-01 Ariosa Diagnostics, Inc. Detection of genetic abnormalities
US11031095B2 (en) 2010-08-06 2021-06-08 Ariosa Diagnostics, Inc. Assay systems for determination of fetal copy number variation
US20120034603A1 (en) 2010-08-06 2012-02-09 Tandem Diagnostics, Inc. Ligation-based detection of genetic variants
US20130040375A1 (en) 2011-08-08 2013-02-14 Tandem Diagnotics, Inc. Assay systems for genetic analysis
US20130261003A1 (en) 2010-08-06 2013-10-03 Ariosa Diagnostics, In. Ligation-based detection of genetic variants
US8700338B2 (en) 2011-01-25 2014-04-15 Ariosa Diagnosis, Inc. Risk calculation for evaluation of fetal aneuploidy
US10533223B2 (en) 2010-08-06 2020-01-14 Ariosa Diagnostics, Inc. Detection of target nucleic acids using hybridization
US20140342940A1 (en) 2011-01-25 2014-11-20 Ariosa Diagnostics, Inc. Detection of Target Nucleic Acids using Hybridization
US11203786B2 (en) 2010-08-06 2021-12-21 Ariosa Diagnostics, Inc. Detection of target nucleic acids using hybridization
US9671344B2 (en) 2010-08-31 2017-06-06 Complete Genomics, Inc. High-density biochemical array chips with asynchronous tracks for alignment correction by moiré averaging
WO2012103031A2 (en) 2011-01-25 2012-08-02 Ariosa Diagnostics, Inc. Detection of genetic abnormalities
US11270781B2 (en) 2011-01-25 2022-03-08 Ariosa Diagnostics, Inc. Statistical analysis for non-invasive sex chromosome aneuploidy determination
US10131947B2 (en) 2011-01-25 2018-11-20 Ariosa Diagnostics, Inc. Noninvasive detection of fetal aneuploidy in egg donor pregnancies
US9994897B2 (en) 2013-03-08 2018-06-12 Ariosa Diagnostics, Inc. Non-invasive fetal sex determination
US8756020B2 (en) 2011-01-25 2014-06-17 Ariosa Diagnostics, Inc. Enhanced risk probabilities using biomolecule estimations
WO2012118745A1 (en) 2011-02-28 2012-09-07 Arnold Oliphant Assay systems for detection of aneuploidy and sex determination
EP2694709B1 (en) 2011-04-08 2016-09-14 Prognosys Biosciences, Inc. Peptide constructs and assay systems
GB201106254D0 (en) 2011-04-13 2011-05-25 Frisen Jonas Method and product
US8712697B2 (en) 2011-09-07 2014-04-29 Ariosa Diagnostics, Inc. Determination of copy number variations using binomial probability calculations
EP2761028A1 (en) 2011-09-30 2014-08-06 Dako Denmark A/S Hybridization compositions and methods using formamide
EP2768974B1 (en) * 2011-10-21 2017-07-19 Dako Denmark A/S Hybridization compositions and methods
US10289800B2 (en) 2012-05-21 2019-05-14 Ariosa Diagnostics, Inc. Processes for calculating phased fetal genomic sequences
US9884893B2 (en) 2012-05-21 2018-02-06 Distributed Bio, Inc. Epitope focusing by variable effective antigen surface concentration
CA2874413A1 (en) 2012-05-21 2013-11-28 The Scripps Research Institute Methods of sample preparation
US9488823B2 (en) 2012-06-07 2016-11-08 Complete Genomics, Inc. Techniques for scanned illumination
US9628676B2 (en) 2012-06-07 2017-04-18 Complete Genomics, Inc. Imaging systems with movable scan mirrors
CN104583421A (en) 2012-07-19 2015-04-29 阿瑞奥萨诊断公司 Multiplexed sequential ligation-based detection of genetic variants
USRE50065E1 (en) 2012-10-17 2024-07-30 10X Genomics Sweden Ab Methods and product for optimising localised or spatial detection of gene expression in a tissue sample
WO2014200579A1 (en) 2013-06-13 2014-12-18 Ariosa Diagnostics, Inc. Statistical analysis for non-invasive sex chromosome aneuploidy determination
DK3013984T3 (en) 2013-06-25 2023-06-06 Prognosys Biosciences Inc METHOD FOR DETERMINING SPATIAL PATTERNS IN BIOLOGICAL TARGETS IN A SAMPLE
KR102160389B1 (en) 2013-08-05 2020-09-28 트위스트 바이오사이언스 코포레이션 De novo synthesized gene libraries
WO2015042708A1 (en) 2013-09-25 2015-04-02 Bio-Id Diagnostic Inc. Methods for detecting nucleic acid fragments
EP3191604B1 (en) 2014-09-09 2021-04-14 Igenomx International Genomics Corporation Methods and compositions for rapid nucleic acid library preparation
WO2016126987A1 (en) 2015-02-04 2016-08-11 Twist Bioscience Corporation Compositions and methods for synthetic gene assembly
WO2016126882A1 (en) 2015-02-04 2016-08-11 Twist Bioscience Corporation Methods and devices for de novo oligonucleic acid assembly
FI3901281T3 (en) 2015-04-10 2023-01-31 Spatially distinguished, multiplex nucleic acid analysis of biological specimens
US9981239B2 (en) 2015-04-21 2018-05-29 Twist Bioscience Corporation Devices and methods for oligonucleic acid library synthesis
EP3350314A4 (en) 2015-09-18 2019-02-06 Twist Bioscience Corporation Oligonucleic acid variant libraries and synthesis thereof
KR20180058772A (en) 2015-09-22 2018-06-01 트위스트 바이오사이언스 코포레이션 Flexible substrate for nucleic acid synthesis
CN115920796A (en) 2015-12-01 2023-04-07 特韦斯特生物科学公司 Functionalized surfaces and preparation thereof
CA3034769A1 (en) 2016-08-22 2018-03-01 Twist Bioscience Corporation De novo synthesized nucleic acid libraries
US10417457B2 (en) 2016-09-21 2019-09-17 Twist Bioscience Corporation Nucleic acid based data storage
GB2573069A (en) 2016-12-16 2019-10-23 Twist Bioscience Corp Variant libraries of the immunological synapse and synthesis thereof
CA3054303A1 (en) 2017-02-22 2018-08-30 Twist Bioscience Corporation Nucleic acid based data storage
US10894959B2 (en) 2017-03-15 2021-01-19 Twist Bioscience Corporation Variant libraries of the immunological synapse and synthesis thereof
WO2018231864A1 (en) 2017-06-12 2018-12-20 Twist Bioscience Corporation Methods for seamless nucleic acid assembly
AU2018284227B2 (en) 2017-06-12 2024-05-02 Twist Bioscience Corporation Methods for seamless nucleic acid assembly
CN111566125A (en) 2017-09-11 2020-08-21 特韦斯特生物科学公司 GPCR binding proteins and synthesis thereof
GB2583590A (en) 2017-10-20 2020-11-04 Twist Bioscience Corp Heated nanowells for polynucleotide synthesis
AU2019205269A1 (en) 2018-01-04 2020-07-30 Twist Bioscience Corporation DNA-based digital information storage
CN112639130B (en) 2018-05-18 2024-08-09 特韦斯特生物科学公司 Polynucleotides, reagents and methods for nucleic acid hybridization
US11519033B2 (en) 2018-08-28 2022-12-06 10X Genomics, Inc. Method for transposase-mediated spatial tagging and analyzing genomic DNA in a biological sample
US11649485B2 (en) 2019-01-06 2023-05-16 10X Genomics, Inc. Generating capture probes for spatial analysis
US11926867B2 (en) 2019-01-06 2024-03-12 10X Genomics, Inc. Generating capture probes for spatial analysis
WO2020176678A1 (en) 2019-02-26 2020-09-03 Twist Bioscience Corporation Variant nucleic acid libraries for glp1 receptor
JP2022522668A (en) 2019-02-26 2022-04-20 ツイスト バイオサイエンス コーポレーション Mutant nucleic acid library for antibody optimization
WO2020243579A1 (en) 2019-05-30 2020-12-03 10X Genomics, Inc. Methods of detecting spatial heterogeneity of a biological sample
CA3144644A1 (en) 2019-06-21 2020-12-24 Twist Bioscience Corporation Barcode-based nucleic acid sequence assembly
AU2020356471A1 (en) 2019-09-23 2022-04-21 Twist Bioscience Corporation Variant nucleic acid libraries for CRTH2
WO2021091611A1 (en) 2019-11-08 2021-05-14 10X Genomics, Inc. Spatially-tagged analyte capture agents for analyte multiplexing
EP4025711A2 (en) 2019-11-08 2022-07-13 10X Genomics, Inc. Enhancing specificity of analyte binding
WO2021133842A1 (en) 2019-12-23 2021-07-01 10X Genomics, Inc. Compositions and methods for using fixed biological samples in partition-based assays
EP4424843A3 (en) 2019-12-23 2024-09-25 10X Genomics, Inc. Methods for spatial analysis using rna-templated ligation
US11702693B2 (en) 2020-01-21 2023-07-18 10X Genomics, Inc. Methods for printing cells and generating arrays of barcoded cells
US11732299B2 (en) 2020-01-21 2023-08-22 10X Genomics, Inc. Spatial assays with perturbed cells
US11821035B1 (en) 2020-01-29 2023-11-21 10X Genomics, Inc. Compositions and methods of making gene expression libraries
US12076701B2 (en) 2020-01-31 2024-09-03 10X Genomics, Inc. Capturing oligonucleotides in spatial transcriptomics
US12110541B2 (en) 2020-02-03 2024-10-08 10X Genomics, Inc. Methods for preparing high-resolution spatial arrays
US11898205B2 (en) 2020-02-03 2024-02-13 10X Genomics, Inc. Increasing capture efficiency of spatial assays
US11732300B2 (en) 2020-02-05 2023-08-22 10X Genomics, Inc. Increasing efficiency of spatial analysis in a biological sample
US11835462B2 (en) 2020-02-11 2023-12-05 10X Genomics, Inc. Methods and compositions for partitioning a biological sample
US11891654B2 (en) 2020-02-24 2024-02-06 10X Genomics, Inc. Methods of making gene expression libraries
US11926863B1 (en) 2020-02-27 2024-03-12 10X Genomics, Inc. Solid state single cell method for analyzing fixed biological cells
US11768175B1 (en) 2020-03-04 2023-09-26 10X Genomics, Inc. Electrophoretic methods for spatial analysis
CN115916999A (en) 2020-04-22 2023-04-04 10X基因组学有限公司 Methods for spatial analysis using targeted RNA depletion
AU2021275906A1 (en) 2020-05-22 2022-12-22 10X Genomics, Inc. Spatial analysis to detect sequence variants
EP4414459A3 (en) 2020-05-22 2024-09-18 10X Genomics, Inc. Simultaneous spatio-temporal measurement of gene expression and cellular activity
WO2021242834A1 (en) 2020-05-26 2021-12-02 10X Genomics, Inc. Method for resetting an array
AU2021283184A1 (en) 2020-06-02 2023-01-05 10X Genomics, Inc. Spatial transcriptomics for antigen-receptors
EP4025692A2 (en) 2020-06-02 2022-07-13 10X Genomics, Inc. Nucleic acid library methods
US12031177B1 (en) 2020-06-04 2024-07-09 10X Genomics, Inc. Methods of enhancing spatial resolution of transcripts
WO2021252499A1 (en) 2020-06-08 2021-12-16 10X Genomics, Inc. Methods of determining a surgical margin and methods of use thereof
EP4165207B1 (en) 2020-06-10 2024-09-25 10X Genomics, Inc. Methods for determining a location of an analyte in a biological sample
EP4450639A2 (en) 2020-06-25 2024-10-23 10X Genomics, Inc. Spatial analysis of dna methylation
US11981960B1 (en) 2020-07-06 2024-05-14 10X Genomics, Inc. Spatial analysis utilizing degradable hydrogels
US11761038B1 (en) 2020-07-06 2023-09-19 10X Genomics, Inc. Methods for identifying a location of an RNA in a biological sample
US11981958B1 (en) 2020-08-20 2024-05-14 10X Genomics, Inc. Methods for spatial analysis using DNA capture
US11926822B1 (en) 2020-09-23 2024-03-12 10X Genomics, Inc. Three-dimensional spatial analysis
US11827935B1 (en) 2020-11-19 2023-11-28 10X Genomics, Inc. Methods for spatial analysis using rolling circle amplification and detection probes
AU2021409136A1 (en) 2020-12-21 2023-06-29 10X Genomics, Inc. Methods, compositions, and systems for capturing probes and/or barcodes
WO2022178267A2 (en) 2021-02-19 2022-08-25 10X Genomics, Inc. Modular assay support devices
EP4301870A1 (en) 2021-03-18 2024-01-10 10X Genomics, Inc. Multiplex capture of gene and protein expression from a biological sample
EP4347879A1 (en) 2021-06-03 2024-04-10 10X Genomics, Inc. Methods, compositions, kits, and systems for enhancing analyte capture for spatial analysis
EP4196605A1 (en) 2021-09-01 2023-06-21 10X Genomics, Inc. Methods, compositions, and kits for blocking a capture probe on a spatial array

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6528288B2 (en) * 1999-04-21 2003-03-04 Genome Technologies, Llc Shot-gun sequencing and amplification without cloning

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5641658A (en) * 1994-08-03 1997-06-24 Mosaic Technologies, Inc. Method for performing amplification of nucleic acid with two primers bound to a single solid support
US5750341A (en) * 1995-04-17 1998-05-12 Lynx Therapeutics, Inc. DNA sequencing by parallel oligonucleotide extensions

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6528288B2 (en) * 1999-04-21 2003-03-04 Genome Technologies, Llc Shot-gun sequencing and amplification without cloning

Non-Patent Citations (7)

* Cited by examiner, † Cited by third party
Title
DAY J.P. ET AL.: 'Nucleotide Analogs and new buffers improve a generalized method to enrich for low abundance mutations' NUCLEIC ACIDS RESEARCH vol. 27, no. 8, 1999, pages 1819 - 1827, XP002159669 *
KACZOROWSKI ET AL.: 'Automated four-color DNA sequencing using primers assembled by hexamer ligation' GENE vol. 179, 1996, pages 195 - 198 *
KACZOROWSKI ET AL.: 'Genomic DNA sequencing by SPEL-6 primer walking using hexamer ligation' GENE vol. 223, 1998, pages 83 - 91 *
KACZOROWSKI T. ET AL.: 'Co-operativity of hexamer ligation' GENE vol. 179, 1996, pages 189 - 193, XP004071982 *
PASTINEN T. ET AL.: 'A System for Specific, High-throughput Genotyping by Allele-Specific Primer Extention on Microarrays' GENOME RESEARCH vol. 10, 2000, pages 1031 - 1042, XP008013561 *
RAJA M.C. ET AL.: 'DNA sequencing using differential extension with nucleotide subsets (DENS)' NUCLEIC ACID RESEARCH vol. 25, no. 4, 1997, pages 800 - 805, XP003010704 *
ROSE T.M. ET AL.: 'Consensus-degenerate hybrid oligonucleotide primers for amplification of distantly related sequences' NUCLEIC ACIDS RESEARCH vol. 26, no. 7, 1998, pages 1628 - 1635, XP002141299 *

Cited By (75)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7906285B2 (en) 2003-02-26 2011-03-15 Callida Genomics, Inc. Random array DNA analysis by hybridization
US8431691B2 (en) 2005-02-01 2013-04-30 Applied Biosystems Llc Reagents, methods, and libraries for bead-based sequencing
US10323277B2 (en) 2005-02-01 2019-06-18 Applied Biosystems, Llc Reagents, methods, and libraries for bead-based sequencing
US11414702B2 (en) 2005-06-15 2022-08-16 Complete Genomics, Inc. Nucleic acid analysis by random mixtures of non-overlapping fragments
US10125392B2 (en) 2005-06-15 2018-11-13 Complete Genomics, Inc. Preparing a DNA fragment library for sequencing using tagged primers
US9637784B2 (en) 2005-06-15 2017-05-02 Complete Genomics, Inc. Methods for DNA sequencing and analysis using multiple tiers of aliquots
US10351909B2 (en) 2005-06-15 2019-07-16 Complete Genomics, Inc. DNA sequencing from high density DNA arrays using asynchronous reactions
US9637785B2 (en) 2005-06-15 2017-05-02 Complete Genomics, Inc. Tagged fragment library configured for genome or cDNA sequence analysis
US9944984B2 (en) 2005-06-15 2018-04-17 Complete Genomics, Inc. High density DNA array
US9650673B2 (en) 2005-06-15 2017-05-16 Complete Genomics, Inc. Single molecule arrays for genetic and chemical analysis
US8945835B2 (en) 2006-02-08 2015-02-03 Illumina Cambridge Limited Method for sequencing a polynucleotide template
US9994896B2 (en) 2006-02-08 2018-06-12 Illumina Cambridge Limited Method for sequencing a polynucelotide template
US10876158B2 (en) 2006-02-08 2020-12-29 Illumina Cambridge Limited Method for sequencing a polynucleotide template
EP2657869A2 (en) 2007-08-29 2013-10-30 Applied Biosystems, LLC Alternative nucleic acid sequencing methods
US7811810B2 (en) 2007-10-25 2010-10-12 Industrial Technology Research Institute Bioassay system including optical detection apparatuses, and method for detecting biomolecules
EP3037553A1 (en) 2007-10-25 2016-06-29 Industrial Technology Research Institute Bioassay system including optical detection apparatuses, and method for detecting biomolecules
US10017815B2 (en) 2007-10-30 2018-07-10 Complete Genomics, Inc. Method for high throughput screening of nucleic acids
US9382585B2 (en) 2007-10-30 2016-07-05 Complete Genomics, Inc. Apparatus for high throughput sequencing of nucleic acids
US9267172B2 (en) 2007-11-05 2016-02-23 Complete Genomics, Inc. Efficient base determination in sequencing reactions
US9551026B2 (en) 2007-12-03 2017-01-24 Complete Genomincs, Inc. Method for nucleic acid detection using voltage enhancement
US11389779B2 (en) 2007-12-05 2022-07-19 Complete Genomics, Inc. Methods of preparing a library of nucleic acid fragments tagged with oligonucleotide bar code sequences
EP2610351A1 (en) 2007-12-05 2013-07-03 Complete Genomics, Inc. Efficient base determination in sequencing reactions
EP2565279A1 (en) 2007-12-05 2013-03-06 Complete Genomics, Inc. Efficient base determination in sequencing reactions
WO2009094583A1 (en) * 2008-01-23 2009-07-30 Complete Genomics, Inc. Methods and compositions for preventing bias in amplification and sequencing reactions
US9222132B2 (en) 2008-01-28 2015-12-29 Complete Genomics, Inc. Methods and compositions for efficient base calling in sequencing reactions
US11214832B2 (en) 2008-01-28 2022-01-04 Complete Genomics, Inc. Methods and compositions for efficient base calling in sequencing reactions
US10662473B2 (en) 2008-01-28 2020-05-26 Complete Genomics, Inc. Methods and compositions for efficient base calling in sequencing reactions
US11098356B2 (en) 2008-01-28 2021-08-24 Complete Genomics, Inc. Methods and compositions for nucleic acid sequencing
US9523125B2 (en) 2008-01-28 2016-12-20 Complete Genomics, Inc. Methods and compositions for efficient base calling in sequencing reactions
EP2189793A1 (en) * 2008-11-21 2010-05-26 Micronas GmbH Method for regenerating a biosensor
US8518709B2 (en) 2008-11-21 2013-08-27 Endress+Hauser Conducta Gesellschaft Fuer Mess-Und Regeltechnik Mbh+Co. Kg Method for regenerating a biosensor
US8731843B2 (en) 2009-02-03 2014-05-20 Complete Genomics, Inc. Oligomer sequences mapping
US8615365B2 (en) 2009-02-03 2013-12-24 Complete Genomics, Inc. Oligomer sequences mapping
US8738296B2 (en) 2009-02-03 2014-05-27 Complete Genomics, Inc. Indexing a reference sequence for oligomer sequence mapping
US9778188B2 (en) 2009-03-11 2017-10-03 Industrial Technology Research Institute Apparatus and method for detection and discrimination molecular object
US10996166B2 (en) 2009-03-11 2021-05-04 Industrial Technology Research Institute Apparatus and method for detection and discrimination molecular object
EP3159678A1 (en) 2009-03-11 2017-04-26 Industrial Technology Research Institute Apparatus and method for detection and discrimination of a molecular object
EP2362209A2 (en) 2009-03-11 2011-08-31 Industrial Technology Research Institute Apparatus and method for detection and discrimination of the type of a molecular object
EP2511843A2 (en) 2009-04-29 2012-10-17 Complete Genomics, Inc. Method and system for calling variations in a sample polynucleotide sequence with respect to a reference polynucleotide sequence
EP2977455A1 (en) 2009-06-15 2016-01-27 Complete Genomics, Inc. Methods and compositions for long fragment read sequencing
US9524369B2 (en) 2009-06-15 2016-12-20 Complete Genomics, Inc. Processing and analysis of complex nucleic acid sequence data
WO2010148039A2 (en) 2009-06-15 2010-12-23 Complete Genomics, Inc. Methods and compositions for long fragment read sequencing
US10385391B2 (en) 2009-09-22 2019-08-20 President And Fellows Of Harvard College Entangled mate sequencing
US9023769B2 (en) 2009-11-30 2015-05-05 Complete Genomics, Inc. cDNA library for nucleic acid sequencing
US9777321B2 (en) 2010-03-15 2017-10-03 Industrial Technology Research Institute Single molecule detection system and methods
US9482615B2 (en) 2010-03-15 2016-11-01 Industrial Technology Research Institute Single-molecule detection system and methods
EP3043319A1 (en) 2010-04-30 2016-07-13 Complete Genomics, Inc. Method and system for accurate alignment and registration of array for dna sequencing
US8865078B2 (en) 2010-06-11 2014-10-21 Industrial Technology Research Institute Apparatus for single-molecule detection
US9995683B2 (en) 2010-06-11 2018-06-12 Industrial Technology Research Institute Apparatus for single-molecule detection
US9880089B2 (en) 2010-08-31 2018-01-30 Complete Genomics, Inc. High-density devices with synchronous tracks for quad-cell based alignment correction
US12012670B2 (en) 2010-10-08 2024-06-18 President And Fellows Of Harvard College High-throughput immune sequencing
US10392726B2 (en) 2010-10-08 2019-08-27 President And Fellows Of Harvard College High-throughput immune sequencing
US8725422B2 (en) 2010-10-13 2014-05-13 Complete Genomics, Inc. Methods for estimating genome-wide copy number variations
WO2013066975A1 (en) 2011-11-02 2013-05-10 Complete Genomics, Inc. Treatment for stabilizing nucleic acid arrays
US11835437B2 (en) 2011-11-02 2023-12-05 Complete Genomics, Inc. Treatment for stabilizing nucleic acid arrays
US10837879B2 (en) 2011-11-02 2020-11-17 Complete Genomics, Inc. Treatment for stabilizing nucleic acid arrays
US9803239B2 (en) 2012-03-29 2017-10-31 Complete Genomics, Inc. Flow cells for high density array chips
WO2013166517A1 (en) 2012-05-04 2013-11-07 Complete Genomics, Inc. Methods for determining absolute genome-wide copy number variations of complex tumors
EP3741872A1 (en) 2013-03-15 2020-11-25 Complete Genomics, Inc. Multiple tagging of long dna fragments
WO2014145820A2 (en) 2013-03-15 2014-09-18 Complete Genomics, Inc. Multiple tagging of long dna fragments
US10726942B2 (en) 2013-08-23 2020-07-28 Complete Genomics, Inc. Long fragment de novo assembly using short reads
US11198855B2 (en) 2014-11-13 2021-12-14 The Board Of Trustees Of The University Of Illinois Bio-engineered hyper-functional “super” helicases
US10227647B2 (en) 2015-02-17 2019-03-12 Complete Genomics, Inc. DNA sequencing using controlled strand displacement
US11319588B2 (en) 2015-02-17 2022-05-03 Mgi Tech Co., Ltd. DNA sequencing using controlled strand displacement
EP4112741A1 (en) 2017-01-04 2023-01-04 MGI Tech Co., Ltd. Stepwise sequencing by non-labeled reversible terminators or natural nucleotides
WO2018129214A1 (en) 2017-01-04 2018-07-12 Complete Genomics, Inc. Stepwise sequencing by non-labeled reversible terminators or natural nucleotides
WO2019071471A1 (en) 2017-10-11 2019-04-18 深圳华大智造科技有限公司 Method for improving loading and stability of nucleic acid on solid support
EP3995590A1 (en) 2017-10-11 2022-05-11 MGI Tech Co., Ltd. Method for improving loading and stability of nucleic acid
US11905553B2 (en) 2018-01-29 2024-02-20 St. Jude Children's Research Hospital, Inc. Method for nucleic acid amplification
EP3746564A1 (en) 2018-01-29 2020-12-09 St. Jude Children's Research Hospital, Inc. Method for nucleic acid amplification
US11643682B2 (en) 2018-01-29 2023-05-09 St. Jude Children's Research Hospital, Inc. Method for nucleic acid amplification
EP4183886A1 (en) 2018-01-29 2023-05-24 St. Jude Children's Research Hospital, Inc. Method for nucleic acid amplification
WO2020180813A1 (en) 2019-03-06 2020-09-10 Qiagen Sciences, Llc Compositions and methods for adaptor design and nucleic acid library construction for rolony-based sequencing
WO2021103695A1 (en) * 2019-11-25 2021-06-03 齐鲁工业大学 Single-base continuous extension flow-type targeted sequencing method
WO2021185320A1 (en) 2020-03-18 2021-09-23 Mgi Tech Co., Ltd. Restoring phase in massively parallel sequencing

Also Published As

Publication number Publication date
WO2006073504A3 (en) 2007-04-12
WO2006073504A8 (en) 2007-09-27
US20070207482A1 (en) 2007-09-06

Similar Documents

Publication Publication Date Title
US20070207482A1 (en) Wobble sequencing
US20210062186A1 (en) Next-generation sequencing libraries
US20220267845A1 (en) Selective Amplfication of Nucleic Acid Sequences
US8753816B2 (en) Sequencing methods
US7622281B2 (en) Methods and compositions for clonal amplification of nucleic acid
JP7240337B2 (en) LIBRARY PREPARATION METHODS AND COMPOSITIONS AND USES THEREOF
EP2694679A2 (en) Methods and systems for sequencing long nucleic acids
US20120107878A1 (en) Multiplex assembly of high fedelity dna
US20190106744A1 (en) Dna sequencing
EP3956445B1 (en) Multiplex assembly of nucleic acid molecules
US20210017596A1 (en) Sequential sequencing methods and compositions
US20200377935A1 (en) Polynucleotide adapters and methods of use thereof
KR20230124636A (en) Compositions and methods for highly sensitive detection of target sequences in multiplex reactions
US20200123604A1 (en) Dna sequencing
US20230323451A1 (en) Selective amplification of molecularly identifiable nucleic 5 acid sequences
WO2008127901A1 (en) Region-specific hyperbranched amplification
So Universal Sequence Tag Array (U-STAR) platform: strategies towards the development of a universal platform for the absolute quantification of gene expression on a global scale

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application
WWE Wipo information: entry into national phase

Ref document number: 11670588

Country of ref document: US

NENP Non-entry into the national phase

Ref country code: DE

WWP Wipo information: published in national office

Ref document number: 11670588

Country of ref document: US

122 Ep: pct application non-entry in european phase