WO2021083195A1 - Dna linker oligonucleotides - Google Patents

Dna linker oligonucleotides Download PDF

Info

Publication number
WO2021083195A1
WO2021083195A1 PCT/CN2020/124338 CN2020124338W WO2021083195A1 WO 2021083195 A1 WO2021083195 A1 WO 2021083195A1 CN 2020124338 W CN2020124338 W CN 2020124338W WO 2021083195 A1 WO2021083195 A1 WO 2021083195A1
Authority
WO
WIPO (PCT)
Prior art keywords
sequence
strand
stapler
dna
linker oligonucleotides
Prior art date
Application number
PCT/CN2020/124338
Other languages
French (fr)
Inventor
Matthew Callow
Linsu CHEN
Snezana Drmanac
Radoje Drmanac
Original Assignee
Mgi Tech Co., Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Mgi Tech Co., Ltd. filed Critical Mgi Tech Co., Ltd.
Priority to CN202080075206.4A priority Critical patent/CN114641581A/en
Publication of WO2021083195A1 publication Critical patent/WO2021083195A1/en

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6806Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • C12N15/1034Isolating an individual clone by screening libraries
    • C12N15/1093General methods of preparing gene libraries, not provided for in other subgroups

Definitions

  • This invention relates to the fields of DNA sequencing, genomics, and molecular biology.
  • DNBs DNA nanoballs
  • Chemical cross-linking has been used to stabilize DNB in these applications.
  • current approachs can lower the overall intensity of signal and inhibit second strand production. Improvements in methods of stabilizing DNBs are of value.
  • this disclosure provides a method of preparing a stabilized DNA template for nucleic acid analysis, comprising hybridizing a plurality of first strand linker oligonucleotides to the DNA template, wherein the DNA template is a single-stranded concatemer comprising a plurality of monomers, wherein each monomer comprises an adaptor sequence and a DNA target sequence, wherein each first strand linker oligonucleotide comprises a sequence that is complementary to and hybridizes to an adaptor sequence of the DNA template, and wherein at least two first strand linker oligonucleotides of the plurality of first strand linker oligonucleotides that are hybridized to the DNA template are linked to each other.
  • the sequence that is complementary to and hybridizes to the adaptor sequence of the DNA template may be a primer sequence that comprises an extendible 3’ end.
  • the method further comprises extending the at least two first strand linker oligonucleotides to generate at least two second strands by one or more DNA polymerases, wherein the at least two first strand linker oligonucleotides are linked to each other and are hybridized to the DNA template, thereby producing at least two second strands, the 5’ ends of which are linked.
  • the at least two first strand linker oligonucleotides are linked through DNA hybridization, a covalent bond, or both.
  • the at least two first strand linker oligonucleotides each comprises a stapler sequence, wherein the at least two first strand linker oligonucleotides are linked by hybridization of the respective stapler sequences.
  • the linker oligonucleotide comprises a cleavage site, wherein cleaving the linker oligonucleotide releases the stapler sequence.
  • the two stapler sequences are hybridized to different regions of a shared scaffold, thereby linking the at least two first strand linker oligonucleotides.
  • the stapler sequences of the at least two first strand linker oligonucleotides bind to each other after binding of the primer sequences to the DNA template.
  • the method comprises: 1) hybridizing blocker oligonucleotides to the stapler sequences of at least two first strand linker oligonucleotides in a reaction, to produce partially double stranded first strand linker oligonucleotides, which comprises a double-stranded region consisting of the blocker oligonucleotide and the stapler sequence, thereby preventing stapler sequences of the first strand linker oligonucleotides from hybridizing to each other, 2) adding DNA template to the reaction, with the first strand linker oligonucleotides, wherein the primer sequences of the partially double stranded first strand linker oligonucleotides bind to the DNA template, 3) disassociating the blocker oligonucleotides from the DNA template, 4) washing to remove the blocker oligonucleotide, and 5) lower the temperature to allow the hybridization of the stapler sequences of the first strand linker
  • the disassociation is achieved by one or more of raising temperature of the reaction such that the blocker oligonucleotide is dissociated from the DNA template, enzymatic degradation of the blocker oligonucleotide, and chemical degradation of the blocker oligonucleotide.
  • the stapler sequences of the at least two first strand linker oligonucleotides bind to each other to form linked first strand linker oligonucleotides before binding of the primer sequences to the DNA template.
  • the method comprises using the linked first strand linker oligonucleotides at a concentration below a predetermined threshold such that two first strand linker oligonucleotides in the linked pair bind to the same DNB.
  • the stapler sequence is a palindromic stapler sequence. In some embodiments the palindromic stapler sequence is 5’ to the primer sequence on each of the at least two first strand linker oligonucleotides.
  • the at least two first strand linker oligonucleotides comprise two complementary, non-palindromic stapler sequences, one on each first strand linker oligonucleotide; and wherein at least two first strand linker oligonucleotides are linked through hybridization of the two complementary, non-palindromic stapler sequences.
  • the at least two first strand linker oligonucleotides each comprises a non-palindromic stapler sequence and a palindromic stapler sequence.
  • the palindromic linker is interposed between the non-palindromic stapler sequence and the primer sequence on each of the at least two first strand linker oligonucleotides.
  • the at least two first strand linker oligonucleotides each comprise a stapler sequence interposed between two primer sequences, wherein the stapler sequences on the at least two first strand linker oligonucleotides are hybridized to each other.
  • the length of the stapler sequences is in the range from 8 and 50 nucleotides. In some embodiments, the length of the primer sequence is from 15 to 70 nucleotides.
  • a method of preparing a stabilized DNA template for nucleic acid analysis comprising hybridizing a plurality of first strand linker oligonucleotides to the DNA template, wherein the DNA template is a single-stranded concatemer comprising a plurality of monomers, wherein each monomer comprises an adaptor sequence and a DNA target sequence, wherein each first strand linker oligonucleotide comprises a primer sequence that is complementary to and hybridizes to an adaptor sequence of the DNA template, wherein the at least one first strand linker oligonucleotide comprises a blocking group at the 3’ of the primer sequence to prevent extension, and wherein the at least two first strand linker oligonucleotides of the plurality of first strand linker oligonucleotides that are hybridized to the DNA template are linked to each other.
  • the blocking group is a reversible blocking group, wherein the method further comprises removing the blocking group from the at least one first strand linker oligonucleotide, and extending the at least one first strand linker oligonucleotide to generate at least one second strand.
  • the first strand linker oligonucleotides can be cleaved at a site that is in the stapler sequence or in the primer sequence.
  • the method further comprises removing unbound first strand linker oligonucleotides after the hybridizing step and/or heating a reaction mixture comprising the DNA template and first strand linker oligonucleotides, e.g., to 50-65 °C.
  • the method further comprises: 1) extending the at least two first strand linker oligonucleotides of the plurality of first strand linker oligonucleotides by a non-displacement DNA polymerase to generate at least two, partially extended second strands, including a partially extended upstream second strand and a partially extended downstream second strand, wherein the at least two partially extended second strands are fully hybridized to the DNA template, and 2) extending the fully hybridized upstream and downstream second strands with a strand displacement DNA polymerase, wherein extending the upstream second strand partially displaces the downstream second strand, and thereby producing a partially hybridized downstream second strand.
  • a DNA complex comprising a DNA template and a plurality of first strand linker oligonucleotides
  • the DNA template is a single-stranded concatemer comprising a plurality of monomers, wherein each monomer comprises an adaptor sequence and a DNA target sequence, wherein each first strand linker oligonucleotide comprises a primer sequence that is complementary to and hybridizes to an adaptor sequence of the DNA template, wherein the primer sequence includes an extendible 3’ end, and wherein at least two first strand linker oligonucleotides of the plurality of first strand linker oligonucleotides that are hybridized to the DNA template are linked to each other.
  • a DNA complex comprising a DNA template, two or more second strands, wherein each second strand comprises an overhang region and a hybridized region that is hybridized to the DNA template, wherein the two or more second strands are complementary to the DNA template, and wherein at least two second strands are linked at their respective 5’ end.
  • the DNA complex further comprises two or more second strand linker oligonucleotides that are hybridized to two or more second strands, each second strand linker oligonucleotide comprising a second stapler sequence and at least two second strand linker oligonucleotides are linked through the hybridization of the respective second stapler sequences.
  • Also provided herein is a DNA array comprising a plurality of any of the DNA complexes disclosed herein.
  • linker oligonucleotides each linker oligonucleotide comprising a stapler sequence and a primer sequence, and the stapler sequence is 5’ to the primer sequence, wherein the stapler sequences on the two linker oligonucleotides are complementary to each other, and wherein the two linker oligonucleotides are hybridized to each other via respective stapler sequences.
  • the stapler sequences of the two linker oligonucleotides are palindromic sequences.
  • the primer sequences of both linker oligonucleotides are the same.
  • each linker oligonucleotide comprises an additional, non-palindromic stapler sequence, and the additional non-palindromic stapler sequence is located 5’ to the stapler sequence.
  • the additional non-palindromic stapler sequence in one of the two linker oligonucleotides is hybridized to a stapler sequence in a third linker oligonucleotide.
  • the at least one linker oligonucleotide has a sequence that is selected from the group consisting of SEQ ID NO: 1-10.
  • the disclosure provides a method of preparing a DNA template for nucleic acid analysis comprising immobilizing the DNA template on an array, wherein the DNA template is a DNA concatemer comprising a plurality of monomers, and each monomer comprises an adaptor sequence and a DNA target sequence.
  • a plurality of first strand linker oligonucleotides are hybridized to the DNA template, and each first strand linker oligonucleotide comprises a sequence that is complementary to and hybridizes to an adaptor sequence of the DNA template.
  • At least two first strand linker oligonucleotides of the plurality of first strand linker oligonucleotides are hybridized to the DNA template are linked to each other.
  • FIG. 1A and 1B illustrate DNB subunits using “Z-linkers” staplers as primers.
  • FIG. 2 illustrates DNB subunit using “X-linkers” and possible linker structures.
  • FIG. 3 illustrates the effect of addition of an X-linker on signal intensity over multiple sequencing cycles.
  • FIG. 4A and 4B illustrate the effect of an X-linker on mapping rate and error rates.
  • FIG. 5A and 5B illustrate the effect of a Z-linker on mapping rate and error rate between the first 50 bases and second 50 bases of a read.
  • FIG. 6A-6E show various linker configurations formed by linker oligonucleotides disclosed herein through hybridization.
  • B and b are stapler sequences that are complementary to each other.
  • “bB” represents a palindromic stapler sequence.
  • A represents a primer sequence, which can hybridize to the DNA template.
  • FIG. 7A-7B illustrate additional linking configurations.
  • A represents a primer sequence.
  • B, C, D, and E represent stapler sequences, which may have the same or different sequence.
  • FIG. 7A represents a linear scaffold that links multiple linker oligonucleotides.
  • FIG. 7B shows a circular scaffold that links multiple linker oligonucleotides.
  • FIG. 7C shows an embodiment where two linker oligonucleotides are linked via a chemical bond.
  • the linker oligonucleotides described in this disclosure can be used to hybridize to long nucleic acid molecules in a predictabe fashion to crosslink different regions of the long nucleic acid molecules. In some cases, crosslinking brings spatially distant regions in the nucleic acids into pre-defined shapes and structures.
  • the nucleic acid is a DNA concatemer comprising a plurality of monomers
  • the linker oligonucleotides each comprises a primer sequence that binds to a different monomer of the concatemer. Linking primers eliminates need for incorporating in the adapter the binding sites for non-primer linking oligonucleoties, which allows the adaptors to be relatively short.
  • Shorter adapters has multiple advantages, such as providing DNBs with higher number of copies of adaptors for a given DNB size.
  • Linked-primers also allow most of denatured primers to rehybridize without being washed away. At least two of the linker oligonucleotides are connected such that different monomers of the concatemer is also connected. In addition to primer rehybridizing, most likely the cage prevent mechanical braking and removing segments of DNB but not loss of smaller segments of DNB due to other types of DNA cuts.
  • the length of the linker both dsDNA part and ssDNA part may be varied to maximize signal preservation benefits under different sequencing conditiosn (e.g. cleavage chemistry, reaction temperature, time and pH, polymerase properties and others)
  • Linking monomers in a DNA concatemer can stabilize its structure. Sequencing reactions may result in cuts to the DNA concatemers and loss of segments thereof. As more sequencing cycles occur, more cuts to the DNA concatemer will accumulate, more loss of the DNA segments will occur. By linking two or more monomers, the probability of two nicks/cuts to the DNA concatemer resulting in DNA loss will be reduced, therefore minimizing DNB mass loss. Furthermore, DNB with linked-primers may have less mechanical cuts due to the imposed structure. With linked oligonucleotides, typically 4 or more specific cuts are needed to lose a segment of a DNB. In general, the more linking is provided the more cuts is needed to lose a segment of the DNB.
  • linked primers are combined with reduced DNA nicking/cutting by used reagents (adjusting temperature, time, concentration, avoiding impurities, specialy pure enzymes without endonuclease activities, close to zero microbial congtamination) assuring less than 4 cuts per 10, 29, 30, 50, 100 or more sequencing cycles.
  • used reagents adjusting temperature, time, concentration, avoiding impurities, specialy pure enzymes without endonuclease activities, close to zero microbial congtamination
  • linking two or more subunits also reduces the volume and allowing a high number of DNA concatemer to be deposited on the substrate.
  • linking may also offer other advantages such as protecting the DNBs from degradation.
  • linker oligonucleotides as a means of crosslinking does not inhibit production of reverse complement or primer extension.
  • linker oligonucleotides comprises one or more primer sequences, which can be extended to form reverse complement strands.
  • the connection of the linker oligonucleotides results in second strands that are also linked at the 5’ ends by virtue of primer sequences becoming the 5’ terminal sequences of the second strand.
  • the linking of the second strands further stabilizes the DNA template (e.g., a single-stranded DNA concatemer) .
  • compositions and methods described herein thus can be used to stabilize the DNA template and minimize structural loss of DNB mass. Sequencing technologies using these compositions and methods extend the read length (number of sequencing cycles) , reduce error rates and increase mapping rates during sequencing.
  • a DNA template used in the invention is a DNA concatemer.
  • the term “concatemer” refers to a continuous DNA molecule that contains multiple copies of the same DNA sequences (the “monomer” or “subunit” linked in series) .
  • a “DNA concatemer” may comprise at least two, at least three, at least four, at least 10, at least 25 monomers, at least 50 monomers, at least 200 monomers, or at least 500 monomers.
  • the DNA concatemer comprises 25-1000 monomers, such as 50-800 monomers or 300-600 monomers) .
  • a DNA concatemer used in the methods of the invention may be a DNA nanoball, or “DNB.
  • DNA nanoballs are described in Drmanac et al. “Methods and compositions for long fragment read sequencing” U.S. Patent No. 8,592,150 (Nov. 26.2013) , the entire content of which is incorporated herein by reference.
  • “DNA nanoballs” or “DNBs” are single-stranded DNA concatemers of sufficient length to form random coils that fill a roughly spherical volume in solution (e.g., SSC buffer at room temperature) .
  • DNA nanoballs typically have a diameter of from about 100 to 300 nm.
  • each monomer comprises at least one target DNA sequence.
  • DNA nanoballs are single-stranded copies of a DNA sequence that are concatemerized into a linear DNA structure.
  • DNBs are produced by copying a single-stranded circular DNA using a strand displacement polymerase, such as phi29 polymerase or Bst polymerase, in a process called rolling circle replication.
  • the polymerase begins extension of a primer that is hybridized to the single-stranded circle and creates a reverse complementary strand that is hybridized to the circle. Once one complete extension around the circle is made, the polymerase continues extension by displacing the newly made strand ahead of the direction of travel. As the polymerase continues to extend the strand around the circle, multiple, reverse complement copies are created, joined in a linear fashion to each other. This strategy creates a target with many probe or primer binding sites.
  • DNA concatemers can be produced by any suitable method.
  • a single genomic fragment is used to generate a single-stranded circular DNA with adaptors interspersed between target sequences that are contiguous or close together in the genome.
  • a monomer of the concatemer comprises one adaptor sequence and one target DNA sequence. Because monomers are linked in series, target DNA sequences will be flanked by two adaptor sequences. In some approaches, the target DNA sequence in the monomer is flanked by two “half-adaptor” sequences, such that each target sequence linked in series in the concatemer is flanked by two adaptors. In some approaches, the monomeric unit comprises one, two, three, or four, or more adaptors. In some embodiments, all of adaptors of a monomer (and concatemer) have the same sequence. On other embodiments, adaptors may have different sequences, such as two, three or four different sequences. It will be recognized that individual monomers may comprise more than one DNA template sequence.
  • a monomer may comprise the structure A1-T1-A2-T2 where T1 and T2 are DNA templates with the same or different sequences, and A1 and A2 are adaptors with the same or different sequence.
  • T1 and T2 are DNA templates with the same or different sequences
  • A1 and A2 are adaptors with the same or different sequence.
  • Various configurations of the DNA concatemer that can be used are disclosed in U.S. Pat. No. 10,227,647, the relevant disclosure of which is incorporated by reference in its entirety.
  • the corresponding concatemer will have the structure A 1 -T 1 -A 2 -T 2 -A 1 -T 1 -A 2 -T 2 -A 1 -T 1 -A 2 -T 2 ....
  • the monomer may comprise the structure A 1 -T 1 -A 2 -T 2 -A 3 where T 1 and T 2 are DNA templates with the same or different sequences, A 2 is an adaptor and A 1 and A 3 are “half adaptors. ”
  • the corresponding concatemer will include the structure A 2 -T 2 -A 3 A 1 -T 1 -A 2 -T 2 -A 3 A 1 -T 1 -A 2 -T 2 -A 3 A 1 ... where the A 3 A 1 half adaptors together function as an adaptor.
  • TABLE 1 illustrates exemplary concatemer structures. In TABLE 1, N is greater than 1.
  • N is at least 3 (at least 3 monomers) , often at least 4, at least 10, at least 25, at least 50, at least 200, or at least 500.
  • N is in the range of 25-1000, such as 50-800, or 300-600.
  • N is at least 25, usually at least 50, and often in the range 50-800, or 300-600.
  • the DNA template (e.g. ., a concatemer) may comprises a target DNA.
  • the target DNA may be from any source, including naturally occurring sequences (such as genomic DNA, cDNA, mitochondrial DNA, cell free DNA, etc. ) , artificial sequences (e.g., synthetic sequences, products of gene shuffling or molecular evolution, etc. ) or combinations thereof.
  • Target DNA may be derived from sources such as an organism or cell (e.g., from plants, animals, viruses, bacteria, fungi, humans, mammals, insects) , forensic sources, etc.
  • Target DNA sequences may be from a population of organisms, such as a population of gut bacteria.
  • a target DNA sequence may be obtained directly from a sample, or may be a product of an amplification reaction, a fragmentation reaction, and the like.
  • a target DNA may have a length within a particular size range, such as 50 to 600 nucleotides in length.
  • Other exemplary size ranges include 25 to 2000, 50 to 1000, 100 to 600, 50-100, 50-300, 100-300, and 100-400 nucleotides in length.
  • the target DNAs may be the same length or different lengths.
  • the members of the library may have, in some embodiments, similar lengths (e.g., all in the range of 25 to 2000 nucleotides, or another range) .
  • target DNAs may be prepared by fragmenting a larger source DNA (e.g., genomic DNA) to produce fragments in a desired size range.
  • a size-selection step is used to obtain a pool of fragments within a particular size range.
  • an adaptor sequence refers to the nucleic acid sequence of an adaptor.
  • Adaptors may comprise elements for immobilizing DNA template polynucleotides on a substrate, elements for binding oligonucleotides used in sequence determination (e.g., binding sites for primers extended in sequencing by synthesis methods and/or probes for cPAL or other ligation based sequencing methods, and the like) , or both elements for immobilization and sequencing.
  • Adaptors may include additional features such as, without limitation, restriction endonuclease recognition sites, extension primer hybridization sites (for use in analysis) , bar code sequences, unique molecular identifier sequences, and polymerase recognition sequences.
  • Adaptor sequences may have a length, structure, and other properties appropriate for a particular sequencing platform and intended use.
  • adaptors may be single-stranded, double-stranded, or partially-double stranded, and may be of a length suitable for the intended use.
  • adaptors may have length in the range of 10-200 nucleotides, 20-100 nucleotides, 40-100 nucleotides, or 50-80 nucleotides.
  • an adaptor may comprise one or more modified nucleotides that contain modifications to the base, sugar, and/or phosphate moieties.
  • An individual adaptor sequence may include multiple functionally distinct subsequences.
  • a single adaptor sequence may contain two more primer stapler sequences (which can be recognized by different complementary primers or probes) .
  • Functionally distinct sequences within an adaptor may be overlapping or non-overlapping.
  • bases 1-20 are a first primer binding site and bases 21-40 are a second primer binding site.
  • bases 1-15 are a first primer binding site and bases 21-40 are a second primer binding site.
  • bases 5-25 are a first primer binding site and bases 15-35 are a second primer binding site.
  • bases 1-20 can be an immobilization sequence and bases 21-40 can be a primer binding site.
  • Different primer stapler sequences in an adaptor may have the same or different lengths.
  • Adaptors may comprise one, two or more than two primer stapler sequences.
  • a primer stapler sequence is defined functionally as the site or sequence to which a primer (or oligonucleotide) specifically binds.
  • an adaptor with two primer stapler sequences may be specifically bound by two different primers.
  • the two primer stapler sequences in the same adaptor are overlapping, i.e., sharing part of the nucleotide sequence.
  • the overlapped region is no more than 50%, or 40%, or 30%, or 20%, or 10%or 5%of either of the two overlapping primer stapler sequences.
  • the more than one primer stapler sequences are non-overlapping.
  • the non-overlapping primer stapler sequences are immediately adjacent to each other; in some other embodiments, the non-overlapping primer stapler sequences are separate by 1-10, 10-20, 30-40, or 40-50 nucleotides.
  • a “linker oligonucleotide” is an oligonucleotide that can be linked to another oligonucleotide via covalent or covalent means.
  • a linker oligonucleotide is a “first strand linker oligonucleotide, ” which hybridizes to the first strand.
  • a linker oligonucleotide is a “second strand linker oligonucleotide, ” which hybridizes to the second strand, as further described below.
  • a linker oligonucleotide may comprise a primer sequence having an extendible 3’ end, said primer sequence being complementary to an adaptor sequence of the DNA template and being able to hybridize to the DNA template. At least two linker oligonucleotides can be linked.
  • the term “link, ” refers to both non-covalent and covalent interactions through which the two nucleic acid molecules are brought together. In some cases the linkage is through DNA hybridization, a chemical bond, or both.
  • These linker oligonucleotides can link contiguous or non-contiguous monomers of a DNA concatemer, thereby stabilizing the DNA concatemer. Exemplary linker oligonucleotides are shown in FIG. 6A-6E and also in Table 2 below, in 5’ to 3’ direction. b and B represents stapler sequences and A represents a primer sequence.
  • FIG. 6A-6E illustrate various linker configurations the above linker oligonucleotides can form through hybridization.
  • B and b are stapler sequences that are complementary to each other.
  • bB represents a palindromic stapler sequence.
  • A represents a primer sequence, which can hybridize to the DNA template.
  • D and d are complementary to each other.
  • FIG. 7A illustrates additional embodiments of the linking configuration.
  • A represents a primer sequence and B, C, D, and E represent stapler sequences.
  • B, C, D, and E may have the same or different sequences.
  • FIG. 7A shows a linear scaffold that links multiple linker oligonucleotides.
  • FIG. 7B shows a circular scaffold that links multiple linker oligonucleotides.
  • FIG. 7C shows an embodiment where two linker oligonucleotides are linked via a chemical bond (i.e., a covalent bond) .
  • a chemical bond i.e., a covalent bond
  • the linker oligonucleotides disclosed herein comprises at least one primer sequence and at least one stapler sequence. In some embodiments, at least one primer sequence is located 3’ to the stapler sequence. In some embodiments, the linker oligonucleotide hybridizes to a single-stranded DNA and the primer sequence of the linker oligonucleotide comprises an extendible 3’ end, from which a DNA strand that is reverse complementary to the single-stranded DNA is made.
  • a DNA strand generated by extending a first strand linker oligonucleotide is referred to as a second strand; and the DNA strand generated by extending a second strand linker oligonucleotide is referred to as a third strand.
  • the number of primer sequences per each linker oligonucleotide may vary.
  • the linker oligonucleotide comprises one primer sequence.
  • the oligonucleotide comprises two primer sequences.
  • the primer sequence of the linker oligonucleotide will be of sufficient length to allow hybridization of a primer, with the precise length and sequence dependent on the intended functions of the primer (e.g., extension primer, ligation substrate, indexing sequence, etc. ) .
  • the primer sequence that binds to the DNB would generally be stable under all conditions of temperature, salt and pH that are used throughout the sequencing run. Dissociation and re-association of an individual oligo or region thereof is generally not desirable because of the possibility of creating out of phase reads or loss of the extending primer.
  • the length and Tm of the primer sequence of the oligonucleotide should be sufficient such that the linker oligonucleotide remains hybridized under temperature, salt and pH conditions used throughout the assay process.
  • Primer sequences are often at least 10, at least 12, at least 15 or at least 18 bases in length.
  • the primer sequence has a length that ranges from 8 to 60 nucleotides, e.g., from 10 to 25 nucleotides, or from 40 to 60 nucleotides long.
  • the linker oligonucleotides are connected via various means. In some embodiments, they are connected via chemical linkage. In some cases chemical linkage used to connect the linker oligonucleotides described herein are non-specific, formed by using chemicals such as nitrogen mustards or chloroethylnitrosourea (CENU) derivatives. In some embodiments, chemical linkage used for the method and compositions disclosed herein is targeted crosslinking of oligonucleotides, which could be achieved with, e.g., thionucleobases (Beilstein J. Org. Chem. 2014, 10, 2293–2306) .
  • any modifications that allow attachment of oligonucleotides to surfaces could be modified to allow oligonucleotide to oligonucleotide attachment, and these modifications can be used to connect the linker oligonucleotides.
  • the NHS (N-hydroxysuccinimide) group can react with an amine group of a second molecule (also allows for crosslinking to protein) .
  • Click chemistry such as a conjugation between an azide-modified oligo and an alkyne modified oligo could also be used to conjugate two oligonucleotides together (Acc. Chem. Res. 20124581258-1267) .
  • the linker oligonucleotides are connected via DNA hybridization. In some embodiments, the linker oligonucleotides are connected via hybridization of a stapler sequence located on each of the linker oligonucleotides.
  • a stapler sequence ( “A” ) in linker oligonucleotide, seq 1 is complementary to and can hybridize to the stapler sequence ( “a” ) in the other linker oligonucleotide, seq 2.
  • the stapler sequence is a palindromic sequence. In some embodiments, the stapler sequence is a non-palindromic sequence.
  • a palindromic sequence refers to a sequence, one half of which is complementary to the other half of the sequence.
  • a linker oligonucleotide in Table 2 comprising a sequence “b” that is followed by a sequence of “B” , wherein b is complementary to B.
  • One exemplary palindromic sequence is GGAACCATGGTTCC (SEQ ID NO: 8) .
  • a linker oligonucleotide with palindromic sequence could form a hairpin with internal complementarity (i.e., form an intramolecular hairpin) or could be complementary to the stapler sequence on another linker oligonucleotide has identical sequence (i.e., forming an intermolecular hybrid) .
  • the linker oligonucleotide pair are connected via the palindromic sequences. See FIG.
  • a linker oligonucleotide having a sequence of 5’ GGAACCATGGTTCCAAGTCGGAGGCCAAGCGGTCTUAGGA-3’ (SEQ ID NO: 1) (belonging to the category of Linker oligonucleotide 3 or seq 3) comprises a palindromic sequence GGAACCATGGTTCC (SEQ ID NO: 8) .
  • SEQ ID NO: 1 further comprises a sequence of AAGTCGGAGGCCAAGCGGTCTUAGGA (SEQ ID NO: 10) , which could act as a primer sequence recognizing a portion of the DNB adapter.
  • the palindromic sequence in seq 3 would be self-complementary under appropriate conditions and could form a hairpin with internal complementarity; alternatively the palindromic sequence could be complementary to a second oligonucleotide of the same sequence.
  • higher temperatures for example, 50 °C to 65 °C
  • the internal hairpin structure is unstable and the longer intermolecular hybrid will remain hybridized.
  • raising temperature would favor the formation of intermolecular hybrid (desired for stabilizing DNA templates used in various sequencing applications) over internal hairpin.
  • linker oligonucleotides comprising palindromic stapler sequences are hybridized to a DNA template on a solid support. The hybridization is performed at the temperature of 10-30 °C. The solid support is then washed to remove unbound primers and then temperature is raised to 50 to 65 °C to reduce the formation of the intramolecular hybrid.
  • the length of the stapler sequence may vary.
  • the length of the stapler sequence is chosen such that the Tm of the stapler sequence is between 50 °C and 72 °C. This ensures that the linker oligonucleotides can remain hybridized throughout the assay procedure.
  • the length of the stapler sequence may range from 20 to 150 nucleotides, e.g., from 40 to 120 nucleotides, from 50 to 100 nucleotides.
  • the relative position of the stapler sequence and the template hybridizing sequence (e.g., a primer sequence) in the linker oligonucleotide may vary.
  • the stapler sequence is 5’relative to the primer sequence (e.g., Linker oligonucleotides 1-3 in Table 2) .
  • the stapler sequence is interposed between the two template hybridizing sequences (e.g., the two primer sequences) in the linker oligonucleotide, and the stapler sequence is 3’ relative to a first primer sequence and 5’ relative to a second primer sequence.
  • Table 2 e.g., linker oligonucleotides 4-6
  • FIG. 6C and 6D Illustrative examples are shown in Table 2 (e.g., linker oligonucleotides 4-6) and in FIG. 6C and 6D.
  • a scaffold refers to a molecular structure through which individual oligonucleotides are associated with one another.
  • two or more linker oligonucleotides e.g., three, four, or five linker oligonucleotides
  • the stapler sequences of these linker oligonucleotides linked through the scaffold are different.
  • the stapler sequences are non-palindromic.
  • the stapler sequences are identical among the two or more linker oligonucleotides.
  • a scaffold used in the methods and compositions herein may exist in various forms.
  • the scaffold is a linear scaffold.
  • the scaffold is a circular scaffold.
  • the scaffold is a dendrimer scaffold.
  • the scaffold can be linear or circular or dendrimers.
  • the linker oligonucleotides are hybridized to the DNA template (e.g., a DNA concatemer) via the primer sequence, the scaffold is added and the linker oligonucleotides are hybridized to the scaffold.
  • FIG. 7A shows one illustrative example of linear scaffolds, in which linker oligonucleotides BA, CA, DA, are hybridized to a linear scaffold.
  • FIG. 7B shows an illustrative example of a circular scaffold, in which linker oligonucleotides BA, CA, DA, and EA are hybridized to a circular scaffold.
  • A represents a primer sequence that is complementary to a DNA template
  • B, C, D and E are stapler sequences.
  • the scaffold is typically used in a relatively low concentration (e.g., a concentration that is lower than the concentration of the linker oligonucleotides in the reaction) , and a longer hybridization time to the scaffold would provide more linked primers.
  • a scaffold disclosed herein can be made of any material or substrate (e.g., a protein or a nucleic acid) .
  • the scaffold is a protein scaffold.
  • the scaffold is a nucleic acid scaffold.
  • Nonlimiting examples of nucleic acid scaffolds include DNA, RNA, and peptide nucleic acid (PNA) .
  • scaffold can attach, either covalently or non-covalently, to the stapler sequences of the linker oligonucleotide.
  • the scaffold is a nucleic acid that comprise two or more copies of a sequence complementary to the stapler sequence so that the linker oligonucleotide is anchored to the scaffold through hybridization.
  • a plurality of scaffold molecules are used, each linking multiple linker oligonucleotides.
  • a scaffold disclosed herein can be generated as linear repeats or as a circular structure containing multiple repeats.
  • the scaffold is a nucleic acid concatemer (e.g., a DNB) .
  • the molar ratio of the linker oligonucleotides to the scaffold may be in a range from 2 to 50, e.g., 3 to 25, from 3 to 15, or from 4 to 10.
  • Suitable concentrations of the scaffold for this purpose can be determined empirically. For example, having multiple (e.g., 3-4 or 4-6) different stapler sequences for the same scaffold increases the chance that distant, individual DNA templates (e.g., DNBs) can be linked.
  • two Linker oligonucleotides are connected and form a linker.
  • Linkers and linker oligonucleotides may take different forms and can be classified in different ways. For example, based on the number of subunits in the DNA concatemer that the linkers can bind, the linkers can be classified as 2-arm linkers or 4-arm linkers. Based on the relative sequence components in the linkers themselves, they can be classified as Z-linkers or X-linkers. Z-linkers can be 2-arm linkers or 4-arm linkers. X-linkers are generally 2-arm linkers.
  • each linker oligonucleotide contains only one primer seuqence and thus two oligonucleotides can link two subunits of the DNA concatemer ( “2-arm linkers” ) .
  • two linker oligonucleotides, seq 1 and seq 2 each contains a primer sequence (A) that serves to hybridize to one subunit (s) of a DNB.
  • Seq 1 also contains a stapler sequence (b)
  • seq 2 contains a stapler sequence (B) , with (b) being complementary to (B) . See FIG. 6.
  • seq 1 and seq 2 are added to reaction comprising the templateDNA (e.g., a DNA concatemer) .
  • one single linker oligonucleotide by virtue of having two primer sequences that are complementary to the adaptor sequence of the DNA concatemer, could connect 2 subunits of the concatemer.
  • seq 4 or seq 5 in FIG. 6 belong to this category of linker oligonucleotides.
  • each of the linker oligonucleotides has two primer sequences, with a stapler sequence interposed in between.
  • Each linker oligonucleotide can hybridize to two subunits of a DNB. This configurations allows the two connected linker oligonucleotides to bind to four separate sequences of the DNA template, thus is referred to as “4-arm linkers” .
  • the stapler sequences of the two linker oligonucleotides of the 4-arm linker are not identical, such as seq 4 and seq 5 in FIG. 6.
  • the stapler sequences of the two linker oligonucleotides of the 4-arm linker are identical and palindromic, which allows hybridization of two linker oligonucleotide of the same sequence.
  • two linker oligonucleotides are connected to form a Z linker.
  • Each linker oligonucleotide in the Z-linker comprises a primer sequence that is complementary to and can hybridize to the DNA template, e.g., the adaptor of the DNA concatemer.
  • each primer sequence comprises an extendible 3’ end and can serve as a primer to make a second strand based on the DNA template.
  • Each linker oligonucleotide of the Z-linker also comprises a stapler sequence, which is complementary to the stapler sequence of the other linker oligonucleotide, hybridization of the two stapler sequences results in formation of a partial hybrid between the two linker oligonucleotides.
  • the stapler sequence of the linker oligonucleotide is palindromic, and the two linker oligonucleotides of the Z-linker are of identical sequence. In some embodiments the stapler sequence of linker oligonucleotide is non-palindromic and the two linker oligonucleotides are of different sequences.
  • each of the linker oligonucleotides can be extended to form second strands.
  • the two second strands so formed are linked on the 5’ end via the Z-linker.
  • Figure 1A shows an illustrative example of a Z-linker consisting of a pair of linker oligonucleotides, each comprising a stapler sequence and a primer sequence, with the stapler sequence located 5’ to the primer sequence.
  • the stapler sequences of the pair of linker oligonucleotides are complementary and annealing thereof connects the linker oligonucleotides at the 5’ end and forms a 2-arm Z-linker.
  • each linker oligonucleotide hybridizes to form a second strand, which results in two second strands linked at 5’ (bottom right panel)
  • the extension is carried out by a strand-displacement DNA polymerase, which forms a branched structure, in which each second strand is partially hybridized to the DNA template.
  • the linker oligonucleotide can be cleaved at defined positions to allow removal of a 3 prime block thus forming a primer for polymerization.
  • the linker oligonucleotides can be cleaved, for example, by an enzyme, (e.g., phosphatase or esterase) , chemical reaction, heat, light, and the like. Cleavage could enable release of the cross linked structures for applications that require a lesser degree of crosslinking between the subunits of the concatemer. That is, the linking between the subunits of the concatemer can be reversed or converted so that the secondary function can be performed. For example, a new priming site could be generated as result of formation of a new 3’ hydroxyl group at the cleavage site, which allows extension by polymerases.
  • the cleavage site can be at any location on the linker oligonucleotide.
  • At least one of the two linker oligonucleotides comprises two cleavage sites flanking the stapler sequences, and cleavage on the sites results in release of the stapler sequences.
  • the cleavage site is on the stapler sequences of the at least two linker oligonucleotides, and the cleavage releases the linker oligonucleotide-linker oligonucleotide hybrid or to release the linker-DNB hybrid.
  • the cleavage site is on the primer sequences of the linker oligonucleotides, and the cleavage on the sites creates shorter unstable hybrids and results in the release of the crosslinked structures.
  • Cleavage could be achieved by enzymatic recognition of nucleotide bases and or abasic sites, or by modifications to the phosphodiester bond to create site specific scission.
  • a 3’ block of the linker oligonucleotide could be engineered by incorporating a 3’ phosphate group during oligonucleotide synthesis, and the 3’ phosphate group can be cleaved by kinases or phosphatase enzymes.
  • the cleavage site comprises a uracil nucleotide base, which allows cleavage of the base and phosphodiester bond with uracil DNA glycosylase (UDG) and an endonuclease.
  • the endonuclease is one that can generate a 3’ hydroxyl group after cleavage at the cleavage site, e.g., APE1.
  • the cleavage of the phosphodiester backbone at the uracil nucleotide base causes the release of the stapler sequence. This is useful to situations where the de-crosslinking is desired.
  • the linker oligonucleotide is A-bB-A.
  • A represents the primer sequence and bB is the palindromic stapler sequence. Cleavage at the dU base and sugar excision with an endonuclease of the two primer sequences allows release of the stapler sequence and the second hybridizing region.
  • the resultant structure will be one of two forms.
  • the first sequence is sequence 11.4, having a sequence of AAGTCGGAGGCCAAGCGGTCT (SEQ ID NO: 2) which can remain hybridized to the DNB and could possess a 3’ hydroxyl if the appropriate endonuclease is chosen.
  • the second sequence (seq 11.5) AGGAGGAACCATGGTTCCAAGTCGGAGGCCAAGCGGTCT (SEQ ID NO: 3) could also remain hybridized to the DNB and could possess a 3’ hydroxyl that will allow extension by polymerases. Seq 11.5 would still possess a 5’ tail sequence that could remain hybridized to a second 5’ tail sequence of a second oligo.
  • the residual cleaved oligo sequences, seq 11.4 and seq 11.5 as described above, can then serve as primers for subsequent polymerase extension such as generation of a second strand with a strand-displacement polymerase.
  • the dU cleavage step is optional; as in some cases, subsequent second strand generation could still occur by binding a primer to an alternative region of the DNB.
  • the primer can be extended to form a second strand by a strand-displacement polymerase, and said extension can displace a downstream second strand, for example, one formed by extending a linker oligonucleotide as disclosed herein.
  • linker oligonucleotides hybridize to the DNA template before they hybridize to each other.
  • One approach to achieve this is to first block the hybridization between the linker oligonucleotides during the process when the linker oligonucleotides are contacted with and hybridized to the DNA template. After the completion of hybridization of the linker oligonucleotides to the DNA template, and after an optional step of removing excess linker oligonucleotides from the reaction, the block is reversed to permit hybridization between linker oligonucleotides.
  • prevention of hybridization of the oligonucleotide linkers before hybridization to the DNA template can be achieved by using a blocker oligonucleotide.
  • Said blocker oligonucleotide is complementary to the stapler sequence of the linker oligonucleotides but not to the primer sequence.
  • the blocker oligonucleotide may also be of a length that is similar to the length of the stapler sequence.
  • the partially double stranded linker oligonucleotide is allowed to contact with the DNA template, where the primer sequence binds to the complementary sequence in the DNA template. After hybridization to the DNA template, the blocker sequence is then removed so that the linker oligonucleotides can hybridize to each other through the complementary stapler sequences.
  • Blocker oligonucleotides can be removed via a number of means.
  • the blocker oligonucleotide is designed with a sequence hybridized to the stapler sequence of the linker oligonucleotide to form a double-stranded hybrid, and the double-stranded hybrid having a melting temperature that is lower than the melting temperature of the double stranded hybrids formed between the DNA template and the linker oligonucleotide.
  • the removal of blocker oligonucleotides can be achieved by raising the temperature of the reaction so that the blocker oligonucleotide is dissociated from the linker oligonucleotide, whereas the DNA template remains hybridized to the linker oligonucleotide.
  • the blocker oligonucleotide can be removed through enzymatic cleavage, for example uracil -glycosylase/endonuclease IV or UDG/APEI.
  • the phosphodiester bonds in various position in blocker oligonucleotide can be replaced with chemically cleavable bonds (e.g., disulfide, azido) so that the blocker oligonucleotide can be cleaved and removed.
  • chemically cleavable bonds e.g., disulfide, azido
  • the method comprises 1) hybridizing blocker oligonucleotides to the stapler sequences of at least two linker oligonucleotides in a reaction, to produce partially double stranded linker oligonucleotides, which comprise a double-stranded region consisting of the blocker oligonucleotide and the stapler sequence, thereby preventing stapler sequences of the linker oligonucleotides from hybridizing to each other, 2) adding a DNA template to the reaction, with the linker oligonucleotides, thereby the partially double stranded primer sequences of the linker oligonucleotides bind to the DNA template, 3) removing the blocker oligonucleotide by one or more of the following: raising temperature of the reaction such that the blocker oligonucleotide or multiple oligonucleotides are dissociated from the linker oligonucleotides, enzymatic or chemical degradation of the blocker
  • the stapler sequences of at least two linker oligonucleotides bind to each other to form a linked pair before binding of the primer sequences to the DNA template.
  • the events of two linker oligonucleotides from the same linked pair bind to an individual DNB occur at a higher rate than two linker oligonucleotides from different linked pairs binding to an individual DNB In some embodiments.
  • the linked pair of linker oligonucleotides are used at a suitable concentration such that two linker oligonucleotides in the linked pair bind to the same DNB, rather than two linker oligonucleotides from different linked pairs bind to the same DNB.
  • suitable concentrations for this purpose can be determined empirically.
  • Linking of subunits could occur after deposition of the DNB on the slide or in solution before loading of the DNBs onto the surface. Linking in solution may minimize splitting of DNBs across multiple surface binding sites. Linking on the surface would minimize the risk of linking multiple DNBs.
  • linker oligonucleotides are connected and form a linker.
  • a linker refers to a complex consisting of two or more linker oligonucleotides that are connected by covalent or non-covalent means.
  • Linkers and linker oligonucleotides may take different forms. For example, based on the number of subunits in the DNA concatemer that the linkers can bind, the linkers can be classified as 2-arm linkers or 4-arm linkers. Based on the relative sequence components on the linkers themselves, they can be classified as Z-linkers or X-linkers.
  • each linker oligonucleotide contains only one primer seqence and thus two oligonucleotides can link two subunits of the DNA concatemer ( “2-arm linkers” ) .
  • two linker oligonucleotides, seq 1 and seq 2 each contains a primer sequence (A) that serves to hybridize to one subunit (s) of a DNB.
  • Seq 1 also contains a stapler sequence (b)
  • seq 2 contains a stapler sequence (B) , with (b) being complementary to (B) . See FIG. 6A.
  • each of the 2-arm linkers comprises a palindromic sequence, see FIG. 6B.
  • one single linker oligonucleotide by virtue of having two primer sequences that are complementary to the adaptor sequence of the DNA concatemer, could connect two subunits of the concatemer. For example seq 4 or seq 5 in FIG. 6C.
  • each of the linker oligonucleotides has two primer sequences, with a stapler sequence interposed in between.
  • Each linker oligonucleotide can hybridize to two subunits of a DNB. This configuration allows the two connected linker oligonucleotides to bind to four separate sequences of the DNA template, thus is referred to as “4-arm linkers” .
  • the stapler sequences of the two linker oligonucleotides of the 4-arm linker are not identical, such as seq 4 and seq 5 in FIG. 6C.
  • the stapler sequences of the two linker oligonucleotides of the 4-arm linker are identical and palindromic, which allows hybridization of two linker oligonucleotides of the same sequence. See, e.g., seq 6 in FIG. 6D and seq 7 and seq 8 in FIG. 6E.
  • two linker oligonucleotides are connected to form a Z linker.
  • Each linker oligonucleotide in the Z-linker comprises a primer sequence that is complementary to and can hybridize to the DNA template, e.g., the adaptor of the DNA concatemer.
  • each primer sequence comprises an extendible 3’ end and can serve as a primer to make a second strand based on the DNA template.
  • Each linker oligonucleotide of the Z-linker also comprises a stapler sequence, which is complementary to the stapler sequence of the other linker oligonucleotide such that hybridization of the two stapler sequences results in formation of a partial hybrid between the two linker oligonucleotides.
  • the stapler sequence of the linker oligonucleotide is palindromic, and the two linker oligonucleotides of the Z-linker are of identical sequence. In some embodiments the stapler sequence of linker oligonucleotide is non-palindromic and the two linker oligonucleotides are different.
  • the 3’ ends of the linker oligonucleotides can be extended to form second strands.
  • the two second strands so formed are linked on the 5’ end via the Z-linker.
  • Figure 1A shows an illustrative example of a Z-linker consisting of a pair of linker oligonucleotides, each comprising a stapler sequence and a primer sequence, with the stapler sequence located 5’ to the primer sequence.
  • the stapler sequences of the pair of linker oligonucleotides are complementary and annealing thereof connects the linker oligonucleotides at the 5’ end and forms a 2-arm Z-linker.
  • the primer sequences hybridize to a DNB template (e.g., the first strand) and each linker oligonucleotide is extended to form a second strand, which results in two second strands linked at 5’ (bottom right panel) .
  • the extension is carried out by a strand-displacement DNA polymerase, which forms a branched structure, in which each second strand is partially hybridized to the DNA template.
  • the linker oligonucleotides form an X linker, in which two linker oligonucleotides that are identical in sequence are linked through a palindromic sequence, and each of the linker oligonucleotides possess extra sequence at the 5’ ends that can hybridize to another linker oligonucleotide having a non-palindromic stapler sequence.
  • Each linker oligonucleotide also comprise a primer sequence that is complementary and can hybridize to the DNA template (e.g., an adaptor of the DNA concatemer) .
  • This X-linker structure thus allows for multiple 3’ extendable primer sequences, possibly four or more.
  • an X-linker may comprise a pair of D-Bb-A linker oligonucleotides ( “Oligo Linker 1” ) and a pair of d-cC-A linker oligonucleotides ( “Oligo Linker 2” ) .
  • the two D-Bb-A linker oligonucleotides hybridize to each other via the palindromic stapler sequences Bb, and the two d-cC-A linker oligonucleotides hybridize to each other via the palindromic stapler sequences cC.
  • Each of the D-Bb-A linker oligonucleotide is also hybridized to one of the d-cC-A linker oligonucleotide via the complementary stapler sequences D and d.
  • This results in a structure ( “structure 1” , as shown in the bottom left panel in FIG. 2) with four primer sequences that can hybridize to a DNA template.
  • Each of the four primer sequences comprises an extendible 3’ end and can be extended to produce a second strand, which is a reverse complement of the DNA template.
  • the X-linker may also take a form as shown in “structure 2” (the bottom right panel of FIG. 2) .
  • Structure 2 the bottom right panel of FIG. 2 .
  • Two D-Bb-A linker oligonucleotides are annealed with four d-cC-A linker oligonucleotides, which results in a structure that has multiple primer sequences (A) , and excess single-stranded arms (e.g., “D” or “d” ) .
  • A primer sequences
  • D excess single-stranded arms
  • These excess strand arms allow for continued structure growth as a random network.
  • “D” is readily complementary to any “d” region and is capable of annealing to “d” to expand the structure in any form.
  • the single underline sequence is the primer sequence (A) ; the bold type is palindromic stapler sequence (Bb) ; and the double underline sequence is the non-palindromic stapler sequence (D or d) .
  • the method further comprises hybridizing linker oligonucleotides to the second strands (these linker oligonucleotides are referred to as second linker oligonucleotides) .
  • Second linker oligonucleotides may comprise any of the components arranged in any of the configuration as described above, i.e., they may also comprise a stapler sequence and a primer sequence having an extendible 3’ end.
  • the second linker oligonucleotides are sequencing primers and are used to generate sequence reads of the second strand. As described below, the sequence reads of the second strand can be combined with the sequence reads of the first strand to construct sequence information of the target DNA. See FIG. 1B.
  • Seq 12 is an example of one embodiment of a second strand linker oligonucleotide.
  • Seq 12 consists of subsequences Seq 12.1, Seq 12.2, and Seq 12.3.
  • Seq 12.1 (SEQ ID NO: 7) and seq 12.3 (SEQ ID NO: 9) are identical repeated sequences that hybridize to a region of the second strand (also referred to as a second strand spur) of a DNB (reverse complement strand of the original DNB) .
  • Seq 12.2 is a region of internal complementarity that allows 2 oligonucleotide molecules to come together and hybridize to form a 4-arm structure.
  • the linker oligonucleotides of the invention may comprise a primer sequence with an extendible 3’ end, thus, linker oligonucleotides can serve as primers.
  • the linker oligonucleotide may possess a 3’-hydroxyl chemical group that allows extension as a primer with one or more DNA polymerases.
  • the linker oligonucleotide comprises a reversible 3’ blocking group, which can be cleaved to produce an extendible 3’ end.
  • at least two linker oligonucleotides are extended to produce two second strands, i.e., strands that have a sequence that is reverse complementary to the DNA template.
  • FIG. 1A shows two second strands produced by extending two linker oligonucleotides (indicated as “linked second strand spurs” in FIG. 1A) that are linked.
  • the strand shown on the left is located 5’ to the strand on the right; in this configuration, the strand on the left is the upstream strand, and the strand on the right is the downstream strand.
  • a plurality of linker oligonucleotides can be extended to generate a series of second strands. Any individual second strand in the series can be deemed a downstream second strand (relative to a second strand upstream) and also an upstream second strand (relative to a second strand downstream) .
  • extending linker oligonucleotides produces a series of second strands, including second strand #1, second strand #2, second strand #3. #1, #2, and #3 are hybridized or partially hybridized to a DNA template and are present in the order from 5’ to 3’. #2 is the upstream second strand relative to #3; meanwhile, #2 is also the downstream second strand relative to second strand #1.
  • producing second strands involves at least two steps.
  • the first step includes extending at least two first strand linker oligonucleotides by a DNA polymerase (e.g., a non- strand-displacement polymerase or a strand-displacement polymerase) to generate at least two partially extended second strands, a partially extended upstream second strand, and a partially extended downstream second strand. Both strands are fully hybridized to the DNA template.
  • a DNA polymerase e.g., a non- strand-displacement polymerase or a strand-displacement polymerase
  • the second step includes further extending the two partially extended second strands with a strand-displacement polymerase, during which extending the partially extended upstream partially displaces the partially extended downstream second strand, thereby producing a partially hybridized downstream second strand.
  • an extension primer is a substrate for a DNA polymerase and is extendible by addition of nucleotides.
  • extension primers often have a length in the range of 10-100 nucleotides, often 12-80 nucleotides, and often 15-80 nucleotides.
  • primers and probes may be fully or partially complementary to the stapler sequence in an adaptor to which it hybridizes.
  • a primer may have at least 85%, 90%, 95%, or 100%identity to the sequence to which it hybridizes.
  • a primer may also contain additional sequence at the 5’ end of the primer that is not complementary to the primer binding sequence (i.e., the sequence of the primer binding site) in the adaptor.
  • the non-complementary portion of a primer may be at a length that does not interfere with the hybridization between the primer and its primer stapler sequence. In general, the non-complementary portion is 1 to 100 nucleotides long. In some embodiments, the non-complementary portion is 4 to 8 nucleotides long.
  • Primers may comprise DNA and/or RNA moieties, and in some approaches primers used in the invention may have one or more modified nucleotides that contain modifications to the base, sugar, and/or phosphate moieties.
  • a “sequencing oligonucleotide” may be an extension primer used in sequencing-by-synthesis reactions (also called “sequencing by extension” ) .
  • a “sequencing oligonucleotide” may be an oligonucleotide used in a sequencing-by-ligation method such as “combinatorial probe-anchor ligation reaction” (cPAL) (including single, double and multiple cPAL) as described in US Patent Publication 20140213461, incorporated herein by reference for all purposes.
  • cPAL combinatorial probe-anchor ligation reaction
  • cPAL comprises cycling of the following steps: First, a “sequencing oligonucleotide” (or “anchor” ) is hybridized to a complementary sequence in an adaptor of the second DNA strand described above. Enzymatic ligation reactions are then performed with the anchor to a fully degenerate probe population of, e.g., 8-mer probes that are labeled, e.g., with fluorescent dyes. Probes may comprise, e.g., about 6 to about 20 bases in length, to about 7 to about 12 bases in length.
  • the population of 8-mer probes that is used is structured such that the identity of one or more of its positions is correlated with the identity of the fluorophore attached to that, e.g., 8-mer probe.
  • 8-mer probe e.g. 8-mer probe.
  • basic cPAL well known in the art, such as multiple cPAL, partially or fully degenerate secondary anchors are used to increase the readable sequence.
  • a strand displacement polymerase is used to produce partially-displaced second strands (follow-on fragments) with both overhangs and duplex portions attached to the DNA template polynucleotide (e.g., DNB DNA strands) .
  • the extension reaction may be controlled to avoid complete displacement of the second strands (i.e., “following strands” or “follow-on fragments” ) and to produce second strands having lengths of overhangs suitable for sequencing.
  • DNA polymerase having suitable strand displacement activities include, but are not limited to, Phi29, Bst DNA polymerase, Klenow fragment of DNA polymerase I, and Deep-VentR DNA polymerase (NEB#M0258) . These DNA polymerases are known to have different strength of the strand displacement activity. See, Kornberg and Baker (1992, DNA Replication, Second Edition, pp. 113-225, Freeman, N.Y. ) . It is within the ability of a person of ordinary skill in the art guided by this disclosure to select a DNA polymerase suitable for the carrying out the method.
  • Another approach to control the extension-displacement reaction is using suitable concentrations of the DNA polymerase having strand displacement activity, or controlling the concentrations of dNTP, or the concentrations of the linker oligonucleotides, which are served as primers.
  • the extension reaction rate is controlled by including an agent that affects the duplex formation between extension primers and DNA template, such as DMSO (e.g., 1%-2%) , Betaine (e.g., 0.5 M) , glycerol (e.g., 10%-20%) , T4 G32 SSB (e.g., 10-20 ng/ ⁇ l) , and volume exclusion agents, in the reaction buffer.
  • an agent that affects the duplex formation between extension primers and DNA template such as DMSO (e.g., 1%-2%) , Betaine (e.g., 0.5 M) , glycerol (e.g., 10%-20%) , T4 G32 SSB (e.g., 10-20 ng/ ⁇ l) , and volume exclusion agents, in the reaction buffer.
  • DMSO e.g., 1%-2%)
  • Betaine e.g., 0.5 M
  • glycerol e.g., 10%-20
  • reaction temperatures may also be controlled to allow appropriate speed of polymerization and strand displacement. Higher temperature typically results in greater extent of strand displacement. In some embodiments, reaction temperatures are maintained to be within the range of 20°C –37°C, for example, 32°C, 33°C, 34°C, 35°C, 36°C, or 37°C, in order to avoid complete displacement.
  • extension reactions are controlled by using a mixture of conventional (extendible) primers and non-extendible primers, i.e. 3’ end blocked primers.
  • a non-extendible primer blocks elongation via, for example, a chemical blocking group that prevents polymerization by a DNA polymerase.
  • the length of duplex (hybridized) portion of the newly synthesized complementary DNA strand (follow-on fragments) can be controlled.
  • a mixture of first primers is used in which 50-70%are non-extendible ( “blocked” ) and 30-50%can be extended ( “unblocked” ) .
  • Many types of non-extendible primers are known in the art and would be suitable for the present invention.
  • the extension-displacement reaction is controlled by terminating the reaction after a certain period of time during which the desired length of the second strands is achieved. In some embodiments, the reaction is terminated after 5 min, 10 min, 20 min, 30 min, 40 min or 60 min from initiation. Methods of termination of the reaction are well known in the art, for example, by incorporation of ddNTPs or by adding chemical solutions, e.g., a Tris buffer containing 1.5 M NaCl. In one embodiment, the termination is achieved by incorporating ddNTPs after adding to the reaction a Tris buffer containing 1.5M NaCl.
  • the claimed invention provides methods of determining the sequence of the second strands produced as described above.
  • the method comprises hybridizing a sequencing oligonucleotide to the sequence in the second strand that is complementary to at least part of the adaptor of the DNA template (e.g., a DNA concatemer) , and determining the nucleotide sequence of at least part of the sequence complementary to the target DNA sequence. Sequence determination may be carried out using sequencing-by-synthesis methods or using sequencing-by-ligation methods, or both.
  • any of the linker oligonucleotides as described above can be used as sequencing oligonucleotides.
  • overhangs of the second strands are sequenced by extending primers (e.g., a second strand linker oligonucleotide) hybridized to the complementary sequences of the adaptor of a monomer, for example, as illustrated FIG. 1B.
  • primers e.g., a second strand linker oligonucleotide
  • the DNA template strand is also sequenced using primers hybridized to the adaptor of a monomer.
  • the sequence information from the second strands is paired with sequences generated from sequencing the DNA template to determine the entire target DNA sequence.
  • extension primers e.g., the first strand linker oligonucleotide
  • sequencing oligonucleotides bind to different portions of an adaptor sequence.
  • the extension primers and sequencing oligonucleotides bind to the same portion of the adaptor sequence (e.g., a portion of the adaptor sequence for extension and the complement of same portion of the adaptor sequence for sequencing) .
  • any suitable sequence determination method may be used to determine the sequence of the overhang, for example, SBS, pyrosequencing, sequencing by ligation, and others.
  • more than one sequencing approach is used.
  • the DNA template strand may be sequenced using one method (e.g., cPAL) and the second strands are sequenced using a different method (e.g., SBS) .
  • Sequencing-by-synthesis may rely on DNA polymerase activity to perform chain extension during the sequencing reaction step.
  • SBS is well known in the art. See, e.g., U.S. Pat. Nos. 6,787,308 and US8241573B2 and Shendure et al., 2005, Science, 309: 1728-1739. Sequencing on DNA nanoballs can occur through a variety of processes. In one approach the circle used to generate the DNB is prepared with a DNA region of known sequence (the adapter) and an adjacent sequence of unknown identity which is to be determined.
  • One function of the adapter is to provide a primer hybridization site such that extension of the primer will lead to addition of nucleotides into the “unknown” or “to be determined” region.
  • the nucleotides if reversibly blocked at the 3’ position, are added one position at a time and are complementary to the base position in the DNB. After removal of the 3’ blocking group an additional position can be read in the next cycle.
  • a fluorescent moiety characteristic of the base type is used for detection of the incorporating base and so reveals the base at that position in the DNB.
  • sequencing by ligation can be used.
  • the primer or anchor can be extended by ligating fluorescent oligonucleotides that extend into the unknown sequence.
  • fluorescent oligonucleotides with degenerate bases are ligated to the initiating anchor, however one base of the oligonucleotide is defined, and is associated with the fluorescent moiety.
  • Ligation of the oligo probe to the anchor created a stable fluorescence after washing excess probes and is dependent upon the recognition of the defined base being complementary to the base at the same position of the DNB. Sequencing by ligation is described for example, Shendure et al., 2005, Science, 309: 1728-1739. )
  • sequencing methods can also be used, e.g., pyrosequencing (See, e.g., Ronaghi et al., Anal. Biochem. (1996) 242: 84–89) and sequencing by hybridization (see, e.g., Drmanac et al, Advances in Biochemical Engineering/Biotechnology (2002) 77: 75-101) .
  • first linker oligonucleotides and second linker oligonucleotides may vary.
  • a first linker oligonucleotide and polymerase are added and synthesis of the second strand occurs (at least in part) prior to addition of the second linker oligonucleotide.
  • the first and second linker oligonucleotides are added at about the same time. For example, they may be added together in the same composition, or may be added separately within about 1 minute of each other, or within about 5 minutes of each other.
  • the first and second extension primers may be added in any order.
  • Sequential addition of the primers may be necessary in approaches in which second strand is to be produced using a DNA polymerase that has no strand displacement activity, while the second strand is to be produced using a DNA polymerase having strand displacement activity.
  • a single oligonucleotide may function as both an extension primer for producing the second strand and for sequencing.
  • sequencing oligonucleotide (s) for the second strand is typically added after the extension-displacement of the second strand is terminated using the methods disclosed herein.
  • a second strand linker oligonucleotide serves as a sequencing oligonucleotide that hybridizes to the overhang portion of the second strand.
  • the sequencing oligonucleotide has a sequence that is complementary to and thus hybridizes to a known sequence within the second strand.
  • the sequencing oligonucleotide hybridizes to a sequence in the second strand that is complementary to at least part of the adaptor in the DNA concatemer.
  • the sequencing oligonucleotide is complementary, partially or completely, to the first strand linker oligonucleotide.
  • the methods of the present invention may be carried out using methods, tools and reagents well known to those of ordinary skill in the art of molecular biology and MPS sequencing, including nucleic acid polymerases (RNA polymerase, DNA polymerase, reverse transcriptase) , phosphatases and phosphorylases, DNA ligases, and the like.
  • nucleic acid polymerases RNA polymerase, DNA polymerase, reverse transcriptase
  • phosphatases and phosphorylases DNA ligases
  • certain primer extension steps may be carried out using one or more DNA polymerases.
  • Certain extension steps are carried out using DNA polymerase with strand displacement activity.
  • the methods disclosed herein use one or more DNA polymerases and strand displacement activities of the DNA polymerase (s) to generate DNA strands complementary to a DNA template.
  • the present invention uses a DNA polymerase with a strong 5’ ⁇ 3’ strand displacement activity.
  • the polymerase does not have 5’ ⁇ 3’ exonuclease activity.
  • DNA polymerases having 5’-3’ exonuclease activity may be used when the activity does not prevent the implementation of the method of the invention, e.g., by using reaction conditions that inhibit the exonuclease activity.
  • Strand displacement activity describes the ability of the polymerase to displace downstream DNA encountered during synthesis.
  • Strand displacement activity is described in US Pat. Pub. No. 20120115145, incorporated herein by reference, as follows: “Strand displacement activity” designates the phenomenon by which a biological, chemical or physical agent, for example a DNA polymerase, causes the dissociation of a paired nucleic acid from its complementary strand in a direction from 5 towards 3, in conjunction with, and close to, the template-dependent nucleic acid synthesis. The strand displacement starts at the 5′ end of a paired nucleic acid sequence and the enzyme therefore carries out the nucleic acid synthesis immediately in 5′ of the displacement site.
  • the neosynthesized nucleic acid and the displaced nucleic acid generally have the same nucleotide sequence, which is complementary to the template nucleic acid strand.
  • the strand displacement activity may be situated on the same molecule as that conferring the activity of nucleic acid synthesis, and particularly the DNA synthesis, or it may be a separate and independent activity.
  • DNA polymerases such as E. coli DNA polymerase I, Klenow fragment of DNA polymerase I, T7 or T5 bacteriophage DNA polymerase, and HIV virus reverse transcriptase are enzymes, which possess both the polymerase activity and the strand displacement activity.
  • Agents such as helicases can be used in conjunction with inducing agents that do not possess strand displacement activity in order to produce the strand displacement effect, that is to say the displacement of a nucleic acid coupled to the synthesis of a nucleic acid of the same sequence.
  • proteins such as Rec A or Single-Strand Binding Protein from E. coli or from another organism could be used to produce or to promote the strand displacement, in conjunction with other inducing agents (Kornberg and Baker, 1992, DNA Replication, 2nd Edition, pp 113-225, Freeman, N.Y. ) .
  • the polymerase is Phi29 polymerase.
  • Phi29 polymerase has a strong displacement activity at moderate temperatures (e.g., 20-37°C) .
  • Bst DNA Polymerase Large Fragment (NEB #M0275) is used.
  • Bst DNA Polymerase is active at elevated temperatures ( ⁇ 65°C) .
  • the polymerase is Deep-VentR DNA polymerase (NEB #M0258) (Hommelsheim et al., Scientific Reports 4: 5052 (2014) ) .
  • DNA template polynucleotides are immobilized on a substrate.
  • the immobilization occurs prior to synthesis of the second strands discussed above.
  • Exemplary substrates may be substantially planar (e.g., slides) or nonplanar and unitary or formed from a plurality of distinct units (e.g., beads) .
  • Exemplary materials include glass, ceramic, silica, silicon, metal, elastomer (e.g., silicone) , polyacrylamide (e.g., a polyacrylamide hydrogel; see WO 2005/065814) .
  • the substrate comprises an ordered or non-ordered array of immobilization sites or wells.
  • target DNA polynucleotides are immobilized on a substantially planar substrate, such as a substrate comprising an ordered or non-ordered array of immobilization sites or wells. In some approaches, target DNA polynucleotides are immobilized on beads.
  • Polynucleotides can be immobilized on a substrate by a variety of techniques, including covalent and non-covalent attachment. Polynucleotides can be fixed to a substrate by a variety of techniques.
  • a surface may include capture probes that form complexes, e.g., double stranded duplexes, with component of the polynucleotide molecule, such as an adaptor oligonucleotide.
  • a surface may have reactive functionalities that react with complementary functionalities on the polynucleotide molecules to form a covalent linkage.
  • Long DNA molecules e.g., several nucleotides or larger, may also be efficiently attached to hydrophobic surfaces, such as a clean glass surface that has a low concentration of various reactive functionalities, such as –OH groups.
  • polynucleotide molecules can be adsorbed to a surface through non-specific interactions with the surface, or through non-covalent interactions such as hydrogen bonding, van der Waals forces, and the like.
  • a DNA nanoball may be immobilized to a discrete spaced apart region as described in US Pat. No. 8,609,335 to Drmanac et al.
  • the DNBs are immobilized on a substrate by hybridization to immobilized probe sequences, and solid-phase nucleic acid amplification methods are used to produce clonal clusters comprising DNA template polynucleotides. See, e.g., WO 98/44151 and WO 00/18957.
  • DNA template polynucleotides are compartmentalized in an emulsion, droplets, on beads and/or in microwells (Margulies et al. "Genome sequencing in microfabricated high-density picolitre reactors. " Nature 437: 7057 (2005) ; Shendure et al. “Accurate multiplex polony sequencing of an evolved bacterial genome” Science 309, 1728–1732 (2005) prior to the primer extension steps.
  • DNA nanoballs are arrayed on a substrate in either an ordered or random array.
  • the adsorption to the substrate is mediated through substrate-protein-DNA interactions.
  • post attachment deposition of a protein layer can improve stability of the DNA array, see WO2013066975A1, the entire disclosure of which is herein incorporated by reference.
  • the invention comprises an array of DNA complexes.
  • the array is a support comprising an array of discrete areas, wherein a plurality of the areas comprise (a) a clonal cluster of single-stranded DNA templates and a plurality of linker oligonucleotides, wherein the DNA template is a single-stranded concatemer comprising a plurality of monomers, wherein each monomer comprises an adaptor sequence and a DNA target sequence, wherein each linker oligonucleotide comprises a primer sequence that is complementary to and hybridizes to an adaptor sequence of the DNA template, wherein the primer sequence includes an extendible 3’ end, and wherein at least two linker oligonucleotides of the plurality of linker oligonucleotides that are hybridized to the DNA template are connected to each other.
  • the invention comprises A DNA complex comprising a DNA template, two or more second strands, wherein each second strand comprises an overhang region and a hybridized region that is hybridized to the DNA template, wherein the two or more second strands are complementary to the DNA template, and wherein at least two second strands are connected at their respective 5’ end.
  • a plurality of second linker oligonucleotides are used as a primer for primer extension (e.g., a sequencing by synthesis reaction) , or is an extension product of such a primer, or is an oligonucleotide capable of activing as an anchor for sequencing by ligation, or is an ligation product of such an oligonucleotide and a labeled probe (e.g., a labeled cPAL probe) .
  • the second linker oligonucleotide comprises a portion complementary to the adaptor sequence and can be extended for sequencing the second strand.
  • the DNA complexes of the array may comprise any of the properties of complexes described herein or made according to methods described herein. Additionally the complexes may have any combination of one or more of the following features: (i) the array comprises at least 10 6 discrete areas, (ii) wherein the DNAs are single-stranded (iii) wherein the second linker oligonucleotide comprises at least 10 bases of sequence of the adaptor, preferably at least 12 bases, and optionally at least 15 bases, and (iv) the second linker oligonucleotide is completely complementary to the second DNA strand to which it is hybridized.
  • the disclosure provides a composition comprising an array as described above in Section 9 and an enzyme selected from DNA ligase and DNA polymerase.
  • the composition comprises two DNA polymerases, one with strand displacement activity and one without strand displacement activity.
  • the composition further comprises fluorescently tagged dNTPs (e.g., dNTP analogs) and/or a pool of tagged oligonucleotide probes.
  • Example 1 Effect of a Z-linker on RhoA (Adenosine ( “A” ) base intensity) decline, mapping rates and error rates
  • DNBs were produced by rolling circle amplification using a library of single-stranded circles comprising human genomic DNA fragments. DNBs were immobilized on DNB array chip and sequenced using BGIseq500. Sequencing was performed by cycles of sequencing by synthesis with a DNA polymerase and addition of reversibly blocked fluorescent terminators. Incorporation and de-blocking occurred at temperatures between 50°C to 60°C. Standard primers, e.g., primers that do not include a stapler sequence and does not link multiple subunits of DNB, or linker oligonucleotides having the configuration of seq 6 (A-bB-A) were used as sequencing oligonucleotides. Two seq 6 linker oligonucleotides hybridize to each other and form a Z-linker. Mapping rates and error rates were determined from individual reads according to the instructions provided by the -500 software.
  • Sequencing oligonucleotides (1 ⁇ M) were hybridized to the DNBs and SBS was performed for 175 cycles at temperatures of 20°C to 57°C.
  • Reversible Terminator nucleotides RTs
  • unlabeled nucleotides were incorporated in each cycle of sequencing to further incorporate at every subunit of each DNB during each cycle of sequencing.
  • the 3’ blocking group was removed with a phosphine reagent before the next incorporation event.
  • FIG. 3 shows the effect of the Z-linker on the reduction in intensity over multiple cycles (e.g., over 175 cycles) of sequencing.
  • the Y-axis represents intensity measurement (referred to as “Rho” ) for one base group (A base) .
  • Rho was processed to indicate the average intensity of DNBs after being assigned to a base group per BGIseq500 software.
  • Lane 1 (A L1) i.e., signals from sequencing with the standard primer showed a more rapid reduction in intensity values over 170 cycles of sequencing compared with lane 2 (A L2) (i.e., signals from sequencing with the Z-linker) .
  • the decline of signal as sequencing cycles progress can have potentially multiple causes, such as DNB shearing or structural loss of DNB mass, loss of the extending strand, irreversible termination of nucleotides, out-of-phase base reading within DNBs.
  • the slower decline in signals from sequencing with the Z-linker indicates less loss of DNBs.
  • FIG. 4A shows that, for the first 66 bases, the flow cell lane with the linker oligonucleotides had a slightly higher mapping rate (82%) than that of the lane with standard primer (81.2%) .
  • the mapping rate for sequencing using the linker oligonucleotides was 82%, which was substantially higher than sequencing with the standard primer, which showed a 72%mapping rate.
  • error rates with the Z-linker oligonucleotides were generally lower than with standard primers.
  • the error rates were about half when a linker oligonucleotide was included --with error rates of 0.57%with the linker oligonucleotides, and 1.17%with the standard primers.
  • DNBs were produced and sequencing was performed as described in Example 1, except that the linker oligonucleotides are seq3 (bB-A) , which comprises a palindromic stapler sequence bB and a primer sequence A.
  • the hybridization of two Linker oligonucleotide 3 forms a Z-linker.
  • mapping and discordance was determined for the first strand 50 bases of a read and the second strand 50 bases of the read.
  • the flowcell lane with the Z-linker stapler had a slightly higher mapping rate (94%) than that of the lane with the standard primers (93%) .
  • the mapping rate for the sequencing reaction with Z-linker was 91%, which was substantially higher than the sequencing reaction with the standard primers, which had an 83%mapping rate.
  • error rates for sequencing reactions with the Z-linker were generally lower than the one with standard primers.
  • the error rates for sequencing reactions with Z-linker were 0.31%, only one half of the ones with standard primers, about 0.63%) .
  • Embodiment 1 A method of preparing a DNA template for nucleic acid analysis comprising hybridizing a plurality of first strand linker oligonucleotides to the DNA template,
  • the DNA template is a single-stranded concatemer comprising a plurality of monomers, wherein each monomer comprises an adaptor sequence and a DNA target sequence,
  • each first strand linker oligonucleotide comprises a template-hybridizing sequence, wherein the template-hybridizing sequence is complementary to and hybridizes to an adaptor sequence of the DNA template, and
  • Embodiment 2 The method of embodiment 1, wherein the template-hybridizing sequence is a primer sequence, and wherein the primer sequence includes an extendible 3’ end.
  • Embodiment 3 The method of embodiment 1, wherein the method further comprises extending the at least two first strand linker oligonucleotides to generate at least two second strands by one or more DNA polymerases,
  • Embodiment 4 The method of embodiment 1, 2 or 3, wherein the at least two first strand linker oligonucleotides are linked through DNA hybridization, a covalent bond, or both.
  • Embodiment 5 The method of embodiment 1, 2 or 3, wherein the at least two first strand linker oligonucleotides each comprises a stapler sequence, wherein the at least two first strand linker oligonucleotides are linked by hybridization of the respective stapler sequences.
  • Embodiment 6 The method of any of the embodiments 1-3, wherein the at least two first strand linker oligonucleotides are linked by . a shared scaffold.
  • Embodiment 7 The method of embodiment 5, wherein at least one of the first strand linker oligonucleotides comprises two cleavage sites flanking the stapler sequence, wherein cleaving at the cleavage sites releases the stapler sequence.
  • Embodiment 8 The method of embodiment 5, wherein the stapler sequences of the at least two first strand linker oligonucleotides are hybridized to different regions of a shared scaffold, thereby linking the at least two first strand linker oligonucleotides.
  • Embodiment 9 The method of embodiment 5, wherein the at least two first strand linker oligonucleotides bind to the DNA template before they bind to each other via the respective stapler sequences.
  • Embodiment 10 The method of embodiment 9, wherein the method comprises:
  • each of the partially double stranded first strand linker oligonucleotide comprises i) a double-stranded region consisting of the blocker oligonucleotide and the stapler sequence, thereby preventing stapler sequences of the first strand linker oligonucleotides from hybridizing to each other, and ii) a single-stranded region comprising the sequence that is a template-hybridizing sequence,
  • removing the blocker oligonucleotide by one or more of the following: raising temperature of the reaction such that the blocker oligonucleotide is dissociated from the DNA template, enzymatic or chemical degradation of the blocker oligonucleotide,
  • Embodiment 11 The method of embodiment 5, wherein the stapler sequences of the at least two first strand linker oligonucleotides bind to each other to form linked first strand linker oligonucleotides before binding of the primer sequences to the DNA template.
  • Embodiment 12 The method of embodiment 11, wherein the method comprises using the linked first strand linker oligonucleotides at a concentration below a predetermined threshold such that two first strand linker oligonucleotides bind to a single DNA template molecule.
  • Embodiment 13 The method of embodiment 5, wherein the stapler sequence is a palindromic stapler sequence, and
  • the palindromic stapler sequence is 5’ to the template-hybridizing sequence.
  • Embodiment 14 The method of embodiment 5, wherein the at least two first strand linker oligonucleotides comprise two complementary, non-palindromic stapler sequences, one on each first strand linker oligonucleotide; and
  • Embodiment 15 The method of embodiment 5, wherein the at least two first strand linker oligonucleotides each comprises a non-palindromic stapler sequence and a palindromic stapler sequence.
  • Embodiment 16 The method of embodiment 15, wherein for each of the at least two first strand linker oligonucleotides, the palindromic stapler sequence is interposed between the non-palindromic stapler sequence and the template-hybridizing sequence.
  • Embodiment 17 The method of embodiment 5, wherein the at least two first strand linker oligonucleotides each comprises a stapler sequence interposed between two primer sequences, wherein the stapler sequences on the at least two first strand linker oligonucleotides are hybridized to each other.
  • Embodiment 18 The method of embodiment 5, wherein the stapler sequence has a length that ranges between 8 and 50 nucleotides.
  • Embodiment 19 The method of embodiment 5, wherein the template-hybridizing sequence has a length that ranges from 15 to 70 nucleotides.
  • Embodiment 20 A method of preparing a DNA template for nucleic acid analysis comprising hybridizing a plurality of first strand linker oligonucleotides to the DNA template in a reaction mixture,
  • the DNA template is a single-stranded concatemer comprising a plurality of monomers, wherein each monomer comprises an adaptor sequence and a DNA target sequence,
  • each first strand linker oligonucleotide comprises a primer sequence that is complementary to and hybridizes to an adaptor sequence of the DNA template
  • At least one first strand linker oligonucleotide comprises a blocking group at 3’ of the primer sequence to prevent extension
  • Embodiment 21 The method of embodiment 20, wherein the blocking group is a reversible blocking group, wherein the method further comprises
  • Embodiment 22 The method of embodiment 5, wherein the first strand linker oligonucleotides can be cleaved at a site that is in the stapler sequence or in the template-hybridizing sequence.
  • Embodiment 23 The method of embodiment 20, wherein the method further comprises removing unbound first strand linker oligonucleotides from the reaction mixture from the DNA template.
  • Embodiment 24 The method of embodiment 1, wherein the method further comprises
  • the at least two fully hybridized second strands include an upstream second strand and a downstream second strand
  • Embodiment 25 A DNA complex comprising a DNA template and a plurality of first strand linker oligonucleotides, wherein the DNA template is a single-stranded concatemer comprising a plurality of monomers, wherein each monomer comprises an adaptor sequence and a DNA target sequence,
  • each first strand linker oligonucleotide comprises a template-hybridizing sequence
  • template-hybridizing sequence is complementary to and hybridizes to an adaptor sequence of the DNA template
  • Embodiment 26 The DNA complex of embodiment 25, wherein the template-hybridizing sequence is a primer sequence, and wherein the primer sequence includes an extendible 3’ end.
  • Embodiment 27 A DNA complex comprising a DNA template, two or more second strands, wherein each second strand comprises an overhang region and a hybridized region that is hybridized to the DNA template,
  • Embodiment 28 The DNA complex of embodiment 27, wherein the DNA complex further comprises two or more second strand linker oligonucleotides that are hybridized to two or more second strands,
  • each second strand linker oligonucleotide comprising a second stapler sequence and at least two second strand linker oligonucleotides are linked through hybridization of the respective second stapler sequences.
  • Embodiment 29 A DNA array comprising the DNA complex of any of embodiments 25-28.
  • Embodiment 30 Two linker oligonucleotides, each linker oligonucleotide comprising a stapler sequence and a primer sequence, and the stapler sequence is 5’ to the primer sequence,
  • Embodiment 31 The two linker oligonucleotides of embodiment 30, wherein the stapler sequences are palindromic sequences.
  • Embodiment 32 The two linker oligonucleotides of embodiment 30, wherein the primer sequences on both linker oligonucleotides have the same sequence.
  • Embodiment 33 The two linker oligonucleotides of embodiment 30, wherein each oligonucleotide comprise an additional stapler sequence that is a non-palindromic stapler sequence, and wherein the additional non-palindromic stapler sequence is 5’ to the stapler sequence.
  • Embodiment 34 The two linker oligonucleotides of embodiment 33, wherein the additional non-palindromic stapler sequence in one of the two linker oligonucleotides is hybridized to a stapler sequence in a third linker oligonucleotide.
  • Embodiment 35 The two linker oligonucleotides of embodiment 30, at least one of which comprises a sequence that is selected from the group consisting of SEQ ID NO: 1-10.
  • Embodiment 36 A method of preparing a DNA template for nucleic acid analysis comprising immobilizing the DNA template on an array, wherein the DNA template is a DNA concatemer comprising a plurality of monomers, wherein each monomer comprises an adaptor sequence and a DNA target sequence,
  • each first strand linker oligonucleotide comprises a template-hybridizing sequence
  • the template-hybridizing sequence is complementary to and hybridizes to an adaptor sequence of the DNA template, and wherein at least two first strand linker oligonucleotides of the plurality of first strand linker oligonucleotides that are hybridized to the DNA template are linked to each other.
  • Embodiment 37 The method of embodiment 36, wherein the template-hybridizing sequence is a primer sequence, and wherein the primer sequence includes an extendible 3’ end.

Abstract

Provided are methods and compositions relating to stabilizing DNBs for sequencing especially minimizing loss of DNB segments.

Description

DNA Linker Oligonucleotides
CROSS REFERENCE TO RELATED APPLICATION
This application claims priority to and is entitled to the benefit of U.S. Provisional Application No. 62/927,060, filed on October 28, 2019, which is herein incorporated by reference in its entirety for all purposes.
SEQUENCE LISTING
This application contains a Sequence Listing which has been submitted electronically in ASCII format and is hereby incorporated by reference in its entirety. Said ASCII copy, created on October 16, 2020, is named 092171-1215354_ (5078-WOCN) _SL. txt and is 2, 923 bytes in size.
FIELD OF THE INVENTION
This invention relates to the fields of DNA sequencing, genomics, and molecular biology.
BACKGROUND
DNBs (DNA nanoballs) are useful for a number of applications, including DNA sequencing. Chemical cross-linking has been used to stabilize DNB in these applications. However, current approachs can lower the overall intensity of signal and inhibit second strand production. Improvements in methods of stabilizing DNBs are of value.
BRIEF SUMMARY OF INVENTION
In some embodiments, this disclosure provides a method of preparing a stabilized DNA template for nucleic acid analysis, comprising hybridizing a plurality of first strand linker oligonucleotides to the DNA template, wherein the DNA template is a single-stranded concatemer comprising a plurality of monomers, wherein each monomer comprises an adaptor sequence and a DNA target sequence, wherein each first strand linker oligonucleotide comprises a sequence that is complementary to and hybridizes to an adaptor sequence of the DNA template, and wherein at least two first strand linker oligonucleotides of the plurality of first strand linker oligonucleotides that are hybridized to the DNA  template are linked to each other. In some embodiments, the sequence that is complementary to and hybridizes to the adaptor sequence of the DNA template may be a primer sequence that comprises an extendible 3’ end.
In some embodiments, the method further comprises extending the at least two first strand linker oligonucleotides to generate at least two second strands by one or more DNA polymerases, wherein the at least two first strand linker oligonucleotides are linked to each other and are hybridized to the DNA template, thereby producing at least two second strands, the 5’ ends of which are linked.
In some embodiments, the at least two first strand linker oligonucleotides are linked through DNA hybridization, a covalent bond, or both. In some embodiments, the at least two first strand linker oligonucleotides each comprises a stapler sequence, wherein the at least two first strand linker oligonucleotides are linked by hybridization of the respective stapler sequences.
In some embodiments, the linker oligonucleotide comprises a cleavage site, wherein cleaving the linker oligonucleotide releases the stapler sequence.
In some embodiments, the two stapler sequences are hybridized to different regions of a shared scaffold, thereby linking the at least two first strand linker oligonucleotides.
In some embodiments, the stapler sequences of the at least two first strand linker oligonucleotides bind to each other after binding of the primer sequences to the DNA template.
In some embodiments, the method comprises: 1) hybridizing blocker oligonucleotides to the stapler sequences of at least two first strand linker oligonucleotides in a reaction, to produce partially double stranded first strand linker oligonucleotides, which comprises a double-stranded region consisting of the blocker oligonucleotide and the stapler sequence, thereby preventing stapler sequences of the first strand linker oligonucleotides from hybridizing to each other, 2) adding DNA template to the reaction, with the first strand linker oligonucleotides, wherein the primer sequences of the partially double stranded first strand linker oligonucleotides bind to the DNA template, 3) disassociating the blocker oligonucleotides from the DNA template, 4) washing to remove the blocker oligonucleotide, and 5) lower the temperature to allow the hybridization of the stapler sequences of the first strand linker oligonucleotides to each other. In embodiments the disassociation is achieved by one or more of raising temperature of the reaction such that the blocker oligonucleotide is dissociated from the DNA template, enzymatic degradation of the blocker oligonucleotide, and chemical degradation of the blocker oligonucleotide.
In some embodiments, the stapler sequences of the at least two first strand linker oligonucleotides bind to each other to form linked first strand linker oligonucleotides before binding of the primer sequences to the DNA template.
In some embodiments, the method comprises using the linked first strand linker oligonucleotides at a concentration below a predetermined threshold such that two first strand linker oligonucleotides in the linked pair bind to the same DNB.
In some embodiments, the stapler sequence is a palindromic stapler sequence. In some embodiments the palindromic stapler sequence is 5’ to the primer sequence on each of the at least two first strand linker oligonucleotides.
In some embodiments, the at least two first strand linker oligonucleotides comprise two complementary, non-palindromic stapler sequences, one on each first strand linker oligonucleotide; and wherein at least two first strand linker oligonucleotides are linked through hybridization of the two complementary, non-palindromic stapler sequences.
In some embodiments, the at least two first strand linker oligonucleotides each comprises a non-palindromic stapler sequence and a palindromic stapler sequence. In some embodiments, the palindromic linker is interposed between the non-palindromic stapler sequence and the primer sequence on each of the at least two first strand linker oligonucleotides.
In some embodiments, the at least two first strand linker oligonucleotides each comprise a stapler sequence interposed between two primer sequences, wherein the stapler sequences on the at least two first strand linker oligonucleotides are hybridized to each other. In some embodiments, the length of the stapler sequences is in the range from 8 and 50 nucleotides. In some embodiments, the length of the primer sequence is from 15 to 70 nucleotides.
In some embodiments, provided herein is a method of preparing a stabilized DNA template for nucleic acid analysis comprising hybridizing a plurality of first strand linker oligonucleotides to the DNA template, wherein the DNA template is a single-stranded concatemer comprising a plurality of monomers, wherein each monomer comprises an adaptor sequence and a DNA target sequence, wherein each first strand linker oligonucleotide comprises a primer sequence that is complementary to and hybridizes to an adaptor sequence of the DNA template, wherein the at least one first strand linker oligonucleotide comprises a blocking group at the 3’ of the primer sequence to prevent extension, and  wherein the at least two first strand linker oligonucleotides of the plurality of first strand linker oligonucleotides that are hybridized to the DNA template are linked to each other.
In some embodiments, the blocking group is a reversible blocking group, wherein the method further comprises removing the blocking group from the at least one first strand linker oligonucleotide, and extending the at least one first strand linker oligonucleotide to generate at least one second strand.
In some embodiments, the first strand linker oligonucleotides can be cleaved at a site that is in the stapler sequence or in the primer sequence.
In some embodiments, the method further comprises removing unbound first strand linker oligonucleotides after the hybridizing step and/or heating a reaction mixture comprising the DNA template and first strand linker oligonucleotides, e.g., to 50-65 ℃.
In some embodiments, the method further comprises: 1) extending the at least two first strand linker oligonucleotides of the plurality of first strand linker oligonucleotides by a non-displacement DNA polymerase to generate at least two, partially extended second strands, including a partially extended upstream second strand and a partially extended downstream second strand, wherein the at least two partially extended second strands are fully hybridized to the DNA template, and 2) extending the fully hybridized upstream and downstream second strands with a strand displacement DNA polymerase, wherein extending the upstream second strand partially displaces the downstream second strand, and thereby producing a partially hybridized downstream second strand.
Also disclosed herein is a DNA complex comprising a DNA template and a plurality of first strand linker oligonucleotides, wherein the DNA template is a single-stranded concatemer comprising a plurality of monomers, wherein each monomer comprises an adaptor sequence and a DNA target sequence, wherein each first strand linker oligonucleotide comprises a primer sequence that is complementary to and hybridizes to an adaptor sequence of the DNA template, wherein the primer sequence includes an extendible 3’ end, and wherein at least two first strand linker oligonucleotides of the plurality of first strand linker oligonucleotides that are hybridized to the DNA template are linked to each other.
Also provided herein is a DNA complex comprising a DNA template, two or more second strands, wherein each second strand comprises an overhang region and a hybridized region that is hybridized to the DNA template, wherein the two or more second strands are complementary to the DNA template, and wherein at least two second strands are linked at their respective 5’ end.
In some embodiments, the DNA complex further comprises two or more second strand linker oligonucleotides that are hybridized to two or more second strands, each second strand linker oligonucleotide comprising a second stapler sequence and at least two second strand linker oligonucleotides are linked through the hybridization of the respective second stapler sequences.
Also provided herein is a DNA array comprising a plurality of any of the DNA complexes disclosed herein.
Also provided herein are linker oligonucleotides, each linker oligonucleotide comprising a stapler sequence and a primer sequence, and the stapler sequence is 5’ to the primer sequence, wherein the stapler sequences on the two linker oligonucleotides are complementary to each other, and wherein the two linker oligonucleotides are hybridized to each other via respective stapler sequences.
In some embodiments, the stapler sequences of the two linker oligonucleotides are palindromic sequences. In some embodiments, the primer sequences of both linker oligonucleotides are the same. In some embodiments, each linker oligonucleotide comprises an additional, non-palindromic stapler sequence, and the additional non-palindromic stapler sequence is located 5’ to the stapler sequence. In some embodiments, the additional non-palindromic stapler sequence in one of the two linker oligonucleotides is hybridized to a stapler sequence in a third linker oligonucleotide. In some embodiments, the at least one linker oligonucleotide has a sequence that is selected from the group consisting of SEQ ID NO: 1-10.
In some embodiments, the disclosure provides a method of preparing a DNA template for nucleic acid analysis comprising immobilizing the DNA template on an array, wherein the DNA template is a DNA concatemer comprising a plurality of monomers, and each monomer comprises an adaptor sequence and a DNA target sequence. A plurality of first strand linker oligonucleotides are hybridized to the DNA template, and each first strand linker oligonucleotide comprises a sequence that is complementary to and hybridizes to an adaptor sequence of the DNA template. At least two first strand linker oligonucleotides of the plurality of first strand linker oligonucleotides are hybridized to the DNA template are linked to each other.
DESCRIPTION OF DRAWINGS
FIG. 1A and 1B illustrate DNB subunits using “Z-linkers” staplers as primers.
FIG. 2 illustrates DNB subunit using “X-linkers” and possible linker structures.
FIG. 3 illustrates the effect of addition of an X-linker on signal intensity over multiple sequencing cycles.
FIG. 4A and 4B illustrate the effect of an X-linker on mapping rate and error rates.
FIG. 5A and 5B illustrate the effect of a Z-linker on mapping rate and error rate between the first 50 bases and second 50 bases of a read.
FIG. 6A-6E show various linker configurations formed by linker oligonucleotides disclosed herein through hybridization. B and b are stapler sequences that are complementary to each other. “bB” represents a palindromic stapler sequence. A represents a primer sequence, which can hybridize to the DNA template.
FIG. 7A-7B illustrate additional linking configurations. A represents a primer sequence. B, C, D, and E represent stapler sequences, which may have the same or different sequence. FIG. 7A represents a linear scaffold that links multiple linker oligonucleotides. FIG. 7B shows a circular scaffold that links multiple linker oligonucleotides. FIG. 7C shows an embodiment where two linker oligonucleotides are linked via a chemical bond.
DETAILED DESCRIPTION OF THE INVENTION
1. Overview
The linker oligonucleotides described in this disclosure can be used to hybridize to long nucleic acid molecules in a predictabe fashion to crosslink different regions of the long nucleic acid molecules. In some cases, crosslinking brings spatially distant regions in the nucleic acids into pre-defined shapes and structures. In some cases, the nucleic acid is a DNA concatemer comprising a plurality of monomers, and the linker oligonucleotides each comprises a primer sequence that binds to a different monomer of the concatemer. Linking primers eliminates need for incorporating in the adapter the binding sites for non-primer linking oligonucleoties, which allows the adaptors to be relatively short. Shorter adapters has multiple advantages, such as providing DNBs with higher number of copies of adaptors for a given DNB size. Linked-primers also allow most of denatured primers to rehybridize without being washed away. At least two of the linker oligonucleotides are connected such that different monomers of the concatemer is also connected. In addition to primer rehybridizing, most likely the cage prevent mechanical braking and removing segments of DNB but not loss of smaller segments of DNB due to other types of DNA cuts. The length of the linker both dsDNA part and ssDNA  part may be varied to maximize signal preservation benefits under different sequencing conditiosn (e.g. cleavage chemistry, reaction temperature, time and pH, polymerase properties and others) 
Linking monomers in a DNA concatemer can stabilize its structure. Sequencing reactions may result in cuts to the DNA concatemers and loss of segments thereof. As more sequencing cycles occur, more cuts to the DNA concatemer will accumulate, more loss of the DNA segments will occur. By linking two or more monomers, the probability of two nicks/cuts to the DNA concatemer resulting in DNA loss will be reduced, therefore minimizing DNB mass loss. Furthermore, DNB with linked-primers may have less mechanical cuts due to the imposed structure. With linked oligonucleotides, typically 4 or more specific cuts are needed to lose a segment of a DNB. In general, the more linking is provided the more cuts is needed to lose a segment of the DNB. For a maximal preservation of DNB mass, linked primers are combined with reduced DNA nicking/cutting by used reagents (adjusting temperature, time, concentration, avoiding impurities, specialy pure enzymes without endonuclease activities, close to zero microbial congtamination) assuring less than 4 cuts per 10, 29, 30, 50, 100 or more sequencing cycles. Further, linking two or more subunits also reduces the volume and allowing a high number of DNA concatemer to be deposited on the substrate. Lastly, linking may also offer other advantages such as protecting the DNBs from degradation.
Unlike chemical crosslinking, which often inhibits the making of the reverse complement strand of the DNA template and inhibiting sequencing by primer extension, linker oligonucleotides as a means of crosslinking does not inhibit production of reverse complement or primer extension. On the contrary, linker oligonucleotides comprises one or more primer sequences, which can be extended to form reverse complement strands. In addition, the connection of the linker oligonucleotides results in second strands that are also linked at the 5’ ends by virtue of primer sequences becoming the 5’ terminal sequences of the second strand. The linking of the second strands further stabilizes the DNA template (e.g., a single-stranded DNA concatemer) .
The compositions and methods described herein thus can be used to stabilize the DNA template and minimize structural loss of DNB mass. Sequencing technologies using these compositions and methods extend the read length (number of sequencing cycles) , reduce error rates and increase mapping rates during sequencing.
2. DNA Template Polynucleotides: Concatemers and DNBs
In some embodiments, a DNA template used in the invention is a DNA concatemer. As used in this context, the term “concatemer” refers to a continuous DNA molecule that contains multiple  copies of the same DNA sequences (the “monomer” or “subunit” linked in series) . A “DNA concatemer” may comprise at least two, at least three, at least four, at least 10, at least 25 monomers, at least 50 monomers, at least 200 monomers, or at least 500 monomers. In some embodiments, the DNA concatemer comprises 25-1000 monomers, such as 50-800 monomers or 300-600 monomers) . A DNA concatemer used in the methods of the invention may be a DNA nanoball, or “DNB. ” Without intending to limit the present invention in any fashion, DNA nanoballs are described in Drmanac et al. “Methods and compositions for long fragment read sequencing” U.S. Patent No. 8,592,150 (Nov. 26.2013) , the entire content of which is incorporated herein by reference. “DNA nanoballs” or “DNBs” are single-stranded DNA concatemers of sufficient length to form random coils that fill a roughly spherical volume in solution (e.g., SSC buffer at room temperature) . In some embodiments, DNA nanoballs typically have a diameter of from about 100 to 300 nm. Typically, each monomer comprises at least one target DNA sequence.
DNA nanoballs are single-stranded copies of a DNA sequence that are concatemerized into a linear DNA structure. Typically, DNBs are produced by copying a single-stranded circular DNA using a strand displacement polymerase, such as phi29 polymerase or Bst polymerase, in a process called rolling circle replication. The polymerase begins extension of a primer that is hybridized to the single-stranded circle and creates a reverse complementary strand that is hybridized to the circle. Once one complete extension around the circle is made, the polymerase continues extension by displacing the newly made strand ahead of the direction of travel. As the polymerase continues to extend the strand around the circle, multiple, reverse complement copies are created, joined in a linear fashion to each other. This strategy creates a target with many probe or primer binding sites.
DNA concatemers (including DNA nanoballs) , can be produced by any suitable method. In one approach, a single genomic fragment is used to generate a single-stranded circular DNA with adaptors interspersed between target sequences that are contiguous or close together in the genome.
In one embodiment, a monomer of the concatemer comprises one adaptor sequence and one target DNA sequence. Because monomers are linked in series, target DNA sequences will be flanked by two adaptor sequences. In some approaches, the target DNA sequence in the monomer is flanked by two “half-adaptor” sequences, such that each target sequence linked in series in the concatemer is flanked by two adaptors. In some approaches, the monomeric unit comprises one, two, three, or four, or more adaptors. In some embodiments, all of adaptors of a monomer (and concatemer) have the same sequence. On other embodiments, adaptors may have different sequences, such as two, three or four different sequences. It will be recognized that individual monomers may comprise more than one  DNA template sequence. For example, a monomer may comprise the structure A1-T1-A2-T2 where T1 and T2 are DNA templates with the same or different sequences, and A1 and A2 are adaptors with the same or different sequence. Various configurations of the DNA concatemer that can be used are disclosed in U.S. Pat. No. 10,227,647, the relevant disclosure of which is incorporated by reference in its entirety. The corresponding concatemer will have the structure A 1-T 1-A 2-T 2 -A 1-T 1-A 2-T 2-A 1-T 1-A 2-T 2 .... In a related embodiment, the monomer may comprise the structure A 1-T 1-A 2-T 2-A 3 where T 1 and T 2 are DNA templates with the same or different sequences, A 2 is an adaptor and A 1 and A 3 are “half adaptors. ” The corresponding concatemer will include the structure A 2-T 2-A 3 A 1-T 1-A 2-T 2-A 3 A 1-T 1-A 2-T 2-A 3 A 1 ... where the A 3 A 1 half adaptors together function as an adaptor. For illustration and not limitation, TABLE 1 illustrates exemplary concatemer structures. In TABLE 1, N is greater than 1. Usually N is at least 3 (at least 3 monomers) , often at least 4, at least 10, at least 25, at least 50, at least 200, or at least 500. In some embodiments, N is in the range of 25-1000, such as 50-800, or 300-600. In cases in which the DNA template polynucleotide is a DNA nanoball, N is at least 25, usually at least 50, and often in the range 50-800, or 300-600.
TABLE 1
Exemplary Concatemer Structures
Figure PCTCN2020124338-appb-000001
3. Target DNA Sequence
As described above, the DNA template (e.g. ., a concatemer) may comprises a target DNA. The target DNA may be from any source, including naturally occurring sequences (such as genomic DNA, cDNA, mitochondrial DNA, cell free DNA, etc. ) , artificial sequences (e.g., synthetic sequences, products of gene shuffling or molecular evolution, etc. ) or combinations thereof. Target DNA may be derived from sources such as an organism or cell (e.g., from plants, animals, viruses, bacteria, fungi, humans, mammals, insects) , forensic sources, etc. Target DNA sequences may be from a population of organisms, such as a population of gut bacteria. A target DNA sequence may be obtained directly from a sample, or may be a product of an amplification reaction, a fragmentation reaction, and the like.
A target DNA may have a length within a particular size range, such as 50 to 600 nucleotides in length. Other exemplary size ranges include 25 to 2000, 50 to 1000, 100 to 600, 50-100, 50-300, 100-300, and 100-400 nucleotides in length. In a DNA template polynucleotide having two or more different target DNAs, the target DNAs may be the same length or different lengths. In a library of a DNA template polynucleotide, the members of the library may have, in some embodiments, similar lengths (e.g., all in the range of 25 to 2000 nucleotides, or another range) .
In one approach, target DNAs may be prepared by fragmenting a larger source DNA (e.g., genomic DNA) to produce fragments in a desired size range. In some approaches, a size-selection step is used to obtain a pool of fragments within a particular size range.
4. Adaptors
A DNA template, or DNA template polynucleotide, as used in the methods disclosure herein, includes two or more adaptor sequences. As used herein, an adaptor sequence refers to the nucleic acid sequence of an adaptor. Adaptors may comprise elements for immobilizing DNA template polynucleotides on a substrate, elements for binding oligonucleotides used in sequence determination (e.g., binding sites for primers extended in sequencing by synthesis methods and/or probes for cPAL or other ligation based sequencing methods, and the like) , or both elements for immobilization and sequencing. Adaptors may include additional features such as, without limitation, restriction endonuclease recognition sites, extension primer hybridization sites (for use in analysis) , bar code sequences, unique molecular identifier sequences, and polymerase recognition sequences.
Adaptor sequences may have a length, structure, and other properties appropriate for a particular sequencing platform and intended use. For example, adaptors may be single-stranded,  double-stranded, or partially-double stranded, and may be of a length suitable for the intended use. For example, adaptors may have length in the range of 10-200 nucleotides, 20-100 nucleotides, 40-100 nucleotides, or 50-80 nucleotides. In some embodiments, an adaptor may comprise one or more modified nucleotides that contain modifications to the base, sugar, and/or phosphate moieties.
It will be appreciated by the skilled reader that different members of a library will typically contain common adaptor sequences, although different species or subgenera in the library may have unique features such as sub-genera-specific bar codes.
An individual adaptor sequence may include multiple functionally distinct subsequences. For example, as discussed in detail in this disclosure, a single adaptor sequence may contain two more primer stapler sequences (which can be recognized by different complementary primers or probes) . Functionally distinct sequences within an adaptor may be overlapping or non-overlapping. For illustration, given a 40-base long adaptor, in one embodiment, bases 1-20 are a first primer binding site and bases 21-40 are a second primer binding site. In a different embodiment, bases 1-15 are a first primer binding site and bases 21-40 are a second primer binding site. In a different embodiment, bases 5-25 are a first primer binding site and bases 15-35 are a second primer binding site. Likewise, given a 40-base long adaptor, bases 1-20 can be an immobilization sequence and bases 21-40 can be a primer binding site. Different primer stapler sequences in an adaptor (or in different adaptors of a DNA template polynucleotide, may have the same or different lengths.
Adaptors (e.g., first adaptors, second adaptors, third adaptor, etc. ) may comprise one, two or more than two primer stapler sequences. A primer stapler sequence is defined functionally as the site or sequence to which a primer (or oligonucleotide) specifically binds. For example, an adaptor with two primer stapler sequences may be specifically bound by two different primers. In one approach the two primer stapler sequences in the same adaptor are overlapping, i.e., sharing part of the nucleotide sequence. In some embodiments, the overlapped region is no more than 50%, or 40%, or 30%, or 20%, or 10%or 5%of either of the two overlapping primer stapler sequences. In one approach the more than one primer stapler sequences are non-overlapping. In some embodiments, the non-overlapping primer stapler sequences are immediately adjacent to each other; in some other embodiments, the non-overlapping primer stapler sequences are separate by 1-10, 10-20, 30-40, or 40-50 nucleotides.
It will be apparent that within a given DNA template polynucleotide, different adaptors may have the same sequence or different sequences, and may have the same primer stapler sequences, or different primer stapler sequences. See, e.g., Sec. 7 below. Although certain drawings are provided to  illustrate the invention, representations of adaptors using similar cross-hatching and the like should not be constructed as indicating identity of sequences.
5. Linker Oligonucleotides
A “linker oligonucleotide” is an oligonucleotide that can be linked to another oligonucleotide via covalent or covalent means. In some embodiments, a linker oligonucleotide is a “first strand linker oligonucleotide, ” which hybridizes to the first strand. In some embodiments, a linker oligonucleotide is a “second strand linker oligonucleotide, ” which hybridizes to the second strand, as further described below.
A linker oligonucleotide may comprise a primer sequence having an extendible 3’ end, said primer sequence being complementary to an adaptor sequence of the DNA template and being able to hybridize to the DNA template. At least two linker oligonucleotides can be linked. The term “link, ” refers to both non-covalent and covalent interactions through which the two nucleic acid molecules are brought together. In some cases the linkage is through DNA hybridization, a chemical bond, or both. These linker oligonucleotides can link contiguous or non-contiguous monomers of a DNA concatemer, thereby stabilizing the DNA concatemer. Exemplary linker oligonucleotides are shown in FIG. 6A-6E and also in Table 2 below, in 5’ to 3’ direction. b and B represents stapler sequences and A represents a primer sequence.
Figure PCTCN2020124338-appb-000002
FIG. 6A-6E illustrate various linker configurations the above linker oligonucleotides can form through hybridization. B and b are stapler sequences that are complementary to each other. “bB” represents a palindromic stapler sequence. “A” represents a primer sequence, which can hybridize to the DNA template. D and d are complementary to each other. Each of the components is further described in detail below.
FIG. 7A illustrates additional embodiments of the linking configuration. A represents a primer sequence and B, C, D, and E represent stapler sequences. B, C, D, and E may have the same or different  sequences. FIG. 7A shows a linear scaffold that links multiple linker oligonucleotides. FIG. 7B shows a circular scaffold that links multiple linker oligonucleotides. FIG. 7C shows an embodiment where two linker oligonucleotides are linked via a chemical bond (i.e., a covalent bond) .
A. Primer sequence
The linker oligonucleotides disclosed herein comprises at least one primer sequence and at least one stapler sequence. In some embodiments, at least one primer sequence is located 3’ to the stapler sequence. In some embodiments, the linker oligonucleotide hybridizes to a single-stranded DNA and the primer sequence of the linker oligonucleotide comprises an extendible 3’ end, from which a DNA strand that is reverse complementary to the single-stranded DNA is made. As used herein, a DNA strand generated by extending a first strand linker oligonucleotide is referred to as a second strand; and the DNA strand generated by extending a second strand linker oligonucleotide is referred to as a third strand.
The number of primer sequences per each linker oligonucleotide may vary. In some embodiments, the linker oligonucleotide comprises one primer sequence. In some embodiments, the oligonucleotide comprises two primer sequences.
The primer sequence of the linker oligonucleotide will be of sufficient length to allow hybridization of a primer, with the precise length and sequence dependent on the intended functions of the primer (e.g., extension primer, ligation substrate, indexing sequence, etc. ) . The primer sequence that binds to the DNB would generally be stable under all conditions of temperature, salt and pH that are used throughout the sequencing run. Dissociation and re-association of an individual oligo or region thereof is generally not desirable because of the possibility of creating out of phase reads or loss of the extending primer. For example, the length and Tm of the primer sequence of the oligonucleotide should be sufficient such that the linker oligonucleotide remains hybridized under temperature, salt and pH conditions used throughout the assay process. Primer sequences are often at least 10, at least 12, at least 15 or at least 18 bases in length. In some embodiments, the primer sequence has a length that ranges from 8 to 60 nucleotides, e.g., from 10 to 25 nucleotides, or from 40 to 60 nucleotides long.
B. Connecting the linker oligonucleotides
The linker oligonucleotides are connected via various means. In some embodiments, they are connected via chemical linkage. In some cases chemical linkage used to connect the linker oligonucleotides described herein are non-specific, formed by using chemicals such as nitrogen mustards or chloroethylnitrosourea (CENU) derivatives. In some embodiments, chemical linkage used  for the method and compositions disclosed herein is targeted crosslinking of oligonucleotides, which could be achieved with, e.g., thionucleobases (Beilstein J. Org. Chem. 2014, 10, 2293–2306) . In general, any modifications that allow attachment of oligonucleotides to surfaces could be modified to allow oligonucleotide to oligonucleotide attachment, and these modifications can be used to connect the linker oligonucleotides. For example, the NHS (N-hydroxysuccinimide) group can react with an amine group of a second molecule (also allows for crosslinking to protein) . Click chemistry such as a conjugation between an azide-modified oligo and an alkyne modified oligo could also be used to conjugate two oligonucleotides together (Acc. Chem. Res. 20124581258-1267) .
In some embodiments, the linker oligonucleotides are connected via DNA hybridization. In some embodiments, the linker oligonucleotides are connected via hybridization of a stapler sequence located on each of the linker oligonucleotides. For example, a stapler sequence ( “A” ) in linker oligonucleotide, seq 1, is complementary to and can hybridize to the stapler sequence ( “a” ) in the other linker oligonucleotide, seq 2. In some embodiments, the stapler sequence is a palindromic sequence. In some embodiments, the stapler sequence is a non-palindromic sequence.
A palindromic sequence refers to a sequence, one half of which is complementary to the other half of the sequence. For example, a linker oligonucleotide in Table 2, comprising a sequence “b” that is followed by a sequence of “B” , wherein b is complementary to B. One exemplary palindromic sequence is GGAACCATGGTTCC (SEQ ID NO: 8) . A linker oligonucleotide with palindromic sequence could form a hairpin with internal complementarity (i.e., form an intramolecular hairpin) or could be complementary to the stapler sequence on another linker oligonucleotide has identical sequence (i.e., forming an intermolecular hybrid) . When the palindromic sequence GGAACCATGGTTCC (SEQ ID NO: 8) of the first linker oligonucleotide hybridizes to the palindromic sequence on a second oligonucleotide of the same sequence, the linker oligonucleotide pair are connected via the palindromic sequences. See FIG. 2 (two linker oligonucleotides “oligo linker 1” hybridize to each other via the palindromic sequence “Bb” to form a first linker oligonucleotide pair, and two linker oligonucleotides “oligo linker 2” hybridize to each other via the palindromic sequence “Cc” to form a second linker oligonucleotide pair) . Another illustrative example is shown in FIG. 6B, in which the sequence “bB” is self-complementary and allows hybridization to a second oligonucleotide.
In one illustrative example, a linker oligonucleotide having a sequence of 5’ GGAACCATGGTTCCAAGTCGGAGGCCAAGCGGTCTUAGGA-3’ (SEQ ID NO: 1) (belonging to the category of Linker oligonucleotide 3 or seq 3) comprises a palindromic sequence GGAACCATGGTTCC (SEQ ID NO: 8) .  SEQ ID NO: 1 further comprises a sequence of AAGTCGGAGGCCAAGCGGTCTUAGGA (SEQ ID NO: 10) , which could act as a primer sequence recognizing a portion of the DNB adapter.
The palindromic sequence in seq 3 would be self-complementary under appropriate conditions and could form a hairpin with internal complementarity; alternatively the palindromic sequence could be complementary to a second oligonucleotide of the same sequence. Under higher temperatures, for example, 50 ℃ to 65 ℃, the internal hairpin structure is unstable and the longer intermolecular hybrid will remain hybridized. Thus raising temperature would favor the formation of intermolecular hybrid (desired for stabilizing DNA templates used in various sequencing applications) over internal hairpin. Accordingly, in some embodiments, linker oligonucleotides comprising palindromic stapler sequences are hybridized to a DNA template on a solid support. The hybridization is performed at the temperature of 10-30 ℃. The solid support is then washed to remove unbound primers and then temperature is raised to 50 to 65 ℃ to reduce the formation of the intramolecular hybrid.
The length of the stapler sequence may vary. The length of the stapler sequence is chosen such that the Tm of the stapler sequence is between 50 ℃ and 72 ℃. This ensures that the linker oligonucleotides can remain hybridized throughout the assay procedure. In some embodiments, the length of the stapler sequence may range from 20 to 150 nucleotides, e.g., from 40 to 120 nucleotides, from 50 to 100 nucleotides.
The relative position of the stapler sequence and the template hybridizing sequence (e.g., a primer sequence) in the linker oligonucleotide may vary. In some embodiments, the stapler sequence is 5’relative to the primer sequence (e.g., Linker oligonucleotides 1-3 in Table 2) . In some embodiments, the stapler sequence is interposed between the two template hybridizing sequences (e.g., the two primer sequences) in the linker oligonucleotide, and the stapler sequence is 3’ relative to a first primer sequence and 5’ relative to a second primer sequence. Illustrative examples are shown in Table 2 (e.g., linker oligonucleotides 4-6) and in FIG. 6C and 6D.
C. Alternative linking option --Scaffold
As used herein, a scaffold refers to a molecular structure through which individual oligonucleotides are associated with one another. In some embodiments, two or more linker oligonucleotides (e.g., three, four, or five linker oligonucleotides) are linked through hybridization to sequences in a scaffold, the sequences being complementary to the stapler sequences of the linker oligonucleotides. In some embodiments, the stapler sequences of these linker oligonucleotides linked  through the scaffold are different. In some embodiments, the stapler sequences are non-palindromic. In some embodiments, the stapler sequences are identical among the two or more linker oligonucleotides.
A scaffold used in the methods and compositions herein may exist in various forms. In some embodiments, the scaffold is a linear scaffold. In some embodiments, the scaffold is a circular scaffold. In some embodiments, the scaffold is a dendrimer scaffold. The scaffold can be linear or circular or dendrimers. In some embodiments, after the linker oligonucleotides are hybridized to the DNA template (e.g., a DNA concatemer) via the primer sequence, the scaffold is added and the linker oligonucleotides are hybridized to the scaffold. FIG. 7A shows one illustrative example of linear scaffolds, in which linker oligonucleotides BA, CA, DA, are hybridized to a linear scaffold. FIG. 7B shows an illustrative example of a circular scaffold, in which linker oligonucleotides BA, CA, DA, and EA are hybridized to a circular scaffold. In both examples, A represents a primer sequence that is complementary to a DNA template, and B, C, D and E are stapler sequences. The scaffold is typically used in a relatively low concentration (e.g., a concentration that is lower than the concentration of the linker oligonucleotides in the reaction) , and a longer hybridization time to the scaffold would provide more linked primers.
A scaffold disclosed herein can be made of any material or substrate (e.g., a protein or a nucleic acid) . In some embodiments, the scaffold is a protein scaffold. In some embodiments, the scaffold is a nucleic acid scaffold. Nonlimiting examples of nucleic acid scaffolds include DNA, RNA, and peptide nucleic acid (PNA) . scaffold can attach, either covalently or non-covalently, to the stapler sequences of the linker oligonucleotide. In some embodiments, the scaffold is a nucleic acid that comprise two or more copies of a sequence complementary to the stapler sequence so that the linker oligonucleotide is anchored to the scaffold through hybridization. In some embodiments, a plurality of scaffold molecules are used, each linking multiple linker oligonucleotides.
A scaffold disclosed herein can be generated as linear repeats or as a circular structure containing multiple repeats. In some embodiments, the scaffold is a nucleic acid concatemer (e.g., a DNB) . In some cases, it is desirable to control the hybridization rate of the scaffold to ensure that each linker oligonucleotide is not itself hybridized to independent scaffold molecules ( “independent single hybridization events) but to the same scaffold molecules that link other linker oligonucleotides ( “bridging events” ) . Promoting bridging events rather than independent single hybridization events can be achieved by e.g., keeping the concentration of the scaffold relatively low. In some embodiments, the molar ratio of the linker oligonucleotides to the scaffold may be in a range from 2 to 50, e.g., 3 to 25,  from 3 to 15, or from 4 to 10. Suitable concentrations of the scaffold for this purpose can be determined empirically. For example, having multiple (e.g., 3-4 or 4-6) different stapler sequences for the same scaffold increases the chance that distant, individual DNA templates (e.g., DNBs) can be linked.
D. Specific linked primer configurations without using scaffolds
In some embodiments, two Linker oligonucleotides are connected and form a linker. Linkers and linker oligonucleotides may take different forms and can be classified in different ways. For example, based on the number of subunits in the DNA concatemer that the linkers can bind, the linkers can be classified as 2-arm linkers or 4-arm linkers. Based on the relative sequence components in the linkers themselves, they can be classified as Z-linkers or X-linkers. Z-linkers can be 2-arm linkers or 4-arm linkers. X-linkers are generally 2-arm linkers.
2-arm linkers versus 4-arm linkers
In some embodiments, each linker oligonucleotide contains only one primer seuqence and thus two oligonucleotides can link two subunits of the DNA concatemer ( “2-arm linkers” ) . As an illustrative example, two linker oligonucleotides, seq 1 and seq 2, each contains a primer sequence (A) that serves to hybridize to one subunit (s) of a DNB. Seq 1 also contains a stapler sequence (b) and seq 2 contains a stapler sequence (B) , with (b) being complementary to (B) . See FIG. 6. To form functional linkers, seq 1 and seq 2) are added to reaction comprising the templateDNA (e.g., a DNA concatemer) .
In some embodiments, one single linker oligonucleotide, by virtue of having two primer sequences that are complementary to the adaptor sequence of the DNA concatemer, could connect 2 subunits of the concatemer. For example, seq 4 or seq 5 in FIG. 6 belong to this category of linker oligonucleotides.
In some embodiments, each of the linker oligonucleotides has two primer sequences, with a stapler sequence interposed in between. Each linker oligonucleotide can hybridize to two subunits of a DNB. This configurations allows the two connected linker oligonucleotides to bind to four separate sequences of the DNA template, thus is referred to as “4-arm linkers” . In some embodiments, the stapler sequences of the two linker oligonucleotides of the 4-arm linker are not identical, such as seq 4 and seq 5 in FIG. 6. In some embodiments, the stapler sequences of the two linker oligonucleotides of the 4-arm linker are identical and palindromic, which allows hybridization of two linker oligonucleotide of the same sequence.
In some embodiments, two linker oligonucleotides are connected to form a Z linker. Each linker oligonucleotide in the Z-linker comprises a primer sequence that is complementary to and can hybridize to the DNA template, e.g., the adaptor of the DNA concatemer. In some embodiments, each primer sequence comprises an extendible 3’ end and can serve as a primer to make a second strand based on the DNA template. Each linker oligonucleotide of the Z-linker also comprises a stapler sequence, which is complementary to the stapler sequence of the other linker oligonucleotide, hybridization of the two stapler sequences results in formation of a partial hybrid between the two linker oligonucleotides. In some embodiments, the stapler sequence of the linker oligonucleotide is palindromic, and the two linker oligonucleotides of the Z-linker are of identical sequence. In some embodiments the stapler sequence of linker oligonucleotide is non-palindromic and the two linker oligonucleotides are of different sequences.
The 3’ of each of the linker oligonucleotides can be extended to form second strands. The two second strands so formed are linked on the 5’ end via the Z-linker.
Figure 1A shows an illustrative example of a Z-linker consisting of a pair of linker oligonucleotides, each comprising a stapler sequence and a primer sequence, with the stapler sequence located 5’ to the primer sequence. The stapler sequences of the pair of linker oligonucleotides are complementary and annealing thereof connects the linker oligonucleotides at the 5’ end and forms a 2-arm Z-linker. The primer sequences hybridizes to a DNB template (e.g., the first strand) and each linker oligonucleotide is extended to form a second strand, which results in two second strands linked at 5’ (bottom right panel) In this instance, the extension is carried out by a strand-displacement DNA polymerase, which forms a branched structure, in which each second strand is partially hybridized to the DNA template.
E. Controlling the degree of crosslinking
In one embodiment, the linker oligonucleotide can be cleaved at defined positions to allow removal of a 3 prime block thus forming a primer for polymerization.
In some embodiments, the linker oligonucleotides can be cleaved, for example, by an enzyme, (e.g., phosphatase or esterase) , chemical reaction, heat, light, and the like. Cleavage could enable release of the cross linked structures for applications that require a lesser degree of crosslinking between the subunits of the concatemer. That is, the linking between the subunits of the concatemer can be reversed or converted so that the secondary function can be performed. For example, a new priming site could be generated as result of formation of a new 3’ hydroxyl group at the cleavage site,  which allows extension by polymerases. The cleavage site can be at any location on the linker oligonucleotide. In some embodiments, at least one of the two linker oligonucleotides comprises two cleavage sites flanking the stapler sequences, and cleavage on the sites results in release of the stapler sequences. In some embodiments, the cleavage site is on the stapler sequences of the at least two linker oligonucleotides, and the cleavage releases the linker oligonucleotide-linker oligonucleotide hybrid or to release the linker-DNB hybrid. In some embodiments, the cleavage site is on the primer sequences of the linker oligonucleotides, and the cleavage on the sites creates shorter unstable hybrids and results in the release of the crosslinked structures.
Cleavage could be achieved by enzymatic recognition of nucleotide bases and or abasic sites, or by modifications to the phosphodiester bond to create site specific scission. For example, a 3’ block of the linker oligonucleotide could be engineered by incorporating a 3’ phosphate group during oligonucleotide synthesis, and the 3’ phosphate group can be cleaved by kinases or phosphatase enzymes.
In some embodiments, the cleavage site comprises a uracil nucleotide base, which allows cleavage of the base and phosphodiester bond with uracil DNA glycosylase (UDG) and an endonuclease. In some embodiments, the endonuclease is one that can generate a 3’ hydroxyl group after cleavage at the cleavage site, e.g., APE1. The cleavage of the phosphodiester backbone at the uracil nucleotide base causes the release of the stapler sequence. This is useful to situations where the de-crosslinking is desired.
In some embodiments, the linker oligonucleotide is A-bB-A. A represents the primer sequence and bB is the palindromic stapler sequence. Cleavage at the dU base and sugar excision with an endonuclease of the two primer sequences allows release of the stapler sequence and the second hybridizing region. The resultant structure will be one of two forms. In one example, the first sequence is sequence 11.4, having a sequence of AAGTCGGAGGCCAAGCGGTCT (SEQ ID NO: 2) which can remain hybridized to the DNB and could possess a 3’ hydroxyl if the appropriate endonuclease is chosen. The second sequence (seq 11.5) AGGAGGAACCATGGTTCCAAGTCGGAGGCCAAGCGGTCT (SEQ ID NO: 3) could also remain hybridized to the DNB and could possess a 3’ hydroxyl that will allow extension by polymerases. Seq 11.5 would still possess a 5’ tail sequence that could remain hybridized to a second 5’ tail sequence of a second oligo.
The residual cleaved oligo sequences, seq 11.4 and seq 11.5 as described above, can then serve as primers for subsequent polymerase extension such as generation of a second strand with a  strand-displacement polymerase. The dU cleavage step is optional; as in some cases, subsequent second strand generation could still occur by binding a primer to an alternative region of the DNB. The primer can be extended to form a second strand by a strand-displacement polymerase, and said extension can displace a downstream second strand, for example, one formed by extending a linker oligonucleotide as disclosed herein.
F. Control the timing of crosslinking
Blocker oligonucleotide
In some cases, to maximize the crosslinking efficiency of the DNA template, it is desirable to ensure that the linker oligonucleotides hybridize to the DNA template before they hybridize to each other. One approach to achieve this is to first block the hybridization between the linker oligonucleotides during the process when the linker oligonucleotides are contacted with and hybridized to the DNA template. After the completion of hybridization of the linker oligonucleotides to the DNA template, and after an optional step of removing excess linker oligonucleotides from the reaction, the block is reversed to permit hybridization between linker oligonucleotides. In some cases, prevention of hybridization of the oligonucleotide linkers before hybridization to the DNA template can be achieved by using a blocker oligonucleotide. Said blocker oligonucleotide is complementary to the stapler sequence of the linker oligonucleotides but not to the primer sequence. The blocker oligonucleotide may also be of a length that is similar to the length of the stapler sequence. Thus, incubating the blocker oligonucleotide with the linker oligonucleotides form a double stranded stapler region but leaves the primer sequence remaining single-stranded. The partially double stranded linker oligonucleotide is allowed to contact with the DNA template, where the primer sequence binds to the complementary sequence in the DNA template. After hybridization to the DNA template, the blocker sequence is then removed so that the linker oligonucleotides can hybridize to each other through the complementary stapler sequences.
Blocker oligonucleotides can be removed via a number of means. In some cases, the blocker oligonucleotide is designed with a sequence hybridized to the stapler sequence of the linker oligonucleotide to form a double-stranded hybrid, and the double-stranded hybrid having a melting temperature that is lower than the melting temperature of the double stranded hybrids formed between the DNA template and the linker oligonucleotide. In those cases, the removal of blocker oligonucleotides can be achieved by raising the temperature of the reaction so that the blocker oligonucleotide is dissociated from the linker oligonucleotide, whereas the DNA template remains  hybridized to the linker oligonucleotide. In some cases, the blocker oligonucleotide can be removed through enzymatic cleavage, for example uracil -glycosylase/endonuclease IV or UDG/APEI. In some cases, the phosphodiester bonds in various position in blocker oligonucleotide can be replaced with chemically cleavable bonds (e.g., disulfide, azido) so that the blocker oligonucleotide can be cleaved and removed.
In some embodiments, the method comprises 1) hybridizing blocker oligonucleotides to the stapler sequences of at least two linker oligonucleotides in a reaction, to produce partially double stranded linker oligonucleotides, which comprise a double-stranded region consisting of the blocker oligonucleotide and the stapler sequence, thereby preventing stapler sequences of the linker oligonucleotides from hybridizing to each other, 2) adding a DNA template to the reaction, with the linker oligonucleotides, thereby the partially double stranded primer sequences of the linker oligonucleotides bind to the DNA template, 3) removing the blocker oligonucleotide by one or more of the following: raising temperature of the reaction such that the blocker oligonucleotide or multiple oligonucleotides are dissociated from the linker oligonucleotides, enzymatic or chemical degradation of the blocker oligonucleotide, 4) washing to remove the blocker oligonucleotide, and 5) lowering the temperature to allow the hybridization of the stapler sequences of the linker oligonucleotides to each other.
In some embodiments, the stapler sequences of at least two linker oligonucleotides bind to each other to form a linked pair before binding of the primer sequences to the DNA template. In general, the events of two linker oligonucleotides from the same linked pair bind to an individual DNB occur at a higher rate than two linker oligonucleotides from different linked pairs binding to an individual DNB In some embodiments. Accordingly, in some embodiments, the linked pair of linker oligonucleotides (or the linker oligonucleotides) are used at a suitable concentration such that two linker oligonucleotides in the linked pair bind to the same DNB, rather than two linker oligonucleotides from different linked pairs bind to the same DNB. Suitable concentrations for this purpose can be determined empirically.
Linking of subunits could occur after deposition of the DNB on the slide or in solution before loading of the DNBs onto the surface. Linking in solution may minimize splitting of DNBs across multiple surface binding sites. Linking on the surface would minimize the risk of linking multiple DNBs.
G. Specific configurations
In some embodiments, two linker oligonucleotides are connected and form a linker. A linker, as used herein, refers to a complex consisting of two or more linker oligonucleotides that are connected  by covalent or non-covalent means. Linkers and linker oligonucleotides may take different forms. For example, based on the number of subunits in the DNA concatemer that the linkers can bind, the linkers can be classified as 2-arm linkers or 4-arm linkers. Based on the relative sequence components on the linkers themselves, they can be classified as Z-linkers or X-linkers.
2-arm linkers and 4-arm linkers
In some embodiments, each linker oligonucleotide contains only one primer seqence and thus two oligonucleotides can link two subunits of the DNA concatemer ( “2-arm linkers” ) . As an illustrative example, two linker oligonucleotides, seq 1 and seq 2, each contains a primer sequence (A) that serves to hybridize to one subunit (s) of a DNB. Seq 1 also contains a stapler sequence (b) and seq 2 contains a stapler sequence (B) , with (b) being complementary to (B) . See FIG. 6A. To form functional linkers, seq 1 and seq 2 are added to the reaction comprising the DNA template (e.g., a DNA concatemer) . In some embodiments, each of the 2-arm linkers comprises a palindromic sequence, see FIG. 6B.
In some embodiments, one single linker oligonucleotide, by virtue of having two primer sequences that are complementary to the adaptor sequence of the DNA concatemer, could connect two subunits of the concatemer. For example seq 4 or seq 5 in FIG. 6C.
In some embodiments, each of the linker oligonucleotides has two primer sequences, with a stapler sequence interposed in between. Each linker oligonucleotide can hybridize to two subunits of a DNB. This configuration allows the two connected linker oligonucleotides to bind to four separate sequences of the DNA template, thus is referred to as “4-arm linkers” . In some embodiments, the stapler sequences of the two linker oligonucleotides of the 4-arm linker are not identical, such as seq 4 and seq 5 in FIG. 6C. In some embodiments, the stapler sequences of the two linker oligonucleotides of the 4-arm linker are identical and palindromic, which allows hybridization of two linker oligonucleotides of the same sequence. See, e.g., seq 6 in FIG. 6D and seq 7 and seq 8 in FIG. 6E.
Z linkers and X-linkers
In some embodiments, two linker oligonucleotides are connected to form a Z linker. Each linker oligonucleotide in the Z-linker comprises a primer sequence that is complementary to and can hybridize to the DNA template, e.g., the adaptor of the DNA concatemer. In some embodiments, each primer sequence comprises an extendible 3’ end and can serve as a primer to make a second strand based on the DNA template. Each linker oligonucleotide of the Z-linker also comprises a stapler  sequence, which is complementary to the stapler sequence of the other linker oligonucleotide such that hybridization of the two stapler sequences results in formation of a partial hybrid between the two linker oligonucleotides. In some embodiments, the stapler sequence of the linker oligonucleotide is palindromic, and the two linker oligonucleotides of the Z-linker are of identical sequence. In some embodiments the stapler sequence of linker oligonucleotide is non-palindromic and the two linker oligonucleotides are different.
The 3’ ends of the linker oligonucleotides can be extended to form second strands. In some cases, the two second strands so formed are linked on the 5’ end via the Z-linker.
Figure 1A shows an illustrative example of a Z-linker consisting of a pair of linker oligonucleotides, each comprising a stapler sequence and a primer sequence, with the stapler sequence located 5’ to the primer sequence. The stapler sequences of the pair of linker oligonucleotides are complementary and annealing thereof connects the linker oligonucleotides at the 5’ end and forms a 2-arm Z-linker. The primer sequences hybridize to a DNB template (e.g., the first strand) and each linker oligonucleotide is extended to form a second strand, which results in two second strands linked at 5’ (bottom right panel) . In this instance, the extension is carried out by a strand-displacement DNA polymerase, which forms a branched structure, in which each second strand is partially hybridized to the DNA template.
In some embodiments, the linker oligonucleotides form an X linker, in which two linker oligonucleotides that are identical in sequence are linked through a palindromic sequence, and each of the linker oligonucleotides possess extra sequence at the 5’ ends that can hybridize to another linker oligonucleotide having a non-palindromic stapler sequence. Each linker oligonucleotide also comprise a primer sequence that is complementary and can hybridize to the DNA template (e.g., an adaptor of the DNA concatemer) . This X-linker structure thus allows for multiple 3’ extendable primer sequences, possibly four or more.
For example, as illustrated in FIG. 2, an X-linker may comprise a pair of D-Bb-A linker oligonucleotides ( “Oligo Linker 1” ) and a pair of d-cC-A linker oligonucleotides ( “Oligo Linker 2” ) . The two D-Bb-A linker oligonucleotides hybridize to each other via the palindromic stapler sequences Bb, and the two d-cC-A linker oligonucleotides hybridize to each other via the palindromic stapler sequences cC. Each of the D-Bb-A linker oligonucleotide is also hybridized to one of the d-cC-A linker oligonucleotide via the complementary stapler sequences D and d. This results in a structure ( “structure 1” , as shown in the bottom left panel in FIG. 2) with four primer sequences that can hybridize to a DNA  template. Each of the four primer sequences comprises an extendible 3’ end and can be extended to produce a second strand, which is a reverse complement of the DNA template.
The X-linker may also take a form as shown in “structure 2” (the bottom right panel of FIG. 2) . Two D-Bb-A linker oligonucleotides are annealed with four d-cC-A linker oligonucleotides, which results in a structure that has multiple primer sequences (A) , and excess single-stranded arms (e.g., “D” or “d” ) . These excess strand arms allow for continued structure growth as a random network. For example, “D” is readily complementary to any “d” region and is capable of annealing to “d” to expand the structure in any form.
An exemplary sequence is shown below. The single underline sequence is the primer sequence (A) ; the bold type is palindromic stapler sequence (Bb) ; and the double underline sequence is the non-palindromic stapler sequence (D or d) .
Figure PCTCN2020124338-appb-000003
and
Figure PCTCN2020124338-appb-000004
Second strand linker oligonucleotides
In some instances, the method further comprises hybridizing linker oligonucleotides to the second strands (these linker oligonucleotides are referred to as second linker oligonucleotides) . Second linker oligonucleotides may comprise any of the components arranged in any of the configuration as described above, i.e., they may also comprise a stapler sequence and a primer sequence having an extendible 3’ end. In some cases, the second linker oligonucleotides are sequencing primers and are used to generate sequence reads of the second strand. As described below, the sequence reads of the second strand can be combined with the sequence reads of the first strand to construct sequence information of the target DNA. See FIG. 1B.
Seq 12, below, is an example of one embodiment of a second strand linker oligonucleotide. Seq 12 consists of subsequences Seq 12.1, Seq 12.2, and Seq 12.3.
Figure PCTCN2020124338-appb-000005
Seq 12.1
Figure PCTCN2020124338-appb-000006
Seq 12.2
Figure PCTCN2020124338-appb-000007
Seq 12.3
Figure PCTCN2020124338-appb-000008
Seq 12.1 (SEQ ID NO: 7) and seq 12.3 (SEQ ID NO: 9) are identical repeated sequences that hybridize to a region of the second strand (also referred to as a second strand spur) of a DNB (reverse complement strand of the original DNB) . Seq 12.2 is a region of internal complementarity that allows 2 oligonucleotide molecules to come together and hybridize to form a 4-arm structure.
6. Production of Partially Displaced Second Strands by Extending Linker Oligonucleotides
As described above, the linker oligonucleotides of the invention may comprise a primer sequence with an extendible 3’ end, thus, linker oligonucleotides can serve as primers. As described above, the linker oligonucleotide may possess a 3’-hydroxyl chemical group that allows extension as a primer with one or more DNA polymerases. In some embodiments, the linker oligonucleotide comprises a reversible 3’ blocking group, which can be cleaved to produce an extendible 3’ end. Thus, in some embodiments, at least two linker oligonucleotides are extended to produce two second strands, i.e., strands that have a sequence that is reverse complementary to the DNA template. Based on the relative positions of the two second strands, the strand that is located 5’ to the other strand is referred to as the upstream strand, and the other strand is referred to as the downstream strand. For example, FIG. 1A shows two second strands produced by extending two linker oligonucleotides (indicated as “linked second strand spurs” in FIG. 1A) that are linked. The strand shown on the left is located 5’ to the strand on the right; in this configuration, the strand on the left is the upstream strand, and the strand on the right is the downstream strand.
A plurality of linker oligonucleotides can be extended to generate a series of second strands. Any individual second strand in the series can be deemed a downstream second strand (relative to a second strand upstream) and also an upstream second strand (relative to a second strand downstream) . By way of an example, extending linker oligonucleotides produces a series of second strands, including second strand #1, second strand #2, second strand #3. #1, #2, and #3 are hybridized or partially hybridized to a DNA template and are present in the order from 5’ to 3’. #2 is the upstream second strand relative to #3; meanwhile, #2 is also the downstream second strand relative to second strand #1.
In some embodiments, producing second strands involves at least two steps. The first step includes extending at least two first strand linker oligonucleotides by a DNA polymerase (e.g., a non- strand-displacement polymerase or a strand-displacement polymerase) to generate at least two partially extended second strands, a partially extended upstream second strand, and a partially extended downstream second strand. Both strands are fully hybridized to the DNA template.
The second step includes further extending the two partially extended second strands with a strand-displacement polymerase, during which extending the partially extended upstream partially displaces the partially extended downstream second strand, thereby producing a partially hybridized downstream second strand.
These primers may be “extension primers” or “sequencing oligonucleotides, ” “Extension primers” are used in primer extension reactions to generate the second strands described above. Thus, an extension primer is a substrate for a DNA polymerase and is extendible by addition of nucleotides.
It will be well within the ability of one of ordinary skill in the art guided by this disclosure to select or design primers and probes for use in the present invention (e.g., primers capable of extension or ligation under sequencing assay conditions that are well known) . Without intending to limit the invention, extension primers often have a length in the range of 10-100 nucleotides, often 12-80 nucleotides, and often 15-80 nucleotides.
It will be appreciated that primers and probes may be fully or partially complementary to the stapler sequence in an adaptor to which it hybridizes. For example, a primer may have at least 85%, 90%, 95%, or 100%identity to the sequence to which it hybridizes.
A primer may also contain additional sequence at the 5’ end of the primer that is not complementary to the primer binding sequence (i.e., the sequence of the primer binding site) in the adaptor. The non-complementary portion of a primer may be at a length that does not interfere with the hybridization between the primer and its primer stapler sequence. In general, the non-complementary portion is 1 to 100 nucleotides long. In some embodiments, the non-complementary portion is 4 to 8 nucleotides long. Primers may comprise DNA and/or RNA moieties, and in some approaches primers used in the invention may have one or more modified nucleotides that contain modifications to the base, sugar, and/or phosphate moieties.
A “sequencing oligonucleotide” may be an extension primer used in sequencing-by-synthesis reactions (also called “sequencing by extension” ) . A “sequencing oligonucleotide” may be an oligonucleotide used in a sequencing-by-ligation method such as “combinatorial probe-anchor ligation reaction” (cPAL) (including single, double and multiple cPAL) as described in US Patent Publication  20140213461, incorporated herein by reference for all purposes. In brief, cPAL comprises cycling of the following steps: First, a “sequencing oligonucleotide” (or “anchor” ) is hybridized to a complementary sequence in an adaptor of the second DNA strand described above. Enzymatic ligation reactions are then performed with the anchor to a fully degenerate probe population of, e.g., 8-mer probes that are labeled, e.g., with fluorescent dyes. Probes may comprise, e.g., about 6 to about 20 bases in length, to about 7 to about 12 bases in length. At any given cycle, the population of 8-mer probes that is used is structured such that the identity of one or more of its positions is correlated with the identity of the fluorophore attached to that, e.g., 8-mer probe. In variations of basic cPAL well known in the art, such as multiple cPAL, partially or fully degenerate secondary anchors are used to increase the readable sequence.
In some embodiments, a strand displacement polymerase is used to produce partially-displaced second strands (follow-on fragments) with both overhangs and duplex portions attached to the DNA template polynucleotide (e.g., DNB DNA strands) . The extension reaction may be controlled to avoid complete displacement of the second strands (i.e., “following strands” or “follow-on fragments” ) and to produce second strands having lengths of overhangs suitable for sequencing. This can be achieved by controlling progression of the reaction by selecting a polymerase (s) with a suitable polymerization rate or other properties, and by using a variety of reaction parameters including (but not limited to) reaction temperature, duration of the reaction, primer composition, DNA polymerase, primer and dents concentration, additives and buffer composition. Optimal conditions may be determined empirically.
6.1 DNA Polymerase
One approach to control the extension-displacement reaction is to use a DNA polymerase having suitable strand displacement activities to produce the second strands. DNA polymerases having strand displacement activity include, but are not limited to, Phi29, Bst DNA polymerase, Klenow fragment of DNA polymerase I, and Deep-VentR DNA polymerase (NEB#M0258) . These DNA polymerases are known to have different strength of the strand displacement activity. See, Kornberg and Baker (1992, DNA Replication, Second Edition, pp. 113-225, Freeman, N.Y. ) . It is within the ability of a person of ordinary skill in the art guided by this disclosure to select a DNA polymerase suitable for the carrying out the method.
6.2 Polymerase, Primer and dNTP Concentrations
Another approach to control the extension-displacement reaction is using suitable concentrations of the DNA polymerase having strand displacement activity, or controlling the concentrations of dNTP, or the concentrations of the linker oligonucleotides, which are served as primers.
6.3 Additives
In some embodiments, the extension reaction rate is controlled by including an agent that affects the duplex formation between extension primers and DNA template, such as DMSO (e.g., 1%-2%) , Betaine (e.g., 0.5 M) , glycerol (e.g., 10%-20%) , T4 G32 SSB (e.g., 10-20 ng/μl) , and volume exclusion agents, in the reaction buffer.
6.4 Temperature
The reaction temperatures may also be controlled to allow appropriate speed of polymerization and strand displacement. Higher temperature typically results in greater extent of strand displacement. In some embodiments, reaction temperatures are maintained to be within the range of 20℃ –37℃, for example, 32℃, 33℃, 34℃, 35℃, 36℃, or 37℃, in order to avoid complete displacement.
In some approaches, extension reactions are controlled by using a mixture of conventional (extendible) primers and non-extendible primers, i.e. 3’ end blocked primers. A non-extendible primer blocks elongation via, for example, a chemical blocking group that prevents polymerization by a DNA polymerase. By mixing these two different primers at different ratios, the length of duplex (hybridized) portion of the newly synthesized complementary DNA strand (follow-on fragments) can be controlled. For example, in one approach a mixture of first primers is used in which 50-70%are non-extendible ( “blocked” ) and 30-50%can be extended ( “unblocked” ) . Many types of non-extendible primers are known in the art and would be suitable for the present invention.
6.5 Reaction time
In some embodiments, the extension-displacement reaction is controlled by terminating the reaction after a certain period of time during which the desired length of the second strands is achieved. In some embodiments, the reaction is terminated after 5 min, 10 min, 20 min, 30 min, 40 min or 60 min from initiation. Methods of termination of the reaction are well known in the art, for example, by incorporation of ddNTPs or by adding chemical solutions, e.g., a Tris buffer containing 1.5 M NaCl. In  one embodiment, the termination is achieved by incorporating ddNTPs after adding to the reaction a Tris buffer containing 1.5M NaCl.
7. Sequence Determination
In some embodiments, the claimed invention provides methods of determining the sequence of the second strands produced as described above. The method comprises hybridizing a sequencing oligonucleotide to the sequence in the second strand that is complementary to at least part of the adaptor of the DNA template (e.g., a DNA concatemer) , and determining the nucleotide sequence of at least part of the sequence complementary to the target DNA sequence. Sequence determination may be carried out using sequencing-by-synthesis methods or using sequencing-by-ligation methods, or both.
In some embodiments, any of the linker oligonucleotides as described above can be used as sequencing oligonucleotides.
In one embodiment, overhangs of the second strands are sequenced by extending primers (e.g., a second strand linker oligonucleotide) hybridized to the complementary sequences of the adaptor of a monomer, for example, as illustrated FIG. 1B.
In another embodiment, the DNA template strand is also sequenced using primers hybridized to the adaptor of a monomer. The sequence information from the second strands is paired with sequences generated from sequencing the DNA template to determine the entire target DNA sequence.
It will be apparent to the reader that variations of the specific embodiments outlined herein may be used. In one approach, the extension primers (e.g., the first strand linker oligonucleotide) and sequencing oligonucleotides (e.g., the second strand oligonucleotide) bind to different portions of an adaptor sequence. In one approach, the extension primers and sequencing oligonucleotides bind to the same portion of the adaptor sequence (e.g., a portion of the adaptor sequence for extension and the complement of same portion of the adaptor sequence for sequencing) .
Any suitable sequence determination method may be used to determine the sequence of the overhang, for example, SBS, pyrosequencing, sequencing by ligation, and others. In some embodiments, more than one sequencing approach is used. For example, the DNA template strand may be sequenced using one method (e.g., cPAL) and the second strands are sequenced using a different method (e.g., SBS) .
Sequencing-by-synthesis (SBS) may rely on DNA polymerase activity to perform chain extension during the sequencing reaction step. SBS is well known in the art. See, e.g., U.S. Pat. Nos. 6,787,308 and US8241573B2 and Shendure et al., 2005, Science, 309: 1728-1739. Sequencing on DNA nanoballs can occur through a variety of processes. In one approach the circle used to generate the DNB is prepared with a DNA region of known sequence (the adapter) and an adjacent sequence of unknown identity which is to be determined. One function of the adapter is to provide a primer hybridization site such that extension of the primer will lead to addition of nucleotides into the “unknown” or “to be determined” region. The nucleotides, if reversibly blocked at the 3’ position, are added one position at a time and are complementary to the base position in the DNB. After removal of the 3’ blocking group an additional position can be read in the next cycle. A fluorescent moiety characteristic of the base type is used for detection of the incorporating base and so reveals the base at that position in the DNB.
Alternatively, sequencing by ligation can be used. The primer or anchor can be extended by ligating fluorescent oligonucleotides that extend into the unknown sequence. In this sequencing method fluorescent oligonucleotides with degenerate bases are ligated to the initiating anchor, however one base of the oligonucleotide is defined, and is associated with the fluorescent moiety. Ligation of the oligo probe to the anchor created a stable fluorescence after washing excess probes and is dependent upon the recognition of the defined base being complementary to the base at the same position of the DNB. Sequencing by ligation is described for example, Shendure et al., 2005, Science, 309: 1728-1739. )
Other sequencing methods can also be used, e.g., pyrosequencing (See, e.g., Ronaghi et al., Anal. Biochem. (1996) 242: 84–89) and sequencing by hybridization (see, e.g., Drmanac et al, Advances in Biochemical Engineering/Biotechnology (2002) 77: 75-101) .
Order of addition of primers
The order of addition of the extension primers (e.g., first linker oligonucleotides and second linker oligonucleotides) may vary. For example, in some embodiments, a first linker oligonucleotide and polymerase are added and synthesis of the second strand occurs (at least in part) prior to addition of the second linker oligonucleotide. In another approach, the first and second linker oligonucleotides are added at about the same time. For example, they may be added together in the same composition, or may be added separately within about 1 minute of each other, or within about 5 minutes of each other. The first and second extension primers may be added in any order.
Sequential addition of the primers may be necessary in approaches in which second strand is to be produced using a DNA polymerase that has no strand displacement activity, while the second strand is to be produced using a DNA polymerase having strand displacement activity.
It will be recognized that a single oligonucleotide may function as both an extension primer for producing the second strand and for sequencing.
It will be further recognized that multiple different primers and/or multiple different sequencing oligonucleotides may be used in the same sequencing reaction.
The sequencing oligonucleotide (s) for the second strand is typically added after the extension-displacement of the second strand is terminated using the methods disclosed herein.
In some embodiments, a second strand linker oligonucleotide serves as a sequencing oligonucleotide that hybridizes to the overhang portion of the second strand. In some embodiments, the sequencing oligonucleotide has a sequence that is complementary to and thus hybridizes to a known sequence within the second strand. In some embodiments, the sequencing oligonucleotide hybridizes to a sequence in the second strand that is complementary to at least part of the adaptor in the DNA concatemer. In some embodiments, the sequencing oligonucleotide is complementary, partially or completely, to the first strand linker oligonucleotide.
8. DNA Polymerase
The methods of the present invention may be carried out using methods, tools and reagents well known to those of ordinary skill in the art of molecular biology and MPS sequencing, including nucleic acid polymerases (RNA polymerase, DNA polymerase, reverse transcriptase) , phosphatases and phosphorylases, DNA ligases, and the like. In particular, certain primer extension steps may be carried out using one or more DNA polymerases. Certain extension steps are carried out using DNA polymerase with strand displacement activity.
In some embodiments, the methods disclosed herein use one or more DNA polymerases and strand displacement activities of the DNA polymerase (s) to generate DNA strands complementary to a DNA template. In one approach, the present invention uses a DNA polymerase with a strong 5’→3’ strand displacement activity. Preferably, the polymerase does not have 5’→3’ exonuclease activity. However, DNA polymerases having 5’-3’ exonuclease activity may be used when the activity does not prevent the implementation of the method of the invention, e.g., by using reaction conditions that inhibit the exonuclease activity.
The term “strand displacement activity, ” describes the ability of the polymerase to displace downstream DNA encountered during synthesis. Strand displacement activity is described in US Pat. Pub. No. 20120115145, incorporated herein by reference, as follows: “Strand displacement activity” designates the phenomenon by which a biological, chemical or physical agent, for example a DNA polymerase, causes the dissociation of a paired nucleic acid from its complementary strand in a direction from 5 towards 3, in conjunction with, and close to, the template-dependent nucleic acid synthesis. The strand displacement starts at the 5′ end of a paired nucleic acid sequence and the enzyme therefore carries out the nucleic acid synthesis immediately in 5′ of the displacement site. The neosynthesized nucleic acid and the displaced nucleic acid generally have the same nucleotide sequence, which is complementary to the template nucleic acid strand. The strand displacement activity may be situated on the same molecule as that conferring the activity of nucleic acid synthesis, and particularly the DNA synthesis, or it may be a separate and independent activity. DNA polymerases such as E. coli DNA polymerase I, Klenow fragment of DNA polymerase I, T7 or T5 bacteriophage DNA polymerase, and HIV virus reverse transcriptase are enzymes, which possess both the polymerase activity and the strand displacement activity. Agents such as helicases can be used in conjunction with inducing agents that do not possess strand displacement activity in order to produce the strand displacement effect, that is to say the displacement of a nucleic acid coupled to the synthesis of a nucleic acid of the same sequence. Likewise, proteins such as Rec A or Single-Strand Binding Protein from E. coli or from another organism could be used to produce or to promote the strand displacement, in conjunction with other inducing agents (Kornberg and Baker, 1992, DNA Replication, 2nd Edition, pp 113-225, Freeman, N.Y. ) .
In one approach, the polymerase is Phi29 polymerase. Phi29 polymerase has a strong displacement activity at moderate temperatures (e.g., 20-37℃) .
In one approach, Bst DNA Polymerase, Large Fragment (NEB #M0275) is used. Bst DNA Polymerase is active at elevated temperatures (~65℃) .
In one approach, the polymerase is Deep-VentR DNA polymerase (NEB #M0258) (Hommelsheim et al., Scientific Reports 4: 5052 (2014) ) .
9. Substrates and Compartments
In some applications, DNA template polynucleotides are immobilized on a substrate. Generally, the immobilization occurs prior to synthesis of the second strands discussed above. Exemplary substrates may be substantially planar (e.g., slides) or nonplanar and unitary or formed from a plurality of distinct units (e.g., beads) . Exemplary materials include glass, ceramic, silica, silicon, metal,  elastomer (e.g., silicone) , polyacrylamide (e.g., a polyacrylamide hydrogel; see WO 2005/065814) . In some embodiments, the substrate comprises an ordered or non-ordered array of immobilization sites or wells. In some approaches, target DNA polynucleotides are immobilized on a substantially planar substrate, such as a substrate comprising an ordered or non-ordered array of immobilization sites or wells. In some approaches, target DNA polynucleotides are immobilized on beads.
Polynucleotides can be immobilized on a substrate by a variety of techniques, including covalent and non-covalent attachment. Polynucleotides can be fixed to a substrate by a variety of techniques. In one embodiment, a surface may include capture probes that form complexes, e.g., double stranded duplexes, with component of the polynucleotide molecule, such as an adaptor oligonucleotide. In another embodiment, a surface may have reactive functionalities that react with complementary functionalities on the polynucleotide molecules to form a covalent linkage. Long DNA molecules, e.g., several nucleotides or larger, may also be efficiently attached to hydrophobic surfaces, such as a clean glass surface that has a low concentration of various reactive functionalities, such as –OH groups. In still another embodiment, polynucleotide molecules can be adsorbed to a surface through non-specific interactions with the surface, or through non-covalent interactions such as hydrogen bonding, van der Waals forces, and the like.
For example, a DNA nanoball may be immobilized to a discrete spaced apart region as described in US Pat. No. 8,609,335 to Drmanac et al. In one approach, the DNBs are immobilized on a substrate by hybridization to immobilized probe sequences, and solid-phase nucleic acid amplification methods are used to produce clonal clusters comprising DNA template polynucleotides. See, e.g., WO 98/44151 and WO 00/18957.
In some embodiments, DNA template polynucleotides are compartmentalized in an emulsion, droplets, on beads and/or in microwells (Margulies et al. "Genome sequencing in microfabricated high-density picolitre reactors. " Nature 437: 7057 (2005) ; Shendure et al. “Accurate multiplex polony sequencing of an evolved bacterial genome” Science 309, 1728–1732 (2005) prior to the primer extension steps.
Typically, DNA nanoballs are arrayed on a substrate in either an ordered or random array. In many applications the adsorption to the substrate is mediated through substrate-protein-DNA interactions. In addition, to achieve stable nanoball arrays though cycles of sequencing, post attachment deposition of a protein layer can improve stability of the DNA array, see WO2013066975A1, the entire disclosure of which is herein incorporated by reference.
10. Arrays of DNA Complexes
In one aspect the invention comprises an array of DNA complexes. In one aspect, the array is a support comprising an array of discrete areas, wherein a plurality of the areas comprise (a) a clonal cluster of single-stranded DNA templates and a plurality of linker oligonucleotides, wherein the DNA template is a single-stranded concatemer comprising a plurality of monomers, wherein each monomer comprises an adaptor sequence and a DNA target sequence, wherein each linker oligonucleotide comprises a primer sequence that is complementary to and hybridizes to an adaptor sequence of the DNA template, wherein the primer sequence includes an extendible 3’ end, and wherein at least two linker oligonucleotides of the plurality of linker oligonucleotides that are hybridized to the DNA template are connected to each other.
In one aspect the invention comprises A DNA complex comprising a DNA template, two or more second strands, wherein each second strand comprises an overhang region and a hybridized region that is hybridized to the DNA template, wherein the two or more second strands are complementary to the DNA template, and wherein at least two second strands are connected at their respective 5’ end.
In some embodiments a plurality of second linker oligonucleotides are used as a primer for primer extension (e.g., a sequencing by synthesis reaction) , or is an extension product of such a primer, or is an oligonucleotide capable of activing as an anchor for sequencing by ligation, or is an ligation product of such an oligonucleotide and a labeled probe (e.g., a labeled cPAL probe) . In one approach the second linker oligonucleotide comprises a portion complementary to the adaptor sequence and can be extended for sequencing the second strand.
It will be appreciated that the DNA complexes of the array may comprise any of the properties of complexes described herein or made according to methods described herein. Additionally the complexes may have any combination of one or more of the following features: (i) the array comprises at least 10 6 discrete areas, (ii) wherein the DNAs are single-stranded (iii) wherein the second linker oligonucleotide comprises at least 10 bases of sequence of the adaptor, preferably at least 12 bases, and optionally at least 15 bases, and (iv) the second linker oligonucleotide is completely complementary to the second DNA strand to which it is hybridized.
10. Compositions
In one aspect the disclosure provides a composition comprising an array as described above in Section 9 and an enzyme selected from DNA ligase and DNA polymerase. In some cases, the  composition comprises two DNA polymerases, one with strand displacement activity and one without strand displacement activity. In some embodiments, the composition further comprises fluorescently tagged dNTPs (e.g., dNTP analogs) and/or a pool of tagged oligonucleotide probes.
11. Examples
11.1 Example 1: Effect of a Z-linker on RhoA (Adenosine ( “A” ) base intensity) decline, mapping rates and error rates
DNBs were produced by rolling circle amplification using a library of single-stranded circles comprising human genomic DNA fragments. DNBs were immobilized on DNB array chip and sequenced using BGIseq500. Sequencing was performed by cycles of sequencing by synthesis with a DNA polymerase and addition of reversibly blocked fluorescent terminators. Incorporation and de-blocking occurred at temperatures between 50℃ to 60℃. Standard primers, e.g., primers that do not include a stapler sequence and does not link multiple subunits of DNB, or linker oligonucleotides having the configuration of seq 6 (A-bB-A) were used as sequencing oligonucleotides. Two seq 6 linker oligonucleotides hybridize to each other and form a Z-linker. Mapping rates and error rates were determined from individual reads according to the instructions provided by the
Figure PCTCN2020124338-appb-000009
-500 software.
Sequencing oligonucleotides (1 μM) were hybridized to the DNBs and SBS was performed for 175 cycles at temperatures of 20℃ to 57℃. Reversible Terminator nucleotides (RTs) labeled with 4 different fluorescent dyes were incorporated during each sequencing cycle. In addition, unlabeled nucleotides were incorporated in each cycle of sequencing to further incorporate at every subunit of each DNB during each cycle of sequencing. After imaging, the 3’ blocking group was removed with a phosphine reagent before the next incorporation event.
FIG. 3 shows the effect of the Z-linker on the reduction in intensity over multiple cycles (e.g., over 175 cycles) of sequencing. The Y-axis represents intensity measurement (referred to as “Rho” ) for one base group (A base) . Rho was processed to indicate the average intensity of DNBs after being assigned to a base group per BGIseq500 software. Lane 1 (A L1) (i.e., signals from sequencing with the standard primer showed a more rapid reduction in intensity values over 170 cycles of sequencing compared with lane 2 (A L2) (i.e., signals from sequencing with the Z-linker) . Without being bound by any theories, the decline of signal as sequencing cycles progress can have potentially multiple causes, such as DNB shearing or structural loss of DNB mass, loss of the extending strand, irreversible termination of nucleotides, out-of-phase base reading within DNBs. The slower decline in signals from sequencing with the Z-linker indicates less loss of DNBs.
Stabilizing the structure of the DNB may help to keep DNB intensity well above background, which results in lower error rates and higher mapping rates. FIG. 4A shows that, for the first 66 bases, the flow cell lane with the linker oligonucleotides had a slightly higher mapping rate (82%) than that of the lane with standard primer (81.2%) . For the second 66 bases, the mapping rate for sequencing using the linker oligonucleotides was 82%, which was substantially higher than sequencing with the standard primer, which showed a 72%mapping rate.
As shown in Figure 4B, error rates with the Z-linker oligonucleotides were generally lower than with standard primers. For example, for the second 66 bases, the error rates were about half when a linker oligonucleotide was included --with error rates of 0.57%with the linker oligonucleotides, and 1.17%with the standard primers.
11.2 Example 2: Effect of another Z-linker on mapping rates and error rates
DNBs were produced and sequencing was performed as described in Example 1, except that the linker oligonucleotides are seq3 (bB-A) , which comprises a palindromic stapler sequence bB and a primer sequence A. The hybridization of two Linker oligonucleotide 3 forms a Z-linker.
Mapping and discordance was determined for the first strand 50 bases of a read and the second strand 50 bases of the read. As shown in FIG. 5A, for the first strand 50 bases, the flowcell lane with the Z-linker stapler had a slightly higher mapping rate (94%) than that of the lane with the standard primers (93%) . For the second strand 50 bases, the mapping rate for the sequencing reaction with Z-linker was 91%, which was substantially higher than the sequencing reaction with the standard primers, which had an 83%mapping rate.
As shown in FIG. 5B, error rates for sequencing reactions with the Z-linker were generally lower than the one with standard primers. For example, for the second strand 50 bases, the error rates for sequencing reactions with Z-linker were 0.31%, only one half of the ones with standard primers, about 0.63%) .
Illustrative Embodiments of the invention
The following are the non-limiting embodiments of the invention.
Embodiment 1. A method of preparing a DNA template for nucleic acid analysis comprising hybridizing a plurality of first strand linker oligonucleotides to the DNA template,
wherein the DNA template is a single-stranded concatemer comprising a plurality of monomers, wherein each monomer comprises an adaptor sequence and a DNA target sequence,
wherein each first strand linker oligonucleotide comprises a template-hybridizing sequence, wherein the template-hybridizing sequence is complementary to and hybridizes to an adaptor sequence of the DNA template, and
wherein at least two first strand linker oligonucleotides of the plurality of first strand linker oligonucleotides that are hybridized to the DNA template are linked to each other.
Embodiment 2. The method of embodiment 1, wherein the template-hybridizing sequence is a primer sequence, and wherein the primer sequence includes an extendible 3’ end.
Embodiment 3. The method of embodiment 1, wherein the method further comprises extending the at least two first strand linker oligonucleotides to generate at least two second strands by one or more DNA polymerases,
wherein the at least two first strand linker oligonucleotides are linked to each other and are hybridized to the DNA template,
thereby producing at least two second strands each having a 5’ end, wherein the 5’ ends of the two second strands are linked.
Embodiment 4. The method of  embodiment  1, 2 or 3, wherein the at least two first strand linker oligonucleotides are linked through DNA hybridization, a covalent bond, or both.
Embodiment 5. The method of  embodiment  1, 2 or 3, wherein the at least two first strand linker oligonucleotides each comprises a stapler sequence, wherein the at least two first strand linker oligonucleotides are linked by hybridization of the respective stapler sequences.
Embodiment 6. The method of any of the embodiments 1-3, wherein the at least two first strand linker oligonucleotides are linked by . a shared scaffold.
Embodiment 7. The method of embodiment 5, wherein at least one of the first strand linker oligonucleotides comprises two cleavage sites flanking the stapler sequence, wherein cleaving at the cleavage sites releases the stapler sequence.
Embodiment 8. The method of embodiment 5, wherein the stapler sequences of the at least two first strand linker oligonucleotides are hybridized to different regions of a shared scaffold, thereby linking the at least two first strand linker oligonucleotides.
Embodiment 9. The method of embodiment 5, wherein the at least two first strand linker oligonucleotides bind to the DNA template before they bind to each other via the respective stapler sequences.
Embodiment 10. The method of embodiment 9, wherein the method comprises:
1) hybridizing blocker oligonucleotides to the stapler sequences of the at least two first strand linker oligonucleotides in a reaction, thereby forming partially double stranded first strand linker oligonucleotides,
wherein each of the partially double stranded first strand linker oligonucleotide comprises i) a double-stranded region consisting of the blocker oligonucleotide and the stapler sequence, thereby preventing stapler sequences of the first strand linker oligonucleotides from hybridizing to each other, and ii) a single-stranded region comprising the sequence that is a template-hybridizing sequence,
2) adding DNA template to the reaction, with the first strand linker oligonucleotides, wherein the primer sequences of the partially double stranded first strand linker oligonucleotides bind to the DNA template,
3) removing the blocker oligonucleotide by one or more of the following: raising temperature of the reaction such that the blocker oligonucleotide is dissociated from the DNA template, enzymatic or chemical degradation of the blocker oligonucleotide,
4) washing to remove the blocker oligonucleotide, and
5) adjusting the temperature to allow the hybridization of the stapler sequences of the first strand linker oligonucleotides to each other.
Embodiment 11. The method of embodiment 5, wherein the stapler sequences of the at least two first strand linker oligonucleotides bind to each other to form linked first strand linker oligonucleotides before binding of the primer sequences to the DNA template.
Embodiment 12. The method of embodiment 11, wherein the method comprises using the linked first strand linker oligonucleotides at a concentration below a predetermined threshold such that two first strand linker oligonucleotides bind to a single DNA template molecule.
Embodiment 13. The method of embodiment 5, wherein the stapler sequence is a palindromic stapler sequence, and
wherein for each of the at least two first strand linker oligonucleotides, the palindromic stapler sequence is 5’ to the template-hybridizing sequence.
Embodiment 14. The method of embodiment 5, wherein the at least two first strand linker oligonucleotides comprise two complementary, non-palindromic stapler sequences, one on each first strand linker oligonucleotide; and
wherein at least two first strand linker oligonucleotides are linked through hybridization of the two complementary, non-palindromic stapler sequences.
Embodiment 15. The method of embodiment 5, wherein the at least two first strand linker oligonucleotides each comprises a non-palindromic stapler sequence and a palindromic stapler sequence.
Embodiment 16. The method of embodiment 15, wherein for each of the at least two first strand linker oligonucleotides, the palindromic stapler sequence is interposed between the non-palindromic stapler sequence and the template-hybridizing sequence.
Embodiment 17. The method of embodiment 5, wherein the at least two first strand linker oligonucleotides each comprises a stapler sequence interposed between two primer sequences, wherein the stapler sequences on the at least two first strand linker oligonucleotides are hybridized to each other.
Embodiment 18. The method of embodiment 5, wherein the stapler sequence has a length that ranges between 8 and 50 nucleotides.
Embodiment 19. The method of embodiment 5, wherein the template-hybridizing sequence has a length that ranges from 15 to 70 nucleotides.
Embodiment 20. A method of preparing a DNA template for nucleic acid analysis comprising hybridizing a plurality of first strand linker oligonucleotides to the DNA template in a reaction mixture,
wherein the DNA template is a single-stranded concatemer comprising a plurality of monomers, wherein each monomer comprises an adaptor sequence and a DNA target sequence,
wherein each first strand linker oligonucleotide comprises a primer sequence that is complementary to and hybridizes to an adaptor sequence of the DNA template,
wherein at least one first strand linker oligonucleotide comprises a blocking group at 3’ of the primer sequence to prevent extension, and
wherein at least two first strand linker oligonucleotides of the plurality of first strand linker oligonucleotides that are hybridized to the DNA template are linked to each other.
Embodiment 21. The method of embodiment 20, wherein the blocking group is a reversible blocking group, wherein the method further comprises
removing the blocking group from the at least one first strand linker oligonucleotide, and
extending the at least one first strand linker oligonucleotide to generate at least one second strand.
Embodiment 22. The method of embodiment 5, wherein the first strand linker oligonucleotides can be cleaved at a site that is in the stapler sequence or in the template-hybridizing sequence.
Embodiment 23. The method of embodiment 20, wherein the method further comprises removing unbound first strand linker oligonucleotides from the reaction mixture from the DNA template.
Embodiment 24. The method of embodiment 1, wherein the method further comprises
1) extending at least two of the plurality of first strand linker oligonucleotides by a non-displacement DNA polymerase to generate at least two partially extended second strands that are fully hybridized to the DNA template,
wherein the at least two fully hybridized second strands include an upstream second strand and a downstream second strand, and
2) further extending the partially extended upstream second strand and downstream second strand, wherein the extending the partially extended upstream partially displace the partially extended downstream second strand, thereby producing a partially hybridized downstream second strand.
Embodiment 25. A DNA complex comprising a DNA template and a plurality of first strand linker oligonucleotides, wherein the DNA template is a single-stranded concatemer comprising a plurality of monomers, wherein each monomer comprises an adaptor sequence and a DNA target sequence,
wherein each first strand linker oligonucleotide comprises a template-hybridizing sequence,
wherein the template-hybridizing sequence is complementary to and hybridizes to an adaptor sequence of the DNA template, and
wherein at least two of the plurality of first strand linker oligonucleotides hybridized to the DNA template are linked to each other.
Embodiment 26. The DNA complex of embodiment 25, wherein the template-hybridizing sequence is a primer sequence, and wherein the primer sequence includes an extendible 3’ end.
Embodiment 27. A DNA complex comprising a DNA template, two or more second strands, wherein each second strand comprises an overhang region and a hybridized region that is hybridized to the DNA template,
wherein the two or more second strands are complementary to the DNA template, and
wherein the 5’ ends of the at least two second strands are linked.
Embodiment 28. The DNA complex of embodiment 27, wherein the DNA complex further comprises two or more second strand linker oligonucleotides that are hybridized to two or more second strands,
each second strand linker oligonucleotide comprising a second stapler sequence and at least two second strand linker oligonucleotides are linked through hybridization of the respective second stapler sequences.
Embodiment 29. A DNA array comprising the DNA complex of any of embodiments 25-28.
Embodiment 30. Two linker oligonucleotides, each linker oligonucleotide comprising a stapler sequence and a primer sequence, and the stapler sequence is 5’ to the primer sequence,
wherein the stapler sequences in the two linker oligonucleotides are complementary to each other and hybridize to each other, resulting two linker oligonucleotides hybridized to each other.
Embodiment 31. The two linker oligonucleotides of embodiment 30, wherein the stapler sequences are palindromic sequences.
Embodiment 32. The two linker oligonucleotides of embodiment 30, wherein the primer sequences on both linker oligonucleotides have the same sequence.
Embodiment 33. The two linker oligonucleotides of embodiment 30, wherein each oligonucleotide comprise an additional stapler sequence that is a non-palindromic stapler sequence, and wherein the additional non-palindromic stapler sequence is 5’ to the stapler sequence.
Embodiment 34. The two linker oligonucleotides of embodiment 33, wherein the additional non-palindromic stapler sequence in one of the two linker oligonucleotides is hybridized to a stapler sequence in a third linker oligonucleotide.
Embodiment 35. The two linker oligonucleotides of embodiment 30, at least one of which comprises a sequence that is selected from the group consisting of SEQ ID NO: 1-10.
Embodiment 36. A method of preparing a DNA template for nucleic acid analysis comprising immobilizing the DNA template on an array, wherein the DNA template is a DNA concatemer comprising a plurality of monomers, wherein each monomer comprises an adaptor sequence and a DNA target sequence,
hybridizing a plurality of first strand linker oligonucleotides to the DNA template, wherein each first strand linker oligonucleotide comprises a template-hybridizing sequence,
wherein the template-hybridizing sequence is complementary to and hybridizes to an adaptor sequence of the DNA template, and wherein at least two first strand linker oligonucleotides of the plurality of first strand linker oligonucleotides that are hybridized to the DNA template are linked to each other.
Embodiment 37. The method of embodiment 36, wherein the template-hybridizing sequence is a primer sequence, and wherein the primer sequence includes an extendible 3’ end.
***
All publications and patent documents cited herein are incorporated herein by reference as if each such publication or document was specifically and individually indicated to be incorporated herein by reference. Although the present invention is described primarily with reference to specific embodiments, it is also envisioned that other embodiments will become apparent to those skilled in the art upon reading the present disclosure, and it is intended that such embodiments be contained within the present inventive methods.

Claims (37)

  1. A method of preparing a DNA template for nucleic acid analysis comprising hybridizing a plurality of first strand linker oligonucleotides to the DNA template,
    wherein the DNA template is a single-stranded concatemer comprising a plurality of monomers, wherein each monomer comprises an adaptor sequence and a DNA target sequence,
    wherein each first strand linker oligonucleotide comprises a template-hybridizing sequence, wherein the template-hybridizing sequence is complementary to and hybridizes to an adaptor sequence of the DNA template, and
    wherein at least two first strand linker oligonucleotides of the plurality of first strand linker oligonucleotides that are hybridized to the DNA template are linked to each other.
  2. The method of claim 1, wherein the template-hybridizing sequence is a primer sequence, and wherein the primer sequence includes an extendible 3’ end.
  3. The method of claim 1, wherein the method further comprises extending the at least two first strand linker oligonucleotides to generate at least two second strands by one or more DNA polymerases,
    wherein the at least two first strand linker oligonucleotides are linked to each other and are hybridized to the DNA template,
    thereby producing at least two second strands each having a 5’ end, wherein the 5’ ends of the two second strands are linked.
  4. The method of claim 1, 2 or 3, wherein the at least two first strand linker oligonucleotides are linked through DNA hybridization, a covalent bond, or both.
  5. The method of claim 1, 2 or 3, wherein the at least two first strand linker oligonucleotides each comprises a stapler sequence, wherein the at least two first strand linker oligonucleotides are linked by hybridization of the respective stapler sequences.
  6. The method of any of the claims 1-3, wherein the at least two first strand linker oligonucleotides are linked by . a shared scaffold.
  7. The method of claim 5, wherein at least one of the first strand linker oligonucleotides comprises two cleavage sites flanking the stapler sequence, wherein cleaving at the cleavage sites releases the stapler sequence.
  8. The method of claim 5, wherein the stapler sequences of the at least two first strand linker oligonucleotides are hybridized to different regions of a shared scaffold, thereby linking the at least two first strand linker oligonucleotides.
  9. The method of claim 5, wherein the at least two first strand linker oligonucleotides bind to the DNA template before they bind to each other via the respective stapler sequences.
  10. The method of claim 9, wherein the method comprises:
    1) hybridizing blocker oligonucleotides to the stapler sequences of the at least two first strand linker oligonucleotides in a reaction, thereby forming partially double stranded first strand linker oligonucleotides,
    wherein each of the partially double stranded first strand linker oligonucleotide comprises i) a double-stranded region consisting of the blocker oligonucleotide and the stapler sequence, thereby preventing stapler sequences of the first strand linker oligonucleotides from hybridizing to each other, and ii) a single-stranded region comprising the sequence that is a template-hybridizing sequence,
    2) adding DNA template to the reaction, with the first strand linker oligonucleotides, wherein the primer sequences of the partially double stranded first strand linker oligonucleotides bind to the DNA template,
    3) removing the blocker oligonucleotide by one or more of the following: raising temperature of the reaction such that the blocker oligonucleotide is dissociated from the DNA template, enzymatic or chemical degradation of the blocker oligonucleotide,
    4) washing to remove the blocker oligonucleotide, and
    5) adjusting the temperature to allow the hybridization of the stapler sequences of the first strand linker oligonucleotides to each other.
  11. The method of claim 5, wherein the stapler sequences of the at least two first strand linker oligonucleotides bind to each other to form linked first strand linker oligonucleotides before binding of the primer sequences to the DNA template.
  12. The method of claim 11, wherein the method comprises using the linked first strand linker oligonucleotides at a concentration below a predetermined threshold such that two first strand linker oligonucleotides bind to a single DNA template molecule.
  13. The method of claim 5, wherein the stapler sequence is a palindromic stapler sequence, and
    wherein for each of the at least two first strand linker oligonucleotides, the palindromic stapler sequence is 5’ to the template-hybridizing sequence.
  14. The method of claim 5, wherein the at least two first strand linker oligonucleotides comprise two complementary, non-palindromic stapler sequences, one on each first strand linker oligonucleotide; and
    wherein at least two first strand linker oligonucleotides are linked through hybridization of the two complementary, non-palindromic stapler sequences.
  15. The method of claim 5, wherein the at least two first strand linker oligonucleotides each comprises a non-palindromic stapler sequence and a palindromic stapler sequence.
  16. The method of claim 15, wherein for each of the at least two first strand linker oligonucleotides, the palindromic stapler sequence is interposed between the non-palindromic stapler sequence and the template-hybridizing sequence.
  17. The method of claim 5, wherein the at least two first strand linker oligonucleotides each comprises a stapler sequence interposed between two primer sequences, wherein the stapler sequences on the at least two first strand linker oligonucleotides are hybridized to each other.
  18. The method of claim 5, wherein the stapler sequence has a length that ranges between 8 and 50 nucleotides.
  19. The method of claim 5, wherein the template-hybridizing sequence has a length that ranges from 15 to 70 nucleotides.
  20. A method of preparing a DNA template for nucleic acid analysis comprising hybridizing a plurality of first strand linker oligonucleotides to the DNA template in a reaction mixture,
    wherein the DNA template is a single-stranded concatemer comprising a plurality of monomers, wherein each monomer comprises an adaptor sequence and a DNA target sequence,
    wherein each first strand linker oligonucleotide comprises a primer sequence that is complementary to and hybridizes to an adaptor sequence of the DNA template,
    wherein at least one first strand linker oligonucleotide comprises a blocking group at 3’ of the primer sequence to prevent extension, and
    wherein at least two first strand linker oligonucleotides of the plurality of first strand linker oligonucleotides that are hybridized to the DNA template are linked to each other.
  21. The method of claim 20, wherein the blocking group is a reversible blocking group, wherein the method further comprises
    removing the blocking group from the at least one first strand linker oligonucleotide, and
    extending the at least one first strand linker oligonucleotide to generate at least one second strand.
  22. The method of claim 5, wherein the first strand linker oligonucleotides can be cleaved at a site that is in the stapler sequence or in the template-hybridizing sequence.
  23. The method of claim 20, wherein the method further comprises removing unbound first strand linker oligonucleotides from the reaction mixture from the DNA template.
  24. The method of claim 1, wherein the method further comprises
    1) extending at least two of the plurality of first strand linker oligonucleotides by a non-displacement DNA polymerase to generate at least two partially extended second strands that are fully hybridized to the DNA template,
    wherein the at least two fully hybridized second strands include an upstream second strand and a downstream second strand, and
    2) further extending the partially extended upstream second strand and downstream second strand, wherein the extending the partially extended upstream partially displace the partially extended downstream second strand, thereby producing a partially hybridized downstream second strand.
  25. A DNA complex comprising a DNA template and a plurality of first strand linker oligonucleotides, wherein the DNA template is a single-stranded concatemer comprising a plurality of monomers, wherein each monomer comprises an adaptor sequence and a DNA target sequence,
    wherein each first strand linker oligonucleotide comprises a template-hybridizing sequence,
    wherein the template-hybridizing sequence is complementary to and hybridizes to an adaptor sequence of the DNA template, and
    wherein at least two of the plurality of first strand linker oligonucleotides hybridized to the DNA template are linked to each other.
  26. The DNA complex of claim 25, wherein the template-hybridizing sequence is a primer sequence, and wherein the primer sequence includes an extendible 3’ end.
  27. A DNA complex comprising a DNA template, two or more second strands, wherein each second strand comprises an overhang region and a hybridized region that is hybridized to the DNA template,
    wherein the two or more second strands are complementary to the DNA template, and
    wherein the 5’ ends of the at least two second strands are linked.
  28. The DNA complex of claim 27, wherein the DNA complex further comprises two or more second strand linker oligonucleotides that are hybridized to two or more second strands,
    each second strand linker oligonucleotide comprising a second stapler sequence and at least two second strand linker oligonucleotides are linked through hybridization of the respective second stapler sequences.
  29. A DNA array comprising the DNA complex of any of claims 25-28.
  30. Two linker oligonucleotides, each linker oligonucleotide comprising a stapler sequence and a primer sequence, and the stapler sequence is 5’ to the primer sequence,
    wherein the stapler sequences in the two linker oligonucleotides are complementary to each other and hybridize to each other, resulting two linker oligonucleotides hybridized to each other.
  31. The two linker oligonucleotides of claim 30, wherein the stapler sequences are palindromic sequences.
  32. The two linker oligonucleotides of claim 30, wherein the primer sequences on both linker oligonucleotides have the same sequence.
  33. The two linker oligonucleotides of claim 30, wherein each oligonucleotide comprise an additional stapler sequence that is a non-palindromic stapler sequence, and wherein the additional non-palindromic stapler sequence is 5’ to the stapler sequence.
  34. The two linker oligonucleotides of claim 33, wherein the additional non-palindromic stapler sequence in one of the two linker oligonucleotides is hybridized to a stapler sequence in a third linker oligonucleotide.
  35. The two linker oligonucleotides of claim 30, at least one of which comprises a sequence that is selected from the group consisting of SEQ ID NO: 1-10.
  36. A method of preparing a DNA template for nucleic acid analysis comprising immobilizing the DNA template on an array, wherein the DNA template is a DNA concatemer comprising a plurality of monomers, wherein each monomer comprises an adaptor sequence and a DNA target sequence,
    hybridizing a plurality of first strand linker oligonucleotides to the DNA template, wherein each first strand linker oligonucleotide comprises a template-hybridizing sequence,
    wherein the template-hybridizing sequence is complementary to and hybridizes to an adaptor sequence of the DNA template, and wherein at least two first strand linker oligonucleotides of the plurality of first strand linker oligonucleotides that are hybridized to the DNA template are linked to each other.
  37. The method of claim 36, wherein the template-hybridizing sequence is a primer sequence, and wherein the primer sequence includes an extendible 3’ end.
PCT/CN2020/124338 2019-10-28 2020-10-28 Dna linker oligonucleotides WO2021083195A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202080075206.4A CN114641581A (en) 2019-10-28 2020-10-28 DNA adaptor oligonucleotide

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201962927060P 2019-10-28 2019-10-28
US62/927,060 2019-10-28

Publications (1)

Publication Number Publication Date
WO2021083195A1 true WO2021083195A1 (en) 2021-05-06

Family

ID=75714576

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/124338 WO2021083195A1 (en) 2019-10-28 2020-10-28 Dna linker oligonucleotides

Country Status (2)

Country Link
CN (1) CN114641581A (en)
WO (1) WO2021083195A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101580829A (en) * 2009-02-27 2009-11-18 深圳大学 Gene site-directed multi-site mutation method
WO2016133764A1 (en) * 2015-02-17 2016-08-25 Complete Genomics, Inc. Dna sequencing using controlled strand displacement
CN106661631A (en) * 2014-06-06 2017-05-10 康奈尔大学 Method for identification and enumeration of nucleic acid sequence, expression, copy, or dna methylation changes, using combined nuclease, ligase, polymerase, and sequencing reactions
WO2017143006A1 (en) * 2016-02-17 2017-08-24 President And Fellows Of Harvard College Molecular programming tools
CN108070642A (en) * 2016-11-17 2018-05-25 深圳华大基因研究院 The method and the double end sequencing methods of DNB and kit of the double end sequencing quality of raising DNB

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101580829A (en) * 2009-02-27 2009-11-18 深圳大学 Gene site-directed multi-site mutation method
CN106661631A (en) * 2014-06-06 2017-05-10 康奈尔大学 Method for identification and enumeration of nucleic acid sequence, expression, copy, or dna methylation changes, using combined nuclease, ligase, polymerase, and sequencing reactions
WO2016133764A1 (en) * 2015-02-17 2016-08-25 Complete Genomics, Inc. Dna sequencing using controlled strand displacement
WO2017143006A1 (en) * 2016-02-17 2017-08-24 President And Fellows Of Harvard College Molecular programming tools
CN108070642A (en) * 2016-11-17 2018-05-25 深圳华大基因研究院 The method and the double end sequencing methods of DNB and kit of the double end sequencing quality of raising DNB

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
PORRECA, G. J.: "Genome sequencing on nanoballs", NATURE BIOTECHNOLOGY, vol. 28, no. 1, 31 January 2010 (2010-01-31), pages 43 - 44, XP037104026, DOI: 10.1038/nbt0110-43 *

Also Published As

Publication number Publication date
CN114641581A (en) 2022-06-17

Similar Documents

Publication Publication Date Title
US11319588B2 (en) DNA sequencing using controlled strand displacement
US10876158B2 (en) Method for sequencing a polynucleotide template
EP3564394B1 (en) Method of preparing libraries of template polynucleotides
EP2191011B1 (en) Method for sequencing a polynucleotide template
US8168388B2 (en) Preparation of nucleic acid templates for solid phase amplification
WO2021128441A1 (en) Controlled strand-displacement for paired-end sequencing
CN111542532B (en) Method and system for synthesizing oligonucleotide by enzyme method
WO2021083195A1 (en) Dna linker oligonucleotides
AU2022280886A1 (en) Oligo-modified nucleotide analogues for nucleic acid preparation

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20880846

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20880846

Country of ref document: EP

Kind code of ref document: A1