WO2020180813A1

WO2020180813A1 - Compositions and methods for adaptor design and nucleic acid library construction for rolony-based sequencing

Info

Publication number: WO2020180813A1
Application number: PCT/US2020/020694
Authority: WO
Inventors: Yanhong Tong; Thomas PERROUD; Dietrich Wilhelm Karl Lueerssen
Original assignee: Qiagen Sciences, Llc; Qiagen Manchester Ltd.
Priority date: 2019-03-06
Filing date: 2020-03-02
Publication date: 2020-09-10

Abstract

The present disclosure provides compositions and methods for adaptor design and nucleic acid library construction for rolony-based sequencing. Also provided are kits for preparing a library of rolonies for sequencing.

Description

COMPOSITIONS AND METHODS FOR ADAPTOR DESIGN AND NUCLEIC ACID LIBRARY CONSTRUCTION FOR ROLONY-B ASED SEQUENCING

STATEMENT REGARDING SEQUENCE LISTING

The Sequence Listing associated with this application is provided in text format in lieu of a paper copy, and is hereby incorporated by reference into the specification. The name of the text file containing the Sequence Listing is

830109_417WO_SEQUENCE_LISTING.txt. The text file is 7.8 KB, was created on February 27, 2020, and is being submitted electronically via EFS-Web.

BACKGROUND

Next generation sequencing (NGS) has been widely used for the detection and confirmation of genetic changes. Rolonies (rolling circle colonies), which are single stranded DNA concatemers produced by rolling circle amplification of a circularized DNA fragment, offer certain advantages as a template for sequencing, including high image efficiency due to bright signals from hundreds of reaction sites being in the compact rolony, reduced reagent consumption due to the compactness of rolonies allowing for high density arrays, and improved sequencing accuracy. However, current adapter designs and library construction methods for rolony -based sequencing do not accommodate all of the following features: (i) ability to sequence multiple, distinct, separate fragments in the same sequencing run ( e.g ., target sequence, index sequence, unique molecular identifier sequence); (ii) compatible with use of single primer extension (SPE); (iii) compatible with use of unique molecular identifier; (iv) can be used for paired-end sequencing; and (v) can be used with dual sample indexes.

BRIEF SUMMARY

The present disclosure provides adaptors, kits, and methods for nucleic acid library construction for rolony -based sequencing.

In one aspect, the present disclosure provides a method of producing a library of circular, single-stranded nucleic acid templates, each circular, single-stranded nucleic acid template comprising a strand of a double-stranded target nucleic acid, a strand of a first adaptor or the complement thereof, and a strand of a second adaptor or the complement thereof, the method comprising:

a. providing a plurality of fragments of double-stranded target nucleic acids;

b. adding a first adaptor to a 5’ terminus of a sense strand and to a 3’ terminus of an antisense strand of the plurality of fragments of double-stranded nucleic acids, wherein the first adaptor comprises:

(i) a single-stranded region comprising a first sequencing primer binding site, an optional unique molecular identifier (UMI), and a first sample index sequence, wherein the first sequencing primer binding site comprises a first universal primer binding site and a first portion of a bridge oligonucleotide binding site;

(ii) a double stranded linker region of about 15 to about 35 bases for ligation to the plurality of fragments of double-stranded nucleic acids, wherein the double stranded linker region comprises a second sequencing primer binding site; c. adding a second adaptor to a 3’ terminus of the sense strand and to a 5’ terminus of the antisense strand of the plurality of fragments of double-stranded nucleic acids to produce a library of linear, double-stranded nucleic acid templates, wherein the second adaptor comprises a second universal primer binding site, and wherein the second universal primer binding site comprises a second portion of the bridge oligonucleotide binding site and optionally a third sequencing primer binding site; d. optionally amplifying the library of linear, double-stranded nucleic acid templates with a first universal primer that binds to the first primer binding site and a second universal primer that binds to the second primer binding site;

e. denaturing the library of linear, double-stranded nucleic acid templates to produce a library of linear, single-stranded nucleic acid templates; and

f. circularizing the library of linear, single-stranded nucleic acid templates by adding a bridge oligonucleotide and ligating the first adaptor and second adaptor, thereby producing the library of circular, single-stranded nucleic acid templates. In another aspect, the present disclosure provides a set of partially double- stranded adaptors for producing a library of circular, single-stranded nucleic acid templates,

wherein the set comprises a plurality of partially double-stranded adaptors; wherein each adaptor of the set comprises:

(i) a single-stranded region comprising a first sequencing primer binding site, a unique molecular identifier (UMI), and a sample index sequence, wherein the first sequencing primer binding site further comprises a first universal primer binding site and a first portion of a bridge oligonucleotide binding site;

(ii) a double stranded linker region of about 15 to about 35 bases for ligation to a double-stranded nucleic acid, wherein the double stranded linker region comprises a second sequencing primer binding site; and

wherein the plurality of the adaptors are identical to each other except their UMI sequences are different from each other.

In a further aspect, the present disclosure provides a kit for producing a library of circular, single-stranded nucleic acid templates, comprising:

(i) the set of partially double-stranded adaptors provided herein

(ii) a second adaptor comprising a second universal primer binding site, wherein the second universal primer binding site comprises a second portion of the bridge oligonucleotide binding site and optionally a third sequencing primer binding site, and

(iii) the bridge oligonucleotide.

BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWINGS

FIG. 1 shows an exemplary first adaptor scheme, including a linker, molecular barcode (also referred to as unique molecular identifier (UMI)), sample index, and first sequencing primer.

FIG. 2 shows an exemplary second adaptor for rolony -based sequencing, including a target-specific sequence and a second universal primer sequence. FIG. 3 shows exemplary steps (steps A to F) using a target specific PCR primer during library construction and clonal amplification to generate a bottom-strand rolony. The target specific PCR primer is used to generate a double stranded nucleic acid molecule containing a region of interest (ROI) and 1^st and 2^nd adaptors. The 5’ end of 1^st adaptor of top strand is phosphorylated for circularization and ligation. Rolling circle amplification (RCA) primer hybridizes to single-stranded, circular nucleic acid template (top strand) and is used to generate a bottom-strand rolony having sequencing primer binding sites for Seq 1 A primer and Seq 2 A primer.

FIG. 4 shows exemplary steps (steps A to F) using a target specific PCR primer during library construction and clonal amplification to generate a top-strand rolony.

The target specific PCR primer is used to generate a double stranded nucleic acid molecule containing a region of interest (ROI) and 1^st and 2^nd adaptors. The 5’ end of 2^nd adaptor of bottom strand is phosphorylated for circularization and ligation. RCA primer hybridizes to single-stranded, circular nucleic acid template (bottom strand) and is used to generate a top-strand rolony having sequencing primer binding sites for Seq IB primer and Seq 2B primer.

FIG. 5 shows an embodiment of library construction and clonal amplification workflow. Steps labeled 3 A-3F refer to steps or products depicted in Figure 3 with the corresponding label. Steps labeled 4A-4F refer to steps or products depicted in Figure 4 with the corresponding label.

FIG. 6 shows an embodiment of library construction and clonal amplification for paired-end sequencing. Steps labeled 3 A-3F refer to steps or products depicted in Figure 3 with the corresponding label. Steps labeled 4A-4F refer to steps or products depicted in Figure 4 with the corresponding label. Top and bottom rolonies are seeded on the same flow cell, with separate inlets and outlets. Sequencing for each strand is performed in separated areas of flow cell at the same time in the sequencer.

FIG. 7 shows an embodiment of library construction with a first adaptor and a universal adaptor (second adaptor). The first adaptor and universal adaptor (second adaptor) are joined to a region of interest (ROI) via blunt ligation. The ligation product is amplified using a pair of universal primers, one of which is 5’ phosphorylated. The top strand is circularized for clonal amplification (rolling circle amplification (RCA)) to generate a bottom-strand rolony having sequencing primer binding sites for Seq 1 A primer and Seq 2A primer.

FIG. 8 shows an embodiment of library construction with a first adaptor and a universal adaptor (second adaptor). The first adaptor and universal adaptor (second adaptor) are joined to a region of interest (ROI) via blunt ligation. The ligation product is amplified using a pair of universal primers, one of which is 5’ phosphorylated. The bottom strand is circularized for clonal amplification (RCA) to generate a top-strand rolony having sequencing primer binding sites for Seq IB primer and Seq 2B primer.

FIG. 9 shows an embodiment of library construction compatible for use with dual indices and production of a bottom strand rolony. The bottom-strand rolony can be sequentially sequenced by sequencing primer Seq 1 A (hybridizes to the linker region of the first adaptor) for sequencing ROI, sequencing primer Seq 2A (hybridizes to the first sequencing primer site of the first adaptor) for sequencing first sample index and UMI, and sequencing primer Seq 3 A (hybridizes to the 3^rd universal primer binding site of the second adaptor) for sequencing the second sample index.

FIG. 10 shows an embodiment of library construction compatible for use with dual indices and production of a top-strand rolony. The top-strand rolony can be sequentially sequenced using sequencing primer Seq IB (hybridizes to the linker region of the first adaptor) for sequencing the first sample index and UMI, sequencing primer Seq 2B (hybridizes to the bridge oligonucleotide binding site of the 2^nd adaptor and optionally a portion of the first sequencing primer site of the first adaptor) for sequencing the second sample index, and sequencing primer Seq 3B (hybridizes to the 3^rd universal primer binding site of the second adaptor) for sequencing the ROI.

FIG. 11 shows an exemplary second adaptor comprising a second sample index.

FIG. 12 shows an exemplary library construct as described in Example 1 and depicted in step D of Figure 3 (“3D structure”), which is circularized by a bridge oligonucleotide (see step E of Figure 3), ligated, and amplified by RCA using a RCA amplification primer to produce a bottom strand rolony. The sequencing primers Seq 1 and Seq 2 (corresponding to Seq 1A and Seq 2A in step F of Figure 3, respectively) bind to primer binding sites within the first adaptor sequence. BC = bar code (unique molecular identifier (UMI)). Index = sample index. I = insert sequence or region of interest sequence. Seq 2 = sequencing primer #2 (for sequencing region of interest).

Seq 1 = sequencing primer #1 (for sequencing UMI and sample index).

FIG. 13 shows an exemplary library construct as described in Example 2 and depicted in step D of Figure 3 (“3D structure”), which is circularized by a bridge oligonucleotide (see step E of Figure 3) and amplified by RCA using a RCA

amplification primer to produce a bottom strand rolony. Sequencing primer Seq 2 (corresponding to Seq 2A in step F of Fig. 3) binds to a primer binding site within the first adaptor sequence (linker region), while sequencing primer Seq 1 (corresponding to Seq 1A in step F of Fig. 3) binds to a primer binding site created by the junction of the first adaptor and second adaptor upon circularization and ligation. BC = bar code (unique molecular identifier (UMI)). Index = sample index. I = insert sequence or region of interest sequence. Seq 2 = sequencing primer #2 (for sequencing region of interest). Seq 1 = sequencing primer #1 (for sequencing UMI and sample index).

FIG. 14 shows an exemplary library construct as described in Example 3 and depicted in step D of Figure 4 (“4D structure”), which is circularized by a bridge oligonucleotide (see step E of Figure 4) and amplified by RCA using a RCA

amplification primer to produce a top strand rolony. Sequencing primer Seq 2

(corresponding to Seq 2B in step F of Figure 4) binds to a primer binding site within the first adaptor sequence, while sequencing primer Seq 1 (corresponding to Seq IB of Figure 4) binds to a primer binding site created by the junction of the first adaptor and SPE primer upon circularization. BC = bar code (unique molecular identifier (UMI)). Index = sample index. I = insert sequence or region of interest sequence. Seq 1 = sequencing primer #1 (for sequencing sample index and UMI). Seq 2 = sequencing primer #2 (for sequencing region of interest).

Fig. 15 shows the sequence of the first adaptor in Table 2.

Fig. 16 shows the sequence of the first adaptor in Table 3.

Fig. 17 shows the sequence of the first adaptor in Table 4.

DETAILED DESCRIPTION

The present disclosure provides adaptor design and nucleic acid library construction for rolony -based sequencing. Specifically, the present disclosure provides inter alia a partially double- stranded adaptor (referred to as“first adaptor” below) for generating circular, single- stranded nucleic acid templates. The first adaptor comprises: (i) a single-stranded region comprising a first sequencing primer binding site, an optional unique molecular identifier (UMI), and a first sample index sequence, wherein the first sequencing primer binding site comprises a first universal primer binding site and a first portion of a bridge oligonucleotide binding site; and (ii) a double-stranded linker region of about 15 to about 35 bases for ligation to the plurality of fragments of double-stranded nucleic acids, wherein the double stranded linker region comprises a second sequencing primer binding site. Such an adaptor design, especially the relative long double-stranded linker region, provides multiple advantages as described in detail below.

The present disclosure also provides inter alia a method for constructing nucleic acid library for rolony -based sequencing. According to such a method, the first adaptor provided herein may be added to one end of a double-stranded target nucleic acid fragment, while a second adaptor may be added to the other end of the fragment. The second adaptor comprises a second universal primer binding site, which in turn comprises a second portion of the bridge oligonucleotide binding site and optionally a third sequencing primer binding site. The resulting target nucleic acid fragment flanked by the first and second adaptor may be optionally amplified, denatured, and circularized in the presence of the bridge oligonucleotide to generate a circular, single-stranded nucleic acid template. Such a template may be further amplified to generate rolonies via rolling circle amplification (RCA) and subsequently sequenced.

The libraries of nucleic acid templates constructed according to the methods disclosed herein have one or more of the following characteristics:

i) possessing an asymmetric structure (target nucleic acid interposed between a longer first adaptor and a shorter second adaptor),

ii) being multifunctional that permits sequencing of multiple, distinct, separate fragments by for example sequential sequencing, thus minimizing signal loss with increased sequencing cycles

iii) being compatible with single primer extension,

iv) being compatible with use of unique molecular identifiers, v) being compatible with dual sample indices,

vi) being able to be used for paired-end sequencing based on the rolony formation from different stands of double-stranded library input,

vii) avoiding the sequencing of the low diversity region (linker region of the first adaptor) and shortening the refocus frequency and turnaround time of image-based focusing sequencer,

viii) allowing sequencing the target nucleic acid first rather than the UMI and sample index, thus providing the feasibility to sequence the regions of interest using the cycles with the lowest phasing and highest quality,

ix) allowing flexible universal primer and bridge oligonucleotide design, and

x) improving the ligation efficiency and consistency by for example including a relatively long linker region in designing adaptors.

The libraries of nucleic acid templates constructed according to the method disclosed herein may be used in sequencing target nucleic acids useful in diagnosing and monitoring diseases ( e.g ., cancers), charactering diseases (e.g, responsiveness to particular treatments), and other areas where obtaining target nucleic acid sequences is desirable.

In the following description, any ranges provided herein include all the values in the ranges. It should also be noted that the term“or” is generally employed in its sense include“and/or” (i.e., to mean either one, both, or any combination thereof of the alternatives) unless the content dictates otherwise. Also, as used in this specification and the appended claims, the singular forms“a,”“an,” and“the” include plural referents unless the content dictates otherwise. The terms“include,”“have,”

“comprise” and their variants are used synonymously and to be construed as non limiting. The term“about” refers to + 10% of a reference value. For example,“about 50°C” refers to“50°C ± 5°C” (i.e., 50°C ± 10% of 50°C).

A. Target Nucleic Acids and Template Nucleic Acids

The term“nucleic acid,”“nucleic acids,” or“polynucleotide” as used herein refers to a polymer comprising ribonucleosides or deoxyribonucleosides that are covalently bonded typically by phosphodiester linkages between subunits. Nucleic acids include DNA and RNA. DNA includes, but is not limited to, genomic DNA, linear DNA, circular DNA, plasmid DNA, cDNA, cell free DNA ( e.g ., tumor derived or fetal DNA). RNA includes but is not limited to hnRNA, mRNA, noncoding RNA, cell free RNA (e.g., tumor derived RNA). Non coding RNA includes but is not limited to rRNA, tRNA, lncRNA (long non coding RNA), lincRNA (long intergenic non coding RNA), miRNA, and siRNA.

A“target nucleic acid,” also referred to as“target sequence,”“region of interest” (ROI), or“insert sequence,” refers to a nucleic acid molecule of interest. A target nucleic acid may be from any source, such as a cell sample, tissue sample, fluid sample, or organism from a plant, animal, virus, bacteria, fungus, parasite, insect, mammal, bird, reptile, amphibian, or human, or a forensic sample or environmental sample. Exemplary samples include whole blood, blood products, plasma, serum, red blood cells, white blood cells, buffy coat, urine, sputum, saliva, semen lymphatic fluid, amniotic fluid, cerebrospinal fluid, peritoneal effusions, pleural effusions, fluid from cysts, synovial fluid, vitreous humor, aqueous humor, bursa fluid, eye washes, eye aspirates, pulmonary lavage, bone marrow aspirates, lung aspirates, biopsy samples, swab samples, animal (including human) or plant tissues, including but not limited to samples from liver, spleen, kidney, lung, intestine, brain, heart, muscle, pancreas, cell cultures, lysates, extracts, or materials and fractions obtained from the samples described above or any cells and microorganisms and viruses that may be present on or in a sample and the like. A target nucleic acid may be a naturally occurring sequence (e.g, DNA, genomic DNA (gDNA), cDNA, mitochondrial DNA, cell free DNA (cfDNA), RNA, mRNA, rRNA, tRNA, cfRNA, long non-coding RNA, microRNA), artificial sequence, or a combination thereof. A target nucleic acid may be from a gene, a regulatory element, a non-coding sequence, or a combination thereof. A target nucleic acid may be single-stranded or double-stranded.

A target nucleic acid may be obtained or isolated directly from a sample, or a product of a fragmentation reaction, a reverse transcription reaction, an amplification reaction, and the like, of nucleic acids obtained from a sample. Target nucleic acids can be isolated from a sample according to methods known in the art to provide a nucleic acid sample ( e.g ., DNA, RNA).

A target nucleic acid may be of any appropriate length. In certain embodiments, a target nucleic acid may have a length in a particular size range, for example, about 50 to about 2,000 nucleotides, about 50 to about 1,000 nucleotides, about 50 to about 750 nucleotides, about 50 to about 600 nucleotides, about 50 to about 500 nucleotides, about 50 to about 400 nucleotides, about 50 to about 300 nucleotides, about 50 to about 200 nucleotides, about 100 to about 2,000 nucleotides, about 100 to about 1,000

nucleotides, about 100 to about 750 nucleotides, about 100 to about 600 nucleotides, about 100 to about 500 nucleotides, about 100 to about 400 nucleotides, about 100 to about 300 nucleotides, about 100 to about 200 nucleotides, about 150 to about 2,000 nucleotides, about 150 to about 1,000 nucleotides, about 150 to about 750 nucleotides, about 150 to about 600 nucleotides, about 150 to about 500 nucleotides, about 150 to about 400 nucleotides, about 150 to about 300 nucleotides, or about 150 to about 200 nucleotides in length. Preferably, a target nucleic acid may have a length in the range of about 30 to 400 nucleotides. In a library of nucleic acid templates, each comprising a target nucleic acid sequence, the members of the library may have similar lengths, e.g., within a specific length range. The optimal target nucleic acid size for the library is determined by a number of factors, including sequencing application (e.g, de novo sequencing vs. re-sequencing) and selected next generation sequencing platform. In certain embodiments, target nucleic acids (e.g, genomic DNA, RNA, or cDNA) are fragmented. Fragmenting nucleic acids may be performed physically, enzymatically, or chemically from larger nucleic acids to a desired size range. Physical fragmentation includes acoustic shearing, sonication, and hydrodynamic shearing. Enzymatic fragmentation may use an endonuclease that cleaves target nucleic acids into small fragments with 5’ phosphate and 3’ hydroxyl groups. Chemical fragmentation may be accomplished using heat or divalent metal cation (e.g, magnesium or zinc). In certain embodiments, target nucleic acids are subjected to size selection to obtain target nucleic acids within a defined or desired size range.

A“nucleic acid template” refers to a nucleic acid construct that comprises a target nucleic acid flanked between a“first adaptor” and a“second adaptor.” A first adaptor refers to an adaptor sequence 5’ to the target nucleic acid, and a second adaptor refers to an adaptor 3’ to the target nucleic acid. In embodiments involving a double stranded target nucleic acid, a first adaptor refers to an adaptor sequence 5’ to one strand ( e.g ., the sense strand) of the target nucleic acid and a second adaptor refers to an adaptor sequence 3’ to the strand of the target nucleic acid. The sense strand of a double-stranded target nucleic acid may be any of the two stands of the target nucleic acid. The antisense strand of the target nucleic acid is the strand other than the sense strand. A nucleic acid template may be linear or circular. A nucleic acid template may be single stranded or double stranded. In certain embodiments, the target nucleic acid is directly adjacent to a first adaptor, a second adaptor, or both the first adaptor and second adaptor. In certain embodiments, additional bases (e.g., 1, 2 or more bases) are present between the target nucleic acid and the first adaptor, between the target nucleic acid and the second adaptor, or both. In certain embodiments, a nucleic acid template is a member of a library of nucleic acid templates. In certain embodiments, a nucleic acid template is DNA.

B. Adaptors

An“adaptor” refers to an engineered nucleic acid that is added to each end of a target nucleic acid to produce a nucleic acid template for sequencing. An adaptor may comprise a subsequence for a particular function, e.g, library construction, library amplification, immobilization on a substrate, sequencing of nucleic acid templates, or any combination thereof. For example, an adaptor may comprise a restriction endonuclease recognition site, primer binding site for amplification during library construction (e.g, universal primer, target specific primer, single primer extension primer), binding site for a bridge oligonucleotide for circularization of a template nucleic acid, binding site for immobilizing a template nucleic acid on a substrate, primer binding site for sequencing (e.g, primer binding site for sequencing by synthesis methods or probe binding site for combinatorial probe anchor ligation (cPAL) methods), sample index sequence, unique molecular identifier (UMI) sequence, or any combination thereof. An adaptor may comprise multiple, functionally distinct subsequences. Functionally distinct subsequences may be completely overlapping, partially overlapping, or non-overlapping within an adaptor. For example, in the first adaptor shown in Table 2, bases 1-26 at the 5’ end of the top strand (i.e., 5’ CTC AC A CTC ACC ACG TCG GCT CGC AG) are the sequence of a first sequencing primer (Seq 2) (SEQ ID NO: 10), bases 1-17 at the 5’ end of the top strand (i.e., 5’CTC ACA CTC ACC ACG TC) are the sequence of a first universal primer (universal primer 1) (SEQ ID NO:3); bases 1-10 at the 5’ end of the top strand (i.e., 5’CTC ACA CTC A) (SEQ ID NO: 36) are a portion of a bridge oligonucleotide binding site; the 26 bases at the 3’ terminus of the top strand (not including the T-overhang) (i.e., CTC ACT CGT CAC AGC ACC TCC TCC GC) are a ligation linker sequence and the sequence of a second sequencing primer (Seq 1) (SEQ ID NO: 9). An adaptor may be single-stranded, double-stranded, or partially double-stranded. The length of a single-stranded or double-stranded adaptor may vary depending upon the particular sequencing platform selected and intended use, but may range from about 3 nucleotides to about 200 nucleotides, from about 5 nucleotides to about 150 nucleotides, from about 10 nucleotides to about 100 nucleotides, from about 15 nucleotides to about 100 nucleotides, from about 20 nucleotides to about 100 nucleotides, from about 40 nucleotides to about 100 nucleotides, from about 5 nucleotides to about 80 nucleotides, from about 10 nucleotides to about 80 nucleotides, or from about 15 nucleotides to about 80 nucleotides. Preferably, the adaptor length is 15-100 nucleotides. For a partially double-stranded adaptor, one of the strands may have a length as described above for a single-stranded or double-stranded adaptor. In certain embodiments, an adaptor may comprise one or more modified nucleotides, e.g., having modifications to the nitrogenous base, 5-carbon sugar, phosphate moiety, or any combination thereof.

As used herein a“primer binding site” or“primer binding sequence” refers to a sequence to which a primer (or oligonucleotide) specifically binds. Primer binding sequences are of sufficient length to allow hybridization of a primer. In certain embodiments, the primer or a portion thereof is completely complementary to the primer binding sequence. In certain other embodiments, the primer or a portion thereof is substantially complementary to the primer binding site, that is, at least 90% of the nucleotides of the primer or the portion thereof are complementary to the nucleotides of the primer binding site. In certain embodiments, a primer binding site is at least about 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20 nucleotides long and/or at most about 60,

55, 50, 45, 40, 35, or 30 nucleotides long. In embodiments where an adaptor comprises two or more primer binding sites, the two or more primer binding sites may be overlapping, partially overlapping, or non-overlapping. In embodiments wherein the two or more primer binding sites are non-overlapping in the same adaptor, they may be immediately adjacent to each other or separated by one or more nucleotides ( e.g ., about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20 or more nucleotides) and/or about 40, 35, or 30 or less nucleotides.

As used herein, a“sample index,” also referred to as“index” or“index sequence” refers to a component of an adaptor comprising a unique combination of bases that identifies template nucleic acids belonging to a common library or sample. The use of sample indexes in template nucleic acids allows for multiplexing, e.g., sequencing of multiple different libraries or multiple different samples in a single reaction. In some embodiments, an index sequence can be used to orientate a sequence imager for purposes of detecting individual sequencing reactions. In certain

embodiments, an index sequence is about 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, or 25 nucleotides long. An index sequence may be from about 2 nucleotides to about 25 nucleotides in length, from about 5 nucleotides to about 20 nucleotides in length, or from about 8 nucleotides to about 15 nucleotides in length. In an embodiment, a template nucleic acid comprises a single sample index. Sample multiplexing has the inherent risk of index mis-assignment (cross-talk), which occurs when a sequence read derived from one sample in a pool of samples is incorrectly matched to a sample index from a different sample in the pool of samples. Index cross talk can be introduced by a variety of mechanisms. Dual sample indices (dual indices) may minimize the incidence of index cross-talk and improve sequencing accuracy and sensitivity. The use of dual indices may also increase multiplexing capability by combination of the two indices. In an embodiment, a template nucleic acid comprises dual sample indices.

As used herein, a“unique molecular identifier” (UMI), also referred to as“bar code” or“molecular bar code” refers to a component of an adaptor comprising a unique combination of bases that is used to identify unique nucleic acid molecules. A UMI may be used to identify PCR duplicates derived from the same nucleic acid molecule that were generated during library amplification. Thus, a UMI may be used to de- duplicate sequencing reads derived from a single molecule. In certain embodiments, a UMI is about 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, or 25 nucleotides long. A UMI may be from about 2 nucleotides to about 25 nucleotides in length, from about 5 nucleotides to about 20 nucleotides in length, or from about 8 nucleotides to about 15 nucleotides in length.

A UMI is designed to have between 2 and 15 degenerate base positions, but preferably has between 6 and 12 base positions. A“degenerate base position” is a base position that more than 1 nucleotide ( e.g ., 2, 3, or 4 different nucleotides) may occupy. In certain embodiments, a UMI is designed to assign a completely unique sequence tag to each target nucleic acid molecule. In certain other embodiments, a UMI is not designed to assign a completely unique sequence tag to each molecule, but rather is designed to have a low probability of assigning any given sequence tag to a particular molecule. The greater the number of possible UMI sequences, the lower the probability of any particular sequence being assigned to a molecule. When many target nucleic acid molecules are copied and tagged, the same UMI sequence can be assigned to more than one template molecule. UMI sequences are used to track the lineage of molecules from initial copying through amplification, processing and sequencing. They can be used to distinguish sequences that arise from polymerase misincorporations or sequencer errors from sequences that are derived from true mutant template molecules. UMIs can also be used to distinguish sequences that have the wrong sample index assignment as a result of cross-over of sample indices during pooled amplification. Because the same UMI sequence can be assigned to more than one target nucleic acid molecule, meaningful analysis of UMI sequences requires first identifying target nucleic acid sequences (e.g., nucleic acid variants) and then analyzing the distribution of UMI sequences associated with those target nucleic acid sequences. The number of different UMIs in a first adaptor may be at least 100, 1,000, 5,000, 100,000, 500,000, 1,000,000, or 5,000,000.

As disclosed above, nucleic acid templates are flanked by a first adaptor and a second adaptor. A first adaptor comprises from 5’ to 3’ : a single-stranded region comprising a first sequencing primer binding site, a first sample index sequence, wherein the first sequencing primer binding site further comprises a first universal primer binding site and a first portion of a bridge oligonucleotide binding site; and a double-stranded linker region of about 20 nucleotides to about 30 nucleotides for ligation, wherein the double stranded linker region further comprises a second sequencing primer binding site (Figure 1). The double-stranded linker region of the first adaptor is designed for dual purposes: direct ligation of the first adaptor to double stranded target nucleic acids and to provide a sequencing primer binding site for sequencing the target nucleic acid (region of interest) (see, e.g., Figure 3, bottom-strand rolony and Figure 4, top-strand rolony). Preferably, the UMI, sample index, or any portion thereof is not contained within the double-stranded linker region of the first adaptor. In certain embodiments, the first adaptor may further comprise a UMI between the first sequencing primer binding site and the first sample index sequence or between the first sample index sequence and the double-stranded linker region. The first sequencing primer binding site of the first adaptor is designed for multiple purposes: to provide a sequencing primer binding site for sequencing the UMI and sample index (see, e.g. , Figure 3, bottom-strand rolony); to provide a universal primer binding site for library enrichment; to provide a first portion of a bridge oligonucleotide binding site for circularization; and optionally to provide a sequencing primer binding site for sequencing the region of interest (see, e.g, Figure 4, top-strand rolony).

As used herein,“linker” or“linker region” generally refers to the double- stranded nucleic acid sequence that is part of an adaptor and directly ligated with a target nucleic acid. In some embodiments, the first adaptors present in one or more libraries of nucleic acid templates comprise a shared or common linker region sequence. In certain embodiments, the double-stranded linker region is about 15 to about 35 nucleotides in length, such as 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, or 35 nucleotides in length, preferably 20-30 nucleotides in length. The double-stranded linker region of the first adaptor may be formed by annealing the first adaptor’s two complementary strands of different lengths that possess a complementary linker region. In some embodiments, it may be advantageous for the double-stranded linker region of the first adaptor to be as short as possible without loss of function.“Function” in this context means that the double-stranded linker region forms a stable duplex under standard reaction conditions for an enzyme-catalyzed nucleic acid ligation reaction ( e.g ., incubation at a temperature ranging from about 4° C to about 40° C in a ligation buffer appropriate for the enzyme), such that the two strands forming the double- stranded linker region of the first adaptor remain partially annealed during ligation of the first adaptor to a target nucleic acid.

In some embodiments, it may be advantageous for the double-stranded linker region of the first adaptor to be of sufficient length and have a certain percent of GC content to reach the desired Tm for sequencing primer hybridization on the selected sequencing instrument. The Tm requirement depends on the sequencing temperature, which is defined by the enzymes and buffers utilized during sequencing. For example, the double-stranded linker region of the first adaptor can be about 20-30 nucleotides in length, with about 50-80% GC content, with T_m more than 60°C in the sequencing buffer. The relatively lengthy double-stranded linker region can also improve adaptor structure uniformity during the annealing process of adaptor production/manufacturing, which can improve ligation efficiency. Specifically, if the double-stranded linker region is shortened, different sample index sequences or different UMI sequences of the first adaptors may require different optimal conditions for ligating the first adaptors to target nucleic acids.

In some embodiments, it may be advantageous for the first sequencing binding site of the first adaptor to be of sufficient length and have a certain percent of GC content sufficient to reach the desired T_m for sequencing primer hybridization on the selected sequencing instrument (see, e.g., Example 1).

In some embodiments, the length and GC content of the first sequencing binding site of the first adaptor can be reduced, because a portion of the first sequencing binding site is provided by the second universal primer binding site of the second adaptor following circularization of the template nucleic acid (see, Example 2 and Figure 12). For example, the first sequencing binding site of the first adaptor can be about 10-20 nucleotides in length, with about 30-80% GC content. In some embodiments, modified nucleotides, e.g, having modifications to the nitrogenous base, 5-carbon sugar, phosphate moiety, or any combination thereof, spacers, or both are incorporated into the first adaptor to improve system working performance, automation and surface fixation. Examples of spacer modifications include C3 spacer, C6 spacer, C12 spacer, spacer 9, spacer 18 (hexaethyleneglycol), dSpacer (abasic furan), ribospacer rSpacer, PC spacer, and hexanediol.

As noted above, the first sequencing primer binding site of the first adaptor further comprises a first portion of a bridge oligonucleotide binding site. As used herein,“bridge oligonucleotide,” also known as“guide oligonucleotide,” refers to a nucleic acid sequence designed for circularization of linear, single- stranded nucleic acid templates. The bridge oligonucleotide comprises a sequence complementary to the 5’ end and 3’ end of the two flanking adaptors. The 5’ end and 3’ end of the single- stranded nucleic acid template hybridizes to the bridge oligonucleotide, which brings the 5’ end and 3’ end of the single-stranded nucleic acid template in close proximity for ligation. Preferably, the 5’end of the single-stranded nucleic acid template is phosphorylated prior to the ligation reaction to enhance ligation efficiency (see, e.g, Figures 3, 4 and 7-10).

A second adaptor comprises a second universal primer binding site, wherein the second universal primer binding site in turn comprises a second portion of the bridge oligonucleotide binding site and optionally a third sequencing primer binding site. In some embodiments, the second adaptor is single-stranded. The second universal primer binding site is designed for multiple purposes: to provide a universal primer binding site for library enrichment; to provide a second portion of a bridge oligonucleotide binding site for circularization; and optionally to provide a third sequencing primer binding site for sequencing the target nucleic acid.

In some embodiments, it may be advantageous for the third sequencing binding site of the second adaptor to be of sufficient length and have a certain percent of GC content sufficient to reach the desired T_m for sequencing primer hybridization on the selected sequencing instrument. In some embodiments, the length and GC content of the third sequencing binding site of the second adaptor can be reduced, because part of the third sequencing binding site is provided by the first universal primer binding site of the first adaptor following circularization of the template nucleic acid (see, Figures 13, 14). For example, the third sequencing binding site of the second adaptor can be about 10-20 nucleotides in length, with about 30-80% GC content.

In certain embodiments, the second adaptor further comprises a target-specific sequence 5’ to the second universal primer biding site (see“Target specific PCR primer” in Figure 2). The presence of the target-specific sequence in the second adaptor allows target enrichment via PCR using a first universal primer and the target specific PCR primer.

In certain embodiments, a second adaptor comprises from 5’ to 3’: a second portion of a bridge oligonucleotide binding site; a second sample index; a third universal primer binding site; and a target nucleic acid specific sequence, wherein the bridge oligonucleotide binding site further comprises a fourth sequencing primer binding site (for sequencing sample index) and the third universal primer binding site further comprises a fifth sequencing primer binding site (for sequencing ROI) (see Figure 11). In embodiments using dual sample indices, a second adaptor may comprise a third sequencing primer binding site, a second sample index, a second universal primer binding site, and a target-nucleic acid specific sequence, wherein the third sequencing primer binding site further comprises a second portion of the bridge oligonucleotide binding site (Figures 9-11).

In certain other embodiments, a second adaptor is a universal adaptor that comprises a second universal primer binding site without any target nucleic acid specific sequence (see Figures 7 and 8). Such a second adaptor may be useful in whole genome sequencing or other assays that do not require target enrichment. An exemplary universal adaptor is as follows:

GTAAAACGACGGCCAGTCAAGCTATGGAACACCACGTCCA (SEQ ID NO: 34)

CATTTTGCTGCCGGTCAGTTCGATACCTTGTGGTGCAGGT (SEQ ID NO: 35)

An adaptor may be added to a target nucleic acid using a variety of methods including enzymatic ligation (blunt-end ligation, stick end ligation), chemical ligation, or primer extension. For example, the first adaptor is preferably added to a target nucleic acid via ligation (see, e.g., Figures 3, 4, and 7-10). In certain embodiments where a second adaptor comprises a target-specific sequence, it is preferably added to a target nucleic acid via primer extension using the second adaptor as a primer (see, e.g., Figures 3 and 4). An adaptor may be added to a target nucleic acid in whole (see, e.g. , Figures 3, 4, 7 and 8) or in phases where adjacent or overlapping pieces are assembled (see, e.g, Figures 9 and 10 wherein the second adaptor is added via target-enrichment PCR amplification and universal PCR amplification).

Exemplary adaptors that can be used according to the methods of the present disclosure are provided in Tables 2-4.

C. Primers

As used herein, the term“primer” refers to an oligonucleotide that is

complementary to a primer binding site of a template nucleic acid and capable of being extended using the template nucleic acid as a template. A primer may have about 10 to about 100 nucleotides in length, about 12 to about 80 nucleotides in length, or about 15 to about 50 nucleotides in length. In certain embodiments, a primer may have about 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,

34, 35, 36, 37, 38, 39, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, or 100 nucleotides in length.

A primer may comprise DNA, RNA, one or more modified nucleotides that contain modifications to the nitrogenous base, 5-carbon sugar, and/or phosphate moieties, or a combination thereof. Examples of modified nucleotides include nucleotides comprising 2’-0-methylribose, 5-hydroxybutynyl-2’-deoxyridine

(Integrated DNA Technologies), 2- Amino-2’ deoxy adenosine (IB A Lifesciences), 5- Methyl-2’deoxycytidine (IB A Lifesciences), locked nucleic acids (LNA), peptide nucleic acid, and phosphorodiamidate morpholinos.

As used herein,“complementary” and“complementarity” refer to

polynucleotides (i.e., a sequence of nucleotides) related by Watson-Crick base-pairing rules. For example, the sequence“A-G-T,” is complementary to the sequence“T-C-A.” Complementarity may be“partial,” in which only some of the nucleic acids’ bases are matched according to the base pairing rules. Or, there may be“complete” or“total” complementarity between the nucleic acids. The degree of complementarity between nucleic acid strands has significant effects on the efficiency and strength of

hybridization between nucleic acid strands. Two sequences are described as “complementary” to one another when hybridization occurs in an antiparallel configuration.

A primer“specifically hybridizes” or“specifically binds” to a primer binding site if the primer hybridizes to the target under reaction conditions for which the primer is used ( e.g ., amplification conditions, primer extension conditions, and sequencing reaction conditions) with a Tm substantially greater than 45°C, preferably at least 50°C, and typically 60°C-80°C or higher. Such hybridization preferably corresponds to stringent hybridization conditions. Again, such hybridization may occur with“near” or “substantial” complementarity of the antisense oligomer to the target sequence, as well as with exact complementarity.

The melting temperature (Tm) of an oligonucleotide used in the present disclosure is the temperature at which 50% of the oligonucleotide is duplexed with its perfect complement and 50% is free in a solution, such as 115 mM KC1. Tm is determined by measuring the absorbance change of the oligonucleotide with its complement as a function of temperature (i.e., generation of a melting curve). The Tm is the reading halfway between the double-stranded DNA and single stranded DNA plateaus in the melting curve. Factors influencing Tm include length of the

oligonucleotide molecule, the specific sequence of the oligonucleotide, and buffer components, etc. Alternatively, the Tm of an oligonucleotide (e.g., an oligonucleotide that is 14-20 nucleotides in length) may be calculated based on the following formula:

Tm = 2 °C(A + T) + 4 °C(G + C)

The above formula assigns 2°C to each A-T pair and 4°C to each G-C pair. The Tm then is the sum of these values for all individual pairs in a DNA double strand.

A primer may be 100% complementary or partially complementary to the primer binding sequence in an adaptor to which it hybridizes. In certain embodiments, a primer is at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% complementary to the primer binding sequence in an adaptor to which it hybridizes.

The“% complementary” is determined based on the length of the primer binding sequence. For example, if a 20-nucleotide primer has 14 nucleotides complementary to a 15-nucleotide primer binding sequence in an adaptor, the % complementary is 93% (/.<?., 14/15).

In certain embodiments, a primer further comprises additional sequence at the 5' end of the primer that is not complementary to the template nucleic acid sequence ( e.g ., primer binding site in the adaptor). The non-complementary portion of a primer may be at a length that does not interfere with the hybridization between the primer and its primer binding site. In some embodiments, the non-complementary portion is about 1 to 50, 1 to 40, 1 to 30, or 1 to 20 nucleotides long.

Examples of primers include but are not limited to an“extension primer,” a “universal primer,” a“target-specific primer,” a“RCA amplification primer,” or a “sequencing primer.”

An“extension primer” is used in a primer extension reaction by a DNA polymerase. In some embodiments, a primer extension reaction is a single primer extension (SPE) reaction where a SPE primer comprising a target nucleic acid specific sequence and a 5’ universal primer binding site repeatedly hybridizes to the same target locus from different nucleic acid templates resulting in target nucleic acid enrichment. An extension primer may be referred to as a“PCR primer,” an“amplification primer” or the like when used in an amplification reaction such as PCR. Preferably, an extension primer is about 10 to about 50 nucleotides long, such as about 15 to about 35 nucleotides long.

A "universal primer" or a“universal PCR primer” refers to a primer that binds to sequence present in the nucleic acid template. Typically, the universal primer hybridizes to common sequences present in adaptors or target-specific primers. The universal primer can bind to and direct primer extension from the universal priming site. Universal primers may be used to amplify a library of target nucleic acid templates to be sequenced. A universal primer may be referred to as a“boosting primer” when used in combination with a target specific primer for target enrichment PCR.

Preferably, a universal primer is about 15 to about 25 nucleotides long.

A“target-specific primer,”“target-specific nucleic acid primer,” or the like refers to a primer that hybridizes to target nucleic acid specific sequence, rather than adaptor specific sequence. In addition to a region that is specific to a target nucleic acid sequence, a target-specific primer may comprise an additional region, such as a universal primer binding sequence. Preferably, the region specific to a target nucleic acid sequence in a target-specific primer is about 13 to about 25 nucleotides long, and the additional region if present is about 10 to about 20 nucleotide long. The overall length of a target-specific primer preferably about 23 to about 45 nucleotide long if the primer comprises the additional region.

A“RCA amplification primer” or the like refers to a primer used for RCA amplification. Its sequence may be a portion of a first adaptor ( e.g ., the linker sequence of the first adaptor or a substantial portion thereof) or a second adaptor or a substantial portion thereof. A substantial portion of a first or second adaptor refers to a portion of the first or second adaptor that is at least 10, 11, 12, 13, 14, or 15 nucleotides in length. Preferably, a RCA primer is about 13 to about 20 nucleotides long. Additional description of RCA primers may be found in the subsection“Rolling Circle

Amplification” and the Examples below.

A“sequencing primer” refers to a primer that is used in sequencing reactions, e.g., sequencing-by-synthesis reactions or sequencing-by-ligation reaction, such as a combinatorial probe-anchor ligation reaction (cPAL). Preferably, a sequencing primer is about 15 to about 35 nucleotides long, such as about 15 to about 30 nucleotides long.

Exemplary primers that can be used in the methods according to the present disclosure are provided in Tables 2-4.

D. Library Construction

The present disclosure provides methods of producing a library of circular, single-stranded nucleic acid templates, each circular single-stranded nucleic acid template comprising a strand of a double-stranded target nucleic acid, a strand of a first adaptor or the complement thereof, and a strand of a second adaptor or the complement thereof. In massively parallel sequencing (MPS) methods, at least one library of nucleic acid templates is produced and individual constructs in the library are sequenced in parallel. Frequently, large numbers of libraries are pooled together and sequenced simultaneously during a single sequencing run. Thus, while reference may be made with respect to a target nucleic acid or a nucleic acid template, it will be recognized that MPS methods are typically performed on a large library or pool of libraries of nucleic acid templates.

The complement of a strand of an adaptor is an oligonucleotide that is of about the same (including the same) length as the strand of the adaptor and is completely complementary to the strand of the adaptor. For example, if an exemplary first adaptor is a partially double-stranded oligonucleotide with the longer strand that is 60 nucleotides long and the shorter strand that is 30 nucleotides long, then the complement of the longer strand of the first adaptor would be about 60 nucleotides long and is completely complementary to the longer strand (i.e., contains no mismatch, no internal insertion, and no internal deletion). If an exemplary second adaptor is a target-specific primer that is 40 nucleotides in length and comprises a target-specific sequence and a universal primer sequence, then the complement of the second adaptor is about 40 nucleotides in length and is completely complementary to the second adaptor.

In one aspect, the method comprises: a) providing a plurality of fragments of double-stranded target nucleic acids; b) adding a first adaptor to a 5’ terminus of a sense strand and to a 3’ terminus of an antisense strand of the plurality of fragments of double-stranded nucleic acids, wherein the first adaptor comprises: (i) a single-stranded region comprising a first sequencing primer binding site, an optional unique molecular identifier (UMI), and a first sample index sequence, wherein the first sequencing primer binding site comprises a first universal primer binding site and a first portion of a bridge oligonucleotide binding site; (ii) a double stranded linker region of about 15 to about 35 bases (preferably about 20 to 30 bases) for ligation to the plurality of fragments of double-stranded nucleic acids, wherein the double stranded linker region comprises a second sequencing primer binding site; c) adding a second adaptor to a 3’ terminus of the sense strand and to a 5’ terminus of the antisense strand of the plurality of fragments of double-stranded nucleic acids to produce a library of linear, double-stranded nucleic acid templates, wherein the second adaptor comprises a second universal primer binding site, wherein the second universal primer binding site comprises a second portion of the bridge oligonucleotide binding site and optionally a third sequencing primer binding site; d) optionally amplifying the library of linear, double-stranded nucleic acid templates with a first universal primer that binds to the first primer binding site and a second universal primer that binds to the second primer binding site; e) denaturing the library of linear, double-stranded nucleic acid templates to produce a library of linear, single-stranded nucleic acid templates; and f) circularizing the library of linear, single-stranded nucleic acid templates by adding a bridge

oligonucleotide and ligating the first adaptor and second adaptor, thereby producing the library of circular, single-stranded nucleic acid templates.

The double-stranded target nucleic acids are obtained from isolated nucleic acids from a sample ( e.g ., genomic DNA). The double-stranded target nucleic acids may be fragmented by physical, chemical, or enzymatic, means and fragments of double-stranded target nucleic acids of a desired size range are selected. The ends of the size selected fragments of double-stranded target nucleic acids may then be repaired to produce blunt-ended, size-selected double-stranded target nucleic acids. In certain embodiments, 3’ A-tails may then be added to the blunt-ended, size-selected fragments of double-stranded target nucleic acids using a DNA polymerase. Matching 3’ T overhangs may be added to the first adaptor to facilitate ligation with the A-tailed, double-stranded, target nucleic acids. At this time, the first adaptor may be present at one or both ends of the A-tailed, double-stranded target nucleic acids.“Ligation” means to form a covalent bond or linkage between the termini of two or more nucleic acids. The ligation may be an enzymatic ligation, which forms a phosphodiester linkage between a 5’ carbon terminal nucleotide of one DNA strand with a 3’ carbon of another DNA strand, or a chemical ligation. After the step of ligation with the first adaptor, a library of nucleic acid templates is generated which have common sequences at their 5' and 3' ends (step B of Figures 3 and 4, ligation products). In this context the term“common” is interpreted as meaning common to all templates in the library. As explained in further detail below, all templates within the library will contain regions of common sequence at (or proximal to) their 5' and 3' ends. In certain other

embodiments, blunt-ended, size-selected double-stranded target nucleic acids may be directly ligated to the double-stranded linker region of the first adaptor (see, e.g., Figures 7 and 8). To improve ligation efficiency, the end of the first adaptor and/or the end of the universal adaptor can be modified to prevent non-desired ligation. The modification can be chemical modification, for example but not limited to, C3 spacer. In certain embodiments, the second adaptor may comprise a target nucleic acid specific sequence (see, e.g., Figures 3 and 4). The second adaptor hybridizes to the double-stranded target nucleic acids via the target nucleic acid specific sequence, which is used to enrich target nucleic acids with the first universal primer via single primer extension. After the step of target enrichment, the library of nucleic acid templates comprises the first adaptor at only one end (step C of Figures 3 and 4, SPE

amplification products). The other end of the nucleic acid templates is replaced by the second adaptor (step C of Figures 3 and 4, SPE amplification products).

The library of nucleic acid templates undergoes another round of amplification using the first universal primer and the second universal primer (step D of Figures 3 and 4). In some embodiments, a proof-reading DNA polymerase is used during the step of universal PCR amplification. In embodiments where a non-proof-reading DNA polymerase is used during the step of universal PCR amplification, it should be noted that a non-templated 3’ A is added to the 3’ end of the amplicons. Any corresponding bridge oligonucleotide used for circularization should be designed to accommodate this “A” addition (e.g, a corresponding“T” may be added in the bridge oligonucleotide at the junction of the first adaptor and second adaptor, see also Tables 2-4).

The amplified library of double-stranded nucleic acid templates may then be prepared for circularization. The library is denatured, for example, by heat, chemical (e.g, NaOH, high salt concentration, high pH), to produce a library of linear, single- stranded nucleic acid templates. The single-stranded nucleic acid templates may preferably undergo 5’ phosphorylation to facilitate circularization and increase ligation strand specificity. The 5’ phosphorylation group can be added enzymatically, for example using a T4 polynucleotide kinase. The 5’ phosphorylation group can also be added to the strand that is to be circularized during the universal PCR step by using a 5’ phosphorylated universal primer in the universal amplification reaction (step D of Figures 3 and 4, universal amplification products). In some embodiments, the linear, single-stranded nucleic acid template is circularized by ligating the first adaptor and second adaptor (step E of Figures 3 and 4). A single stranded DNA ligase (e.g, CircLigase™) or double stranded DNA ligase (e.g, T4 DNA ligase) may be used. In some embodiments, a bridge oligonucleotide hybridizes to the 5’ end and 3’ end of the two flanking adaptor molecules and brings the 5’ end and 3’ end of the single-stranded nucleic acid template in close proximity to facilitate ligation (see, e.g. , Figures 12-14).

Step D of Figures 3 and 4 shows different strands of double-stranded DNA templates are phosphorylated. When the top strand is phosphorylated by PCR using universal primers with phosphorylation at the 5’ end of the first adaptor, the corresponding rolonies are concatemers of the bottom strand, named as“bottom-strand rolony” (step F of Figure 3). When the bottom strand is phosphorylated by PCR using primers with phosphorylation at the 5’ end of second adaptor, the corresponding rolonies are concatemers of the top strand, named as“top-strand rolony” (step F of Figure 4).

In some other embodiments, the second adaptor does not comprise any target nucleic acid specific sequence. For example, a library of circular, single-stranded nucleic acid templates may be constructed in a method using a first adaptor and a second adaptor that is universal adaptor (see Figures 7 and 8). The method comprises: a) providing a plurality of fragments of double-stranded target nucleic acids; b) adding a first adaptor to a 5’ terminus of a sense strand and to a 3’ terminus of an antisense strand of the plurality of fragments of double-stranded nucleic acids, wherein the first adaptor comprises: (i) a single-stranded region comprising a first sequencing primer binding site, an optional unique molecular identifier (UMI), and a first sample index sequence, wherein the first sequencing primer binding site comprises a first universal primer binding site and a first portion of a bridge oligonucleotide binding site; (ii) a double stranded linker region of about 15 to about 35 bases for ligation to the plurality of fragments of double-stranded nucleic acids, wherein the double stranded linker region comprises a second sequencing primer binding site; c) adding a second adaptor to a 3’ terminus of the sense strand and to a 5’ terminus of the antisense strand of the plurality of fragments of double-stranded nucleic acids to produce a library of linear, double-stranded nucleic acid templates, wherein the second adaptor comprises a second universal primer binding site; d) amplifying the library of linear, double-stranded nucleic acid templates with a first universal primer that binds to the first primer binding site and a second universal primer that binds to the second primer binding site; e) denaturing the library of linear, double-stranded nucleic acid templates to produce a library of linear, single-stranded nucleic acid templates; and f) circularizing the library of linear, single-stranded nucleic acid templates by adding a bridge oligonucleotide and ligating the first adaptor and second adaptor, thereby producing the library of circular, single-stranded nucleic acid templates.

In certain further embodiments, a library of circular, single-stranded nucleic acid templates may be constructed for use with dual sample indices (see Figures 9 and 10).

A second sample index may be introduced into library constructs in a variety of ways. An exemplary second adaptor comprising a second sample index is shown in Figure 11. In some embodiments, the second adaptor comprises from 5’ to 3’: (i) a bridge oligonucleotide binding site; (ii) a second sample index; (iii) a 3^rd universal primer binding site; and (iv) a target nucleic acid specific sequence, wherein the bridge oligonucleotide binding site comprises a portion of the bridge oligonucleotide binding site and a 4^th sequencing primer binding site, or portion thereof ( e.g ., for sequencing second sample index) and the 3^rd universal primer binding site further comprises a 5^th sequencing primer binding site (e.g., for sequencing ROI). In some embodiments, the second adaptor is added to the target nucleic acid in a series of PCR with portions of the second adaptor as shown in steps C and D of Figures 9 and 10, and Figure 11. For example, in Figure 9, steps A to C are the same as those in Figure 3 except that the “target specific PCR primer” is referred to as the“2^nd adaptor” in Figure 3. In step D of Figure 9, a 5’ phosphorylated universal primer and a primer comprising (i) a bridge oligonucleotide binding site; (ii) a second sample index; and (iii) a 3^rd universal primer binding site of the second adaptor as described above are used in universal PCR amplification to generate double stranded template nucleic acids.

While the methods described in this section pertain to producing a library of nucleic acid templates, it is understood that these methods could also be readily applied to a method of producing a circular, single-stranded nucleic acid template, which may be used for production of a rolony for sequencing.

E. Rolling Circle Amplification

Rolonies may then be produced from the library of circular, single-stranded nucleic acid templates prepared as described above. A“rolony” or“rolling circle colony” is a single-stranded DNA concatemer that is produced by rolling circle amplification (RCA) of a circularized DNA fragment. A“concatemer” refers to a long, continuous DNA molecule that comprises multiple copies of the same DNA sequence linked in series. A concatemer may comprise at least 2, 5, 10, 15, 20, 25, 50, 75, 100, 200, 300, 400, 500, 600, 700, 800, 900, or 1000 monomers, wherein each monomer comprises a nucleic acid template ( e.g ., first adaptor-target nucleic acid-second adaptor).“DNA nanoballs” or“DNBs” are single-stranded DNA concatemers of sufficient length to form random coils that fill a roughly spherical volume in solution (e.g., SSC buffer at room temperature). In some embodiments, DNA nanoballs have a diameter of from about 100 to 300 nm. As used herein,“concatemer,”“DNA nanoball,” and“rolony” may be used interchangeably.

“Rolling circle amplification” (RCA) refers to amplification of a circular, nucleic acid template using at least one primer that hybridizes to one strand of the circular nucleic acid template to produce rolonies that represent the other strand of the circular nucleic acid template. A rolling circle amplification primer may comprise random sequence, sequence that hybridizes to an adaptor, or sequence that hybridizes to a junction region of two adaptors created when the nucleic acid template was circularized. In some embodiments, the RCA primer hybridizes to the linker region of the first adaptor, the first sequencing primer binding site of the first adaptor, or the second universal primer binding site of the second adaptor. Using an RCA primer that hybridizes to the“sense” or“top” strand of the circular nucleic acid template for RCA produces a“bottom- strand rolony” (steps E and F of Figure 3 and Figure 12). Using an RCA primer that hybridizes to the“antisense” or“bottom” strand of the circular nucleic acid template for RCA produces a“top-strand rolony” (steps E and F of Figure 4 and 14). Each monomer in the rolonies produced according to the methods provided in the present disclosure comprises two separate sequencing primer binding sites on the same strand (see,“Seq 1 A” and“Seq 2A” in step F of Figure 3 for bottom-strand rolony and “Seq IB” and“Seq 2B” in step F of Figure 4 for top strand rolony). The rolonies may be used as templates for sequencing reactions.

RCA based clonal amplification provides a simple solution that can often eliminate the need for emulsion PCR (ePCR) and thereby provide the option of eliminating an often expensive and labor-intensive step in many next generation sequencing methods.

Preferably, a DNA polymerase having suitable strand displacement activities is used to produce the rolonies. DNA polymerases having strand displacement activity include, but are not limited to, Phi29, Bst DNA polymerase, SensiPhi DNA polymerase, Klenow fragment of DNA polymerase I, and Deep-VentR DNA polymerase

(NEB#M0258).

Table 1 shows the differences of the rolonies generated by different DNA strands. With use of the first adaptor, the sequencing of both kinds of rolonies (top strand and bottom strand) can avoid sequencing the low diversity linker region of the first adaptor.

Table 1 : Comparison of Rolonies Generated from Two Different Strands

F. Substrates

Rolonies produced according to the present disclosure may be immobilized on a substrate. A substrate comprises a plurality of sites for attachment of a plurality of rolonies. Exemplary substrates include planar substrates ( e.g ., slides), non-planar substrates, bead substrates, or arrays comprising spots or wells. Exemplary materials used for substrates include glass, ceramic, silica, silicon, quartz, various plastics, metal, elastomer (e.g., silicone), and polyacrylamide.

Rolonies may be immobilized to the surface of a substrate using a variety of techniques, including covalent and non-covalent attachment. In one embodiment, a substrate surface may comprise short oligonucleotides that form complexes, e.g, double-stranded duplexes, with a component ( e.g ., an adaptor sequence or a portion thereof) of the rolonies. In one embodiment, a substrate surface may comprise reactive functionalities that interact with complementary functionalities on the rolonies to form a covalent linkage (chemical attachment). For example, during RCA, modified nucleotides may be used to incorporate moieties such as bromide or thiol that can then be used in a crosslinking reaction. Thiol-modified DNA can be covalently linked to a mercaptosilanized glass via an alkylating reagent such as iodoacetamide. In another embodiment, rolonies are immobilized through non-specific interactions with the substrate surface, such as via electrostatic interactions, hydrogen bonding, van der Waals forces, etc. For example, rolonies can be non-specifically, electrostatically deposited onto glass surfaces with polyamine attached.

In certain embodiments, rolonies are deposited onto a solid substrate randomly so that the rolonies on resulting substrate do not form a defined pattern. In certain other embodiments, rolonies may be confined to discrete regions on a substrate. The discrete regions may be arranged in a pattern, e.g., rectilinear pattern, hexagonal pattern, etc. A regular pattern or array may be advantageous for detection and analysis of sequencing data.

In certain embodiments, rolonies are immobilized on a flow cell. A flow cell is a glass slide containing small fluidic channels, through which polymerases, dNTPs and buffers can be pumped. The glass inside the channels may be dotted with short oligonucleotides complementary to at least a portion of an adaptor sequence of rolonies. Rolonies may be hybridized to these oligonucleotides and thus immobilized onto the flow cell. Alternatively, a flow cell or its fluidic channels may be coated with moieties that non-specifically (i.e., not in a sequence-dependent manner) bind to rolonies. The coating may be uniform on the flow cell surface or its fluidic channel surface or may be patterned with areas capable of binding rolonies separated by those incapable of binding rolonies.

Methods of forming arrays of rolonies have also been described in Patent Publication Nos. W02007120208, W02006073504, WO2007133831, and

US2007099208, each of which is incorporated herein by reference in its entirety. G. Sequencing

In certain embodiments, following production of rolonies as described above, and optionally immobilizing the rolonies on a substrate surface, at least a portion of the rolony is sequenced. In certain embodiments, the method comprises hybridizing a sequencing primer that is complementary to at least a portion of at least one adapter.

The output of a sequencing reaction is called a“sequence read,” which is a single, uninterrupted series of nucleotides representing the sequence of at least a portion of the rolony.

Any suitable sequencing method may be used to determine the sequence of at least a portion of the rolonies produced from the library of circular nucleic acid templates, including for example, sequencing by synthesis, sequencing by ligation, combinatorial probe anchor ligation (cPAL), pyrosequencing, etc. Sequencing by synthesis has been described in U.S. Pat. Nos. 6,210,891; 6,828,100, 6,833,246;

6,911,345; 6,969,488; 6,897,023; 6,833,246; and 6,787,308; Patent Publication Nos. 200401061 30; 20030064398; and 20030022207; Margulies et al.,

2005, Nature 437:376-380; Ronaghi et al., 1996, Anal. Biochem. 242:84-89; Constans, A, 2003, The Scientist 17(13):36; and Bentley et al., 2008, Nature 456(7218): 53-59. Sequencing by ligation has been described in U.S. Patent Publication

Nos.WO 1999019341, W02005082098, W02006073504, WO2011/044437 and Shendure et al., 2005, Science, 309: 1728-1739.). Pyrosequencing has been described in Ronaghi et al., 1996, Anal. Biochem. 242:84-89.

In one embodiment, the sequencing method comprises sequential sequencing.

As used herein,“sequential sequencing” refers to a sequencing process involving multiple different sequencing primers sequentially used in a sequencing run on the same substrate ( e.g ., flow cell).

In embodiments where multiple sequencing primers are used in the same sequencing run, the order of addition of the sequencing primers may vary. An exemplary sequential sequencing method using two different sequencing primers (Seq 1 and Seq 2) provides: 1) hybridization of the second sequencing primer (Seq 2) to concatemers produced from a library of circular nucleic acid templates; 2) sequencing at least one portion of the concatemer with X cycles following the second sequence primer, thereby generating a first sequencing fragment; 3) removing the first sequencing fragment in the sequencing instrument; 4) hybridization of the first sequencing primer (Seq 1) to the concatemers produced from the library of circular nucleic acid templates; 5) sequencing at least one portion of the concatemer with Y cycles following the first sequencing primer, thereby generating a second sequencing fragment. In some embodiments, X and/or Y is/are more than 2, 3, 4, 5, 6, 7, 8, 9, or 10 cycles. In some embodiments where the order of addition of sequencing primers calls for with Seq 2 first, the regions of interest may be sequenced first (see Figures 12, 13). This outcome is a significant difference from single-read sequencing which sequences the sample index and unique molecular identifier first. Sequencing the region of interest first may offer the advantage of high quality signal. In some embodiments, the order of the sequencing primers can be changed, e.g ., to change the order of what portion of the rolony is sequenced first (e.g, ROI or UMI/sample index).

A library of nucleic acid templates constructed with the first adaptor and second adaptor according to the methods provided in the present disclosure may be used for single-end sequencing or paired-end, rolony-based sequencing (see Figure 6). As used herein,“paired-end sequencing,” also referred to as“pairwise sequencing,” generally refers to the obtaining two sequencing“reads” of a template nucleic acid from both ends or strands of a single template nucleic acid. In embodiments involving a circular template nucleic acid, paired-end sequencing may involve obtaining sequencing reads from a top strand rolony and bottom strand rolony produced from a single double stranded template nucleic acid. Paired end sequencing offers the advantage of improved accuracy and ability to identify indels. There is significantly more information that may be gained from sequencing two stretches each of“N” bases from a single template nucleic acid than from sequencing“N” bases from each of two independent template nucleic acids in a random fashion.

During the step of library construction, the SPE amplification products (see, e.g, step C of Figures 3 and 4) can be separated into two reactions for the following steps: universal PCR amplification with phosphorylated primers for one specific strand (either top or bottom strand), separate clonal amplifications to generate top strand and bottom strand rolonies. The top strand and bottom strand rolonies can be seeded on the same flow cell, designed with two separate inlets and outlets (see Figure 6, step 3). Sequencing for the top rolony strands and bottom rolony strands are performed in separate areas of the flow cell at the same time in the sequencer.

In some embodiments, the top strand and bottom strand rolonies are seeded on different flow cells, designed with single set of inlet and outlet. Sequencing for the top rolony strands and bottom rolony strands are performed in different flow cells at the same time in the sequencer.

H. Sets of 1^st Adaptors for Preparing Library of Rolonies for Sequencing

The present disclosure also provides a set of first adaptors for preparing library of rolonies for sequencing. The set of first adaptors comprises a plurality of partially double-stranded adaptors, each adaptor of the set comprises:

(i) a single-stranded region comprising a first sequencing primer binding site, a unique molecular identifier (UMI), and a sample index sequence, wherein the first sequencing primer binding site comprises a first universal primer binding site and a first portion of a bridge oligonucleotide binding site;

(ii) a double stranded linker region of about 20 to about 30 bases for ligation to a double-stranded nucleic acid, wherein the double stranded linker region comprises a second sequencing primer binding site;

Different components of the partially double-stranded first adaptors are discussed above in the“Adaptors” section.

The present disclosure also provides a plurality of sets of first adaptors for preparing library of rolonies for sequencing. Each set of first adaptors is as described above wherein the first sample index sequences of different sets are different from each other. Different sets of first adaptors are added to at one end of target nucleic acids from different samples or sources. A second adaptor is then added to the other end of the target nucleic acids. The resulting nucleic acids comprising the target nucleic acids flanked by the first and second adaptors may be combined together for amplification (optional), circularization, RCA amplification, and sequencing. In a related aspect, the present disclosure provides use of the set or the plurality of sets of partially double-stranded first adaptors in preparing library of rolonies for sequencing.

I. Kits for Preparing Library of Rolonies for Sequencing

The present disclosure also provides a kit for preparing a library of rolonies for sequencing comprising one or more of the following: (1) a set of first adaptor; (2) a second adaptor; (3) a first universal primer; (4) a second universal primer; (5) a bridge oligonucleotide; (6) a RCA primer; (7) a first sequencing primer; (8) one or more additional sequencing primers. These components are described in other sections above.

In certain embodiments, the kit may further comprise a DNA ligase; a DNA polymerase with or without proofreading activity; a DNA polymerase with strand displacement activity, a DNA polymerase for sequencing, reaction buffers suitable for ligation, primer extension or sequencing, or any combination thereof.

The components of the kits are typically contained in separate vessels or compartments. However, when appropriate, some of the components may be provided as a mixture or composition. Additional descriptions of the components are provided in other sections, including the Examples, of the present disclosure.

In a related aspect, the present disclosure provides use of the kits for preparing a library of rolonies for sequencing.

EXAMPLES

EXAMPLE 1 : DESIGN OF OLIGONUCLEOTIDES/ADAPTORS FOR PRODUCTION OF BOTTOM-

STRAND ROLONY

Table 2 shows exemplary oligonucleotide/adaptor sequences for designing a template nucleic acid and production of a bottom-strand rolony, where the binding site for the sequencing primer for the region of interest (Seq 2) and sequencing primer for the sample index and UMI (Seq 1) are both only present in the first adaptor (Figure 3). Figure 12 shows the corresponding structure of 3D (linear universal amplification product) and 3E (circular nucleic acid template product). In this example, the RCA amplification primer is designed based on the universal sequence of the second adaptor.

Table 2: Oligonucleotide/ Adaptor Sequences for Example 1

Underlined sequence = 26 nucleotide long double-stranded region BC=barcode or unique molecular identifier (UMI)

Index= sample index EXAMPLE 2: DESIGN OF OLIGONUCLEOTIDES/ADAPTORS FOR PRODUCTION OF BOTTOM-

STRAND ROLONY

Table 3 shows exemplary oligonucleotide/adaptor sequences for designing a template nucleic acid and production of a bottom-strand rolony where the binding site for the sequencing primer for the ROI (Seq 2) is present in the first adaptor, and the binding site for the sequencing primer for sample index and barcode (EIMI) (Seq 1) is created by the junction of the first adaptor and SPE primer upon circularization. Figure 13 shows the corresponding structure of 3D (linear universal amplification product) and 3E (circular nucleic acid template product). By this design, the sequencing primer binding site for Seq 1 in the first adaptor can be shorter than the one used in example 1 due to the contribution of additional bases from the second adaptor to the sequencing binding site. In this example, the RCA amplification primer is designed based on the linker region of the first adaptor. This RCA amplification primer design can also be applied to Example 1.

Table 3: Oligonucleotide/ Adaptor Sequences for Example 2

Index= sample index EXAMPLE 3 : DESIGN OF OLIGONUCLEOTIDES/ADAPTORS FOR PRODUCTION OF TOP-

STRAND ROLONY

Table 4 shows exemplary oligonucleotide/adaptor sequences for designing a template nucleic acid and production of a top-strand rolony where the binding site for the sequencing primer for the sample index and barcode (UMI) (Seq 2) is present in the first adaptor, and the binding site for the sequencing primer for the ROI (Seq 1) is created by the junction of the first adaptor and second adaptor upon circularization. Figure 14 shows the corresponding structure of 4D (linear universal amplification product) and 4E (circular template nucleic acid product). By this design, the sequencing primer binding site for Seq 2 in the first adaptor can be shorter than the one used in example 1 due to the contribution of additional bases from the second adaptor to the Seq 1 sequencing binding site. In this example, the RCA amplification primer is designed based on the linker region of the first adaptor. This RCA amplification primer design can also be applied to Example 1.

The oligonucleotide/adaptor sequences in this example, together with those oligonucleotide/adaptor sequences in Example 2, can be used to generate a top-strand rolony and corresponding bottom-strand rolony in a separate tubes; sequenced on a flow cell at different regions at the same time with different sequencing primers for paired- end sequencing.

Table 4: Oligonucleotide/ Adaptor Sequences for Example 3

Underlined sequence = 26 nucleotide long double-stranded region

BC=barcode or unique molecular identifier (UMI)

Index= sample index

The various embodiments described above can be combined to provide further embodiments. All of the U.S. patents, U.S. patent application publications, U.S. patent applications, foreign patents, foreign patent applications and non-patent publications referred to in this specification and/or listed in the Application Data Sheet are incorporated herein by reference, in their entirety. Aspects of the embodiments can be modified, if necessary to employ concepts of the various patents, applications and publications to provide yet further embodiments.

These and other changes can be made to the embodiments in light of the above-detailed description. In general, in the following claims, the terms used should not be construed to limit the claims to the specific embodiments disclosed in the specification and the claims, but should be construed to include all possible

embodiments along with the full scope of equivalents to which such claims are entitled. Accordingly, the claims are not limited by the disclosure.

This application claims the benefit of priority to U.S. Provisional Application No. 62/814,417, filed March 6, 2019, which application is hereby incorporated by reference in its entirety.

Claims

1. A method of producing a library of circular, single-stranded nucleic acid templates, each circular, single-stranded nucleic acid template comprising a strand of a double-stranded target nucleic acid, a strand of a first adaptor or the complement thereof, and a strand of a second adaptor or the complement thereof, the method comprising:

a. providing a plurality of fragments of double-stranded target nucleic acids;

e. denaturing the library of linear, double-stranded nucleic acid templates to produce a library of linear, single-stranded nucleic acid templates; and f. circularizing the library of linear, single-stranded nucleic acid templates by adding a bridge oligonucleotide and ligating the first adaptor and second adaptor, thereby producing the library of circular, single-stranded nucleic acid templates.

2. The method of claim 1, wherein the plurality of fragments of double- stranded target nucleic acids are derived from genomic DNA.

3. The method of claim 1 or 2, wherein the plurality of fragments of double-stranded target nucleic acids is generated by:

a. isolating genomic DNA from a sample,

b. optionally fragmenting the genomic DNA,

c. optionally selecting fragments of genomic DNA of a desired size range, d. repairing the ends of the genomic DNA of step a., the fragmented genomic DNA of step b., or the size selected fragments of genomic DNA of step c. to produce blunt-ended fragments of genomic DNA, and

e. adding 3’A-tails to the blunt-ended fragments of genomic DNA of step d., thereby producing the plurality of fragments of double-stranded target nucleic acids.

4. The method of any one of claims 1-3, wherein the first adaptor is added to the plurality of fragments of double-stranded nucleic acids by ligation.

5. The method of any one of claims 1-4, wherein the second adaptor further comprises a target nucleic acid specific sequence, and is added to the plurality of fragments of double-stranded nucleic acids by single primer extension.

6. The method of any one of claims 1-4, wherein the second adaptor is added to the plurality of fragments of double-stranded nucleic acids by ligation.

7. The method of any one of claims 1-6, wherein the second adaptor further comprises a second sample index sequence and optionally, a fourth sequencing primer binding site.

8 The method of any one of claims 1-7, further comprising producing rolonies from the library of circular nucleic acid templates.

9. The method of claim 8, wherein the rolonies are produced by rolling circle amplification.

10. The method of claim 8 or 9, wherein the rolonies comprise top-strand rolonies, bottom-strand rolonies, or both.

11. The method of any one of claims 8-10, further comprising immobilizing the rolonies on a substrate surface.

12. The method of any one of claims 8-11, further comprising sequencing at least a portion of at least one of rolonies.

13. The method of claim 12, wherein a portion of the target nucleic acid, a portion of the first adaptor, a portion of the second adaptor, or any combination thereof is sequenced.

14. The method of claim 12 or 13, wherein the sequencing is sequencing by synthesis, pyrosequencing, or sequencing by ligation.

15. The method of any one of claims 12-14, wherein the sequencing is single read sequencing or paired-end sequencing.

16. The method of any one of claims 12-15, wherein the sequencing is sequential sequencing.

17. A set of partially double-stranded adaptors for producing a library of circular, single-stranded nucleic acid templates, wherein the set comprises a plurality of partially double-stranded adaptors; wherein each adaptor of the set comprises:

18. The set of partially double-stranded adaptors of claim 17, comprising at least 1,000 different adaptors.

19. A kit for producing a library of circular, single-stranded nucleic acid templates, comprising:

(i) the set of partially double-stranded adaptors of claim 17,

(iii) the bridge oligonucleotide.