WO2024137315A1

WO2024137315A1 - Methods for purifying nucleic acids

Info

Publication number: WO2024137315A1
Application number: PCT/US2023/083910
Authority: WO
Inventors: David Barclay; Benjamin Mcnally
Original assignee: Foundation Medicine, Inc.
Priority date: 2022-12-22
Filing date: 2023-12-13
Publication date: 2024-06-27

Abstract

Provided herein are methods related to purifying DNA fragments from a sample. In particular, the methods allow for recovery of short dsDNA and ssDNA fragments as well as long dsDNA fragments from a sample, e.g., by contacting long dsDNA fragments, short dsDNA fragments, and ssDNA with a solid phase in a solution comprising about 30% to about 50% isopropanol (v/v).

Description

METHODS FOR PURIFYING NUCLEIC ACIDS

CROSS-REFERENCE TO RELATED APPLICATIONS

[0001] This application claims the priority benefit of U.S. Provisional Application No. 63/434,636, filed December 22, 2022, which is hereby incorporated by reference in its entirety.

FIELD

[0002] Provided herein are methods related to purifying DNA fragments from a sample. In particular, the methods allow for recovery of short dsDNA fragments and ssDNA as well as long dsDNA fragments from a sample, e.g., by contacting with a solid phase in a solution comprising isopropanol.

BACKGROUND

[0003] Tremendous strides have been made in genome sequencing technologies since sequencing the human genome in 2003, leading to an increase in the number and diversity of sequenced genomes and a wealth of information related to basic biology and disease. Typically, high- throughput sequencing approaches such as next-generation sequencing (NGS) or methylation sequencing begin with extraction and isolation of nucleic acids from a sample, followed by preparation of one or more libraries using the isolated nucleic acids. However, standard extraction and library preparation (prep) methods only recover double-stranded DNA (dsDNA) fragments greater than -150 base pairs (bp) in length without recovering shorter dsDNA fragments or any single-stranded DNA (ssDNA).

[0004] Some approaches exist for preparing ssDNA libraries (see, e.g., Troll, C.J. et al. (2019) BMC Genomics 20:1023; SRSLY® single-stranded DNA library prep kit, Claret Bio). This approach uses a small amount of isopropanol e.g., -7%) when binding nucleic acids to beads for purification. However, this approach was not designed to recover dsDNA or ssDNA fragments less than 135 nucleotides in length. Other techniques have been developed for generating singlestranded RNA libraries for sequencing, or for size-based separation of specific ssDNA oligos that differ by 20nt in length (see, e.g., Fishman, A. et al. (2018) Genome Biology 19:113). As such, none of these methods is aimed at recovery of short dsDNA fragments and ssDNA in addition to longer dsDNA fragments and subsequent library preparation. Providing methods for recovering all of these types of DNA fragments would be particularly useful for sequencing (including NGS and methylation sequencing), as well as more comprehensive analyses of complex samples (e.g., comprising both dsDNA and ssDNA fragments of various sizes) such as samples containing cell- free DNA (cfDNA). [0005] Therefore, a need exists for methods that allow the isolation of a wider range of nucleic acids, i.e., short and long dsDNA fragments as well as ssDNA, for approaches such as sequencing analyses. Improved recovery of ssDNA can also provide better cleanup of single-stranded nucleic acids following denaturation as part of cytosine conversion during methylation sequencing library prep. Moreover, the ability to “tune” what sizes of dsDNA and ssDNA are isolated during cleanup steps can result in purification of fragments of interest e.g., resulting from adapter ligation during library prep) while removing contaminant(s) (e.g., free adapter and/or adapter dimers).

[0006] All references cited herein, including patent applications and publications, are incorporated by reference in their entirety.

SUMMARY OF THE INVENTION

[0007] The present disclosure provides, inter alia, methods of purifying DNA fragments from a sample, including long dsDNA fragments, short dsDNA fragments, and ssDNA. These are based at least in part on the discovery herein that addition of isopropanol at certain higher concentrations during purification improves the recovery of shorter dsDNA and ssDNA fragments while retaining the ability to recover longer dsDNA fragments. Moreover, this discovery can be leveraged during various cleanup steps involved in library preparation, cytosine conversion, PCR, and sequencing to “tune” which type of nucleic acids are recovered. For example, the percentage of isopropanol can be modified before and/or during adapter ligation as compared to after adapter ligation, e.g., to allow recovery of shorter and/or single-stranded DNA fragments before adapter ligation (thereby maximizing recovery of all fragment types and sizes), while “tuning out” adapter dimers after ligation (thereby retaining longer, adapter-ligated fragments while removing free adapter and/or adapter dimer species that can interfere with downstream analyses).

[0008] In one aspect, provided herein is a method of purifying DNA fragments from a sample. In some embodiments, the method comprises a) contacting a sample that comprises short doublestranded DNA (dsDNA) fragments less than 135 base pairs (bp) in length, long dsDNA fragments greater than 135bp in length, and single-stranded DNA (ssDNA; e.g., ssDNA fragments) with a solid phase in a solution comprising about 30% to about 50% isopropanol (v/v) under conditions suitable for the short dsDNA fragments, long dsDNA fragments, and ssDNA to bind the solid phase; b) washing the solid phase and bound nucleic acids (e.g., short dsDNA fragments, long dsDNA fragments, and ssDNA), e.g., to remove excess solution and/or any contaminant(s); and c) eluting the short dsDNA fragments, long dsDNA fragments, and ssDNA from the solid phase.

[0009] In some embodiments, the short dsDNA fragments and ssDNA are greater than 25bp in length. In some embodiments, the solution in a) further comprises a crowding agent and a salt. In some embodiments, the solution in a) further comprises about 20% polyethylene glycol (PEG) and about 2.5M NaCl. In some embodiments, the solid phase comprises silica or carboxyl groups. In some embodiments, the solid phase comprises a magnetic material, and wherein the solid phase and bound nucleic acids are washed in b) in the presence of a magnetic field that retains the solid phase and bound nucleic acids. In some embodiments, the solid phase and bound nucleic acids are washed in b) in a solution comprising about 80% ethanol (v/v). In some embodiments, the methods further comprise, e.g., after c), contacting the eluted short dsDNA fragments, long dsDNA fragments, and ssDNA with a solid phase in a solution comprising about 50% isopropanol (v/v) under conditions suitable for the short dsDNA fragments, long dsDNA fragments, and ssDNA to bind the solid phase; washing the solid phase and bound nucleic acids (e.g., to remove excess solution and/or any contaminant(s)); and eluting the short dsDNA fragments, long dsDNA fragments, and ssDNA from the solid phase. In some embodiments, the methods further comprise, e.g., after eluting the short dsDNA fragments, long dsDNA fragments, and ssDNA from the solid phase: ligating one or more polynucleotide adapters to the short dsDNA fragments and long dsDNA fragments to generate a dsDNA library or ssDNA library. In some embodiments, the methods further comprise, e.g., after adapter ligation, contacting the dsDNA or ssDNA library with a solid phase in a solution comprising about 25% to about 30% isopropanol (v/v) under conditions suitable for the dsDNA or ssDNA library to bind the solid phase; washing the solid phase and bound dsDNA or ssDNA library to remove adapter dimers and/or free adapters; and eluting the dsDNA or ssDNA library from the solid phase. In some embodiments, the methods further comprise, e.g., after eluting the dsDNA or ssDNA library from the solid phase: denaturing the dsDNA or ssDNA library; and subjecting the denatured dsDNA or ssDNA library to cytosine conversion. In some embodiments, the methods further comprise, e.g., after cytosine conversion, contacting the denatured and cytosine-converted dsDNA or ssDNA library with a solid phase in a solution comprising about 50% isopropanol (v/v) under conditions suitable for the denatured and cytosine-converted dsDNA or ssDNA library to bind the solid phase; washing the solid phase and bound denatured and cytosine-converted dsDNA or ssDNA library e.g., to remove unreacted cytosine conversion reagents); and eluting the denatured and cytosine-converted dsDNA or ssDNA library from the solid phase. In some embodiments, the methods further comprise, e.g., after eluting the denatured and cytosine-converted dsDNA or ssDNA library from the solid phase, subjecting the denatured and cytosine-converted dsDNA or ssDNA library to PCR amplification. In some embodiments, the methods further comprise, e.g., after PCR amplification, subjecting the denatured and cytosine-converted dsDNA library, ssDNA library, or PCR amplicons thereof to methylation sequencing. In some embodiments, the methods further comprise, e.g., after eluting the dsDNA or ssDNA library from the solid phase: subjecting the dsDNA or ssDNA library to PCR amplification. In some embodiments, the methods further comprise, e.g., after PCR amplification, subjecting the dsDNA library and/or PCR amplicons thereof, or ssDNA library and/or PCR amplicons thereof, to sequencing. In some embodiments, the sample comprises cell-free DNA (cfDNA), circulating cell-free DNA (ccfDNA), or circulating tumor DNA (ctDNA). In some embodiments, the sample comprises fluid, cells, or tissue. In some embodiments, the sample comprises tumor cells and/or tumor nucleic acids. [0010] It is to be understood that one, some, or all of the properties of the various embodiments described herein may be combined to form other embodiments of the present invention. These and other aspects of the invention will become apparent to one of skill in the art. These and other embodiments of the invention are further described by the detailed description that follows.

BRIEF DESCRIPTION OF THE DRAWINGS

[0011] FIG. 1 shows the loss of short dsDNA fragments (arrow; left) and all ssDNA fragments (right) using standard nucleic acid isolation techniques.

[0012] FIG. 2 illustrates an exemplary method of the present disclosure for isolating long dsDNA fragments, short ssDNA fragments, and ssDNA from a sample, in accordance with some embodiments.

[0013] FIGS. 3-4 show the effect of isopropanol during nucleic acid purification on recovery of dsDNA (FIG. 3) or ssDNA (FIG. 4) fragments of various sizes. FIG. 3 shows dsDNA recovery per ladder band (upper left), comparing standard purification (SOP; caret) with modified purification using 50% isopropanol (asterisk), as well as the % recovery of dsDNA fragments at each band size using SOP or modified purification using the stated concentration of isopropanol across a range from 0-50% (v/v) (lower right). Very short dsDNA fragments (z.e., 25bp or less) are not well recovered using standard purification, but addition of higher concentrations of isopropanol (e.g. , 40% or 50%) led to better recovery. Recovery of larger DNA fragments was high across many isopropanol concentrations (e.g., 20-50%). FIG. 4 shows ssDNA recovery per ladder band (upper left), comparing standard purification (SOP; caret) with modified purification using 50% isopropanol (asterisk). Detailed recovery data are shown in the table at upper right. Also shown are the % recovery of ssDNA fragments at each band size using SOP or modified purification using the stated concentration of isopropanol across a range from 30-50% (v/v) (lower right). ssDNA is not recovered at all using standard purification techniques. ssDNA fragments greater than 30 nucleotides (nt) in length can be recovered using isopropanol. As isopropanol concentration increased, the recovery of small ssDNA fragments increased.

[0014] FIG. 5 illustrates an exemplary method of the present disclosure for isolating long dsDNA fragments, short ssDNA fragments, and ssDNA from a sample and generating a library for methylation sequencing using APOB EC deamination, in accordance with some embodiments. After deamination, nucleic acids in the sample are single-stranded, leading to poor recovery during cleanup. Use of 50% isopropanol in cleanup after deamination (circled step in flow chart) increased recovery by -80% compared to standard purification (SOP), as shown in lower box plots. Use of isopropanol also led to -32% increase in recovery after PCR, as compared to SOP. [0015] FIGS. 6A & 6B illustrate exemplary workflows for methylation sequencing, including (left to right) the steps of isolating nucleic acids (extraction), adapter ligation, methyl conversion, PCR amplification, and sequencing. These workflows illustrate how different conditions and concentrations of isopropanol can be used, depending on the intended type of nucleic acids to be used for library prep. Shown are exemplary workflows for dsDNA library prep (FIG. 6A) and ssDNA library prep (FIG. 6B).

DETAILED DESCRIPTION

[0016] The present disclosure relates generally to methods for purifying DNA, e.g., long and short dsDNA fragments and ssDNA fragments. The present disclosure demonstrates that addition of isopropanol at certain higher concentrations during purification improves the recovery of shorter dsDNA and ssDNA fragments while still allowing for recovery of longer dsDNA fragments. As such, these methods are useful not only during initial nucleic acid extraction from samples to maximize recovery of all types of DNA fragments, but also during various cleanup steps involved in library preparation, cytosine conversion, PCR, and sequencing, e.g., to “tune” which type of DNA fragments are recovered. For example, the percentage of isopropanol can be modified before and/or during adapter ligation as compared to after adapter ligation, e.g., to allow recovery of shorter and/or single-stranded DNA fragments before adapter ligation (thereby maximizing recovery of all fragment types and sizes), while “tuning out” adapter dimers after ligation (thereby retaining adapter-ligated fragments while removing free adapter and/or adapter dimers that can interfere with downstream analyses).

I. General Techniques

[0017] The techniques and procedures described or referenced herein are generally well understood and commonly employed using conventional methodology by those skilled in the art, such as, for example, the widely utilized methodologies described in Sambrook et al., Molecular Cloning: A Laboratory Manual 3d edition (2001) Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.; Current Protocols in Molecular Biology (F.M. Ausubel, et al. eds., (2003)); the series Methods in Enzymology (Academic Press, Inc.): PCR 2: A Practical Approach (M.J. MacPherson, B.D. Hames and G.R. Taylor eds. (1995)), Harlow and Lane, eds. (1988) Antibodies, A Laboratory Manual, and Animal Cell Culture (R.I. Freshney, ed. (1987)); Oligonucleotide Synthesis (M.J. Gait, ed., 1984); Methods in Molecular Biology, Humana Press; Cell Biology: A Laboratory Notebook (J.E. Cellis, ed., 1998) Academic Press; Animal Cell Culture (R.I. Freshney), ed., 1987); Introduction to Cell and Tissue Culture (J.P. Mather and P.E. Roberts, 1998) Plenum Press; Cell and Tissue Culture: Laboratory Procedures (A. Doyle, J.B. Griffiths, and D.G. Newell, eds., 1993-8) J. Wiley and Sons; Handbook of Experimental Immunology (D.M. Weir and C.C. Blackwell, eds.); Gene Transfer Vectors for Mammalian Cells (J.M. Miller and M.P. Calos, eds., 1987); PCR: The Polymerase Chain Reaction, (Mullis et al., eds., 1994); Current Protocols in Immunology (J.E. Coligan et al., eds., 1991); Short Protocols in Molecular Biology (Wiley and Sons, 1999); Immunobiology (C.A. Janeway and P. Travers, 1997); Antibodies (P. Finch, 1997); Antibodies: A Practical Approach (D. Catty., ed., IRL Press, 1988-1989); Monoclonal Antibodies: A Practical Approach (P. Shepherd and C. Dean, eds., Oxford University Press, 2000); Using Antibodies: A Laboratory Manual (E. Harlow and D. Lane (Cold Spring Harbor Laboratory Press, 1999); The Antibodies (M. Zanetti and J. D. Capra, eds., Harwood Academic Publishers, 1995); and Cancer: Principles and Practice of Oncology (V.T.

DeVita et al., eds., J.B. Lippincott Company, 1993).

II. Definitions

[0018] As used in this specification and the appended claims, the singular forms “a”, “an” and “the” include plural referents unless the context clearly dictates otherwise. Thus, for example, reference to “a molecule” optionally includes a combination of two or more such molecules, and the like.

[0019] The term “about” as used herein refers to the usual error range for the respective value readily known to the skilled person in this technical field. Reference to “about” a value or parameter herein includes (and describes) embodiments that are directed to that value or parameter per se.

[0020] It is understood that aspects and embodiments of the invention described herein include “comprising,” “consisting,” and “consisting essentially of’ aspects and embodiments.

[0021] The terms “cancer” and “cancerous” refer to or describe the physiological condition in mammals that is typically characterized by unregulated cell growth. Included in this definition are benign and malignant cancers.

[0022] The term “tumor,” as used herein, refers to all neoplastic cell growth and proliferation, whether malignant or benign, and all pre-cancerous and cancerous cells and tissues. The terms “cancer,” “cancerous,” and “tumor” are not mutually exclusive as referred to herein.

[0023] “Polynucleotide,” or “nucleic acid,” as used interchangeably herein, refer to polymers of nucleotides of any length, and include DNA and RNA. The nucleotides can be deoxyribonucleotides, ribonucleotides, modified nucleotides or bases, and/or their analogs, or any substrate that can be incorporated into a polymer by DNA or RNA polymerase, or by a synthetic reaction. Thus, for instance, polynucleotides as defined herein include, without limitation, single- and double-stranded DNA, DNA including single- and double-stranded regions, single- and double-stranded RNA, and RNA including single- and double-stranded regions, hybrid molecules comprising DNA and RNA that may be single-stranded or, more typically, double-stranded or include single- and double-stranded regions. In addition, the term “polynucleotide” as used herein refers to triple-stranded regions comprising RNA or DNA or both RNA and DNA. The strands in such regions may be from the same molecule or from different molecules. The regions may include all of one or more of the molecules, but more typically involve only a region of some of the molecules. One of the molecules of a triple -helical region often is an oligonucleotide. The term “polynucleotide” specifically includes cDNAs.

[0024] A polynucleotide may comprise modified nucleotides, such as methylated nucleotides and their analogs. If present, modification to the nucleotide structure may be imparted before or after assembly of the polymer. The sequence of nucleotides may be interrupted by non-nucleotide components. A polynucleotide may be further modified after synthesis, such as by conjugation with a label. Other types of modifications include, for example, “caps,” substitution of one or more of the naturally-occurring nucleotides with an analog, internucleotide modifications such as, for example, those with uncharged linkages (e.g., methyl phosphonates, phosphotriesters, phosphoamidates, carbamates, and the like) and with charged linkages (e.g., phosphorothioates, phosphorodithioates, and the like), those containing pendant moieties, such as, for example, proteins (e.g., nucleases, toxins, antibodies, signal peptides, poly-L-lysine, and the like), those with intercalators (e.g., acridine, psoralen, and the like), those containing chelators (e.g., metals, radioactive metals, boron, oxidative metals, and the like), those containing alkylators, those with modified linkages (e.g., alpha anomeric nucleic acids), as well as unmodified forms of the polynucleotide(s). Further, any of the hydroxyl groups ordinarily present in the sugars may be replaced, for example, by phosphonate groups, phosphate groups, protected by standard protecting groups, or activated to prepare additional linkages to additional nucleotides, or may be conjugated to solid or semi-solid supports. The 5' and 3' terminal OH can be phosphorylated or substituted with amines or organic capping group moieties of from 1 to 20 carbon atoms. Other hydroxyls may also be derivatized to standard protecting groups. Polynucleotides can also contain analogous forms of ribose or deoxyribose sugars that are generally known in the art, including, for example, 2'-0-methyl-, 2'-0-allyl-, 2'-fluoro-, or 2'-azido-ribose, carbocyclic sugar analogs, a- anomeric sugars, epimeric sugars such as arabinose, xyloses or lyxoses, pyranose sugars, furanose sugars, sedoheptuloses, acyclic analogs, and abasic nucleoside analogs such as methyl riboside. One or more phosphodiester linkages may be replaced by alternative linking groups. These alternative linking groups include, but are not limited to, embodiments wherein phosphate is replaced by P(0)S ("thioate"), P(S)S ("dithioate"), "(0)NR2 ("amidate"), P(0)R, P(0)OR', CO or CH2 ("formacetal"), in which each R or R' is independently H or substituted or unsubstituted alkyl (1 -20 C) optionally containing an ether (-0-) linkage, aryl, alkenyl, cycloalkyl, cycloalkenyl or araldyl. Not all linkages in a polynucleotide need be identical. A polynucleotide can contain one or more different types of modifications as described herein and/or multiple modifications of the same type. The preceding description applies to all polynucleotides referred to herein, including RNA and DNA.

[0025] “Oligonucleotide,” as used herein, generally refers to short, single stranded, polynucleotides that are, but not necessarily, less than about 250 nucleotides in length. Oligonucleotides may be synthetic. The terms “oligonucleotide” and “polynucleotide” are not mutually exclusive. The description above for polynucleotides is equally and fully applicable to oligonucleotides .

[0026] The term “detection” includes any means of detecting, including direct and indirect detection.

[0027] “Amplification,” as used herein generally refers to the process of producing multiple copies of a desired sequence. “Multiple copies” mean at least two copies. A “copy” does not necessarily mean perfect sequence complementarity or identity to the template sequence. For example, copies can include nucleotide analogs such as cytosine analogs resistant to cytosine conversion, intentional sequence alterations (such as sequence alterations introduced through a primer comprising a sequence that is hybridizable, but not complementary, to the template), and/or sequence errors that occur during amplification.

[0028] The technique of “polymerase chain reaction” or “PCR” as used herein generally refers to a procedure wherein minute amounts of a specific piece of nucleic acid, RNA and/or DNA, are amplified as described, for example, in U.S. Pat. No. 4,683,195. Generally, sequence information from the ends of the region of interest or beyond needs to be available, such that oligonucleotide primers can be designed; these primers will be identical or similar in sequence to opposite strands of the template to be amplified. The 5' terminal nucleotides of the two primers may coincide with the ends of the amplified material. PCR can be used to amplify specific RNA sequences, specific DNA sequences from total genomic DNA, and cDNA transcribed from total cellular RNA, bacteriophage, or plasmid sequences, etc. See generally Mullis et al., Cold Spring Harbor Symp. Quant. Biol. 51 :263 (1987) and Erlich, ed., PCR Technology (Stockton Press, NY, 1989). As used herein, PCR is considered to be one, but not the only, example of a nucleic acid polymerase reaction method for amplifying a nucleic acid test sample, comprising the use of a known nucleic acid (DNA or RNA) as a primer and utilizes a nucleic acid polymerase to amplify or generate a specific piece of nucleic acid or to amplify or generate a specific piece of nucleic acid which is complementary to a particular nucleic acid.

[0029] The term “sample,” as used herein, refers to a composition that is obtained or derived from a subject and/or individual of interest that contains a cellular and/or other molecular entity that is to be characterized and/or identified, for example, based on physical, biochemical, chemical, and/or physiological characteristics. For example, the phrase “disease sample” and variations thereof refers to any sample obtained from a subject of interest that would be expected or is known to contain the cellular and/or molecular entity that is to be characterized. Samples include, but are not limited to, tissue samples, primary or cultured cells or cell lines, cell supernatants, cell lysates, platelets, serum, plasma, vitreous fluid, lymph fluid, synovial fluid, follicular fluid, seminal fluid, amniotic fluid, milk, whole blood, plasma, serum, blood-derived cells, urine, cerebro-spinal fluid, saliva, sputum, tears, perspiration, mucus, tumor lysates, and tissue culture medium, tissue extracts such as homogenized tissue, tumor tissue, cellular extracts, and combinations thereof. In some instances, the sample is a whole blood sample, a plasma sample, a serum sample, or a combination thereof. In some embodiments, the sample is from a tumor (e.g., a “tumor sample”), such as from a biopsy. In some embodiments, the sample is a formalin-fixed paraffin-embedded (FFPE) sample.

[0030] A “tumor cell” as used herein, refers to any tumor cell present in a tumor or a sample thereof. Tumor cells may be distinguished from other cells that may be present in a tumor sample, for example, stromal cells and tumor-infiltrating immune cells, using methods known in the art and/or described herein.

[0031] A “reference sample,” “reference cell,” “reference tissue,” “control sample,” “control cell,” or “control tissue,” as used herein, refers to a sample, cell, tissue, standard, or level that is used for comparison purposes.

[0032] As used herein, the terms “individual,” “patient,” or “subject” are used interchangeably and refer to any single animal, e.g., a mammal (including such non-human animals as, for example, dogs, cats, horses, rabbits, zoo animals, cows, pigs, sheep, and non-human primates) for which treatment is desired. In particular embodiments, the patient herein is a human.

[0033] An “article of manufacture” is any manufacture (e.g., a package or container) or kit comprising at least one reagent, e.g., a medicament for treatment of a disease or disorder (e.g., cancer), or a probe for specifically detecting a biomarker described herein. In certain embodiments, the manufacture or kit is promoted, distributed, or sold as a unit for performing the methods described herein.

[0034] The term “methylation” is used herein to refer to presence of a methyl group at the C5 position of a cytosine nucleotide within DNA nucleic acids (unless context indicates otherwise). This term includes 5 -methylcytosine (5mC) as well as cytosine nucleotides in which the methyl group is further modified, such as 5-hydroxymethylcytosine (5hmC). This term also includes DNA nucleic acids that have been subjected to chemical or enzymatic conversion of nucleotides, such as conversion that deaminates unmodified cytosines to uracil. [0035] The term “aberrant methylation” is used herein to refer to a pattern of methylation that is not typically present in a normal tissue. For example, the term can refer to increased methylation at a site that is not normally methylated in a normal tissue, or decreased methylation at a site that is normally methylated in a normal tissue. In some embodiments, nucleic acids derived from a cancer cell (e.g., cancer nucleic acids) are characterized by aberrant methylation when their pattern and/or amount of methylation at one or more genomic loci differs from what is normally present at the corresponding locus/loci in a particular type of tissue.

III. Methods

[0036] Certain aspects of the present disclosure relate to methods of purifying DNA fragments from a sample. In some embodiments, the methods comprise contacting a sample (e.g., comprising DNA fragments) with a solid phase in a solution comprising about 30% to about 50% isopropanol (v/v); washing the solid phase and bound nucleic acids, e.g., to remove excess solution and any contaminant(s); and eluting short dsDNA fragments, long dsDNA fragments, and ssDNA from the solid phase. In some embodiments, the sample contacted with the solid phase comprises short dsDNA fragments, long dsDNA fragments, and ssDNA. In some embodiments, the solid phase is contacted with the sample under conditions suitable for short dsDNA fragments, long dsDNA fragments, and ssDNA to bind the solid phase.

[0037] As used herein, unless otherwise specified, short dsDNA fragments refers to dsDNA fragments less than 135bp in length. As used herein, unless otherwise specified, long dsDNA fragments refers to dsDNA fragments greater than or equal to 135bp in length. In some embodiments, short dsDNA fragments and long dsDNA fragments of the present disclosure are greater than 25bp or greater than 30bp in length. In some embodiments, ssDNA of the present disclosure refers to ssDNA fragments greater than 25nt or greater than 30nt in length.

[0038] In some embodiments, a solution of the present disclosure e.g., used for binding DNA fragments to a solid phase as discussed herein) comprising isopropanol further comprises a crowding agent and/or a salt. In some embodiments, a solution of the present disclosure comprising isopropanol further comprises a crowding agent and a salt. For example, in some embodiments, a sample of the present disclosure is contacted with a solid phase in a solution comprising about 30% to about 50% isopropanol (v/v) as well as a crowding agent and/or a salt. In some embodiments, e.g., in subsequent cleanup and/or library preparation steps as disclosed herein, DNA fragments are contacted with a solid phase in a solution comprising about 50% isopropanol (v/v) or about 25% to about 30% isopropanol (v/v), as well as a crowding agent and/or a salt. It is known in the art that the presence of salt and crowding agent can promote binding of DNA to a solid phase of the present disclosure; see, e.g., AMPure XP Reagent (Beckman Coulter). Suitable salts are known in the art and include, without limitation, sodium chloride. Suitable crowding agents are known in the art and include, without limitation, polyethylene glycol (PEG). In some embodiments, a solution comprising isopropanol of the present disclosure further comprises 20% PEG and 2.5M NaCl.

[0039] Solid phases suitable for binding DNA fragments as discussed herein are known in the art. In some embodiments, the solid phase comprises beads. In some embodiments, the solid phase comprises silica or carboxyl groups.

[0040] In some embodiments, the solid phase is magnetic, e.g., paramagnetic. In some embodiments, the solid phase and any bound nucleic acids as discussed herein are washed in the presence of a magnetic field that retains the solid phase and any bound nucleic acids, while allowing unbound nucleic acids, excess solution, and any contaminant(s), if present, to be removed. In some embodiments, the solid phase is suitable for solid phase reversible immobilization (SPRI; see, e.g., DeAngelis, M.M. et al. (1995) Nucleic Acids Res. 23(22):4742- 4743), such as SPRI beads. Suitable solid phases are commercially available, see, e.g., AMPure XP Reagent (Beckman Coulter). In some embodiments, the solid phase and any bound nucleic acids as discussed herein are washed in a solution comprising ethanol, e.g., about 80% ethanol (v/v).

[0041] In some embodiments, after eluting short dsDNA fragments, long dsDNA fragments, and ssDNA from the solid phase (e.g., as discussed supra), thereby extracting short dsDNA fragments, long dsDNA fragments, and ssDNA from a sample of the present disclosure, the methods further comprise preparing a dsDNA or ssDNA library. For example, a dsDNA library can be prepared from the extracted short dsDNA fragments and/or long dsDNA fragments, and a ssDNA library can be prepared from ssDNA. Exemplary library preparation methods and optional cleanup steps are further illustrated in FIGS. 6A & 6B.

[0042] In some embodiments, after eluting short dsDNA fragments, long dsDNA fragments, and ssDNA from the solid phase e.g., as discussed supra), thereby extracting short dsDNA fragments, long dsDNA fragments, and ssDNA from a sample of the present disclosure, the methods further comprise a cleanup step.

[0043] Cleanup as used herein can refer to purification of nucleic acids of interest (e.g., for sequencing) away from one or more contaminants, including without limitation free adapter, adapter dimers, primers, unincorporated nucleotides, other reaction components, salts, proteins, surfactants, and non-nucleotide contaminants from a sample. Cleanup can be performed once or multiple times during library prep, e.g., during or after isolation of nucleic acids from a sample, prior to adapter ligation, after adapter ligation, prior to PCR, after PCR, and/or prior to sequencing. Methods and products for cleanup are known in the art; see, e.g., AMPure XP Reagent (Beckman Coulter). [0044] In some embodiments, the cleanup step is an SPRI-based cleanup step. In some embodiments, after eluting short dsDNA fragments, long dsDNA fragments, and ssDNA from the solid phase e.g., as discussed supra), thereby extracting short dsDNA fragments, long dsDNA fragments, and ssDNA from a sample of the present disclosure, the methods further comprise contacting eluted short dsDNA fragments, long dsDNA fragments, and ssDNA of the present disclosure with a solid phase of the present disclosure in a solution comprising about 50% isopropanol (v/v) of the present disclosure under conditions suitable for the short dsDNA fragments, long dsDNA fragments, and ssDNA to bind the solid phase; washing the solid phase and bound nucleic acids to remove excess solution and any contaminant(s); and eluting the short dsDNA fragments, long dsDNA fragments, and ssDNA from the solid phase. In some embodiments, this step occurs once or twice before adapter ligation. Advantageously, use of a relatively higher isopropanol concentration at this step allows for recovery of short dsDNA fragments and ssDNA in addition to longer dsDNA fragments. See, e.g., PreLC and LC Cleanup as shown in FIGS. 6A & 6B.

[0045] In some embodiments, after eluting short dsDNA fragments, long dsDNA fragments, and ssDNA from the solid phase {e.g., as discussed supra), thereby extracting short dsDNA fragments, long dsDNA fragments, and ssDNA from a sample of the present disclosure, and optionally after cleanup as described supra, the methods further comprise ligating one or more polynucleotide adapters to dsDNA and/or ssDNA fragments. In some embodiments, the methods comprise ligating one or more polynucleotide adapters to short dsDNA fragments and long dsDNA fragments of the present disclosure, e.g., to generate a dsDNA library. In some embodiments, the methods comprise ligating one or more polynucleotide adapters to ssDNA of the present disclosure, e.g., to generate a ssDNA library. In some embodiments, a dsDNA or ssDNA library of the present disclosure refers to a population of nucleic acids generated using dsDNA or ssDNA that comprises a known adapter sequence ligated to 5’ and/or 3’ end(s). In some embodiments, a dsDNA or ssDNA library of the present disclosure refers to a population of similarly-sized nucleic acids generated using dsDNA or ssDNA that comprises a known adapter sequence ligated to 5’ and 3’ ends.

[0046] In some embodiments, short dsDNA fragments and long dsDNA fragments are used to generate a dsDNA library. In some embodiments, ssDNA fragments are used to generate a ssDNA library. In some embodiments, short dsDNA fragments, long dsDNA fragments, and ssDNA are used to generate dsDNA and ssDNA libraries, respectively, or a combined dsDNA/ssDNA library. It will be appreciated by the skilled person that selection of a particular type of adapter and/or particular steps in adapter ligation may be modified in order to generate a dsDNA or ssDNA library. [0047] As discussed herein, dsDNA and ssDNA libraries of the present disclosure refer to libraries prepared from dsDNA or ssDNA starting materials, respectively, rather than the composition of the libraries themselves. For example, it will be appreciated by the skilled person that some methods of ssDNA library preparation result in a library comprising molecules that are partially or fully double-stranded, e.g., due to primer extension using the starting ssDNA as a template (see, e.g., Gansauge, M.T. et al. (2017) Nucleic Acids Res. 45(10):e79) and/or ligation of adapters such as splint adapters that are double-stranded with single-stranded overhangs (see, e.g., Troll, C.J. et al. (2019) BMC Genomics 20:1023; SRSLY® single-stranded DNA library prep kit, Claret Bio), and that some uses of dsDNA libraries (e.g., cytosine conversion for methylation sequencing) require a denaturation step that produces single-stranded fragments from the dsDNA library.

[0048] Suitable adapters and methods for generating dsDNA or ssDNA libraries are known in the art. Adapters can include, for example, binding sites for index primers, barcoding sequences, etc. Exemplary adapters are known in the art; see, e.g., NEBNext® Ultra™ II Y adapters (NEB) and xGen™ Stubby Adapters (IDT) for dsDNA libraries; and splint adapters that are doublestranded with single-stranded overhangs (see, e.g., Gansauge, M.T. et al. (2017) Nucleic Acids Res. 45(10):e79; Troll, C.J. et al. (2019) BMC Genomics 20:1023; and SRSLY® single-stranded DNA library prep kit, Claret Bio) or sets of adapters that ligate to 3’ ends of ssDNA with complementary extension primers (see, e.g., xGen™ ssDNA & Low-Input DNA Library Preparation Kit; IDT) for ssDNA libraries. In some embodiments, dsDNA libraries are constructed using blunt-ended or Y adapters; see, e.g., NEBNext® Ultra™ II Y adapters (NEB) and xGen™ Stubby Adapters (IDT). In some embodiments, ssDNA libraries are constructed using splint adapters; see, e.g., Troll, C.J. et al. (2019) BMC Genomics 20:1023; SRSLY® singlestranded DNA library prep kit, Claret Bio; Gansauge, M.T. et al. (2017) Nucleic Acids Res. 45(10):e79.

[0049] In some embodiments, library preparation further comprises, in addition to adapter ligation, one or more of end-repair and A-tailing. End-repair (also known as end-polishing) as used herein can refer to generation of blunt ends, e.g. , by filling in or degrading overhangs, and/or addition of 5’ phosphates. Methods and products for end-repair (e.g., DNA polymerase and PNK) are known in the art; see, e.g., NEBNext® Ultra™ II (NEB). 3’ adenylation, also known as A- tailing or dA-tailing, as used herein can refer to adding or retaining a 3’ terminal adenine onto a nucleic acid. Methods and products for 3’ adenylation (e.g., Taq DNA polymerase) are known in the art; see, e.g., NEBNext® Ultra™ II (NEB). Some steps can differ between dsDNA and ssDNA library prep. For example, in some embodiments, dsDNA library prep can comprise end repair and A-tailing, then adapter ligation, whereas ssDNA library prep can comprise, for example, combined phosphorylation and adapter ligation (see, e.g., Troll, C.J. et al. (2019) BMC Genomics 20:1023; SRSLY® single-stranded DNA library prep kit, Claret Bio), or combined tailing and adapter ligation (see, e.g., xGen™ ssDNA & Low-Input DNA Library Preparation Kit; IDT).

[0050] In some embodiments, e.g., after adapter ligation, the methods further comprise subjecting a library of the present disclosure to one or more cleanup steps. In some embodiments, the cleanup step is an SPRI-based cleanup step. In some embodiments, e.g., after adapter ligation, the methods further comprise contacting the dsDNA or ssDNA library with a solid phase of the present disclosure in a solution comprising about 25% to about 30% isopropanol (v/v) of the present disclosure under conditions suitable for the dsDNA or ssDNA library to bind the solid phase; washing the solid phase and bound dsDNA or ssDNA library (e.g., to remove free adapters and/or adapter dimers); and eluting the dsDNA or ssDNA library from the solid phase. Advantageously, use of a relatively lower isopropanol concentration at this step allows for recovery of dsDNA fragments (now longer due to adapter ligation) while removing shorter adapter dimers and single-stranded free adapter (since in some embodiments, the ssDNA library now comprises at least partially double-stranded molecules). See, e.g., PostLC Cleanup as shown in FIGS. 6A & 6B.

[0051] In some embodiments, e.g., after adapter ligation and optional cleanup, the methods further comprise subjecting the dsDNA or ssDNA library to cytosine conversion. In some embodiments, the methods further comprise denaturing the dsDNA or ssDNA library; and subjecting the denatured dsDNA or ssDNA library to cytosine conversion. Generally, methyl sequencing methods (e.g., bisulfite sequencing, EM-Seq, etc.) involve cytosine conversion followed by sequence analysis. Cytosine conversion is typically used to mark cytosines based on methylation status. For example, bisulfite treatment converts cytosine to uracil but does not alter 5mC. Subsequent analysis can reveal methylation state by identifying which base pairs were converted and which were not.

[0052] Methods for methyl sequencing are known in the art, including whole-genome methyl sequencing. Generally, these methods combine cytosine conversion with sequencing techniques. For example, in some embodiments, the methyl sequencing comprises bisulfite sequencing, whole genome bisulfite sequencing (WGBS), APOBEC-seq, methyl-CpG-binding domain (MBD) protein capture, methyl-DNA immunoprecipitation (MeDIP-seq), methylation sensitive restriction enzyme sequencing (MSRE/MRE-Seq or Methyl-Seq), enzymatic methylation sequencing, oxidative bisulfite sequencing (oxBS-Seq), reduced representative bisulfite sequencing (RRBS), or Tet-assisted bisulfite sequencing (TAB-Seq).

[0053] A commonly-used method of determining the methylation level and/or pattern of DNA requires methylation status-dependent conversion of cytosine in order to distinguish between methylated and non-methylated CpG dinucleotide sequences. For example, methylation of CpG dinucleotide sequences can be measured by employing cytosine conversion based technologies, which rely on methylation status-dependent chemical modification of CpG sequences within isolated genomic DNA, or fragments thereof, followed by DNA sequence analysis. Chemical reagents that are able to distinguish between methylated and non-methylated CpG dinucleotide sequences include hydrazine, which cleaves the nucleic acid, and bisulfite treatment. Bisulfite treatment followed by alkaline hydrolysis specifically converts non-methylated cytosine to uracil, leaving 5 -methylcytosine unmodified as described by Olek A., Nucleic Acids Res. 24:5064-6, 1996 or Frommer et al., Proc. Natl. Acad. Sci. USA 89:1827-1831 (1992). The bisulfite-treated DNA can subsequently be analyzed by conventional molecular techniques, such as PCR amplification, sequencing, and detection comprising oligonucleotide hybridization. See, e.g., U.S. Pat. No. 10,174,372.

[0054] Various methodologies for cytosine conversion are known in the art. In some embodiments, a plurality of nucleic acids or nucleic acid fragments of the present disclosure has undergone cytosine conversion by bisulfite treatment, TET-assisted bisulfite treatment, TET- assisted pyridine borane treatment, oxidative bisulfite treatment, or APOBEC treatment, e.g., prior to detection.

[0055] As such, in some embodiments, the methods of the present disclosure comprise treating a plurality of nucleic acids or nucleic acid fragments of the present disclosure with bisulfite. Bisulfite sequencing is a commonly used method in the art for generating methylation data at single-base resolution. Bisulfite conversion or treatment refers to a biochemical process for converting unmethylated cytosine residue to uracil or thymine residues (e.g., deamination to uracil, followed by amplification as thymine during PCR), whereby methylated cytosine residues (e.g., 5 -methylcytosine, 5mC; or 5 -hydroxymethylcytosine, 5hmC) are preserved. Reagents to convert cytosine to uracil are known to those of skill in the art and include bisulfite reagents such as sodium bisulfite, potassium bisulfite, ammonium bisulfite, magnesium bisulfite, sodium metabisulfite, potassium metabisulfite, ammonium metabisulfite, magnesium metabisulfite and the like.

[0056] In some embodiments, the methods of the present disclosure comprise treating a plurality of nucleic acids or nucleic acid fragments of the present disclosure with enzymatic digestion and bisulfite treatment. The principle of the method is that the fragmentation of DNA is not achieved by ultrasound but achieved by combined enzymatic digestion by multiple endonucleases (Msel, Tsp 5091, Nlalll and Hpy CH4V), wherein the restriction enzyme cutting sites of Msel, Tsp509I, Nlalll and Hpy CH4V are TTAA, AATT, CATG and TGCA, respectively. See, e.g., Smiraglia D J, et al. Oncogene 2002; 21: 5414-5426. This is followed by bisulfite treatment, e.g., as described herein. [0057] Enzymatic methods for cytosine conversion are also known, e.g., enzymatic methyl sequencing. Such approaches can be advantageous because they employ enzymes instead of bisulfite, which can damage and fragment DNA, leading to DNA loss and potentially biased sequencing. For example, TET2 (the Ten-eleven translocation (Tet) family 2 methylcytosine dioxygenase) and T4-BGT (T4 phage beta-glucosyltransferase) can be used to convert 5mC and 5hmC into products that cannot be deaminated by APOBEC3A (apolipoprotein B mRNA editing enzyme, catalytic polypeptide-like 3A), then APOBEC3A is used to deaminate unmodified cytosines by converting them into uracils. See, e.g., Vaisvila, R. et al. (2021) Genome Res. 31:1- 10.

[0058] In some embodiments, the methods of the present disclosure comprise treating a plurality of nucleic acids or nucleic acid fragments of the present disclosure with TET-assisted bisulfite (e.g., TAB-seq). In the TAB-seq approach, beta-glucosyltransferase (PGT) is used to convert 5hmC into P-glucosyl-5-hydroxymethylcytosine (5gmC), and a Tet enzyme (e.g., mTetl) is used to oxidize 5mC into 5 -carboxylcytosine (5caC). Subsequently, nucleic acids can be treated with bisulfite. See, e.g., Yu, M. et al. (2018) Methods Mol. Biol. 1708:645-663.

[0059] In some embodiments, the methods of the present disclosure comprise treating a plurality of nucleic acids or nucleic acid fragments of the present disclosure with TET-assisted pyridine borane (e.g., TAPS). In the TAPS approach, a TET methylcytosine dioxygenase is used to oxidize 5mC and 5hmC into 5caC, then 5caC is reduced into dihydrouracil (DHU) via pyridine borane. DHU is converted to thymine during subsequent PCR. See, e.g., Liu, Y. et al. (2019) Nat. Biotechnol. 37:424-429.

[0060] In some embodiments, the methods of the present disclosure comprise treating a plurality of nucleic acids or nucleic acid fragments of the present disclosure with oxidative bisulfite (e.g., oxBS). In the oxBS approach, 5hmC is oxidized into 5 -formylcytosine (5fC), which can be converted to uracil under bisulfite. Sequencing results from bisulfite vs. oxidative bisulfite treatment can then be used to infer 5hmC levels from 5mC. See, e.g., Booth, M.J. et al. (2013) Nat. Protocols 8:1841-1851. This approach can be scaled on a genome-wide level in oxBS-seq; see, e.g., Kirschner, K. et al. (2018) Methods Mol. Biol. 1708:665-678.

[0061] In some embodiments, the methods of the present disclosure comprise treating a plurality of nucleic acids or nucleic acid fragments of the present disclosure with APOB EC. Enzymatic reagents to convert cytosine to uracil, i.e. cytosine deaminases, include those of the APOBEC family, such as APOBEC-seq or APOBEC3A. The APOBEC family members are cytidine deaminases that convert cytosine to uracil while maintaining 5-methyl cytosine, i.e. without altering 5-methyl cytosine. Such enzymes are described in US2013/0244237 and WO2018165366 and are commercially available (see, e.g., the NEBNext® Enzymatic Methyl-seq Kit, New England Biolabs). Non-limiting examples of APOBEC family proteins include APOBEC1, AP0BEC2, AP0BEC3A, AP0BEC3B, AP0BEC3C, AP0BEC3D, AP0BEC3F, AP0BEC3G, AP0BEC3H, AP0BEC4, and Activation-induced (cytidine) deaminase.

[0062] Some methyl sequencing methods rely upon library construction and adapter ligation, followed by standard bisulfite conversion and sequencing (e.g., WGBS). Alternatively, bisulfite treatment can be carried out prior to adaptor ligation (see, e.g., Miura, F. et al. (2012) Nucleic Acids Res. 40:el36). More recent techniques use other cytosine conversion methods such as enzymatic approaches in order to reduce damage to DNA caused by bisulfite, e.g., as in the commercially available NEBNext® Enzymatic Methyl-seq Kit (New England Biolabs). Steps of library amplification, quantification, and sequencing generally follow bisulfite conversion. In some embodiments, prior to WGMS, nucleic acids are extracted from a sample. In some embodiments, prior to WGMS, nucleic acids are subjected to fragmentation, repair, and adaptor ligation. As noted previously, cytosine conversion can be carried out before or after adaptor ligation. In some embodiments, DNA repair is performed after cytosine conversion. PCR amplification (generally at least two cycles) is performed after cytosine conversion to convert uracils (generated by formerly unmethylated cytosines) into thymine, and is accomplished using a polymerase that is able to read uracil (excluding polymerases with proofreading and repair activities). In some embodiments, prior to sequencing, fragments are enriched for desired length. In some embodiments, prior to sequencing, nucleic acids are enriched for methylated sequences, such as by immunoprecipitation using an antibody specific for 5mC as in the MeDIP approach (see, e.g., Pomraning, K.R. et al. (2009) Methods 47:142-150.

[0063] In some embodiments, e.g., after cytosine conversion, the methods further comprise subjecting the cytosine-converted dsDNA or ssDNA library to cleanup. In some embodiments, e.g., after cytosine conversion, the methods further comprise contacting the cytosine-converted dsDNA or ssDNA library with a solid phase in a solution of the present disclosure comprising about 50% isopropanol (v/v) of the present disclosure under conditions suitable for the denatured and cytosine-converted dsDNA or ssDNA library to bind the solid phase, washing the solid phase and bound denatured and cytosine-converted dsDNA or ssDNA library, and eluting the denatured and cytosine-converted dsDNA or ssDNA library from the solid phase. Advantageously, use of a relatively higher isopropanol concentration at this step allows for greater recovery of ssDNA, since the dsDNA or ssDNA library has been denatured at this point as part of the cytosine conversion. See, e.g., Methyl Conversion Cleanup as shown in FIGS. 6A & 6B.

[0064] In some embodiments, e.g., after cytosine conversion and optional cleanup, the methods further comprise subjecting the cytosine-converted dsDNA or ssDNA library or amplicons thereof to PCR amplification. [0065] In some embodiments, e.g., after PCR amplification, the methods further comprise subjecting the cytosine-converted dsDNA or ssDNA library to methylation sequencing, e.g., according to any of the methods described supra.

[0066] In some embodiments, e.g., after adapter ligation and optional cleanup, the methods further comprise subjecting the dsDNA or ssDNA library to PCR amplification.

[0067] In some embodiments, e.g., after PCR amplification, the methods further comprise subjecting the dsDNA or ssDNA library to sequencing, e.g., according to any of the methods described infra.

[0068] Various sequencing methods are known in the art, including without limitation nextgeneration sequencing (NGS). Next-generation sequencing generally includes any sequencing method that determines the nucleotide sequence of either individual nucleic acid molecules or clonally expanded proxies for individual nucleic acid molecules in a highly parallel fashion (e.g., greater than 10⁵ molecules are sequenced simultaneously). In one embodiment, the relative abundance of the nucleic acid species in the library can be estimated by counting the relative number of occurrences of their cognate sequences in the data generated by the sequencing experiment.

[0069] NGS methods are known in the art, and are described, e.g., in Metzker, M. (2010) Nature Biotechnology Reviews 11:31-46. Platforms for next-generation sequencing include, e.g., Roche/454’s Genome Sequencer (GS) FLX System, Illumina/Solexa’s Genome Analyzer (GA), Illumina’s HiSeq 2500, HiSeq 3000, HiSeq 4000 and NovaSeq 6000 Sequencing Systems, Life/APG’s Support Oligonucleotide Ligation Detection (SOLiD) system, Polonator’s G.007 system, Helicos BioSciences’ HeliScope Gene Sequencing system, and Pacific Biosciences’ PacBio RS system. In one embodiment, the next-generation sequencing allows for the determination of the nucleotide sequence of an individual nucleic acid molecule (e.g., Helicos BioSciences’ HeliScope Gene Sequencing system, and Pacific Biosciences’ PacBio RS system). In other embodiments, the sequencing method determines the nucleotide sequence of clonally expanded proxies for individual nucleic acid molecules (e.g., the Solexa sequencer, Illumina Inc., San Diego, Calif; 454 Life Sciences (Branford, Conn.), and Ion Torrent), e.g., massively parallel short-read sequencing (e.g., the Solexa sequencer, Illumina Inc., San Diego, Calif.), which generates more bases of sequence per sequencing unit than other sequencing methods that generate fewer but longer reads.

[0070] NGS technologies can include one or more of steps, e.g., template preparation, sequencing and imaging, and data analysis. Methods for template preparation can include steps such as randomly breaking nucleic acids (e.g., genomic DNA) into smaller sizes and generating sequencing templates (e.g., fragment templates or mate-pair templates). The spatially separated templates can be attached or immobilized to a solid surface or support, allowing massive amounts of sequencing reactions to be performed simultaneously. Types of templates that can be used for NGS reactions include, e.g., clonally amplified templates originating from single DNA molecules, and single DNA molecule templates. Exemplary sequencing and imaging steps for NGS include, e.g., cyclic reversible termination (CRT), sequencing by ligation (SBL), single-molecule addition (pyrosequencing), and real-time sequencing. After NGS reads have been generated, they can be aligned to a known reference sequence or assembled de novo. For example, identifying genetic variations such as single-nucleotide polymorphism and structural variants in a sample (e.g., a tumor sample) can be accomplished by aligning NGS reads to a reference sequence e.g., a wild type sequence). Methods of sequence alignment for NGS are described e.g., in Trapnell C. and Salzberg S.L. Nature Biotech., 2009, 27:455-457. Examples of de novo assemblies are described, e.g., in Warren R. et al., Bioinformatics, 2007, 23:500-501; Butler J. et al., Genome Res., 2008, 18:810-820; and Zerbino D.R. and Birney E., Genome Res., 2008, 18:821-829. Sequence alignment or assembly can be performed using read data from one or more NGS platforms, e.g., mixing Roche/454 and Illumina/Solexa read data. In some embodiments, NGS is performed according to the methods described in, e.g., Frampton, G.M. et al. (2013) Nat. Biotech. 31:1023- 1031; and/or Montesion, M., et al., Cancer Discovery (2021) l l(2):282-92.

[0071] In some embodiments, sequencing includes paired-end sequencing or unpaired sequencing. Generally, paired-end sequencing methodologies are described, e.g., in W02007/010252, W02007/091077, and WO03/74734. This approach utilizes pairwise sequencing of a double-stranded polynucleotide template, which results in the sequential determination of nucleotide sequences in two distinct and separate regions of the polynucleotide template. The paired-end methodology makes it possible to obtain two linked or paired reads of sequence information from each double-stranded template on a clustered array, rather than just a single sequencing read as can be obtained with other methods. Paired end sequencing technology can make special use of clustered arrays, generally formed by solid-phase amplification, for example as set forth in WO03/74734. Target polynucleotide duplexes, fitted with adapters, are immobilized to a solid support at the 5' ends of each strand of each duplex, for example, via bridge amplification as described above, forming dense clusters of double stranded DNA.

Because both strands are immobilized at their 5' ends, sequencing primers are then hybridized to the free 3' end and sequencing by synthesis is performed. Adapter sequences can be inserted in between target sequences to allow for up to four reads from each duplex, as described in W02007/091077. In a further adaptation of this methodology, specific strands can be cleaved in a controlled fashion as set forth in W02007/010252. As a result, the timing of the sequencing read for each strand can be controlled, permitting sequential determination of the nucleotide sequences in two distinct and separate regions on complementary strands of the double-stranded template. See, e.g., US Pat. No. 10,174,372. [0072] In some embodiments, a hybrid capture approach is used. Further details about this and other hybrid capture processes can be found in U.S. Pat. No. 9,340,830; Frampton, G.M. et al. (2013) Nat. Biotech. 31:1023-1031; and Montesion, M., et al., Cancer Discovery (2021) l l(2):282-92. In some embodiments, the methods further comprise, prior to contacting the mixture of polynucleotides with the bait molecule: obtaining a sample from an individual, wherein the sample comprises tumor cells and/or tumor nucleic acids; and extracting the mixture of polynucleotides from the sample, wherein the mixture of polynucleotides is from the tumor cells and/or tumor nucleic acids. In some embodiments, the sample further comprises non-tumor cells.

[0073] In some embodiments, the methods further comprise selectively enriching for a plurality of nucleic acids or nucleic acid fragment. For example, one or more baits or probes can be used to hybridize with a genomic locus of interest or fragment thereof, e.g., comprising a cluster of two or more CpG dinucleotides or comprising a genetic variant/mutation of interest. See, e.g., Graham, B.I. et al. Twist Fast Hybridization targeted methylation sequencing: a tunable target enrichment solution for methylation detection [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2021; 2021 Apr 10-15 and May 17-21. Philadelphia (PA): AACR; Cancer Res 2021 ;81(13_Suppl): Abstract nr 2098.

Samples

[0074] In certain aspects, the present disclosure relates to purification of DNA fragments of the present disclosure e.g., long dsDNA fragments, short dsDNA fragments, and/or ssDNA fragments) from a sample. In some embodiments, nucleic acids are obtained from a sample, e.g., comprising tumor cells and/or tumor nucleic acids. For example, the sample can comprise tumor cell(s), circulating tumor cell(s), tumor nucleic acids (e.g., tumor circulating tumor DNA, cfDNA, or cfRNA), part or all of a tumor biopsy, fluid, cells, tissue, mRNA, cDNA, DNA, RNA, cell-free DNA, and/or cell-free RNA. In some embodiments, the sample is from a tumor biopsy or tumor specimen. In some embodiments, the sample further comprises non-tumor cells and/or non-tumor nucleic acids. In some embodiments, the fluid comprises blood, serum, plasma, saliva, semen, cerebral spinal fluid, amniotic fluid, peritoneal fluid, interstitial fluid, etc. In some embodiments, the sample further comprises non-tumor cells and/or non-tumor nucleic acids.

[0075] In some embodiments, a sample comprises tissue, cells, and/or nucleic acids from a cancer and/or tissue, cells, and/or nucleic acids from normal tissue. In some embodiments, the sample comprises a tissue biopsy sample, a liquid biopsy sample, or a normal control. In some embodiments, the sample is from a tumor biopsy, tumor specimen, or circulating tumor cell. In some embodiments, the sample is a liquid biopsy sample and comprises blood, plasma, serum, cerebrospinal fluid, sputum, stool, urine, or saliva. [0076] In some embodiments, the sample comprises a fraction of tumor nucleic acids that is less than 1% of total nucleic acids, less than 0.5% of total nucleic acids, less than 0.1% of total nucleic acids, or less than 0.05% of total nucleic acids. In some embodiments, the sample comprises a fraction of tumor nucleic acids that is at least 0.01%, at least 0.05%, or at least 0.1% of total nucleic acids. In some embodiments, the sample comprises a fraction of tumor nucleic acids having an upper limit of 10%, 9%, 8%, 7%, 6%, 5%, 4%, 3%, 2%, 1%, 0.9%, 0.8%, 0.7%, 0.6%, 0.5%, 0.4%, 0.3%, 0.2%, 0.1%, 0.09%, 0.08%, 0.07%, 0.06%, 0.05%, 0.04%, 0.03%, or 0.02% of total nucleic acids and an independently selected lower limit of 0.0001%, 0.0002%, 0.0003%, 0.0004%, 0.0005%, 0.0006%, 0.0007%, 0.0008%, 0.0009%, 0.001%, 0.002%, 0.003%, 0.004%, 0.005%, 0.006%, 0.007%, 0.008%, 0.009%, 0.01%, 0.02%, 0.03%, 0.04%, 0.05%, 0.06%, 0.07%, 0.08%, 0.09%, 0.1%, 0.2%, 0.3%, 0.4%, 0.5%, 0.6%, 0.7%, 0.8%, 0.9%, or 1% of total nucleic acids, wherein the upper limit is greater than the lower limit.

[0077] In some embodiments, the sample is or comprises biological tissue or fluid. The sample can contain compounds that are not naturally intermixed with the tissue in nature such as preservatives, anticoagulants, buffers, fixatives, nutrients, antibiotics or the like. In one embodiment, the sample is preserved as a frozen sample or as a formaldehyde- or paraformaldehyde-fixed paraffin-embedded (FFPE) tissue preparation. For example, the sample can be embedded in a matrix, e.g., an FFPE block or a frozen sample. In another embodiment, the sample is a blood or blood constituent sample. In yet another embodiment, the sample is a bone marrow aspirate sample. In another embodiment, the sample comprises cell-free DNA (cfDNA) or circulating cell-free DNA (ccfDNA), e.g., tumor cfDNA or tumor ccfDNA. Without wishing to be bound by theory, it is believed that in some embodiments, cfDNA is DNA from apoptosed or necrotic cells. Typically, cfDNA is bound by protein (e.g., histone) and protected by nucleases. CfDNA can be used as a biomarker, for example, for non-invasive prenatal testing (NIPT), organ transplant, cardiomyopathy, microbiome, and cancer. In another embodiment, the sample comprises circulating tumor DNA (ctDNA). Without wishing to be bound by theory, it is believed that in some embodiments, ctDNA is cfDNA with a genetic or epigenetic alteration (e.g., a somatic alteration or a methylation signature) that can discriminate it originating from a tumor cell versus a non-tumor cell. In another embodiment, the sample comprises circulating tumor cells (CTCs). Without wishing to be bound by theory, it is believed that in some embodiments, CTCs are cells shed from a primary or metastatic tumor into the circulation. In some embodiments, CTCs apoptose and are a source of ctDNA in the blood/lymph.

IV. Exemplary Embodiments

[0078] The following exemplary embodiments are representative of some aspects of the invention: Embodiment 1. A method of purifying DNA fragments from a sample, comprising: a) contacting a sample that comprises short double-stranded DNA (dsDNA) fragments less than 135 base pairs (bp) in length, long dsDNA fragments greater than 135bp in length, and single-stranded DNA (ssDNA) with a solid phase in a solution comprising about 30% to about 50% isopropanol (v/v) under conditions suitable for the short dsDNA fragments, long dsDNA fragments, and ssDNA to bind the solid phase; b) washing the solid phase and bound nucleic acids (e.g., to remove excess solution and/or any contaminant(s)); and c) eluting the short dsDNA fragments, long dsDNA fragments, and ssDNA from the solid phase.

Embodiment 2. The method of embodiment 1 , wherein the short dsDNA fragments and ssDNA are greater than 25bp in length.

Embodiment 3. The method of embodiment 1 or embodiment 2, wherein the solution in a) further comprises a crowding agent and a salt.

Embodiment 4. The method of embodiment 3, wherein the solution in a) further comprises about 20% polyethylene glycol (PEG) and about 2.5M NaCl.

Embodiment 5. The method of any one of embodiments 1-4, wherein the solid phase comprises silica or carboxyl groups.

Embodiment 6. The method of any one of embodiments 1-5, wherein the solid phase comprises a magnetic material, and wherein the solid phase and bound nucleic acids are washed in b) in the presence of a magnetic field that retains the solid phase and bound nucleic acids.

Embodiment 7. The method of any one of embodiments 1-6, wherein the solid phase and bound nucleic acids are washed in b) in a solution comprising about 80% ethanol (v/v).

Embodiment 8. The method of any one of embodiments 1-7, further comprising, after c), contacting the eluted short dsDNA fragments, long dsDNA fragments, and ssDNA with a solid phase in a solution comprising about 50% isopropanol (v/v) under conditions suitable for the short dsDNA fragments, long dsDNA fragments, and ssDNA to bind the solid phase; washing the solid phase and bound nucleic acids to remove excess solution and/or any contaminant(s); and eluting the short dsDNA fragments, long dsDNA fragments, and ssDNA from the solid phase. Embodiment 9. The method of any one of embodiments 1-8, further comprising, after eluting the short dsDNA fragments, long dsDNA fragments, and ssDNA from the solid phase: ligating one or more polynucleotide adapters to the short dsDNA fragments and long dsDNA fragments to generate a dsDNA library.

Embodiment 10. The method of any one of embodiments 1-8, further comprising, after eluting the short dsDNA fragments, long dsDNA fragments, and ssDNA from the solid phase: ligating one or more polynucleotide adapters to the ssDNA to generate a ssDNA library.

Embodiment 11. The method of embodiment 9 or embodiment 10, further comprising, after adapter ligation: contacting the dsDNA or ssDNA library with a solid phase in a solution comprising about 25% to about 30% isopropanol (v/v) under conditions suitable for the dsDNA or ssDNA library to bind the solid phase; washing the solid phase and bound dsDNA or ssDNA library to remove adapter dimers; and eluting the dsDNA or ssDNA library from the solid phase.

Embodiment 12. The method of embodiment 11, further comprising, after eluting the dsDNA or ssDNA library from the solid phase: denaturing the dsDNA or ssDNA library; and subjecting the denatured dsDNA or ssDNA library to cytosine conversion.

Embodiment 13. The method of embodiment 12, further comprising, after cytosine conversion, contacting the denatured and cytosine-converted dsDNA or ssDNA library with a solid phase in a solution comprising about 50% isopropanol (v/v) under conditions suitable for the denatured and cytosine-converted dsDNA or ssDNA library to bind the solid phase; washing the solid phase and bound denatured and cytosine-converted dsDNA or ssDNA library; and eluting the denatured and cytosine-converted dsDNA or ssDNA library from the solid phase.

Embodiment 14. The method of embodiment 13, further comprising, after eluting the denatured and cytosine-converted dsDNA or ssDNA library from the solid phase, subjecting the denatured and cytosine-converted dsDNA or ssDNA library to PCR amplification. Embodiment 15. The method of embodiment 14, further comprising, after PCR amplification, subjecting the denatured and cytosine-converted dsDNA library and/or PCR amplicons thereof, or ssDNA library and/or PCR amplicons thereof to methylation sequencing.

Embodiment 16. The method of embodiment 11, further comprising, after eluting the dsDNA or ssDNA library from the solid phase: subjecting the dsDNA or ssDNA library to PCR amplification.

Embodiment 17. The method of embodiment 16, further comprising, after PCR amplification, subjecting the dsDNA library and/or PCR amplicons thereof, or ssDNA library and/or PCR amplicons thereof, to sequencing.

Embodiment 18. The method of any one of embodiments 1-17, wherein the sample comprises cell-free DNA (cfDNA), circulating cell-free DNA (ccfDNA), or circulating tumor DNA (ctDNA).

Embodiment 19. The method of any one of embodiments 1-17, wherein the sample comprises fluid, cells, or tissue.

Embodiment 20. The method of any one of embodiments 1-17, wherein the sample comprises tumor cells and/or tumor nucleic acids.

[0079] The disclosures of all publications, patents, and patent applications referred to herein are each hereby incorporated by reference in their entireties. To the extent that any reference incorporated by reference conflicts with the instant disclosure, the instant disclosure shall control.

EXAMPLES

[0080] The invention will be more fully understood by reference to the following examples.

They should not, however, be construed as limiting the scope of the invention. It is understood that the examples and embodiments described herein are for illustrative purposes only and that various modifications or changes in light thereof will be suggested to persons skilled in the art and are to be included within the spirit and purview of this application and scope of the appended claims.

Example 1: Improved nucleic acid purification methods for recovery of long and short dsDNA fragments as well as ssDNA [0081] This Example describes methods for purification of nucleic acids that allow the recovery of short dsDNA fragments and ssDNA, as well as the long dsDNA fragments typically captured during standard nucleic acid isolation and library prep.

[0082] A mixture of nucleic acids including a dsDNA ladder and ssDNA ladder were extracted and purified according to manufacturer’s instructions using the AMPure XP Reagent (Beckman Coulter). As shown in FIG. 1, this SOP method resulted in near-complete loss of dsDNA fragments less than 25bp in length (left), as well as ssDNA fragments of any length (right). [0083] This protocol was modified by the addition of 30-50% isopropanol (v/v) along with sample and beads during extraction, followed by standard magnetic separation and washing steps (FIG. 2). dsDNA and ssDNA fragments of various sizes were purified using the modified protocol in the presence of varying concentrations of isopropanol. As shown in FIG. 3, recovery of shorter dsDNA fragments (particularly less than 25bp in length) was only observed with higher concentrations of isopropanol. Addition of 50% isopropanol was most effective at recovering dsDNA fragments less than 25bp in length, while recovery of larger dsDNA fragments was high across 20%-50% isopropanol concentrations. As shown in FIG. 4, ssDNA was not recovered at all by standard SPRI purification, while ssDNA fragments greater than 30nt in length were recovered in the presence of isopropanol. The recovery of dsDNA and ssDNA fragments was tunable using the concentration of isopropanol; as isopropanol concentration increased, the recovery of smaller ssDNA fragments also increased.

[0084] This tunable recovery of fragments by size can improve not only initial nucleic acid extraction, but also various cleanup steps during library preparation. For example, SOP and isopropanol SPRI cleanups were compared after APOBEC deamination during library prep for methylation sequencing. SOP cleanup used lOOpL of sample and lOOpL 20% PEG/2.5M NaCl during SPRI cleanup after APOBEC deamination, whereas isopropanol cleanup used lOOpL of sample, 300pL 20% PEG/2.5M NaCl, and 400pL isopropanol (for 50% isopropanol, v/v). As shown in FIG. 5, the use of 50% isopropanol increased recovery by -80%. Isopropanol was also found to improve cleanup recovery after post- APOB EC PCR.

Claims

CLAIMS What is claimed is:

1. A method of purifying DNA fragments from a sample, comprising: a) contacting a sample that comprises short double-stranded DNA (dsDNA) fragments less than 135 base pairs (bp) in length, long dsDNA fragments greater than 135bp in length, and single-stranded DNA (ssDNA) with a solid phase in a solution comprising about 30% to about 50% isopropanol (v/v) under conditions suitable for the short dsDNA fragments, long dsDNA fragments, and ssDNA to bind the solid phase; b) washing the solid phase and bound nucleic acids; and c) eluting the short dsDNA fragments, long dsDNA fragments, and ssDNA from the solid phase.

2. The method of claim 1 , wherein the short dsDNA fragments and ssDNA are greater than 25bp in length.

3. The method of claim 1, wherein the solution in a) further comprises a crowding agent and a salt.

4. The method of claim 3, wherein the crowding agent comprises about 20% polyethylene glycol (PEG), and the salt comprises about 2.5M NaCl.

5. The method of claim 1, wherein the solid phase comprises silica or carboxyl groups.

6. The method of claim 1 , wherein the solid phase comprises a magnetic material, and wherein the solid phase and bound nucleic acids are washed in b) in the presence of a magnetic field that retains the solid phase and bound nucleic acids.

7. The method of claim 1, wherein the solid phase and bound nucleic acids are washed in b) in a solution comprising about 80% ethanol (v/v).

8. The method of claim 1, further comprising, after c), contacting the eluted short dsDNA fragments, long dsDNA fragments, and ssDNA with a solid phase in a solution comprising about 50% isopropanol (v/v) under conditions suitable for the short dsDNA fragments, long dsDNA fragments, and ssDNA to bind the solid phase; washing the solid phase and bound nucleic acids; and eluting the short dsDNA fragments, long dsDNA fragments, and ssDNA from the solid phase.

9. The method of claim 1, further comprising, after eluting the short dsDNA fragments, long dsDNA fragments, and ssDNA from the solid phase:

(a) ligating one or more polynucleotide adapters to the short dsDNA fragments and long dsDNA fragments to generate a dsDNA library; or

(b) ligating one or more polynucleotide adapters to the ssDNA to generate a ssDNA library.

10. The method of claim 9, further comprising, after adapter ligation: contacting the dsDNA or ssDNA library with a solid phase in a solution comprising about 25% to about 30% isopropanol (v/v) under conditions suitable for the dsDNA or ssDNA library to bind the solid phase; washing the solid phase and bound dsDNA or ssDNA library to remove adapter dimers and/or free adapters; and eluting the dsDNA or ssDNA library from the solid phase.

11. The method of claim 10, further comprising, after eluting the dsDNA or ssDNA library from the solid phase: denaturing the dsDNA or ssDNA library; and subjecting the denatured dsDNA or ssDNA library to cytosine conversion.

12. The method of claim 11, further comprising, after cytosine conversion, contacting the denatured and cytosine-converted dsDNA or ssDNA library with a solid phase in a solution comprising about 50% isopropanol (v/v) under conditions suitable for the denatured and cytosine- converted dsDNA or ssDNA library to bind the solid phase; washing the solid phase and bound denatured and cytosine-converted dsDNA or ssDNA library; and eluting the denatured and cytosine-converted dsDNA or ssDNA library from the solid phase.

13. The method of claim 12, further comprising, after eluting the denatured and cytosine- converted dsDNA or ssDNA library from the solid phase, subjecting the denatured and cytosine- converted dsDNA or ssDNA library to PCR amplification.

14. The method of claim 13, further comprising, after PCR amplification, subjecting the denatured and cytosine-converted dsDNA library and/or PCR amplicons thereof, or ssDNA library and/or PCR amplicons thereof to methylation sequencing.

15. The method of claim 10, further comprising, after eluting the dsDNA or ssDNA library from the solid phase: subjecting the dsDNA or ssDNA library to PCR amplification.

16. The method of claim 15, further comprising, after PCR amplification, subjecting the dsDNA library and/or PCR amplicons thereof, or ssDNA library and/or PCR amplicons thereof, to sequencing.

17. The method of claim 1, wherein the sample comprises cell-free DNA (cfDNA), circulating cell-free DNA (ccfDNA), or circulating tumor DNA (ctDNA).

18. The method of claim 1, wherein the sample comprises fluid, cells, or tissue.

19. The method of claim 1, wherein the sample comprises tumor cells and/or tumor nucleic acids.