WO2016061517A2 - Contiguity preserving transposition - Google Patents

Contiguity preserving transposition Download PDF

Info

Publication number
WO2016061517A2
WO2016061517A2 PCT/US2015/056040 US2015056040W WO2016061517A2 WO 2016061517 A2 WO2016061517 A2 WO 2016061517A2 US 2015056040 W US2015056040 W US 2015056040W WO 2016061517 A2 WO2016061517 A2 WO 2016061517A2
Authority
WO
WIPO (PCT)
Prior art keywords
nucleic acid
target nucleic
sequence
fragments
immobilized
Prior art date
Application number
PCT/US2015/056040
Other languages
French (fr)
Other versions
WO2016061517A3 (en
Inventor
Frank J. Steemers
Kevin L. Gunderson
Fan Zhang
Jason Richard Betley
Niall Anthony Gormley
Wouter Meuleman
Jacqueline WEIR
Avgousta IOANNOU
Gareth Jenkins
Rosamond Jackson
Natalie MORRELL
Dmitry K. Pokholok
Steven J. NORBERG
Molly He
Amirali Kia
Igor GORYSHIN
Rigo Pantoja
Original Assignee
Illumina Cambridge Limited
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority to US15/519,482 priority Critical patent/US11873480B2/en
Priority to BR122021026779-0A priority patent/BR122021026779B1/en
Priority to IL299976A priority patent/IL299976B1/en
Priority to JP2017520884A priority patent/JP6808617B2/en
Priority to AU2015331739A priority patent/AU2015331739B2/en
Priority to EP15797490.8A priority patent/EP3207134B1/en
Priority to ES15797490T priority patent/ES2745814T3/en
Priority to CN201580068116.1A priority patent/CN107969137B/en
Priority to RU2017116989A priority patent/RU2709655C2/en
Priority to LTEP15797490.8T priority patent/LT3207134T/en
Priority to BR112017007912-7A priority patent/BR112017007912B1/en
Priority to EP19183798.8A priority patent/EP3636757A1/en
Application filed by Illumina Cambridge Limited filed Critical Illumina Cambridge Limited
Priority to BR122021026781-2A priority patent/BR122021026781B1/en
Priority to CA2964799A priority patent/CA2964799A1/en
Priority to KR1020177013242A priority patent/KR102472027B1/en
Priority to DK15797490.8T priority patent/DK3207134T3/en
Priority to KR1020227041250A priority patent/KR102643955B1/en
Priority to SG11201703139VA priority patent/SG11201703139VA/en
Publication of WO2016061517A2 publication Critical patent/WO2016061517A2/en
Publication of WO2016061517A3 publication Critical patent/WO2016061517A3/en
Priority to IL251737A priority patent/IL251737B/en
Priority to US16/173,202 priority patent/US20190048332A1/en
Priority to IL287853A priority patent/IL287853B2/en
Priority to AU2022201205A priority patent/AU2022201205A1/en
Priority to US17/719,276 priority patent/US20220282242A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • C12N15/1034Isolating an individual clone by screening libraries
    • C12N15/1065Preparation or screening of tagged libraries, e.g. tagged microorganisms by STM-mutagenesis, tagged polynucleotides, gene tags
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • C12N15/1034Isolating an individual clone by screening libraries
    • C12N15/1082Preparation or screening gene libraries by chromosomal integration of polynucleotide sequences, HR-, site-specific-recombination, transposons, viral vectors
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • C12N15/1034Isolating an individual clone by screening libraries
    • C12N15/1093General methods of preparing gene libraries, not provided for in other subgroups
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6806Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6813Hybridisation assays
    • C12Q1/6834Enzymatic or biochemical coupling of nucleic acids to a solid phase
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • C12Q1/6874Methods for sequencing involving nucleic acid arrays, e.g. sequencing by hybridisation
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B50/00Methods of creating libraries, e.g. combinatorial synthesis
    • C40B50/06Biochemical methods, e.g. using enzymes or whole viable microorganisms
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B70/00Tags or labels specially adapted for combinatorial chemistry or libraries, e.g. fluorescent tags or bar codes
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2525/00Reactions involving modified oligonucleotides, nucleic acids, or nucleotides
    • C12Q2525/10Modifications characterised by
    • C12Q2525/191Modifications characterised by incorporating an adaptor
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2563/00Nucleic acid detection characterized by the use of physical, structural and functional properties
    • C12Q2563/179Nucleic acid detection characterized by the use of physical, structural and functional properties the label being a nucleic acid
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2565/00Nucleic acid analysis characterised by mode or means of detection
    • C12Q2565/50Detection characterised by immobilisation to a surface
    • C12Q2565/514Detection characterised by immobilisation to a surface characterised by the use of the arrayed oligonucleotides as identifier tags, e.g. universal addressable array, anti-tag or tag complement array
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/154Methylation markers
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/156Polymorphic or mutational markers
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/172Haplotypes

Definitions

  • Embodiments of the present invention relate to sequencing nucleic acids.
  • embodiments of the methods and compositions provided herein relate to preparing nucleic acid templates and obtaining sequence data therefrom.
  • the detection of specific nucleic acid sequences present in a biological sample has been used, for example, as a method for identifying and classifying microorganisms, diagnosing infectious diseases, detecting and characterizing genetic abnormalities, identifying genetic changes associated with cancer, studying genetic susceptibility to disease, and measuring response to various types of treatment.
  • a common technique for detecting specific nucleic acid sequences in a biological sample is nucleic acid sequencing.
  • Nucleic acid sequencing methodology has evolved significantly from the chemical degradation methods used by Maxam and Gilbert and the strand elongation methods used by Sanger. Today several sequencing methodologies are in use which allow for the parallel processing of nucleic acids all in a single sequencing run. As such, the information generated from a single sequencing run can be enormous.
  • each transposome complex includes: transposons and transposases, in which the transposons comprise transferred strands and non-transferred strands.
  • At least one of the transposons of the transposome complex comprises an adaptor sequence capable of hybridizing to a complementary capture sequence.
  • the target nucleic acid is fragmented into a plurality of fragments and inserting plurality of transferred strands to the 5' end of at least one strand of the fragments while maintaining the contiguity of the target nucleic acid.
  • the plurality of fragments of the target nucleic acid are contacted with a plurality of solid supports, each of the solid supports in the plurality comprising a plurality of immobilized oligonucleotides, each of the oligonucleotides comprising a complementary capture sequence and a first barcode sequence, and wherein the first barcode sequence from each solid support in the plurality of the solid supports differs from the first barcode sequence from other solid supports in the plurality of solid supports.
  • the barcode sequence information is transferred to the target nucleic acid fragments, thereby producing an immobilized library of double-stranded fragments wherein at least one strand is 5'- tagged with the first barcode such that at least two fragments of the same target nucleic acid receives identical barcode information.
  • a target nucleic acid sequence includes contacting the target nucleic acid with a plurality of transposome complexes, each transposome complex comprising: transposons and transposases, in which the transposons comprise transferred strands and non-transferred strands, in which at least one of the transposons of the transposome complex comprise an adaptor sequence capable of hybridizing to a complementary capture sequence.
  • the target nucleic acid is fragmented into a plurality of fragments and plurality of transferred strands is inserted into the plurality of fragments while maintaining the contiguity of the target nucleic acid.
  • the plurality of fragments of the target nucleic acid is contacted with a plurality of solid supports.
  • Each of the solid supports in the plurality comprising a plurality of immobilized oligonucleotides, each of the oligonucleotides comprising a complementary capture sequence and a first barcode sequence, and wherein the first barcode sequence from each solid support in the plurality of the solid supports differs from the first barcode sequence from other solid supports in the plurality of solid supports.
  • the barcode sequence information is transferred to the target nucleic acid fragments such that at least two fragments of the same target nucleic acid receive identical barcode information.
  • the sequence of the target nucleic acid fragments and the barcode sequences are determined.
  • the contiguity information of the target nucleic acid are determined by identifying the barcode sequences.
  • the transposases of transposome complexes are removed after transposition and subsequent hybridization of the adaptor sequences of the transposon to the complimentary capture sequence.
  • the transposases are removed by SDS treatment.
  • the transposases are removed by proteinase treatment.
  • each transposome complex includes transposons and transposases, in which the transposons comprise transferred strands and non-transferred strands, wherein at least one of the transposons of the transposome complex comprise an adaptor sequence capable of hybridizing to a complementary capture sequence.
  • the target nucleic acid is fragmented into a plurality of fragments and plurality of transferred strands is inserted into the target nucleic acid fragments while maintaining the contiguity of the target nucleic acid.
  • the plurality of fragments of the target nucleic acid are contacted with a plurality of solid supports, each of the solid supports in the plurality comprising a plurality of immobilized oligonucleotides, each of the oligonucleotides comprising a complementary capture sequence and a first barcode sequence, and wherein the first barcode sequence from each solid support in the plurality of the solid supports differs from the first barcode sequence from other solid supports in the plurality of solid supports.
  • the barcode sequence information is transferred to the target nucleic acid fragments such that at least two fragments of the same target nucleic acid receive identical barcode information.
  • the target nucleic acid fragments comprising barcodes are subjected to bisulfite treatment, thereby generating bisulfite treated target nucleic acid fragments comprising barcodes.
  • the sequence of the bisulfite treated target nucleic acid fragments and the barcode sequences are determined.
  • the contiguity information of the target nucleic acid is determined by identifying the barcode sequences.
  • the methods include providing a plurality of solid supports having transposome complexes immobilized thereon, in which the transposome complexes are multimeric and the transposome monomeric units of the same transposome complex are linked to each other, and wherein said transposome monomeric units comprise a transposase bound to a first polynucleotide, said first polynucleotide comprising (i)a 3' portion comprising a transposon end sequence, and (ii) a first adaptor comprising a first barcode.
  • a target DNA is applied to the plurality of solid supports under conditions whereby the target DNA is fragmented by the transposome complexes, and the 3' transposon end sequence of the first polynucleotide is transferred to a 5 ' end of at least one strand of the fragments; thereby producing an immobilized library of double-stranded fragments wherein at least one strand is 5'- tagged with the first barcode.
  • a sequencing library for determining the methylation status of a target nucleic acid.
  • the methods include fragmenting the target nucleic acid into two or more fragments.
  • a first common adaptor sequence is incorporated into the 5 '-end of the fragments of the target nucleic acid, wherein the adaptor sequence comprises a first primer binding sequence and an affinity moiety, wherein the affinity moiety in one member of the binding pair.
  • the target nucleic acid fragments are denatured.
  • the target nucleic acid fragments are immobilized on a solid support, in which the solid support comprises other member of the binding pair and the immobilization of the target nucleic acid is by binding of the binding pair.
  • the immobilized target nucleic acid fragments are subjected to bisulfite treatment.
  • a second common adaptor sequence is incorporated to the bisulfite treated immobilized target nucleic acid fragments, wherein the second common adaptor comprises a second primer binding site.
  • the bisulfite treated target nucleic acid fragments immobilized on solid support is amplified thereby producing a sequencing library for determining the methylation status of a target nucleic acid.
  • the methods include providing a plurality of solid support comprising immobilized transposome complexes immobilized thereon.
  • the transposome complexes comprise transposons and transposases, in which the transposons comprise transferred strands and non-transferred strands.
  • the transferred strand comprises (i) a first portion at the 3 '-end comprising the transposase recognition sequence, and (ii) a second portion located 5 'to the first portion comprising a first adaptor sequence and first member of a binding pair.
  • the first member of the binding pair binds to a second member of the binding pair on the solid support, thereby immobilizes the transposon to the solid support.
  • the first adaptor also comprises a first primer binding sequence.
  • the non-transferred strand comprises (i) a first portion at the 5 '-end comprising the transposase recognition sequence and (ii) a second portion located 3 'to the first portion comprising a second adaptor sequence, in which the terminal nucleotide at the 3 '-end is blocked.
  • the second adaptor also comprises a second primer binding sequence
  • the target nucleic acid is contacted with the plurality of solid support comprising immobilized transposome complexes.
  • the target nucleic acid is fragmented into a plurality of fragments and plurality of transferred strands are inserted to the 5 ' end of at least one strand of the fragments, thereby immobilizing the target nucleic acid fragments to the solid support.
  • the 3 '-end of the fragmented target nucleic acid is extended with a DNA polymerase.
  • the non- transferred strand is ligated to the 3 '-end of the fragmented target nucleic acid.
  • the immobilized target nucleic acid fragments are subjected to bisulfite treatment.
  • the 3 '-end of the immobilized target nucleic acid fragments damaged during the bisulfite treatment is extended by using a DNA polymerase such that the 3 '-end of the immobilized target nucleic acid fragments comprise a homopolymeric tail.
  • a second adaptor sequence is introduced to the 3 '-end of the immobilized target nucleic acid fragments damaged during the bisulfite treatment.
  • the bisulfite treated target nucleic acid fragments immobilized on solid support are amplified using a first and a second primer, thereby producing a sequencing library for determining the methylation status of a target nucleic acid.
  • the methods include a. contacting the target nucleic acid with transposome complexes, in which the transposome complexes comprise transposons and transposases.
  • the transposons comprise transferred strands and non- transferred strands.
  • the transferred strand includes (i) a first portion at the 3 '-end comprising the transposase recognition sequence, and (ii) a second portion located 5 'to the first portion comprising a first adaptor sequence and first member of a binding pair, wherein the first member of the binding pair binds to a second member of the binding pair.
  • the non-transferred strand includes (i) a first portion at the 5 '-end comprising the transposase recognition sequence and (ii) a second portion located 3 'to the first portion comprising a second adaptor sequence, in which the terminal nucleotide at the 3 '-end is blocked, and wherein the second adaptor comprises a second primer binding sequence.
  • the target nucleic acid is fragmented into a plurality of fragments and inserting plurality of transferred strands to the 5 ' end of at least one strand of the fragments, thereby immobilizing the target nucleic acid fragments to the solid support.
  • the target nucleic acid fragments comprising the transposon end are contacted with the plurality of solid support comprising second member of the binding pair, wherein binding of the first member of the binding pair to the second member of the binding pair immobilizes the target nucleic acid to the solid support.
  • the 3 '-end of the fragmented target nucleic acid is extended with a DNA polymerase.
  • the non-transferred strand is ligated to the 3 '-end of the fragmented target nucleic acid.
  • the immobilized target nucleic acid fragments are subjected to bisulfite treatment.
  • the 3 '-end of the immobilized target nucleic acid fragments damaged during the bisulfite treatment is extended by using a DNA polymerase such that the 3 '-end of the immobilized target nucleic acid fragments comprise a homopolymeric tail.
  • a second adaptor sequence is introduced to the 3 '-end of the immobilized target nucleic acid fragments damaged during the bisulfite treatment.
  • the bisulfite treated target nucleic acid fragments immobilized on solid support are amplified using a first and a second primer, thereby producing a sequencing library for determining the methylation status of a target nucleic acid.
  • the terminal nucleotide at the 3 '-end of the second adaptor is blocked by a member selected from the group consisting of a dideoxy nucleotide, a phosphate group, thiophosphate group, and an azido group.
  • affinity moieties can be members of a binding pair.
  • the modified nucleic acids may comprise a first member of a binding pair and the capture probe may comprise a second member of the binding pair.
  • capture probes may be immobilized to a solid surface and the modified nucleic acid may comprise a first member of a binding pair and the capture probe may comprise a second member of the binding pair. In such cases, binding the first and second members of the binding pair immobilizes the modified nucleic acid to the solid surface.
  • binding pair examples include, but are not limited to biotin-avidin, biotin-streptavidin, biotin-neutravidin, ligand-receptor, hormone-receptor, lectin-glycoprotein, oligonucleotide-complementary oligonucleotide, and antigen-antibody.
  • the first common adaptor sequence is incorporated to the 5 '-end fragments of the target nucleic acid by one-sided transposition. In some embodiments, the first common adaptor sequence is incorporated to the 5 '-end fragments of the target nucleic acid by ligation.
  • incorporating the second common adaptor sequence into the bisulfite treated immobilized target nucleic acid fragments includes (i) extending the 3 '-end of the immobilized target nucleic acid fragments using terminal transferase to comprise a homopolymeric tail; (ii) hybridizing an oligonucleotide comprising a single stranded homopolymeric portion and a double stranded portion comprising the second common adaptor sequence, wherein the ingle stranded homopolymeric portion is complementary to the homopolymeric tail; and (iii) ligating the second common adaptor sequence to the immobilized target nucleic acid fragments, thereby incorporating the second common adaptor sequence into the bisulfite treated immobilized target nucleic acid fragments.
  • the target nucleic acid is from a single cell. In some embodiments, the target nucleic acid is from a single organelle. In some embodiments, the target nucleic acid is genomic DNA. In some embodiments, the target nucleic acid is cross-linked to other nucleic acids. In some embodiments, target nucleic acid is from formalin fixed paraffin embedded (FFPE) sample. In some embodiments, the target nucleic acid is cross-linked with proteins. In some embodiments, the target nucleic acid is cross-linked with DNA. In some embodiments, the target nucleic acid is histone protected DNA. In some embodiments, histones are removed from the target nucleic acid. In some embodiments, the target nucleic acid is cell free tumor DNA.
  • FFPE formalin fixed paraffin embedded
  • the cell free tumor DNA is obtained from placental fluid. In some embodiments, the cell free tumor DNA is obtained from plasma. In some embodiments, the plasma is collected from whole blood using a membrane separator comprising a collection zone for the plasma. In some embodiments, the collection zone for the plasma comprises transposome complexes immobilized on solid support. In some embodiments, the target nucleic acid is cDNA. In some embodiments, the solid support is a bead. In some embodiments, the plurality of solid supports are plurality of beads and wherein the plurality of beads are of different sizes.
  • a single barcode sequence is present in the plurality of immobilized oligonucleotides on each individual solid support. In some embodiments, different barcode sequences are present in the plurality of immobilized oligonucleotides on each individual solid support.
  • the transferring of the barcode sequence information to the target nucleic acid fragments is by ligation. In some embodiments, transferring of the barcode sequence information to the target nucleic acid fragments is by polymerase extension. In some embodiments, the transferring of the barcode sequence information to the target nucleic acid fragments is by both ligation and polymerase extension.
  • the polymerase extension is by extending the 3 '-end of the non-ligated transposon strand with a DNA polymerase using the ligated immobilized oligonucleotide as a template.
  • at least a portion of the adaptor sequences further comprise a second barcode sequence.
  • the transposome complexes are multimeric, and wherein the adaptor sequences of the transposons of each monomeric unit are different from the other monomeric unit in the same transposome complex.
  • the adaptor sequence further comprises a first primer binding sequence.
  • the first primer binding site has no sequence homology to the capture sequence or to the complement of the capture sequence.
  • the immobilized oligonucleotides on the solid support further comprise a second primer binding sequence.
  • the transposome complexes are multimeric, and the transposome monomeric units are linked to each other in the same transposome complex.
  • the transposase of a transposome monomeric unit is linked to the transposase of another transposome monomeric unit of the same transposome complex.
  • the transposons of a transposome monomeric unit are linked to transposons of another transposome monomeric unit of the same transposome complex.
  • the transposase of a transposome monomeric unit is linked to the transposase of another transposome monomeric unit of the same transposome complex by covalent bond.
  • the transposases of one monomeric unit is linked to the transposase of another transposome monomeric unit of the same transposome complex by di-sulfide bond.
  • the transposons of a transposome monomeric unit are linked to transposons of another transposome monomeric unit of the same transposome complex by covalent bond.
  • the contiguity information of a target nucleic acid sequence is indicative of haplotype information.
  • the contiguity information of a target nucleic acid sequence is indicative of genomic variants.
  • the genomic variants are selected from the group consisting of deletions, translocations, interchromosomal gene fusions, duplications, and paralogs.
  • the oligonucleotides immobilized on the solid support comprise a partially double stranded region and a partially single stranded region.
  • the partially single stranded region of the oligonucleotide comprises the second barcode sequence and the second primer binding sequence.
  • the target nucleic acid fragments comprising the barcodes are amplified prior to determining the sequence of the target nucleic acid fragments.
  • subsequent amplification are carried out in a single reaction compartment prior to determining the sequence of the target nucleic acid fragments.
  • a third barcode sequence is introduced to the target nucleic acid fragments during the amplification.
  • the methods may further include combining the target nucleic acid fragments comprising the barcodes from plurality of first set of reaction compartments into a pool of target nucleic acid fragments comprising the barcodes; redistributing the pool of target nucleic acid fragments comprising the barcodes to a plurality of second set of reaction compartments; and introducing a third barcode in to the target nucleic acid fragments by amplifying the target nucleic acid fragments in the second set of reaction compartments prior to sequencing.
  • the methods may further include pre-fragmenting the target nucleic acid prior to contacting the target nucleic acid with transposome complexes.
  • the pre-fragmenting the target nucleic acid is by a method selected from the group consisting of sonication and restriction digestion.
  • Figure 1 illustrates a flow diagram of an example of a method of binding transposomes to a bead surface.
  • FIG. 1 shows pictorially the steps of the method of Figure 1.
  • Figure 3 illustrates a schematic diagram of an example of a tagmentation process on a bead surface.
  • Figure 4 shows a data table of an example of the DNA yield in terms of cluster number from the bead-based tagmentation process of Figure 3.
  • Figure 5 shows a data table of another example of the reproducibility of the bead-based tagmentation process of Figure 3 in terms of uniform size.
  • Figures 6A and 6B show a plot of the insert size of pool 1 and a plot of the insert size of pool 2, respectively, of the indexed samples of Figure 5.
  • Figure 7 shows a bar graph of the reproducibility of total number of reads and percent reads aligned for the experiment described in Figure 5.
  • Figures 8A, 8B, and 8C show a plot of insert size in a control library, a plot of insert size in a bead-based tagmented library, and a summary data table, respectively, in the exome enrichment assay.
  • Figures 9A, 9B, and 9C show a bar graph of the fraction of dups PF, a bar graph of the fraction of selected bases, and bar graph of PCT usable bases on target, respectively, in the exome enrichment assay.
  • Figure 10 illustrates a flow diagram of an example of a method of forming transposome complexes on a bead surface.
  • FIGS 11, 12, and 13 show pictorially the steps of the method of Figure 10.
  • Figure 14 shows a schematic diagram of a tagmentation process using the transposome coated bead shown in Figure 13.
  • Figure 15 shows an exemplary scheme of forming transposomes on a solid support.
  • Figure 16 shows an exemplary scheme of making contiguously- linked libraries with unique indexes.
  • Figure 17 shows an exemplary scheme of making contiguously- linked libraries with unique indexes.
  • Figures 18 and 19 depicts the capture of a single CPT-DNA on a single clonal indexed bead where the CPT-DNA wraps around the bead.
  • Figure 20 shows an exemplary scheme of linking a Y-adaptor immobilized on the solid surface to the tagmented DNA by ligation and gap filling.
  • Figure 21 shows an exemplary scheme of making such Y-adapters during the ligation of CPT-DNA to the immobilized oligonucleotides on the solid support.
  • Figure 22 depicts an agarose gel electrophoresis showing the removal of free transposome from contiguously-linked libraries by size exclusion chromatography.
  • Figure 23 shows an exemplary scheme of generating shotgun sequence library of a specific DNA fragment.
  • Figure 24 shows an exemplary scheme of assembling the sequence information from clonal indexed sequencing library.
  • Figure 25 shows the results of optimization of capture probe density on beads.
  • Figure 26 shows the results of testing the feasibility of preparing indexed sequencing libraries of CPT-DNA on beads by intra-molecular hybridization.
  • Figure 27 shows the results of testing the feasibility of clonal indexing.
  • Figure 28 depicts a graph showing the frequencies of sequencing reads for particular distances within (intra) and also between (intra) neighboring aligned islands of reads for template nucleic acid following tagmentation.
  • Figure 29A and 29B show exemplary approaches to derive contiguity information on solid support.
  • Figures 30 and 31 show the schematics of indexed clonal bead transposition in a single reaction vessel (one pot) and the results of the transposition.
  • Figure 32 shows the schematics of creating clonal transposomes on beads using 5'- or 3'- biotinylated oligonucleotides.
  • Figure 33 shows the library sizes for transposomes on beads.
  • Figure 34 shows the effect of transposome surface density on insertion size.
  • Figure 35 shows the effect of input DNA on the size distribution.
  • Figure 36 shows the island size and distribution using bead based and solution based tagmentation reactions.
  • Figure 37 shows clonal indexing of several individual DNA molecules, each receiving unique indexes.
  • Figure 38 shows a diagram of a device for separating plasma from whole blood.
  • Figures 39 and 40 show a diagram of a device for separating plasma and subsequent use of the separated plasma.
  • Figure 41 shows an exemplary scheme of targeted phasing by enriching specific regions of a genome.
  • Figure 42 shows an exemplary scheme of exome phasing using the SNPs between the exons.
  • Figure 43 shows an exemplary scheme of simultaneous phasing and methylation detection.
  • Figure 44 shows an alternative exemplary scheme of simultaneous phasing and methylation detection.
  • Fig. 45 shows an exemplary scheme to generate various sized libraries using various sized clonally indexed beads in a single assay.
  • Fig. 46 shows an exemplary scheme of determining genetic variants with different length scale libraries.
  • Fig. 47 A and B shows the result of detection of 60 kb heterozygous deletion in chromosome 1.
  • Fig. 48 shows results of detection of gene fusion using the methods of the present application.
  • Fig. 49 shows results of detection of genetic deletions using the methods of the present application.
  • Fig. 50 shows ME sequences before and after bisulfite conversion.
  • Fig. 51 shows the results of bisulfite conversion efficiency optimization.
  • Fig. 52 shows the results after bisulfite conversion in IVC plot (intensity versus cycles per individual base).
  • Fig. 53 shows an image of agarose gel electrophoresis of indexed-linked libraries after PCR after BSC.
  • Fig. 54 shows the bioanalyzer trace of whole-genome indexed linked CPT-seq libraries before enrichment without size-selection.
  • Fig. 55 shows the agarose gel analysis of libraries after enrichment.
  • Fig. 56 shows the results of application of targeted haplotyping to the HLA region in the chromosome.
  • Fig. 57 shows some possible mechanisms of ME swapping.
  • Fig. 58 shows some possible mechanisms of ME swapping.
  • Fig. 59 shows a portion of Tn5 transposase with exemplary amino acid residues Asp468, Tyr407, Asp461, Lys459, Ser458, Gly462, Ala466, Met470 that can be substituted with Cys.
  • Fig. 60 shows a portion of Tn5 transposase with amino acid substitution of S458C, K459C and A466C, such that cysteine residues can form disulfide bond between two monomeric units.
  • Fig. 61 shows an exemplary scheme of making and using a dimer transposase (dTnp)- nanoparticle (NP) bioconjugate (dTnp-NP) using amine coated nanoparticle.
  • dTnp dimer transposase
  • NP nanoparticle
  • dTnp-NP bioconjugate
  • Fig. 62 shows an exemplary scheme of conjugation of transposome dimer to an amine coated solid support.
  • Fig. 63 shows a Mu transposome complex where transposon ends are linked.
  • Fig. 64 shows a diagram of indexed linked reads for assembly/phasing of pseudogenes and the advantage of indentifying variants in pseudogene using shorter fragments.
  • Fig. 65 shows a plot of index exchange from 4 separate experiments and shown as % of indexes swapped.
  • Fig. 66 shows Agilent BioAnalyzer analysis of fragment sizes of Ts-Tn5 titration.
  • Fig. 67 shows an exemplary scheme to improve DNA yield of the Epi-CPTSeq protocol using enzymatic methods for recovery of broken library elements after bisulfite treatment.
  • Fig. 68 A-C shows several exemplary schemes to improve DNA yield of the Epi-CPTSeq protocol using enzymatic methods for recovery of broken library elements after bisulfite treatment.
  • Fig. 69 shows an exemplary scheme for template rescue using random primer extension.
  • Fig. 70 shows the Fragmentation of DNA library during sodium bisulfate conversion.
  • Left panel illustrates fragmentation during bisulfate conversion of a portion of DNA tagmented on magnetic beads.
  • Right panel shows the BioAnalyzer traces of CPTSeq and Epi-CPTSeq (Me- CPTSeq) libraries.
  • Fig. 71 shows an exemplary scheme and the results of TdT mediated ssDNA ligation reaction.
  • Fig. 72 shows a scheme and the results of TdT mediated recovery of sodium bisulfate converted bead bound library.
  • Left panel illustrates the rescue workflow of damaged bisulfite converted DNA library using TdT mediated ligation reaction. Results of DNA library rescue experiment are shown in the right panel.
  • Fig. 73 shows the results of Methyl-CPTSeq assay.
  • Fig. 74 shows an exemplary scheme of bead based bisulfite conversion of DNA
  • Fig. 75 A-B shows the results of bisulfite conversion efficiency optimization.
  • Embodiments of the present invention relate to sequencing nucleic acids.
  • embodiments of the methods and compositions provided herein relate to preparing nucleic acid templates and obtaining sequence data therefrom.
  • the present invention relate to methods of tagmenting (fragmenting and tagging) target nucleic acid on a solid support for the construction of a tagmented target nucleic acid library.
  • the solid support is a bead.
  • the target nucleic acid is DNA.
  • the present invention relate to methods and compositions of solid-support, transposase-based methods that can derive contiguity information of a target nucleic acid.
  • the compositions and the methods can derive assembly/phasing information.
  • the present invention relate to methods and compositions to derive contiguity information by means of capturing contiguously-linked, transposed, target nucleic acid onto a solid support.
  • compositions and methods disclosed herein relate to analysis of genomic variants.
  • genomic variants include but are not limited to deletions, inter chromosomal translocations, duplications, paralogs, interchromosomal gene fusions.
  • the compositions and methods disclosed herein relate to determining phasing information of the genomic variants.
  • the compositions and methods disclosed herein relate to phasing specific regions of the target nucleic acid.
  • the target nucleic acid is DNA.
  • the target nucleic acid is genomic DNA.
  • the target nucleic acid is RNA.
  • the RNA is mRNA.
  • the target nucleic acid is complimentary DNA (cDNA).
  • target nucleic acid is from a single cell.
  • target nucleic acid is from circulating tumor cells.
  • target nucleic acid is cell free DNA.
  • target nucleic acid is cell free tumor DNA.
  • target nucleic acid is from formalin fixed paraffin embedded tissue samples.
  • target nucleic acid is cross-linked target nucleic acid. In some embodiments, target nucleic acid is cross-linked to proteins. In some embodiments, target nucleic acid is cross-linked to nucleic acid. In some embodiments, target nucleic acid is histone-protected DNA. In some embodiments, histone-protected DNA is precipitated from a cell lysate using antibodies to histones and the histones are removed.
  • indexed libraries are created from the target nucleic acid using the clonally indexed beads.
  • the tagmented target nucleic acid, while the transposase is still bound to the target DNA can be captured using the clonally indexed beads.
  • specific capture probes are used to capture the specific region of interest in the target nucleic acid. The captured regions of the target nucleic acid can be washed at various stringencies and optionally amplified, followed by sequencing.
  • the capture probe may be biotinylated. The complex of the biotinylated capture probes hybridized to the specific regions of the indexed target nucleic acids can be separated by using streptavidin beads. Exemplary scheme of targeted phasing is shown in Fig. 41.
  • compositions and methods disclosed herein can be used phasing exomes.
  • exons, promoters can be enriched. Markers, for example, heterozygous SNPs between exonic regions, can aid in phasing the exons, especially when the distance between exons is large. Exemplary exome phasing is shown in Fig. 42.
  • indexed linked reads cannot span (cover) heterozygous SNPs of neighboring exons simultaneously. As such, it is challenging to phase the two or more exons.
  • the compositions and methods disclosed herein also enriches heterozygous SNPs between exons for example, phasing exons 1 to SNP1 and SNP2 to Exon 2. As such, through the use of SNP 1, exon 1 and exon 2 can be phased as shown in Fig. 42.
  • compositions and methods disclosed herein can be used for phasing and simultaneous methylation detection.
  • Methylation detection through bisulfite conversion (BSC) is challenging as the BSC reaction is harsh on DNA, fragmenting the DNA and therefore removing contiguity/phasing information.
  • methods disclosed in the present application has an additional advantage because no additional purification steps are required in contrast to those required in traditional BSC approaches, thereby improving the yield.
  • compositions and methods disclosed herein can be used to prepare different size libraries in single assay.
  • different sizes of clonally indexed beads can be used to prepare different size libraries.
  • Figure 1 illustrates a flow diagram of an example of a method 100 of binding transposomes to a bead surface.
  • Transposomes may be bound to a bead surface using any chemistry that may be added on the transposon oligonucleotide, transposase, and solid-phase.
  • transposomes are bound to a bead surface via a biotin-streptavidin binding complex.
  • Method 100 includes, but is not limited to, the following steps.
  • transposons may comprise sequencing primer binding sites.
  • Exemplary sequences of sequence binding sites include, but are not limited to AATGATACGGCGACCACCGAGATCTACAC (P5 sequence) and CAAGCAGAAGACGGCATACGAGAT (P7 sequence).
  • the transposons may be biotinylated.
  • the transposons may also include one or more index sequence (unique identifier).
  • index sequences include, but are not limited to TAGATCGC, CTCTCTAT, TATCCTCT, AGAGTAGA, GTAAGGAG, ACTGCATA, AAGGAGTA, CTAAGCCT.
  • the transposons comprise only the mosaic end (ME) sequences or the ME sequences plus additional sequences that are not P5 and P7 sequences.
  • P5 and P7 sequences are added in a subsequent PCR amplification step.
  • the transposomes are assembled.
  • the assembled transposomes are a mixture of P5 and P7 transposomes.
  • a mixture of P5 and P7 transposomes are described in more detail with reference to Figures 11 and 12.
  • P5/P7 transposome mixtures are bound to a bead surface.
  • the beads are streptavidin coated beads and the transposomes are bound to the bead surface via a biotin-streptavidin binding complex.
  • Beads can be of various sizes. In one example, the beads may be 2.8 ⁇ beads. In another example, the beads may be 1 ⁇ beads.
  • a suspension (e.g., 1 ⁇ ) of 1 ⁇ beads provides a large surface area per volume for transposomes binding. Because of the available surface area for transposomes binding, the number of tagmentation products per reaction is increased.
  • Figure 2 shows pictorially the steps 110, 115, and 120 of method 100 of Figure 1.
  • the transposons are shown as duplexes.
  • another structure such as a hairpin, i.e., a single oligonucleotide with regions of self-complementarity capable of forming a duplex, may be used.
  • a plurality of biotinylated P5 transposons 210a and a plurality of P7 transposons 210b are generated.
  • P5 transposons 210a and P7 transposons 210b are biotinylated.
  • transposomes 220 are bound to a bead 225.
  • Bead 225 is a streptavidin coated bead.
  • Transposomes 220 are bound to bead 225 via a biotin-streptavidin binding complex.
  • a mixture of transposomes may be formed on a solid support such as bead surface as shown in Figures 10, 11, 12, and 13.
  • P5 and P7 oligonucleotides are first bound to a bead surface prior to assembly of transposome complexes.
  • FIG 3 illustrates a schematic diagram of an example of a tagmentation process 300 on a bead surface. Shown in process 300 is bead 225 of Figure 2 with transposomes 220 bound thereon. A solution of DNA 310 is added to a suspension of beads 225. As DNA 310 contacts transposomes 220, the DNA is tagmented (fragmented and tagged) and is bound to beads 225 via transposomes 220. Bound and tagmented DNA 310 may be PCR amplified to generate a pool of amplicons 315 in solution (bead- free). Amplicons 315 may be transferred to the surface of a flow cell 320.
  • a cluster generation protocol (e.g., a bridge amplification protocol or any other amplification protocol that may be used for cluster generation) may be used to generate a plurality of clusters 325 on the surface of flow cell 320.
  • Clusters 325 are clonal amplification products of tagmented DNA 310.
  • Clusters 325 are now ready for the next step in a sequencing protocol.
  • the transposomes may be bound to any solid surface, such as the walls of a microfuge tube.
  • oligonucleotides are first bound to a bead surface prior to transposome assembly.
  • Figure 10 illustrates a flow diagram of an example of a method 1000 of forming transposome complexes on a bead surface. Method 1000 includes, but is not limited to, the following steps.
  • P5 and P7 oligonucleotides are bound to a bead surface.
  • the P5 and P7 oligonucleotides are biotinylated and the bead is a streptavidin coated bead.
  • This step is also shown pictorially in schematic diagram 1 100 of Figure 11. Referring now to Figure 11, a P5 oligonucleotide 1110 and a P7 oligonucleotide 1115 are bound to the surface of a bead 1120.
  • a single P5 oligonucleotide 1110 and a single P7 oligonucleotide 1115 are bound to the surface of bead 1 120, but any number of P5 oligonucleotides 1110 and/or P7 oligonucleotides 1115 may be bound to the surface of a plurality of beads 1120.
  • P5 oligonucleotide 1110 comprises a P5 primer sequence, an index sequence (unique identifier), a read 1 sequencing primer sequence and a mosaic end (ME) sequence.
  • P7 oligonucleotide 1115 comprises a P7 primer sequence, an index sequence (unique identifier), a read 2 sequencing primer sequence and an ME sequence.
  • an index sequence is present in only P5 oligonucleotide 1110. In yet another example (not shown), an index sequence is present in only the P7 oligonucleotide 1115. In yet another example (not shown), an index sequence is absent in both P5 oligonucleotide 1110 and P7 oligonucleotide 1115.
  • complementary mosaic end (ME') oligonucleotides are hybridized to the bead-bound P5 and P7 oligonucleotides.
  • This step is also shown pictorially in schematic diagram 1200 of Figure 12.
  • complementary ME sequences (ME') 1125 are hybrid to P5 oligonucleotide 1110 and P7 oligonucleotide 1115.
  • Complementary ME sequences (ME') 1125 e.g., complementary ME sequences (ME') 1125a and complementary ME sequences (ME') 1125b
  • Complementary ME sequence (ME') 1125 is typically about 15 bases in length and phosphorylated at its 5' end.
  • transposase enzyme is added to the bead-bound oligonucleotides to form a mixture of bead-bound transposome complexes.
  • This step is also shown pictorially in schematic diagram 1300 of Figure 13.
  • transposase enzyme is added to form a plurality of transposome complexes 1310.
  • transposome complex 1310 is a duplex structure that comprises transposase enzyme, two surface-bound oligonucleotide sequences, and their hybridized complementary ME sequences (ME') 1125.
  • transposome complex 1310a comprises P5 oligonucleotide 1110 hybridized to complementary ME sequence (ME') 1125 and P7 oligonucleotide 1115 hybridized to complementary ME sequence (ME') 1125 (i.e., P5:P7);
  • transposome complex 1310b comprises two P5 oligonucleotides 1110 hybridized to complementary ME sequences (ME') 1125 (i.e., P5:P5);
  • transposome complex 1310c comprises two P7 oligonucleotides 1115 hybridized to complementary ME sequences (ME') 1125 (i.e., P7:P7).
  • the ratio of P5:P5, P7:P7, and P5:P7 transposome complexes may be, for example, 25:25:50, respectively.
  • Figure 14 shows an exemplary schematic diagram 1400 of a tagmentation process using the transposome coated bead 1120 of Figure 13.
  • bead 1120 with transposome complexes 1310 thereon is added to a solution of DNA 1410 in a tagmentation buffer, tagmentation occurs and the DNA is linked to the surface of bead 1120 via transposomes 1310.
  • Successive tagmentation of DNA 1410 results in a plurality of bridged molecules 1415 between transposomes 1310.
  • the length of bridged molecules 1415 may be dependent on the density of transposome complexes 1310 on the surface of bead 1120.
  • the density of transposome complexes 1310 on the surface of bead 1120 may be tuned by varying the amount of P5 and P7 oligonucleotides bound to the surface of bead 1120 in step 1010 of method 100 of Figure 10. In another example, the density of transposome complexes 1310 on the surface of bead 1120 may be tuned by varying the amount of complementary ME sequence (ME') hybridized to P5 and P7 oligonucleotides in step 1015 of method 1000 of Figure 10. In yet another example, the density of transposome complexes 1310 on the surface of bead 1120 may be tuned by varying the amount of transposase enzyme added in step 1020 of method 1000 of Figure 1.
  • ME' complementary ME sequence
  • the length of bridged molecules 1415 is independent of the quantity of beads 1120 with transposome complexes 1310 bound thereon used in a tagmentation reaction. Similarly, adding more or less DNA 1410 in a tagmentation reaction does not alter the size of the final tagmented product, but may affect the yield of the reaction.
  • bead 1120 is a paramagnetic bead.
  • purification of the tagmentation reaction is readily achieved by immobilizing beads 1120 with a magnet and washing. Therefore, tagmentation and subsequent PCR amplification may be performed in a single reaction compartment ("one -pot") reaction.
  • the present invention relate to methods and compositions of transposase- based methods that can derive contiguity information of a target nucleic acid on a solid support.
  • the compositions and the methods can derive assembly/phasing information.
  • the solid support is a bead.
  • the target nucleic acid is DNA.
  • the target nucleic acid is genomic DNA.
  • the target nucleic acid is RNA.
  • the RNA is mRNA.
  • the target nucleic acid is complimentary DNA (cDNA).
  • transposons may be immobilized as dimers to solid-support such as beads, followed by the binding of transposase to the transposons to form transposomes.
  • two transposons may be immobilized in close proximity (preferably fixed distance) to one another in a solid support.
  • the two transposons will always be immobilized simultaneously, with preferably an optimum linker length and orientation of the two transposons to form transposomes efficiently.
  • transposome formation efficiency will not be a function of transposon density.
  • Two transposons will always be available with the right orientation and distance between them to form transposomes.
  • transposomes may be prepared on solid support, which can subsequently be used to derive contiguity information through tagmentation and sequencing.
  • An exemplary scheme is illustrated in Figure 15.
  • the transposons may be immobilized to the solid support by means other than chemical coupling.
  • Exemplary methods of immobilizing transposons on the solid support may include, but are not limited to affinity binding such as streptavidin-biotin, maltose-maltose binding protein, antigen-antibody, DNA-DNA or DNA-RNA hybridization.
  • transposomes can be pre-assembled and then immobilized on a solid-support.
  • the transposons comprise unique indexes, barcodes, and amplification primer binding sites.
  • Transposase can be added in solution comprising transposons to form transposome dimers, which can be immobilized on a solid support.
  • multiple bead sets can be generated in which each set has the same index derived from the immobilized transposons thus generating indexed beads.
  • Target nucleic acid can be added to each set of indexed beads as shown in Figure 29A.
  • target nucleic acid can be added to each set of indexed beads, tagmented and subsequent PCR amplification may be performed separately.
  • target nucleic acid, indexed beads, and transposomes can be combined in droplets such that a number of droplets contain a single bead with one or more DNA molecules and adequate transposomes.
  • the indexed beads can be pooled, target nucleic acid can be added to the pool, tagmented and subsequent PCR amplification may be performed in a single reaction compartment ("one-pot").
  • the present invention relate to methods and compositions to derive contiguity information by means of capturing contiguously-linked, transposed, target nucleic acid onto a solid support.
  • contiguity preserving transposition CPT
  • CPT-DNA contiguity preserving transposition
  • Contiguity information can be preserved by the use of transposase to maintain the association of template nucleic acid fragments adjacent in the target nucleic acid.
  • the CPT- DNA can be captured by hybridization of complimentary oligonucleotides having unique indexes or barcodes and immobilized on solid support, e.g., beads ( Figure 29B).
  • the oligonucleotide immobilized on the solid support may further comprise primer binding sites, unique molecular indices (UMI), in addition to barcodes.
  • UMI unique molecular indices
  • transposomes to maintain physical proximity of fragmented nucleic acids increases the likelihood that fragmented nucleic acids from the same original molecule, e.g., chromosome, will receive the same unique barcode and index information from the oligonucleotides immobilized on a solid support. This will result in a contiguously-linked sequencing library with unique barcodes.
  • the contiguously-linked sequencing library can be sequenced to derive contiguous sequence information.
  • Figures 16 and 17 show schematic representations of an exemplary embodiment of the above aspect of the invention of making contiguously-linked libraries with unique barcodes or indices.
  • the exemplary method leverages on ligation of the CPT-DNA with the immobilized oligonucleotides on the solid support comprising unique indexes and barcodes and strand- replacement PCR to generate a sequencing library.
  • clonal indexed beads may be generated with immobilized DNA sequences such as random or specific primer and index.
  • Contiguously-linked libraries can be captured onto clonal-indexed beads by hybridization to the immobilized oligonucleotides followed by ligation.
  • FIGS. 18 and 19 depict the capture of the CPT-DNA on clonal indexed beads and the preservation of the contiguity information. Strand-replacement PCR can transfer the clonal bead index information to the individual molecule. Thus, each contiguously- linked library will be uniquely indexed.
  • the oligonucleotide immobilized on a solid support can comprise a partially double stranded structure such that one strand is immobilized to the solid support and the other strand is partially complementary to the immobilized strand resulting in a Y-adaptor.
  • the Y-adaptor immobilized on the solid surface is linked to the contiguously linked tagmented DNA by ligation and gap filling and shown in Figure 20.
  • Y-adaptor is formed through hybridization capture of CPT-DNA with the probe/index on the solid support such as beads.
  • Figure 21 shows an exemplary scheme of making such Y-adapters. The use of these Y-adapters ensures that potentially every fragment can become a sequencing library. This increases the coverage per sequencing.
  • free transposomes may be separated from CPT-DNA.
  • the separation of the free transposomes is by size exclusion chromatography. In one embodiment, the separation may be achieved by MicroSpin S-400 HR Columns (GE Healthcare Life Sciences, Pittsburgh, PA).
  • Figure 22 shows an agarose gel electrophoresis of the separated of CPT-DNA from the free transposomes.
  • Capturing contiguously-linked, transposed, target nucleic acid onto a solid support through hybridization has several unique advantages.
  • First, the method is based on hybridization and not transposition. Intramolecular hybridization rate » intermolecular hybridization rate.
  • Mention of contiguously-transposed libraries on a single target DNA molecule to wrap around a uniquely indexed bead is much higher than having two or more different single target DNA molecule to wrap around a uniquely indexed bead.
  • DNA transposition and barcoding of the transposed DNA occur in two separate steps.
  • the challenges associated with active transposome assembly on beads and surface density optimization of transposons on solid-surfaces can be avoided.
  • Fourth, self-transposition products can be removed by column purification.
  • DNA is more flexible and therefore puts less of a burden on transposition density (insert size) compared to immobilizing transposome on bead methods.
  • the method can be used with combinatorial barcoding schemes.
  • it is easy to covalently-link indexed oligos to the beads. Thus, there is less chance for index exchange.
  • the tagmentation and subsequent PCR amplification may be multiplexed and can be performed in a single reaction compartment ("one- pot") reaction eliminating the need to carryout individual reactions for each index sequences.
  • each barcode includes a first barcode sequence and a second barcode sequence, having a fragmentation site disposed therebetween.
  • the first barcode sequence and second barcode sequence can be identified or designated to be paired with one another.
  • the pairing can be informative so that a first barcode is associated with a second barcode.
  • the paired barcode sequences can be used to assemble sequencing data from the library of template nucleic acids.
  • identifying a first template nucleic acid comprising a first barcode sequence and a second template nucleic acid comprising a second barcode sequence that is paired with the first indicates that the first and second template nucleic acids represent sequences adjacent to one another in a sequence representation of the target nucleic acid.
  • Such methods can be used to assemble a sequence representation of a target nucleic acid de novo, without the requirement of a reference genome.
  • the present invention relate to methods and compositions to generate shotgun sequence library of a specific DNA fragment.
  • clonal indexed beads are generated with immobilized oligonucleotide sequences: random or specific primer and unique indexes.
  • Target nucleic acid is added to the clonal indexed beads.
  • the target nucleic acid is DNA.
  • the target DNA is denatured.
  • the target DNA hybridizes with primers comprising unique indexes immobilized on the solid surface (e.g., bead) and subsequently with other primers with the same index.
  • the primers on the bead amplify the DNA.
  • One or more further rounds of amplification may be carried out.
  • the amplification may be carried out by whole genome amplification using bead immobilized primers with a 3 ' random n-mer sequence.
  • the random n-mer contains pseudocomplementary bases (2-thiothymine, 2-amino dA, N4-ethyl cytosine, etc.) to prevent primer-primer interaction during amplification (Hoshika, S; Chen, F; Leal, NA; Benner, SA , Angew. Chem. Int. Ed.49(32) 5554-5557 (2010).
  • Figure 23 shows an exemplary scheme of generating shotgun sequence library of a specific DNA fragment.
  • a clonal indexed sequencing library can library of the amplified product can be generated.
  • such library can be generated by transposition. Sequence information of the clonal indexed library can be used to assemble the contiguous information using the index information as a guide.
  • Figure 24 shows an exemplary scheme of assembling the sequence information from clonal indexed sequencing library.
  • Intra-molecular amplification on a bead is much faster than inter-bead amplification.
  • the products on a bead will have the same index.
  • a shotgun library of a specific DNA fragment can be created. Random primers amplify the template at random locations and therefore a shotgun library with the same index can be generated from a specific molecule and the sequence information can be assembled using the indexed sequence.
  • a significant advantage of the methods of the above embodiments is that the reactions can be multiplexed in a single reaction (one pot reaction) and will not require using many individual wells.
  • index clonal beads can be prepared so many different fragments can be uniquely labeled, and discrimination can be made to the parental alleles for same genomic regions. With a high number of indexes, the chance that the DNA copy of the father and copy of the mother will receive the same index for the same genomic region is low.
  • the method takes advantage of the fact that intra reactions are much faster than inter, the beads basically generate a virtual partition in a larger physical compartment.
  • the method may be used for cell free DNA (cfDNA) in cfDNA assays.
  • cfDNA cell free DNA
  • the cfDNA is obtained from plasma, placental fluids.
  • the plasma can be obtained from undiluted whole blood using membrane based, sedimentation assisted plasma separator (Liu et al. Anal Chem. 2013 Nov 5;85(21): 10463-70).
  • the collection zone of the plasma of the plasma separator may comprise solid support comprising transposomes.
  • the solid support comprising transposomes may capture the cfDNA from the isolated plasma as it is separated from the whole blood and can concentrate the cfDNA and/or tagment the DNA.
  • the tagmentation will further introduce unique barcodes to allow subsequent demultiplexing after sequencing of the pool of libraries.
  • the collection zone of the separator may comprise PCR master mix (primers, nucleotides, buffers, metals) and polymerase.
  • the master mix can be in dry form such that it will be reconstituted as the plasma comes out of the separator.
  • the primers are random primers.
  • the primers can be specific primers for a particular gene. PCR amplification of the cfDNA will result in the generation of library directly from the separated plasma.
  • the collection zone of the separator may comprise RT-PCR master mix (primers, nucleotides, buffers, metals), reverse transcriptase and polymerase.
  • the primers are random primers or oligo dT primers.
  • the primers can be specific primers for a particular gene.
  • the resulting cDNA can be used for sequencing. Alternatively, the cDNA can be treated with transposomes immobilized on a solid support for sequence library preparation.
  • the plasma separator may comprise barcodes (ID or 2D barcodes).
  • the separation device may comprise blood collection device. This would result in direct delivery of the blood to the plasma separator and library prep device.
  • the device may comprise a downstream sequence analyzer.
  • sequence analyzer is a single use sequencer.
  • the sequencer is capable of queuing samples before sequencing in a batch. Alternatively, the sequencer may have random access capability, where samples are delivered to their sequencing area.
  • the collection zone for plasma may comprise silica substrates, such that the cell free DNA is concentrated
  • Inventors of the present application has surprisingly and unexpectedly found that phasing and simultaneous methylation detection is possible using the methods and compositions of the present application.
  • the present methods will allow to combine CPT-seq on beads (indexed contiguity linked libraries) with DNA methylation detection. For example, individual libraries generated on beads can be treated with bisulfite, converting non-methylated Cs, but not methylated Cs to Us, allowing the detection of 5-Me-C.
  • epi-medication-phasing blocks can be established multi megabase range.
  • the size of the DNA analyzed can be about hundred bases to about multi mega bases. In some embodiments, the size of the DNA analyzed can be about 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 1200, 1300, 1500, 2000, 3000, 3500, 4000, 4500, 5000, 5500, 6000, 6500, 7000, 7,500, 8000, 8500, 9000, 9500, 10,000, 10,500, 11,000, 11,500, 12,000, 12500, 13000, 14000, 14500, 15000, 15500, 16000, 16500, 17000, 17,500, 18,000, 18,500, 19,000, 19,500, 20,000, 20,500, 21,000, 21,500, 22,000, 22,500, 23,000, 23,500, 24,000, 24,500, 25,000, 25,500, 26,000, 26,500, 27,000, 27,500, 28,000, 28,500, 29,500, 30,000, 30,500, 31,000, 31,500, 32,000, 33,000, 34,000, 35,000, 36,000, 37,000, 3
  • DNA is first transformed into indexed-linked libraries on a solid- support.
  • Individual indexed libraries much smaller than the original DNA, are less prone to fragmentation since the individual libraries are smaller. Even if a small fraction of indexed libraries are lost, phasing information is still maintained across the long span of the indexed DNA molecule. For example, if a lOOkb molecule in traditional bisulfite conversion (BSC) is fragmented in half the contiguity is now restricted to 50kb.
  • BSC bisulfite conversion
  • a lOOkb library is first indexed and even if a fraction of individual libraries are lost, contiguity is still at ⁇ 100kb (except in the unlikely event when all libraries lost are from one end of the DNA molecule.
  • methods disclosed in the present application has an additional advantage because no additional purification steps are required in contrast to those required in traditional bisulfite conversion approaches, thereby improving the yield.
  • the beads are simply washed after bisulfite conversion.
  • buffer exchanges can be readily performed with minimal loss of DNA (indexed libraries) and reduced hands on time.
  • the workflow consists of tagmentation of DNA on beads, gap-fill-ligate the 9-bp repeat regions, removal of Tn5 with SDS, and bisulfite conversion of the individual libraries on the beads.
  • the bisulfite conversion is performed under denaturing conditions to ensure that neighboring complementary libraries are not re-annealing, therefore reducing the bisulfite conversion efficiency.
  • BCS converts non-methylated C's to U's and methylated C's are not converted.
  • Figure 44 shows an alternative exemplary scheme of simultaneous phasing and methylation detection.
  • a fraction of gap- filled-ligated libraries are degraded in order to prepare single-stranded templates.
  • Single-stranded templates need milder conditions for bisulfite conversion since the templates are already single- stranded which could reduce library loss or improve bisulfite conversion efficiency.
  • a mixture of 3' thio-protected transposons (Exo resistant) and non-protected transposons are used on the same bead. Enzymes, for example, Exo I, can be used to digest the non-thio-protected libraries, converting them to single stranded libraries.
  • non-protected transposons 50% of the libraries will be converted to single-stranded libraries (50% have one transposon of the library is protected and one, the complement strand, is not protected), 25% will not be converted (both transposons are thio protected), and 25% are both converted removing the whole library, (both transposons not protected).
  • the first strategy relies on decreasing library insert size by more densely populating transposome complexes to the streptavidin beads. By decreasing library size, a smaller proportion of library elements are degraded by bisulfite treatment.
  • the second strategy to improve DNA yield of the Epi-CPTSeq protocol is enzymatic recovery of broken library elements.
  • the purpose of the recovery strategy is to add the 3' common sequence necessary for library amplification back to the bead bound library elements that became digested and lost their 3 ' portion during bisulfite treatment. After the addition of the 3 ' common sequence these elements can now be PCR amplified and sequenced.
  • Figure 67 and 68 shows an exemplary scheme of this strategy. Double stranded CPTSeq library elements have been denatured and bisulfite converted (top panel). During bisulfite conversion, one of DNA strands has been damaged (middle panel), leading to loss of the PCR common sequence on the 3' end.
  • Template rescue strategies restore the 3' common sequence (green) necessary for PCR amplification (bottom panel).
  • a terminal transferase in a presence of 3' phosphorylated attenuator oligo a sequence containing a sequencing adapter followed by an oligo dT stretch is used ( Figure 68A).
  • TdT adds a stretch of 10 to 15 dAs to the 3' end of a broken library element, which anneals to the oligo dT portion of the attenuator oligo. Formation of this DNA hybrid stops TdT reaction and provides template for consequent extension of the 3 'end of a broken library element by DNA polymerase.
  • the TdT tailing reaction is performed in the presence of a partially double stranded attenuator oligo, containing a single stranded oligo dT portion and 5 ' phosphorylated double stranded sequencing adapter portion.
  • a partially double stranded attenuator oligo containing a single stranded oligo dT portion and 5 ' phosphorylated double stranded sequencing adapter portion.
  • the nick between last added dA and 5 ' phosphorylated attenuator oligo is sealed by DNA ligase.
  • a common sequencing adapter can also be added to the 3 ' end of broken library elements by a recently introduced ssDNA template switching activity of MMLV RT.
  • MMLV RT and a template switch oligo are added to damaged DNA ( Figure 68C).
  • reverse transcriptase adds a few additional nucleotides to 3' ends of a single-stranded DNA fragment, and these bases pair with an oligo (N) sequence presented at the 3 ' end of one of the TS oligos.
  • reverse transcriptase template switching activity adds the sequences of the annealed common primers to the 3' end of BSC broken library element, restoring its ability to get amplified in PCR with common sequencing primers.
  • an Epicentre's EpiGenome kit "post-bisulfite conversion" library construction method can be used to rescue library elements which lost their common sequences at the 3' end during bisulfite conversion.
  • this library rescue method utilizes 3 ' phosphorylated oligos with common sequences followed by a short stretch of random sequence. These short random sequences hybridize to the bisulfite-treated single-stranded DNA and common sequences are subsequently copied to the broken library strand by DNA polymerase.
  • Figure 74 shows the fourth strategy to improve the bisulfite sequencing methods on beads.
  • a first common sequence comprising a capture tag is covalently attached to the 5 ' ends of DNA.
  • the first common sequence can be attached to DNA using various methods, including single- sided transposition (as pictured), adapter ligation, or terminal transferase (TdT) adapter ligation as described in US Patent Application Publication 20150087027.
  • DNA is denatured (e.g. incubation at high heat) and bound to a solid support.
  • biotin used as a capture tag on CS1
  • DNA can be bound using streptavidin magnetic beads (as pictured). Once bound to the solid support buffer exchanges can be readily made.
  • PCR is performed to amplify the library and remove it from the solid support.
  • PCR primers can be designed to add additional commons sequences, such as sequencing adapters, to the ends of the MethylSeq library.
  • the accuracy of the assembly of genomes is contingent on the use of different length scale technologies.
  • shotgun 100's of bp
  • ⁇ 3Kb ⁇ 3Kb
  • Mb-scale -Hi-C
  • the challenge is that multiple assays are required to accomplish this, making the multi-layered approach cumbersome and costly.
  • the compositions and methods disclosed herein can address multiple length scales in a single assay.
  • library preparation can be achieved in a single assay using differentially sized solid support, for example, beads. Each bead size will generate a specific library size or range of sizes, with the physical size of the bead determining the library size.
  • the various sized beads all have unique clonal indices that are transferred to the library. As such, different sizes libraries are generated with each different library scale-length uniquely indexed.
  • the various length-scale libraries are prepared simultaneously in the same physical compartment, reducing cost and improving overall work flow.
  • each specific solid support size for example, bead size receives a unique index.
  • multiple different indexes of the same solid support size for example, bead size are also prepared so multiple DNA molecules can be index partitioned for that size range.
  • Fig. 45 shows an exemplary scheme to generate various sized libraries using various sized clonally indexed beads in a single assay.
  • the size of the libraries generated are about 50, 75, 100, 150, 200, 250, 300, 350, 400, 500, 600, 700, 800, 900, 1000, 1200, 1300, 1500, 2000, 3000, 3500, 4000, 4500, 5000, 5500, 6000, 6500, 7000, 7,500, 8000, 8500, 9000, 9500, 10,000, 10,500, 11,000, 11,500, 12,000, 12500, 13000, 14000, 14500, 15000, 15500, 16000, 16500, 17000, 17,500, 18,000, 18,500, 19,000, 19,500, 20,000, 20,500, 21,000, 21,500, 22,000, 22,500, 23,000, 23,500, 24,000, 24,500, 25,000, 25,500, 26,000, 26,500, 27,000, 27,500, 28,000, 28,500, 29,500, 30,000, 30,500, 31,000, 31,500, 32,000, 33,000, 34,000, 35,000, 36,000, 37,000, 38,000, 39,000, 40,000, 42,000, 45,000,
  • multiple length scale libraries discussed above can be used in the assembly of pseudogenes, paralogs etc. instead of having one large length scale.
  • multiple length scale libraries are prepared simultaneously in a single assay. The advantage is that at least one length-scale will link a unique region with only the pseudo-gene and or gene, but not both. As such, variants detected with this length-scale can uniquely assign the variant to either the gene or the pseudo-gene. The same holds true for copy number variants, paralogs etc.
  • the strength of assembly is the use of different length scales. Using the methods disclosed herein different length scale indexed linked libraries can be generated in a single assay instead of individual, different library preparations for different length scales.
  • Fig. 46 shows an exemplary scheme of determining genetic variants with different length scale libraries.
  • compositions and methods disclosed herein relate to analysis of genomic variants.
  • Exemplary genomic variants include but are not limited to deletions, inter chromosomal translocations, duplications, paralogs, interchromosomal gene fusions.
  • the compositions and methods disclosed herein relate to determining phasing information of the genomic variants.
  • the table below shows exemplary interchromosomal gene fusions.
  • Table 2 shows exemplary deletions in chromosome 1 .
  • target nucleic acid can be fragmented prior to exposing it to transposomes.
  • Exemplary fragmentation methods include, but are not limited to sonication, mechanical shearing, and restriction digestion. Fragmentation of target nucleic acid prior to tagmentation (fragmentation and tagging) is advantageous for assembly/phasing of pseudogenes (e.g., CYP2D6). Long islands (>30kb) of indexed linked reads will span the pseudogenes A and A' as shown in Figure 64. Due to high sequence homology, it will be challenging to determine which variant belongs to Gene A and Gene A'. Shorter variants will link one variant of the pseudogenes with unique surrounding sequences. Such shorter islands can be achieved by fragmenting the target nucleic acid prior to tagmentation.
  • pseudogenes e.g., CYP2D6
  • transposases are multimeric in a transposome complex, e.g., they form dimers, tetramers etc. in a transposome complex.
  • Inventors of the present application have surprisingly and unexpectedly found that linking the monomer transposases in multimeric transposome complex or linking the transposon ends of a transposome monomer in multimeric transposome complex has several advantages.
  • Second, lower concentrations of transposomes can potentially be used in the fragmentation by transposition reaction.
  • the linking leads to lower exchange of the mosaic ends (ME) of transposome complexes, thus less mixing of barcodes or adaptor molecules.
  • Such swapping of ME ends are possible if the complexes fall apart and reform, or in case where transposomes are immobilized on solid support by streptavidin/biotin, the streptavidin/biotin interaction can break and reform, or when there is a possible contamination.
  • Inventors of the present application noted that there is a significant swap or exchange of ME ends under various reaction conditions. In some embodiments, the exchange can be as high as 15%. The exchange is pronounced in high salt buffer and the exchange is reduced in glutamate buffer.
  • Figures 57 and 58 shows some possible mechanisms of ME swapping.
  • the transposase subunits in the transposome complex can be linked to each other by covalent and non-covalent means.
  • transposase monomers can be linked before making the transposome complex (before addition of the transposons).
  • transposase monomers can be linked after transposome formation.
  • native amino acid residues may be substituted with Cysteine (Cys) amino acids at the multimeric interface to promote disulfide bond formation.
  • Cysteine (Cys) amino acids For example, in Tn5 transposase, Asp468, Tyr407, Asp461, Lys459, Ser458, Gly462, Ala466, Met470 may be substituted with Cys to promote disulfide bond between the monomer subunits and shown in Figures 59 and 60.
  • exemplary amino acids that can be substituted with cysteine include, but are not limited to Leu21, Leu32, Ala35, His20, Phel7, Phe36, Ilel6, Thrl3, Argl2, GlnlO, Glu9 and shown in Fig. 61.
  • the modified transposase with amino acid residues substituted with cysteine can chemically cross-linked to each other using a chemical cross-linker using maleimide or pyridyldithiol reactive groups.
  • exemplary chemical cross-linkers are available commercially from Pierce Protein Biology/ThermoFisher Scientific (Grand Island, NY, USA).
  • transposome multimer complexes can be covalently linked to solid support.
  • solid supports include but are not limited to nanoparticles, beads, flow cell surfaces, column matrices.
  • solid surfaces may be coated with amine groups.
  • Modified transposase with amino acid residues substituted with cysteine can be chemically cross-linked to such amine groups using an amine-to-sulfhydryl crosslinker (i.e., succinimidyl-4-(N-maleimidomethyl)cyclohexane-l-carboxylate (SMCC)).
  • SMCC succinimidyl-4-(N-maleimidomethyl)cyclohexane-l-carboxylate
  • SMCC succinimidyl-4-(N-maleimidomethyl)cyclohexane-l-carboxylate
  • SMCC succinimidyl-4-(N-maleimidomethyl)cyclohexane-l-carboxylate
  • Figure 62 a
  • transposase gene can be modified to express multimeric protein in a single polypeptide.
  • Tn5 or Mos-1 genes can be modified to express two Tn5 or Mos-1 proteins in a single polypeptide.
  • Mu transposase gene can be modified to encode four mu transposase units in a single polypeptide.
  • the transposon ends of a transposome monomer unit can be linked to form a linked transposome multimer complex.
  • Linking the transposon ends allow insertion of primer sites, sequencing primers, amplification primers or any role DNA can play into gDNA without fragmenting the target DNA. Insertion of such functionality are advantages in
  • transposon ends of Mu transposomes can be linked to a "looped" Mu transposase/transposon configuration. Since Mu is a tetramer, various configurations are possible but not limited by linking either R2UJ and/or R1UJ with R2J and/or R1J. In these configurations R2UJ and R1UJ can/are not connected with R2J and R1J, respectively.
  • Figure 63 shows a Mu transposome complex where transposon ends are linked.
  • transposon ends of Tn5 or transposon ends of Mos-1 transposomes can be linked.
  • transposon means a double-stranded DNA that exhibits only the nucleotide sequences (the "transposon end sequences") that are necessary to form the complex with the transposase or integrase enzyme that is functional in an in vitro transposition reaction.
  • a transposon forms a "complex” or a “synaptic complex” or a “transposome complex” or a “transposome composition with a transposase or integrase that recognizes and binds to the transposon, and which complex is capable of inserting or transposing the transposon into target DNA with which it is incubated in an in vitro transposition reaction.
  • a transposon exhibits two complementary sequences consisting of a "transferred transposon sequence” or “transferred strand” and a “non-transferred transposon sequence,” or “non transferred strand”.
  • a transposon that forms a complex with a hyperactive Tn5 transposase e.g., EZ-Tn5TM
  • Transposase EPICENTRE Biotechnologies, Madison, Wis., USA
  • a transferred strand that exhibits a "transferred transposon sequence" as follows:
  • non-transferred strand that exhibits a "non-transferred transposon sequence" as follows:
  • the 3'-end of a transferred strand is joined or transferred to target DNA in an in vitro transposition reaction.
  • the non-transferred strand which exhibits a transposon sequence that is complementary to the transferred transposon end sequence, is not joined or transferred to the target DNA in an in vitro transposition reaction.
  • the transposon sequences may comprise one or more of the following: a barcode, an adaptor sequence, a tag sequence, a primer binding sequence, a capture sequence, unique molecular identifier (UMI) sequence.
  • adaptor means a nucleic acid sequence that can comprise a barcode, a primer binding sequence, a capture sequence, a sequence complementary to a capture sequence, unique molecular identifier (UMI) sequence, an affinity moiety, restriction site.
  • UMI unique molecular identifier
  • the term "contiguity information” refers to a spatial relationship between two or more DNA fragments based on shared information.
  • the shared aspect of the information can be with respect to adjacent, compartmental and distance spatial relationships. Information regarding these relationships in turn facilitates hierarchical assembly or mapping of sequence reads derived from the DNA fragments. This contiguity information improves the efficiency and accuracy of such assembly or mapping because traditional assembly or mapping methods used in association with conventional shotgun sequencing do not take into account the relative genomic origins or coordinates of the individual sequence reads as they relate to the spatial relationship between the two or more DNA fragments from which the individual sequence reads were derived.
  • methods of capturing contiguity information may be accomplished by short range contiguity methods to determine adjacent spatial relationships, mid-range contiguity methods to determine compartmental spatial relationships, or long range contiguity methods to determine distance spatial relationships. These methods facilitate the accuracy and quality of DNA sequence assembly or mapping, and may be used with any sequencing method, such as those described above.
  • Contiguity information includes the relative genomic origins or coordinates of the individual sequence reads as they relate to the spatial relationship between the two or more DNA fragments from which the individual sequence reads were derived.
  • contiguity information includes sequence information from non-overlapping sequence reads.
  • the contiguity information of a target nucleic acid sequence is indicative of haplotype information. In some embodiments, the contiguity information of a target nucleic acid sequence is indicative of genomic variants.
  • maintaining the contiguity of the target nucleic acid in the context of fragmenting a nucleic acid means maintaining the order of the nucleic acid sequence of the fragments from the same target nucleic acid.
  • the term "at least a portion” and/or grammatical equivalents thereof can refer to any fraction of a whole amount.
  • “at least a portion” can refer to at least about 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 99%, 99.9% or 100% of a whole amount.
  • the term "sequencing read” and/or grammatical equivalents thereof can refer to a repetitive process of physical or chemical steps that is carried out to obtain signals indicative of the order of monomers in a polymer.
  • the signals can be indicative of an order of monomers at single monomer resolution or lower resolution.
  • the steps can be initiated on a nucleic acid target and carried out to obtain signals indicative of the order of bases in the nucleic acid target.
  • the process can be carried out to its typical completion, which is usually defined by the point at which signals from the process can no longer distinguish bases of the target with a reasonable level of certainty. If desired, completion can occur earlier, for example, once a desired amount of sequence information has been obtained.
  • a sequencing read can be carried out on a single target nucleic acid molecule or simultaneously on a population of target nucleic acid molecules having the same sequence, or simultaneously on a population of target nucleic acids having different sequences.
  • a sequencing read is terminated when signals are no longer obtained from one or more target nucleic acid molecules from which signal acquisition was initiated.
  • a sequencing read can be initiated for one or more target nucleic acid molecules that are present on a solid phase substrate and terminated upon removal of the one or more target nucleic acid molecules from the substrate. Sequencing can be terminated by otherwise ceasing detection of the target nucleic acids that were present on the substrate when the sequencing run was initiated. Exemplary methods of sequencing are described in U.S. Patent No. 9,029,103, which is incorporated herein by reference in its entirety.
  • the term "sequencing representation" and/or grammatical equivalents thereof can refer to information that signifies the order and type of monomeric units in the polymer.
  • the information can indicate the order and type of nucleotides in a nucleic acid.
  • the information can be in any of a variety of formats including, for example, a depiction, image, electronic medium, series of symbols, series of numbers, series of letters, series of colors, etc.
  • the information can be at single monomer resolution or at lower resolution.
  • An exemplary polymer is a nucleic acid, such as DNA or RNA, having nucleotide units.
  • a series of "A,” “T,” “G,” and “C” letters is a well-known sequence representation for DNA that can be correlated, at single nucleotide resolution, with the actual sequence of a DNA molecule.
  • Other exemplary polymers are proteins having amino acid units and polysaccharides having saccharide units.
  • solid support and solid surface are used interchangeably.
  • the solid support or its surface is non-planar, such as the inner or outer surface of a tube or vessel.
  • the solid support comprises microspheres or beads.
  • microspheres or “beads” or “particles” or grammatical equivalents herein is meant small discrete particles.
  • Suitable bead compositions include, but are not limited to, plastics, ceramics, glass, polystyrene, methylstyrene, acrylic polymers, paramagnetic materials, thoria sol, carbon graphite, titanium dioxide, latex or cross-linked dextrans such as Sepharose, cellulose, nylon, cross-linked micelles and Teflon, as well as any other materials outlined herein for solid supports may all be used.
  • "Microsphere Detection Guide” from Bangs Laboratories, Fishers Ind. is a helpful guide.
  • the microspheres are magnetic microspheres or beads.
  • the beads can be color coded.
  • MicroPlex® Microspheres from Luminex, Austin, TX may be used.
  • the beads need not be spherical; irregular particles may be used. Alternatively or additionally, the beads may be porous.
  • the bead sizes range from nanometers, i.e. about 10 nm, to millimeters in diameter, i.e. 1 mm, with beads from about 0.2 micron to about 200 microns being preferred, and from about 0.5 to about 5 micron being particularly preferred, although in some embodiments smaller or larger beads may be used. In some embodiments, beads can be about 0.1 , 0.2, 0.3, 0.4, 0.5.
  • a “transposome” comprises an integration enzyme such as an integrase or transposase, and a nucleic acid comprising an integration recognition site, such as a transposase recognition site.
  • the transposase can form a functional complex with a transposase recognition site that is capable of catalyzing a transposition reaction.
  • the transposase may bind to the transposase recognition site and insert the transposase recognition site into a target nucleic acid in a process sometimes termed "tagmentation". In some such insertion events, one strand of the transposase recognition site may be transferred into the target nucleic acid.
  • a transposome comprises a dimeric transposase comprising two subunits, and two non-contiguous transposon sequences.
  • a transposome comprises a transposase comprises a dimeric transposase comprising two subunits, and a contiguous transposon sequence.
  • Some embodiments can include the use of a hyperactive Tn5 transposase and a Tn5-type transposase recognition site (Goryshin and Reznikoff, J. Biol. Chem., 273 :7367 (1998)), or MuA transposase and a Mu transposase recognition site comprising Rl and R2 end sequences (Mizuuchi, K., Cell, 35 : 785, 1983; Savilahti, H, et al, EMBO J., 14: 4893, 1995).
  • An exemplary transposase recognition site that forms a complex with a hyperactive Tn5 transposase comprises the following 19b transferred strand (sometimes "M” or "ME") and non-transferred strands: 5' AGATGTGTATAAGAGACAG 3', 5' CTGTCT CTTATACACATCT 3', respectively.
  • ME sequences can also be used as optimized by a skilled artisan.
  • transposition systems that can be used with certain embodiments of the compositions and methods provided herein include Staphylococcus aureus Tn552 (Colegio et al, J. BacterioL, 183: 2384-8, 2001 ; Kirby C et al, Mol. Microbiol, 43: 173-86, 2002), Tyl (Devine & Boeke, Nucleic Acids Res., 22: 3765-72, 1994 and International Publication WO 95/23875), Transposon Tn7 (Craig, N L, Science.
  • More examples include IS5, TnlO, Tn903, IS911, , Sleeping Beauty, SPIN, hAT, PiggyBac, Hermes, TcBuster, AeBusterl, Tol2, and engineered versions of transposase family enzymes (Zhang et al, (2009) PLoS Genet. 5 :e 1000689. Epub 2009 Oct 16; Wilson C. et al (2007) J. Microbiol. Methods 71 :332-5).
  • integrases that may be used with the methods and compositions provided herein include retroviral integrases and integrase recognition sequences for such retroviral integrases, such as integrases from HIV-1, HIV-2, SIV, PFV-1, RSV.
  • a barcode can include one or more nucleotide sequences that can be used to identify one or more particular nucleic acids.
  • the barcode can be an artificial sequence, or can be a naturally occurring sequence generated during transposition, such as identical flanking genomic DNA sequences (g-codes) at the end of formerly juxtaposed DNA fragments.
  • the barcodes are artificial sequences that are absent in the target nucleic acid sequence and can be used to identify one or more target nucleic acid sequences.
  • a barcode can comprise at least about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 or more consecutive nucleotides.
  • a barcode comprises at least about 10, 20, 30, 40, 50, 60, 70 80, 90, 100 or more consecutive nucleotides.
  • at least a portion of the barcodes in a population of nucleic acids comprising barcodes is different.
  • at least about 10%, 20%>, 30%>, 40%>, 50%>, 60%>, 70%), 80%), 90%), 95%o, 99%o of the barcodes are different.
  • all of the barcodes are different.
  • the diversity of different barcodes in a population of nucleic acids comprising barcodes can be randomly generated or non-randomly generated.
  • a transposon sequence comprises at least one barcode.
  • the first transposon sequence comprises a first barcode
  • the second transposon sequence comprises a second barcode.
  • a transposon sequence comprises a barcode comprising a first barcode sequence and a second barcode sequence.
  • the first barcode sequence can be identified or designated to be paired with the second barcode sequence. For example, a known first barcode sequence can be known to be paired with a known second barcode sequence using a reference table comprising a plurality of first and second bar code sequences known to be paired to one another.
  • the first barcode sequence can comprise the same sequence as the second barcode sequence.
  • the first barcode sequence can comprise the reverse complement of the second barcode sequence.
  • the first barcode sequence and the second barcode sequence are different.
  • the first and second barcode sequences may comprise a bi-code.
  • barcodes are used in the preparation of template nucleic acids.
  • the vast number of available barcodes permits each template nucleic acid molecule to comprise a unique identification.
  • Unique identification of each molecule in a mixture of template nucleic acids can be used in several applications. For example, uniquely identified molecules can be applied to identify individual nucleic acid molecules, in samples having multiple chromosomes, in genomes, in cells, in cell types, in cell disease states, and in species, for example, in haplotype sequencing, in parental allele discrimination, in metagenomic sequencing, and in sample sequencing of a genome.
  • Exemplary barcode sequences include, but are not limited to TATAGCCT, ATAGAGGC, CCTATCCT, GGCTCTGA, AGGCGAAG, TAATCTTA, CAGGACGT, and GTACTGAC. Primer sites
  • a transposon sequence can include a "sequencing adaptor” or “sequencing adaptor site”, that is to say a region that comprises one or more sites that can hybridize to a primer.
  • a transposon sequence can include at least a first primer site useful for amplification, sequencing, and the like. Exemplary sequences of sequence binding sites include, but are not limited to AATG AT AC GGCG AC C AC CG AG ATCT AC AC (P5 sequence) and CAAGCAGAAGACGGCATACGAGAT (P7 sequence).
  • a target nucleic acid can include any nucleic acid of interest.
  • Target nucleic acids can include DNA, RNA, peptide nucleic acid, morpholino nucleic acid, locked nucleic acid, glycol nucleic acid, threose nucleic acid, mixed samples of nucleic acids, polyploidy DNA (i.e., plant DNA), mixtures thereof, and hybrids thereof.
  • genomic DNA or amplified copies thereof are used as the target nucleic acid.
  • cDNA, mitochondrial DNA or chloroplast DNA is used.
  • the target nucleic acid is mRNA.
  • target nucleic acid is from a single cell or from fractions of a single cell.
  • the target nucleic acid is from a single organelle. Exemplary single organelle includes but is not limited to single nuclei, single mitochondria, and a single ribosome.
  • target nucleic acid is from formalin fixed paraffin embedded (FFPE) sample.
  • target nucleic acid is cross-linked nucleic acid.
  • the target nucleic acid is cross-linked with protein.
  • the target nucleic acid is cross-linked DNA.
  • the target nucleic acid is histone protected DNA. In some embodiments, histones are removed from the target nucleic acid.
  • target nucleic acid is from nucleosomes. In some embodiments, target nucleic acid is from nucleosomes from which nuclear proteins are removed.
  • a target nucleic acid can comprise any nucleotide sequence.
  • the target nucleic acid comprises homopolymer sequences.
  • a target nucleic acid can also include repeat sequences. Repeat sequences can be any of a variety of lengths including, for example, 2, 5, 10, 20, 30, 40, 50, 100, 250, 500 or 1000 nucleotides or more. Repeat sequences can be repeated, either contiguously or non-contiguously, any of a variety of times including, for example, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15 or 20 times or more. Some embodiments described herein can utilize a single target nucleic acid. Other embodiments can utilize a plurality of target nucleic acids.
  • a plurality of target nucleic acids can include a plurality of the same target nucleic acids, a plurality of different target nucleic acids where some target nucleic acids are the same, or a plurality of target nucleic acids where all target nucleic acids are different.
  • Embodiments that utilize a plurality of target nucleic acids can be carried out in multiplex formats so that reagents are delivered simultaneously to the target nucleic acids, for example, in one or more chambers or on an array surface.
  • the plurality of target nucleic acids can include substantially all of a particular organism's genome.
  • the plurality of target nucleic acids can include at least a portion of a particular organism's genome including, for example, at least about 1%, 5%, 10%, 25%, 50%>, 75%), 80%o, 85%), 90%), 95%, or 99% of the genome.
  • the portion can have an upper limit that is at most about 1%, 5%, 10%, 25%, 50%, 75%, 80%, 85%, 90%, 95%, or 99% of the genome
  • Target nucleic acids can be obtained from any source.
  • target nucleic acids may be prepared from nucleic acid molecules obtained from a single organism or from populations of nucleic acid molecules obtained from natural sources that include one or more organisms.
  • Sources of nucleic acid molecules include, but are not limited to, organelles, cells, tissues, organs, or organisms.
  • Cells that may be used as sources of target nucleic acid molecules may be prokaryotic (bacterial cells, for example, Escherichia, Bacillus, Serratia, Salmonella, Staphylococcus, Streptococcus, Clostridium, Chlamydia, Neisseria, Treponema, Mycoplasma, Borrelia, Legionella, Pseudomonas, Mycobacterium, Helicobacter, Erwinia, Agrobacterium, Rhizobium, and Streptomyces genera); archeaon, such as crenarchaeota, nanoarchaeota or euryarchaeotia; or eukaryotic such as fungi, (for example, yeasts), plants, protozoans and other parasites, and animals (including insects (for example, Drosophila spp.), nematodes (e.g., Caenorhabditis elegans), and mammals (for example, rat, mouse, monkey
  • Target nucleic acids and template nucleic acids can be enriched for certain sequences of interest using various methods well known in the art. Examples of such methods are provided in Int. Pub. No. WO/2012/108864, which is incorporated herein by reference in its entirety.
  • nucleic acids may be further enriched during methods of preparing template libraries. For example, nucleic acids may be enriched for certain sequences, before insertion of transposomes after insertion of transposomes and/or after amplification of nucleic acids.
  • target nucleic acids and/or template nucleic acids can be highly purified, for example, nucleic acids can be at least about 70%, 80%, 90%>, 95%, 96%, 97%), 98%), 99%), or 100% free from contaminants before use with the methods provided herein.
  • it is beneficial to use methods known in the art that maintain the quality and size of the target nucleic acid for example isolation and/or direct transposition of target DNA may be performed using agarose plugs. Transposition can also be performed directly in cells, with population of cells, lysates, and non-purified DNA.
  • target nucleic acid may be obtained from a biological sample or a patient sample.
  • biological sample or “patient sample” as used herein includes samples such as tissues and bodily fluids.
  • Bodily fluids may include, but are not limited to, blood, serum, plasma, saliva, cerebral spinal fluid, pleural fluid, tears, lactal duct fluid, lymph, sputum, urine, amniotic fluid, and semen.
  • a sample may include a bodily fluid that is "acellular.”
  • An “acellular bodily fluid” includes less than about 1% (w/w) whole cellular material. Plasma or serum are examples of acellular bodily fluids.
  • a sample may include a specimen of natural or synthetic origin (i.e., a cellular sample made to be acellular).
  • target nucleic acid can be fragmented (e.g., by sonication, by restriction digestion, other mechanical means) prior to exposing the target nucleic acid to the transposomes.
  • Plasma refers to acellular fluid found in blood. “Plasma” may be obtained from blood by removing whole cellular material from blood by methods known in the art (e.g., centrifugation, filtration, and the like).
  • Example 1 DNA cluster yield from the bead-based tagmentation process
  • DNA cluster yield from the bead-based tagmentation process of Figure 3 were evaluated and shown in the table of Figure 4.
  • 50, 250, and 1000 ng of human NA12878 DNA were tagmented using the same batch of tagmentation beads (2.8 ⁇ beads).
  • a second 50 ng aliquot of NA12878 DNA was tagmented using a second batch of tagmentation beads (full repeat; 2.8 ⁇ beads).
  • the bead-bound tagmented DNA samples were PCR amplified and purified.
  • An aliquot (5.4 ⁇ ) of each purified PCR product (unquantified) was diluted 270 fold to make stock sample solutions of about 50 pM. For each sample, the 50 pM stock solution was diluted to 15, 19, 21, and 24 pM.
  • the diluted samples were loaded onto a flow cell for cluster generation and sequencing.
  • the data show that starting from the same dilution ( ⁇ 50 pM), cluster numbers are between 100 - 114% for the three different input levels (i.e., 50, 250, and 1000 ng) using the same set of beads.
  • the cluster number for the 50 ng full repeat was 81%.
  • Different dilutions (15, 19, 21, and 24 pM) yield the same number of clusters within about 10%.
  • the data indicates that the beads are largely controlling the yield and yield is reproducible for different DNA inputs and different repeats.
  • Example 2 Reproducibility of the bead-based tagmentation process
  • FIG. 5 The reproducibility of the bead-based tagmentation process of Figure 3 is shown in Figure 5.
  • six different preparations of indexed beads indexes 1 through 6; 2.8 ⁇ beads) made at the "same" transposome density were used to prepare tagmented DNA using 50 and 500 ng of input NA12878 DNA.
  • the tagmented DNA was PCR amplified and purified.
  • the 12 purified PCR products were pooled into two mixtures (pool 1 and pool 2) of six for two HiSeq lanes. Each pool includes 3 - 50 ng and 3 - 500 ng samples per lane.
  • Data table 500 shows the median insert size and the mean insert size for each indexed sample.
  • the insert size of pool 1 and the insert size of pool 2 are shown in Figure 6A (Plot 600) Figure 6B (Plot 650), respectively, of the indexed samples of Figure 5.
  • the data also shows that the insert size is uniform between the six different preparations of indexed beads. Bead- based tagmentation provides a mechanism to control the size of the inserts and DNA yield.
  • the bead-based tagmentation process may be used in an exome enrichment assay which includes a tagmentation step, e.g., Illumina's Nextera® Rapid Capture Enrichment protocol.
  • a tagmentation step e.g., Illumina's Nextera® Rapid Capture Enrichment protocol.
  • solution-based tagmentation Nextera
  • Gene specific primers are then used to pull down specific gene fragments of interest. Two enrichment cycles are performed and fragments pulled down are then enriched by PCR and sequenced.
  • human NA12878 DNA was tagmented using 25, 50, 100, 150, 200, and 500 ng of input DNA.
  • a control library (NA00536) was prepared from 50 ng input DNA according to the standard protocol. Each DNA input had a different index (unique identifier).
  • EPM enhanced polymerase mastermix
  • the amplification protocol was 3 minutes at 72 °C, 30 seconds at 98 °C, followed by 10 cycles of 10 seconds at 98 °C, 30 seconds at 65 °C, and 1 minute at 72 °C. The samples were then held at 10 °C. The samples were then processed through the exome enrichment pulldown process and sequenced.
  • Example 5 Insert size in a control and bead-based tagmented library in the exome enrichment assay
  • Figures 8A, 8B, and 8C show a plot 800 of insert size in a control library, a plot 820 of insert size in a bead-based tagmented library, and a summary data table 840, respectively, in the exome enrichment assay.
  • the data show that the bead-based tagmentation libraries have a wider insert size spread compared to the control library, but the insert size is very similar irrespective of the DNA input for the samples.
  • Figures 9A, 9B, and 9C show a bar graph 900 of percent duplicates passing filters (dups
  • the percent dups PF is a measure of how many reads are duplicated elsewhere on the flow cell. This number will ideally be low (as here) to ensure that all clusters are bringing useful data to the results.
  • Figure 9B shows PCT selected bases, which is a measure of the ratio of reads that sequence at or near the site of interest which should have been enriched during the enrichment process. Ideally this number will be close to 1 to reflect the success of the enrichment process and show that reads that should not be enriched do not get through the process.
  • Figure 9C shows the PCT usable bases on target, which is a measure of the ratio of reads that actually sequence over the particular base of interest within the enriched region. Ideally all enriched reads would sequence over the base of interest within the enriched read, but due to the random nature of the tagmentation and the variable length of the inserts, reads may be enriched that do not end up being sequenced over the area of interest.
  • Two techniques may be used to optimize the insert size distribution.
  • SPRI clean-up may be used to remove fragments that are too small or too large.
  • SPRI clean-up is a process of removing fragments that are larger or smaller than the desired size, by selective DNA precipitation based on size and either retention of the precipitated or non-precipitated DNA as desired (i.e., a first step is to precipitate only DNA that is larger than the desired size and retain the soluble smaller fragments). The smaller fragments are then further precipitated and this time the very small fragments that are not wanted (still in solution) are removed and the precipitated DNA is retained, washed and then resolubilized to give a desired size range of DNA.
  • the spacing of active transposomes on the bead surface may be used to control the insert size distribution.
  • gaps on the bead surface may be filled with inactive transposomes (e.g., transposomes with inactive transposons). Contiguity of the bead-based tagmentation process was assessed.
  • Table 3 shows the number of times 0, 1, 2, or 3 reads occur within a 1000 bp windows sharing an index. Beads were generated with 9 different indexed transposomes and used to tagment a small amount of human DNA. Reads were generated, aligned, and analyzed for the number of reads within a 1000 bp or 10 Kb window that shared the same index. Some reads within a small window sharing an index may be generated by chance and a prediction of how many times this is likely to occur is given in the "Random" row of Table 3 and Table 4.
  • the numbers in the "Bead” row show the actual number of 1000 bp (Table 3) or 10 Kb (Table 4) windows that share an index. As shown in Table 3 and Table 4, the actual number of times the same index was found within 1000 bp or 10 Kb window is significantly greater than expected in the random case. "0" windows show all the times a particular 1000 bp window had no indexed reads mapping to it. The number is largest here because only a very small amount of the human genome was sequence and most windows have no reads aligning to them. "1" is the number of times just one read maps to a lOOObp (or 10Kb) window; "2" the number of times 2 reads share an index within a lOOObp (or 10KB) window, etc.
  • Table 4 shows the number of reads (up to 5) within a 10 kb windows sharing an index.
  • reaction mixture comprising CPT-DNA and free transposomes were subjected to column chromatography using Sephacryl S-400 and Sephacryl S-200 size exclusion chromatography and shown in Figure 22.
  • CPT-DNA is indicated as NCP DNA.
  • Densities of capture probes A7 and B7 were optimized on 1 ⁇ beads and the results were shown in Figure 25.
  • the ligation product of the capture probe to the target molecule was evaluated in a agarose gel. Probe density of approximately 10,000-100,000 per bead had better ligation efficiency than those with higher probe densities.
  • Example 9 Testing the feasibility of preparing indexed sequencing libraries of CPT- DNA on beads by intra-molecular hybridization
  • Transposomes were prepared by mixing transposons having A7' and B7' capture sequences, that are complementary to A7 and B7 capture sequences on beads, with hyperactive Tn5 transposase. High molecular weight genomic DNA is mixed with the transposomes to generate CPT-DNA. Separately, beads are prepared with immobilized oligonucleotides: P5-A7, P7-B7, or P5-A7+ P7-B7, where P5 and P7 are primer binding sequences and A7 and B7 are capture sequences complementary to A7' and B7' sequences respectively.
  • Beads comprising P5- A7 alone, P7-B7 alone, P5-A7+ P7-B7, or a mixture of P5-A7 and P7-B7beads are treated with CPT-DNA and ligase was added to the reaction mixture to determine the efficiency of the hybridization of the immobilized oligos to the transposed DNA.
  • the results are shown in Figure 26. Sequencing libraries are only made when P5-A7 & P7-B7 are immobilized together on one bead (lane 4) as shown by high molecular weight bands on an agarose gel. The results indicate a high efficiency of intra-molecular hybridization and prove the feasibility of the preparing indexed sequencing libraries of CPT-DNA on beads by intra-molecular hybridization.
  • Example 10 Testing the feasibility of clonal indexing.
  • transposomes were prepared.
  • hyperactive Tn5 transposase is mixed with transposon sequences Tnpl with 5' biotin to prepare transposome 1.
  • Tnp2 having unique index2 with 5' biotin to prepare a transposome 2.
  • hyperactive Tn5 transposase is mixed with transposon sequences Tnp3 with 5' biotin to for transposome 3.
  • Tnp4 having unique index 4 and 5 '-biotin to prepare a transposome 4.
  • Each of transposome 1&2 and transposome 3&4 are mixed separately with streptavidin beads to generate bead set 1 and bead set 2.
  • the two set of beads are then mixed together and incubated with genomic DNA and tagmentation buffer to promote tagmentation of the genomic DNA. This is then followed by PCR amplification of the tagmented sequences.
  • the amplified DNA is sequenced to analyze the insertion of the index sequences. If tagmentation is confined to the beads, majority of fragments will be coded with Tnpl/Tnp2 and Tnp3/Tnp4 indexes. If there is intra-molecular hybridization, the fragments may be coded with Tnpl/Tnp4, Tnp2/Tnp3, Tnpl/Tnp3, and Tnp2/Tnp4 indexes. Sequencing results after 5 and 10 cycles of PCR were shown in Figure 27.
  • Individual indexed transposomes were prepared by mixing transposon comprising an oligonucleotide comprising a Tn5 mosaic end sequence (ME) at the 5 '-end and index sequence. Individually indexed transposomes were immobilized on beads through streptavidin-biotin interaction. Transposomes on beads were washed and all 96 individually indexed transposomes on beads were pooled. Oligonucleotides complimentary to the ME sequence and comprising an index sequence is annealed to the immobilized oligonucleotide creating transposons with unique indexes. The ninety six clonal indexed transposome bead sets are combined and incubated with high molecular weight (HMW) genomic DNA in presence of Nextera tagmentation buffer in a single tube.
  • HMW high molecular weight
  • the beads are washed and the transposase are removed by treating the reaction mixture with 0.1% SDS.
  • the tagmented DNA is amplified with indexed primers and sequenced with PE HiSeq flow cell v2 using TrueSeq v3 cluster kit and sequencing data are analyzed.
  • Clusters or islands of reads are observed.
  • a plot of the nearest neighbor distances between the reads for each sequence shows essentially to major peaks, one from within the cluster (proximal) and another from between clusters (distal).
  • a schematic of the method and the results are shown in Figures 30 and 31.
  • the island sizes ranged from approximately between 3- 10 kb. Percent of bases covered are approximately 5% to 10%.
  • the insert sizes of the genomic DNA are approximately 200-300 bases.
  • Transposomes are first assembled in solution by mixing a first oligonucleotide having ME' sequence, a second oligonucleotide having ME -barcode -P5/P7 sequence, and Tn5 transposase.
  • first oligonucleotide having ME' sequence is biotinylated at the 3'- end.
  • the oligonucleotide having ME-barcode-P5/P7 sequence is biotinylated at the 5'-end.
  • streptavidin beads are added such that the transposomes are immobilized on the streptavidin beads.
  • the beads are washed and HMW genomic DNA is added and tagmentation is carried out.
  • the tagmented DNA is treated with 0.1 % SDS and in other cases the tagmented DNA are untreated.
  • the tagmented DNA is PCR amplified for 5-8 cycles and sequenced. The schematic is shown in Figure 32.
  • Figure 34 shows the effect of transposome surface density on the insertion size.
  • Transposomes with 5'-biotin shows smaller sized library and more self-insertion by-products.
  • Various amounts of target HMW DNA was added to clonally indexed beads with 50mM Tn5: Transposon density and incubated for 15 or 60 min at 37 degree C or for 60 min at room temperature.
  • the transposomes comprised oligonucleotides with 3'-biotin.
  • the tagmentation was carried out, the reaction mixture was treated with 0.1% SDS, and PCR amplified.
  • the amplified DNA was sequenced.
  • Figure 35 shows the effect of input DNA on the size distribution. Reactions with 10 pg of input DNA showed the least signal. Size distribution pattern was similar for DNA inputs ranging from 20, 40, and 200 pg.
  • Example 14 Island size and distribution using solution based and bead based methods Island size and distribution using solution based and bead based methods are compared.
  • 96 transposomes each with unique index in the transposons are assembled in a 96 well plate.
  • HMW genomic DNA is added, and the tagmentation reaction is carried out.
  • the reaction product is treated with 0.1% SDS and PCR amplified. The amplified products were sequenced.
  • transposomes each with unique index in the transposons are assembled in a 96 well plate.
  • the oligonucleotides comprised 3' -end biotin.
  • Streptavidin beads are added to each of the 96 well plate and incubated such that the transposomes are immobilized on the streptavidin beads.
  • the beads are individually washed and pooled, HMW genomic DNA is added, and the tagmentation reaction is carried out in a single reaction vessel (one pot).
  • the reaction product is treated with 0.1% SDS and PCR amplified. The amplified products were sequenced.
  • transposon sequences In the negative control, all 96 transposon sequences, each with unique index, are mixed together first.
  • the oligonucleotides comprised 3 '-end biotin.
  • Transposomes are prepared from the individually mixed indexed transposons. Streptavidin beads are added to the mixture. HMW genomic DNA is added, and the tagmentation reaction is carried out. The reaction product is treated with 0.1 % SDS and PCR amplified. The amplified products were sequenced.
  • the number of intra island reads is plotted versus the island size.
  • the results as shown in Figure 36 indicate that islands (proximity reads) are observed with the one-pot clonal indexed beads, similar to the solution based method. When indexed transposons were mixed before transposome formation, no island (proximity reads) were observed. Mixing transposons before transposome formation gives beads with different indexes/transposomes per bead, i.e. not clonal.
  • the sequencing data are extracted as fastq files and go through the demultiplexing process to generate individual fastq file for each barcode.
  • the fastq files from the CPT sequencing are demultiplexed according to their indexes and aligned to the reference genome with the duplicates removed.
  • the chromosomes are scanned by 5kb/lkb window, in which the number of the indexes showing any reads within the scanning window is recorded.
  • Statistically for heterozygous deletion region only half amount of DNA is available for the library generation compared to its neighboring regions, therefore the number of indexes should be roughly half as its neighbors' as well.
  • the NA12878 chrl 60kb heterozygous deletion are shown in Fig. 47A and 47B by scanning in 5kb window from 9216 indexed CPT sequencing data.
  • the fastq files from the CPT sequencing are demultiplexed according to their index and aligned to the reference genome with the duplicates removed. Chromosomes are scanned in 2kb window. Each 2kb window is a 36864 vector in which each element records how many reads from a unique index have been found in this 2kb window. For every 2kb window pair (X,Y) across the genome, the weighted- Jaccard index is calculated. This index indicates the de facto distance between (X,Y) in the sample.
  • each data point representing a pair of 2kb scanning window is displayed as the heatmap shown in Fig.48, each data point representing a pair of 2kb scanning window; the top left square is for X,Y both from region 1, bottom right is for X,Y both from region2 and top right is for the X,Y from region 1 cross region2.
  • the gene fusion signal is revealed as the horizontal line in the middle in this case.
  • the fastq files from the CPT sequencing are demultiplexed according to their index and aligned to the reference genome with the duplicates removed. Chromosomes are scanned in lkb window. Fig. 49 shows results of detection of genetic deletions.
  • Fig. 53 shows an image of agarose gel electrophoresis of indexed-linked libraries after PCR after bisulfite conversion. The expected size-range of 200-5 OObp libraries was observed. Reaction without DNA does not yield indexed-linked libraries.
  • Fig. 54 the bioanalyzer trace of whole-genome indexed linked CPT-seq libraries before enrichment without size- selection.
  • Fig. 55 shows the agarose gel analysis of libraries after enrichment.
  • Figure 56 shows the results of application of targeted haplotyping to the HLA region in the chromosome. Illustration of enrichment of whole genome indexed linked read library in the left. Each small bar represents an indexed short library. Clusters of indexed libraries are "islands", the region that got clonally indexed on a single bead with the same index, hence the proximity of the reads ("island” character) on a genomic scale. Enrichment (see Selective enrichment of nucleic acids WO 2012108864 Al) of libraries in the targeted region are displayed on the right. Reads are enriched for the HLA region. Additionally, when reads are sorted by index and aligned to the genome they again display the "island” structure indicating that contiguity information is maintained from the indexed linked reads.
  • Streptavidin magnetic beads were loaded with lx, 6x, and 12x concentrations of TsTn5 transposome complex.
  • the Epi-CPT seq protocol was performed for each bead type.
  • the final PCR product was loaded on the Agilent BioAnalyzer for analysis and shown in Figure .
  • Epi- CPT seq libraries fragments are smaller and have greater yield when more TsTn5 is loaded on beads.
  • Example 20 Fragmentation of DNA library during sodium bisulfate conversion.
  • DNA After bisulfite conversion, DNA becomes damaged, resulting in loss of the common sequences (CS2) needed for PCR amplification.
  • DNA fragments CPTSeq and Epi-CPTSeq (Me- CPTSeq) libraries were analyzed by BioAnalyzer. Due to DNA damage during bisulfite conversion, the Epi-CPTSeq library has 5-fold lower yield and a smaller library size distribution compared to the CPTSeq library as shown in Figure 70.
  • TdT Terminal transferase
  • TdT Terminal transferase

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Organic Chemistry (AREA)
  • Engineering & Computer Science (AREA)
  • Genetics & Genomics (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Biotechnology (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Biochemistry (AREA)
  • Molecular Biology (AREA)
  • Microbiology (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Physics & Mathematics (AREA)
  • Analytical Chemistry (AREA)
  • Plant Pathology (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Immunology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Medicinal Chemistry (AREA)
  • General Chemical & Material Sciences (AREA)
  • Virology (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Immobilizing And Processing Of Enzymes And Microorganisms (AREA)
  • Glass Compositions (AREA)
  • Inorganic Insulating Materials (AREA)
  • Transition And Organic Metals Composition Catalysts For Addition Polymerization (AREA)

Abstract

Embodiments provided herein relate to methods and compositions for preparing an immobilized library of barcoded DNA fragments of a target nucleic acid, identifying genomic variants, determining the contiguity information, phasing information, and methylation status of the target nucleic acid.

Description

CONTIGUITY PRESERVING TRANSPOSITION
RELATED APPLICATIONS
This application claims priority to U.S. provisional application nos.: 62/065,544 filed on October 17, 2014 and 62/157,396 filed on May 5, 2015 which are hereby incorporated by reference in its entirety.
FIELD OF THE INVENTION
Embodiments of the present invention relate to sequencing nucleic acids. In particular, embodiments of the methods and compositions provided herein relate to preparing nucleic acid templates and obtaining sequence data therefrom.
BACKGROUND OF THE INVENTION
The detection of specific nucleic acid sequences present in a biological sample has been used, for example, as a method for identifying and classifying microorganisms, diagnosing infectious diseases, detecting and characterizing genetic abnormalities, identifying genetic changes associated with cancer, studying genetic susceptibility to disease, and measuring response to various types of treatment. A common technique for detecting specific nucleic acid sequences in a biological sample is nucleic acid sequencing.
Nucleic acid sequencing methodology has evolved significantly from the chemical degradation methods used by Maxam and Gilbert and the strand elongation methods used by Sanger. Today several sequencing methodologies are in use which allow for the parallel processing of nucleic acids all in a single sequencing run. As such, the information generated from a single sequencing run can be enormous.
SUMMARY OF THE INVENTION
In one aspect, described herein are methods of preparing a library of barcoded DNA fragments of a target nucleic acid. The methods include contacting a target nucleic acid with a plurality of transposome complexes, each transposome complex includes: transposons and transposases, in which the transposons comprise transferred strands and non-transferred strands. At least one of the transposons of the transposome complex comprises an adaptor sequence capable of hybridizing to a complementary capture sequence. The target nucleic acid is fragmented into a plurality of fragments and inserting plurality of transferred strands to the 5' end of at least one strand of the fragments while maintaining the contiguity of the target nucleic acid. The plurality of fragments of the target nucleic acid are contacted with a plurality of solid supports, each of the solid supports in the plurality comprising a plurality of immobilized oligonucleotides, each of the oligonucleotides comprising a complementary capture sequence and a first barcode sequence, and wherein the first barcode sequence from each solid support in the plurality of the solid supports differs from the first barcode sequence from other solid supports in the plurality of solid supports. The barcode sequence information is transferred to the target nucleic acid fragments, thereby producing an immobilized library of double-stranded fragments wherein at least one strand is 5'- tagged with the first barcode such that at least two fragments of the same target nucleic acid receives identical barcode information.
In one aspect, described herein are methods for determining contiguity information of a target nucleic acid sequence The methods include contacting the target nucleic acid with a plurality of transposome complexes, each transposome complex comprising: transposons and transposases, in which the transposons comprise transferred strands and non-transferred strands, in which at least one of the transposons of the transposome complex comprise an adaptor sequence capable of hybridizing to a complementary capture sequence. The target nucleic acid is fragmented into a plurality of fragments and plurality of transferred strands is inserted into the plurality of fragments while maintaining the contiguity of the target nucleic acid. The plurality of fragments of the target nucleic acid is contacted with a plurality of solid supports. Each of the solid supports in the plurality comprising a plurality of immobilized oligonucleotides, each of the oligonucleotides comprising a complementary capture sequence and a first barcode sequence, and wherein the first barcode sequence from each solid support in the plurality of the solid supports differs from the first barcode sequence from other solid supports in the plurality of solid supports. The barcode sequence information is transferred to the target nucleic acid fragments such that at least two fragments of the same target nucleic acid receive identical barcode information. The sequence of the target nucleic acid fragments and the barcode sequences are determined. The contiguity information of the target nucleic acid are determined by identifying the barcode sequences. In some embodiments, the transposases of transposome complexes are removed after transposition and subsequent hybridization of the adaptor sequences of the transposon to the complimentary capture sequence. In some embodiments, the transposases are removed by SDS treatment. In some embodiments, the transposases are removed by proteinase treatment.
In one aspect, described herein are methods for simultaneously determining phasing information and methylation status of a target nucleic acid sequence. The methods include contacting the target nucleic acid with a plurality of transposome complexes, each transposome complex includes transposons and transposases, in which the transposons comprise transferred strands and non-transferred strands, wherein at least one of the transposons of the transposome complex comprise an adaptor sequence capable of hybridizing to a complementary capture sequence. The target nucleic acid is fragmented into a plurality of fragments and plurality of transferred strands is inserted into the target nucleic acid fragments while maintaining the contiguity of the target nucleic acid. The plurality of fragments of the target nucleic acid are contacted with a plurality of solid supports, each of the solid supports in the plurality comprising a plurality of immobilized oligonucleotides, each of the oligonucleotides comprising a complementary capture sequence and a first barcode sequence, and wherein the first barcode sequence from each solid support in the plurality of the solid supports differs from the first barcode sequence from other solid supports in the plurality of solid supports. The barcode sequence information is transferred to the target nucleic acid fragments such that at least two fragments of the same target nucleic acid receive identical barcode information. The target nucleic acid fragments comprising barcodes are subjected to bisulfite treatment, thereby generating bisulfite treated target nucleic acid fragments comprising barcodes. The sequence of the bisulfite treated target nucleic acid fragments and the barcode sequences are determined. The contiguity information of the target nucleic acid is determined by identifying the barcode sequences.
In one aspect, described herein are methods of preparing an immobilized library of tagged DNA fragments. The methods include providing a plurality of solid supports having transposome complexes immobilized thereon, in which the transposome complexes are multimeric and the transposome monomeric units of the same transposome complex are linked to each other, and wherein said transposome monomeric units comprise a transposase bound to a first polynucleotide, said first polynucleotide comprising (i)a 3' portion comprising a transposon end sequence, and (ii) a first adaptor comprising a first barcode. A target DNA is applied to the plurality of solid supports under conditions whereby the target DNA is fragmented by the transposome complexes, and the 3' transposon end sequence of the first polynucleotide is transferred to a 5 ' end of at least one strand of the fragments; thereby producing an immobilized library of double-stranded fragments wherein at least one strand is 5'- tagged with the first barcode.
In one aspect, described herein are methods of preparing a sequencing library for determining the methylation status of a target nucleic acid. The methods include fragmenting the target nucleic acid into two or more fragments. A first common adaptor sequence is incorporated into the 5 '-end of the fragments of the target nucleic acid, wherein the adaptor sequence comprises a first primer binding sequence and an affinity moiety, wherein the affinity moiety in one member of the binding pair. The target nucleic acid fragments are denatured. The target nucleic acid fragments are immobilized on a solid support, in which the solid support comprises other member of the binding pair and the immobilization of the target nucleic acid is by binding of the binding pair. The immobilized target nucleic acid fragments are subjected to bisulfite treatment. A second common adaptor sequence is incorporated to the bisulfite treated immobilized target nucleic acid fragments, wherein the second common adaptor comprises a second primer binding site. The bisulfite treated target nucleic acid fragments immobilized on solid support is amplified thereby producing a sequencing library for determining the methylation status of a target nucleic acid.
In one aspect, described herein are methods of preparing a sequencing library for determining the methylation status of a target nucleic acid. The methods include providing a plurality of solid support comprising immobilized transposome complexes immobilized thereon. The transposome complexes comprise transposons and transposases, in which the transposons comprise transferred strands and non-transferred strands. The transferred strand comprises (i) a first portion at the 3 '-end comprising the transposase recognition sequence, and (ii) a second portion located 5 'to the first portion comprising a first adaptor sequence and first member of a binding pair. The first member of the binding pair binds to a second member of the binding pair on the solid support, thereby immobilizes the transposon to the solid support. The first adaptor also comprises a first primer binding sequence. The non-transferred strand comprises (i) a first portion at the 5 '-end comprising the transposase recognition sequence and (ii) a second portion located 3 'to the first portion comprising a second adaptor sequence, in which the terminal nucleotide at the 3 '-end is blocked. The second adaptor also comprises a second primer binding sequence The target nucleic acid is contacted with the plurality of solid support comprising immobilized transposome complexes. The target nucleic acid is fragmented into a plurality of fragments and plurality of transferred strands are inserted to the 5 ' end of at least one strand of the fragments, thereby immobilizing the target nucleic acid fragments to the solid support. The 3 '-end of the fragmented target nucleic acid is extended with a DNA polymerase. The non- transferred strand is ligated to the 3 '-end of the fragmented target nucleic acid. The immobilized target nucleic acid fragments are subjected to bisulfite treatment. The 3 '-end of the immobilized target nucleic acid fragments damaged during the bisulfite treatment is extended by using a DNA polymerase such that the 3 '-end of the immobilized target nucleic acid fragments comprise a homopolymeric tail. A second adaptor sequence is introduced to the 3 '-end of the immobilized target nucleic acid fragments damaged during the bisulfite treatment. The bisulfite treated target nucleic acid fragments immobilized on solid support are amplified using a first and a second primer, thereby producing a sequencing library for determining the methylation status of a target nucleic acid.
In one aspect, disclosed herein are methods of preparing a sequencing library for determining the methylation status of a target nucleic acid. The methods include a. contacting the target nucleic acid with transposome complexes, in which the transposome complexes comprise transposons and transposases. The transposons comprise transferred strands and non- transferred strands. The transferred strand includes (i) a first portion at the 3 '-end comprising the transposase recognition sequence, and (ii) a second portion located 5 'to the first portion comprising a first adaptor sequence and first member of a binding pair, wherein the first member of the binding pair binds to a second member of the binding pair. The non-transferred strand includes (i) a first portion at the 5 '-end comprising the transposase recognition sequence and (ii) a second portion located 3 'to the first portion comprising a second adaptor sequence, in which the terminal nucleotide at the 3 '-end is blocked, and wherein the second adaptor comprises a second primer binding sequence. The target nucleic acid is fragmented into a plurality of fragments and inserting plurality of transferred strands to the 5 ' end of at least one strand of the fragments, thereby immobilizing the target nucleic acid fragments to the solid support. The target nucleic acid fragments comprising the transposon end are contacted with the plurality of solid support comprising second member of the binding pair, wherein binding of the first member of the binding pair to the second member of the binding pair immobilizes the target nucleic acid to the solid support. The 3 '-end of the fragmented target nucleic acid is extended with a DNA polymerase. The non-transferred strand is ligated to the 3 '-end of the fragmented target nucleic acid. The immobilized target nucleic acid fragments are subjected to bisulfite treatment. The 3 '-end of the immobilized target nucleic acid fragments damaged during the bisulfite treatment is extended by using a DNA polymerase such that the 3 '-end of the immobilized target nucleic acid fragments comprise a homopolymeric tail. A second adaptor sequence is introduced to the 3 '-end of the immobilized target nucleic acid fragments damaged during the bisulfite treatment. The bisulfite treated target nucleic acid fragments immobilized on solid support are amplified using a first and a second primer, thereby producing a sequencing library for determining the methylation status of a target nucleic acid.
In some embodiments, the terminal nucleotide at the 3 '-end of the second adaptor is blocked by a member selected from the group consisting of a dideoxy nucleotide, a phosphate group, thiophosphate group, and an azido group.
In some embodiments, affinity moieties can be members of a binding pair. In some cases, the modified nucleic acids may comprise a first member of a binding pair and the capture probe may comprise a second member of the binding pair. In some cases, capture probes may be immobilized to a solid surface and the modified nucleic acid may comprise a first member of a binding pair and the capture probe may comprise a second member of the binding pair. In such cases, binding the first and second members of the binding pair immobilizes the modified nucleic acid to the solid surface. Examples of binding pair include, but are not limited to biotin-avidin, biotin-streptavidin, biotin-neutravidin, ligand-receptor, hormone-receptor, lectin-glycoprotein, oligonucleotide-complementary oligonucleotide, and antigen-antibody.
In some embodiments, the first common adaptor sequence is incorporated to the 5 '-end fragments of the target nucleic acid by one-sided transposition. In some embodiments, the first common adaptor sequence is incorporated to the 5 '-end fragments of the target nucleic acid by ligation. In some embodiments, incorporating the second common adaptor sequence into the bisulfite treated immobilized target nucleic acid fragments includes (i) extending the 3 '-end of the immobilized target nucleic acid fragments using terminal transferase to comprise a homopolymeric tail; (ii) hybridizing an oligonucleotide comprising a single stranded homopolymeric portion and a double stranded portion comprising the second common adaptor sequence, wherein the ingle stranded homopolymeric portion is complementary to the homopolymeric tail; and (iii) ligating the second common adaptor sequence to the immobilized target nucleic acid fragments, thereby incorporating the second common adaptor sequence into the bisulfite treated immobilized target nucleic acid fragments.
In some embodiments, the target nucleic acid is from a single cell. In some embodiments, the target nucleic acid is from a single organelle. In some embodiments, the target nucleic acid is genomic DNA. In some embodiments, the target nucleic acid is cross-linked to other nucleic acids. In some embodiments, target nucleic acid is from formalin fixed paraffin embedded (FFPE) sample. In some embodiments, the target nucleic acid is cross-linked with proteins. In some embodiments, the target nucleic acid is cross-linked with DNA. In some embodiments, the target nucleic acid is histone protected DNA. In some embodiments, histones are removed from the target nucleic acid. In some embodiments, the target nucleic acid is cell free tumor DNA. In some embodiments, the cell free tumor DNA is obtained from placental fluid. In some embodiments, the cell free tumor DNA is obtained from plasma. In some embodiments, the plasma is collected from whole blood using a membrane separator comprising a collection zone for the plasma. In some embodiments, the collection zone for the plasma comprises transposome complexes immobilized on solid support. In some embodiments, the target nucleic acid is cDNA. In some embodiments, the solid support is a bead. In some embodiments, the plurality of solid supports are plurality of beads and wherein the plurality of beads are of different sizes.
In some embodiments, a single barcode sequence is present in the plurality of immobilized oligonucleotides on each individual solid support. In some embodiments, different barcode sequences are present in the plurality of immobilized oligonucleotides on each individual solid support. In some embodiments, the transferring of the barcode sequence information to the target nucleic acid fragments is by ligation. In some embodiments, transferring of the barcode sequence information to the target nucleic acid fragments is by polymerase extension. In some embodiments, the transferring of the barcode sequence information to the target nucleic acid fragments is by both ligation and polymerase extension. In some embodiments, the polymerase extension is by extending the 3 '-end of the non-ligated transposon strand with a DNA polymerase using the ligated immobilized oligonucleotide as a template. In some embodiments, at least a portion of the adaptor sequences further comprise a second barcode sequence. In some embodiments, the transposome complexes are multimeric, and wherein the adaptor sequences of the transposons of each monomeric unit are different from the other monomeric unit in the same transposome complex. In some embodiments, the adaptor sequence further comprises a first primer binding sequence. In some embodiments, the first primer binding site has no sequence homology to the capture sequence or to the complement of the capture sequence. In some embodiments, the immobilized oligonucleotides on the solid support further comprise a second primer binding sequence.
In some embodiments, the transposome complexes are multimeric, and the transposome monomeric units are linked to each other in the same transposome complex. In some embodiments, the transposase of a transposome monomeric unit is linked to the transposase of another transposome monomeric unit of the same transposome complex. In some embodiments, the transposons of a transposome monomeric unit are linked to transposons of another transposome monomeric unit of the same transposome complex. In some embodiments, the transposase of a transposome monomeric unit is linked to the transposase of another transposome monomeric unit of the same transposome complex by covalent bond. In some embodiments, the transposases of one monomeric unit is linked to the transposase of another transposome monomeric unit of the same transposome complex by di-sulfide bond. In some embodiments, the transposons of a transposome monomeric unit are linked to transposons of another transposome monomeric unit of the same transposome complex by covalent bond. In some embodiments, the contiguity information of a target nucleic acid sequence is indicative of haplotype information. In some embodiments, the contiguity information of a target nucleic acid sequence is indicative of genomic variants. In some embodiments, the genomic variants are selected from the group consisting of deletions, translocations, interchromosomal gene fusions, duplications, and paralogs. In some embodiments, the oligonucleotides immobilized on the solid support comprise a partially double stranded region and a partially single stranded region. In some embodiments, the partially single stranded region of the oligonucleotide comprises the second barcode sequence and the second primer binding sequence. In some embodiments, the target nucleic acid fragments comprising the barcodes are amplified prior to determining the sequence of the target nucleic acid fragments. In some embodiments, subsequent amplification are carried out in a single reaction compartment prior to determining the sequence of the target nucleic acid fragments. In some embodiments, a third barcode sequence is introduced to the target nucleic acid fragments during the amplification.
In some embodiments, the methods may further include combining the target nucleic acid fragments comprising the barcodes from plurality of first set of reaction compartments into a pool of target nucleic acid fragments comprising the barcodes; redistributing the pool of target nucleic acid fragments comprising the barcodes to a plurality of second set of reaction compartments; and introducing a third barcode in to the target nucleic acid fragments by amplifying the target nucleic acid fragments in the second set of reaction compartments prior to sequencing. In some embodiments, the methods may further include pre-fragmenting the target nucleic acid prior to contacting the target nucleic acid with transposome complexes. In some embodiments, the pre-fragmenting the target nucleic acid is by a method selected from the group consisting of sonication and restriction digestion.
BRIEF DESCRIPTION OF THE DRAWINGS
Figure 1 illustrates a flow diagram of an example of a method of binding transposomes to a bead surface.
Figure 2 shows pictorially the steps of the method of Figure 1.
Figure 3 illustrates a schematic diagram of an example of a tagmentation process on a bead surface.
Figure 4 shows a data table of an example of the DNA yield in terms of cluster number from the bead-based tagmentation process of Figure 3.
Figure 5 shows a data table of another example of the reproducibility of the bead-based tagmentation process of Figure 3 in terms of uniform size.
Figures 6A and 6B show a plot of the insert size of pool 1 and a plot of the insert size of pool 2, respectively, of the indexed samples of Figure 5.
Figure 7 shows a bar graph of the reproducibility of total number of reads and percent reads aligned for the experiment described in Figure 5. Figures 8A, 8B, and 8C show a plot of insert size in a control library, a plot of insert size in a bead-based tagmented library, and a summary data table, respectively, in the exome enrichment assay.
Figures 9A, 9B, and 9C show a bar graph of the fraction of dups PF, a bar graph of the fraction of selected bases, and bar graph of PCT usable bases on target, respectively, in the exome enrichment assay.
Figure 10 illustrates a flow diagram of an example of a method of forming transposome complexes on a bead surface.
Figures 11, 12, and 13 show pictorially the steps of the method of Figure 10.
Figure 14 shows a schematic diagram of a tagmentation process using the transposome coated bead shown in Figure 13.
Figure 15 shows an exemplary scheme of forming transposomes on a solid support.
Figure 16 shows an exemplary scheme of making contiguously- linked libraries with unique indexes.
Figure 17 shows an exemplary scheme of making contiguously- linked libraries with unique indexes.
Figures 18 and 19 depicts the capture of a single CPT-DNA on a single clonal indexed bead where the CPT-DNA wraps around the bead.
Figure 20 shows an exemplary scheme of linking a Y-adaptor immobilized on the solid surface to the tagmented DNA by ligation and gap filling.
Figure 21 shows an exemplary scheme of making such Y-adapters during the ligation of CPT-DNA to the immobilized oligonucleotides on the solid support.
Figure 22 depicts an agarose gel electrophoresis showing the removal of free transposome from contiguously-linked libraries by size exclusion chromatography.
Figure 23 shows an exemplary scheme of generating shotgun sequence library of a specific DNA fragment.
Figure 24 shows an exemplary scheme of assembling the sequence information from clonal indexed sequencing library.
Figure 25 shows the results of optimization of capture probe density on beads.
Figure 26 shows the results of testing the feasibility of preparing indexed sequencing libraries of CPT-DNA on beads by intra-molecular hybridization. Figure 27 shows the results of testing the feasibility of clonal indexing.
Figure 28 depicts a graph showing the frequencies of sequencing reads for particular distances within (intra) and also between (intra) neighboring aligned islands of reads for template nucleic acid following tagmentation.
Figure 29A and 29B show exemplary approaches to derive contiguity information on solid support.
Figures 30 and 31 show the schematics of indexed clonal bead transposition in a single reaction vessel (one pot) and the results of the transposition.
Figure 32 shows the schematics of creating clonal transposomes on beads using 5'- or 3'- biotinylated oligonucleotides.
Figure 33 shows the library sizes for transposomes on beads.
Figure 34 shows the effect of transposome surface density on insertion size.
Figure 35 shows the effect of input DNA on the size distribution.
Figure 36 shows the island size and distribution using bead based and solution based tagmentation reactions.
Figure 37 shows clonal indexing of several individual DNA molecules, each receiving unique indexes.
Figure 38 shows a diagram of a device for separating plasma from whole blood.
Figures 39 and 40 show a diagram of a device for separating plasma and subsequent use of the separated plasma.
Figure 41 shows an exemplary scheme of targeted phasing by enriching specific regions of a genome.
Figure 42 shows an exemplary scheme of exome phasing using the SNPs between the exons.
Figure 43 shows an exemplary scheme of simultaneous phasing and methylation detection.
Figure 44 shows an alternative exemplary scheme of simultaneous phasing and methylation detection.
Fig. 45 shows an exemplary scheme to generate various sized libraries using various sized clonally indexed beads in a single assay. Fig. 46 shows an exemplary scheme of determining genetic variants with different length scale libraries.
Fig. 47 A and B shows the result of detection of 60 kb heterozygous deletion in chromosome 1.
Fig. 48 shows results of detection of gene fusion using the methods of the present application.
Fig. 49 shows results of detection of genetic deletions using the methods of the present application.
Fig. 50 shows ME sequences before and after bisulfite conversion.
Fig. 51 shows the results of bisulfite conversion efficiency optimization.
Fig. 52 shows the results after bisulfite conversion in IVC plot (intensity versus cycles per individual base).
Fig. 53 shows an image of agarose gel electrophoresis of indexed-linked libraries after PCR after BSC.
Fig. 54 shows the bioanalyzer trace of whole-genome indexed linked CPT-seq libraries before enrichment without size-selection.
Fig. 55 shows the agarose gel analysis of libraries after enrichment.
Fig. 56 shows the results of application of targeted haplotyping to the HLA region in the chromosome.
Fig. 57 shows some possible mechanisms of ME swapping.
Fig. 58 shows some possible mechanisms of ME swapping.
Fig. 59 shows a portion of Tn5 transposase with exemplary amino acid residues Asp468, Tyr407, Asp461, Lys459, Ser458, Gly462, Ala466, Met470 that can be substituted with Cys.
Fig. 60 shows a portion of Tn5 transposase with amino acid substitution of S458C, K459C and A466C, such that cysteine residues can form disulfide bond between two monomeric units.
Fig. 61 shows an exemplary scheme of making and using a dimer transposase (dTnp)- nanoparticle (NP) bioconjugate (dTnp-NP) using amine coated nanoparticle.
Fig. 62 shows an exemplary scheme of conjugation of transposome dimer to an amine coated solid support.
Fig. 63 shows a Mu transposome complex where transposon ends are linked. Fig. 64 shows a diagram of indexed linked reads for assembly/phasing of pseudogenes and the advantage of indentifying variants in pseudogene using shorter fragments.
Fig. 65 shows a plot of index exchange from 4 separate experiments and shown as % of indexes swapped.
Fig. 66 shows Agilent BioAnalyzer analysis of fragment sizes of Ts-Tn5 titration.
Fig. 67 shows an exemplary scheme to improve DNA yield of the Epi-CPTSeq protocol using enzymatic methods for recovery of broken library elements after bisulfite treatment.
Fig. 68 A-C shows several exemplary schemes to improve DNA yield of the Epi-CPTSeq protocol using enzymatic methods for recovery of broken library elements after bisulfite treatment.
Fig. 69 shows an exemplary scheme for template rescue using random primer extension.
Fig. 70 shows the Fragmentation of DNA library during sodium bisulfate conversion. Left panel illustrates fragmentation during bisulfate conversion of a portion of DNA tagmented on magnetic beads. Right panel shows the BioAnalyzer traces of CPTSeq and Epi-CPTSeq (Me- CPTSeq) libraries.
Fig. 71 shows an exemplary scheme and the results of TdT mediated ssDNA ligation reaction.
Fig. 72 shows a scheme and the results of TdT mediated recovery of sodium bisulfate converted bead bound library. Left panel illustrates the rescue workflow of damaged bisulfite converted DNA library using TdT mediated ligation reaction. Results of DNA library rescue experiment are shown in the right panel.
Fig. 73 shows the results of Methyl-CPTSeq assay.
Fig. 74 shows an exemplary scheme of bead based bisulfite conversion of DNA
Fig. 75 A-B shows the results of bisulfite conversion efficiency optimization.
DETAILED DESCRIPTION
Embodiments of the present invention relate to sequencing nucleic acids. In particular, embodiments of the methods and compositions provided herein relate to preparing nucleic acid templates and obtaining sequence data therefrom.
In one aspect, the present invention relate to methods of tagmenting (fragmenting and tagging) target nucleic acid on a solid support for the construction of a tagmented target nucleic acid library. In one embodiment, the solid support is a bead. In one embodiment, the target nucleic acid is DNA.
In one aspect, the present invention relate to methods and compositions of solid-support, transposase-based methods that can derive contiguity information of a target nucleic acid. In some embodiments, the compositions and the methods can derive assembly/phasing information.
In one aspect, the present invention relate to methods and compositions to derive contiguity information by means of capturing contiguously-linked, transposed, target nucleic acid onto a solid support.
In one aspect the compositions and methods disclosed herein relate to analysis of genomic variants. Exemplary genomic variants include but are not limited to deletions, inter chromosomal translocations, duplications, paralogs, interchromosomal gene fusions. In some embodiments, the compositions and methods disclosed herein relate to determining phasing information of the genomic variants.
In one aspect, the compositions and methods disclosed herein relate to phasing specific regions of the target nucleic acid. In one embodiment, the target nucleic acid is DNA. In one embodiment, the target nucleic acid is genomic DNA. In some embodiments, the target nucleic acid is RNA. In some embodiments, the RNA is mRNA. In some embodiments, the target nucleic acid is complimentary DNA (cDNA). In some embodiments, target nucleic acid is from a single cell. In some embodiments, target nucleic acid is from circulating tumor cells. In some embodiments, target nucleic acid is cell free DNA. In some embodiments, target nucleic acid is cell free tumor DNA. In some embodiments, target nucleic acid is from formalin fixed paraffin embedded tissue samples. In some embodiments, target nucleic acid is cross-linked target nucleic acid. In some embodiments, target nucleic acid is cross-linked to proteins. In some embodiments, target nucleic acid is cross-linked to nucleic acid. In some embodiments, target nucleic acid is histone-protected DNA. In some embodiments, histone-protected DNA is precipitated from a cell lysate using antibodies to histones and the histones are removed.
In some aspects, indexed libraries are created from the target nucleic acid using the clonally indexed beads. In some embodiments, the tagmented target nucleic acid, while the transposase is still bound to the target DNA can be captured using the clonally indexed beads. In some embodiments, specific capture probes are used to capture the specific region of interest in the target nucleic acid. The captured regions of the target nucleic acid can be washed at various stringencies and optionally amplified, followed by sequencing. In some embodiments, the capture probe may be biotinylated. The complex of the biotinylated capture probes hybridized to the specific regions of the indexed target nucleic acids can be separated by using streptavidin beads. Exemplary scheme of targeted phasing is shown in Fig. 41.
In some aspects, the compositions and methods disclosed herein can be used phasing exomes. In some embodiments, exons, promoters can be enriched. Markers, for example, heterozygous SNPs between exonic regions, can aid in phasing the exons, especially when the distance between exons is large. Exemplary exome phasing is shown in Fig. 42. In some embodiments, indexed linked reads cannot span (cover) heterozygous SNPs of neighboring exons simultaneously. As such, it is challenging to phase the two or more exons. The compositions and methods disclosed herein also enriches heterozygous SNPs between exons for example, phasing exons 1 to SNP1 and SNP2 to Exon 2. As such, through the use of SNP 1, exon 1 and exon 2 can be phased as shown in Fig. 42.
In one aspect, the compositions and methods disclosed herein can be used for phasing and simultaneous methylation detection. Methylation detection through bisulfite conversion (BSC) is challenging as the BSC reaction is harsh on DNA, fragmenting the DNA and therefore removing contiguity/phasing information. Also, methods disclosed in the present application has an additional advantage because no additional purification steps are required in contrast to those required in traditional BSC approaches, thereby improving the yield.
In one aspect, the compositions and methods disclosed herein can be used to prepare different size libraries in single assay. In some embodiment, different sizes of clonally indexed beads can be used to prepare different size libraries. Figure 1 illustrates a flow diagram of an example of a method 100 of binding transposomes to a bead surface. Transposomes may be bound to a bead surface using any chemistry that may be added on the transposon oligonucleotide, transposase, and solid-phase. In one example, transposomes are bound to a bead surface via a biotin-streptavidin binding complex. Method 100 includes, but is not limited to, the following steps.
In one embodiment, transposons may comprise sequencing primer binding sites. Exemplary sequences of sequence binding sites include, but are not limited to AATGATACGGCGACCACCGAGATCTACAC (P5 sequence) and CAAGCAGAAGACGGCATACGAGAT (P7 sequence). In some embodiments, the transposons may be biotinylated.
At a step 110 of Figure 1, P5 and P7 biotinylated transposons are generated. The transposons may also include one or more index sequence (unique identifier). Exemplary index sequences include, but are not limited to TAGATCGC, CTCTCTAT, TATCCTCT, AGAGTAGA, GTAAGGAG, ACTGCATA, AAGGAGTA, CTAAGCCT. In another example, only the P5 or only the P7 transposons are biotinylated. In yet another example, the transposons comprise only the mosaic end (ME) sequences or the ME sequences plus additional sequences that are not P5 and P7 sequences. In this example, P5 and P7 sequences are added in a subsequent PCR amplification step.
At a step 115 of Figure 1, the transposomes are assembled. The assembled transposomes are a mixture of P5 and P7 transposomes. A mixture of P5 and P7 transposomes are described in more detail with reference to Figures 11 and 12.
At a step 120 of Figure 1, P5/P7 transposome mixtures are bound to a bead surface. In this example, the beads are streptavidin coated beads and the transposomes are bound to the bead surface via a biotin-streptavidin binding complex. Beads can be of various sizes. In one example, the beads may be 2.8 μιη beads. In another example, the beads may be 1 μιη beads. A suspension (e.g., 1 μί) of 1 μιη beads provides a large surface area per volume for transposomes binding. Because of the available surface area for transposomes binding, the number of tagmentation products per reaction is increased.
Figure 2 shows pictorially the steps 110, 115, and 120 of method 100 of Figure 1. In this example, the transposons are shown as duplexes. In another example (not shown), another structure such as a hairpin, i.e., a single oligonucleotide with regions of self-complementarity capable of forming a duplex, may be used.
At step 110 of method 100, a plurality of biotinylated P5 transposons 210a and a plurality of P7 transposons 210b are generated. P5 transposons 210a and P7 transposons 210b are biotinylated.
At step 115 of method 100, P5 transposons 210a and P7 transposons 210b are mixed with transposase Tn5 215 to form a plurality of assembled transposomes 220. At step 120 of method 100, transposomes 220 are bound to a bead 225. Bead 225 is a streptavidin coated bead. Transposomes 220 are bound to bead 225 via a biotin-streptavidin binding complex.
In one embodiment, a mixture of transposomes may be formed on a solid support such as bead surface as shown in Figures 10, 11, 12, and 13. In this example, P5 and P7 oligonucleotides are first bound to a bead surface prior to assembly of transposome complexes.
Figure 3 illustrates a schematic diagram of an example of a tagmentation process 300 on a bead surface. Shown in process 300 is bead 225 of Figure 2 with transposomes 220 bound thereon. A solution of DNA 310 is added to a suspension of beads 225. As DNA 310 contacts transposomes 220, the DNA is tagmented (fragmented and tagged) and is bound to beads 225 via transposomes 220. Bound and tagmented DNA 310 may be PCR amplified to generate a pool of amplicons 315 in solution (bead- free). Amplicons 315 may be transferred to the surface of a flow cell 320. A cluster generation protocol (e.g., a bridge amplification protocol or any other amplification protocol that may be used for cluster generation) may be used to generate a plurality of clusters 325 on the surface of flow cell 320. Clusters 325 are clonal amplification products of tagmented DNA 310. Clusters 325 are now ready for the next step in a sequencing protocol.
In another embodiment, the transposomes may be bound to any solid surface, such as the walls of a microfuge tube.
In another embodiment of forming a mixture of transposome complexes on a bead surface, oligonucleotides are first bound to a bead surface prior to transposome assembly. Figure 10 illustrates a flow diagram of an example of a method 1000 of forming transposome complexes on a bead surface. Method 1000 includes, but is not limited to, the following steps.
At a step 1010, P5 and P7 oligonucleotides are bound to a bead surface. In one example, the P5 and P7 oligonucleotides are biotinylated and the bead is a streptavidin coated bead. This step is also shown pictorially in schematic diagram 1 100 of Figure 11. Referring now to Figure 11, a P5 oligonucleotide 1110 and a P7 oligonucleotide 1115 are bound to the surface of a bead 1120. In this example, a single P5 oligonucleotide 1110 and a single P7 oligonucleotide 1115 are bound to the surface of bead 1 120, but any number of P5 oligonucleotides 1110 and/or P7 oligonucleotides 1115 may be bound to the surface of a plurality of beads 1120. In one example, P5 oligonucleotide 1110 comprises a P5 primer sequence, an index sequence (unique identifier), a read 1 sequencing primer sequence and a mosaic end (ME) sequence. In this example, P7 oligonucleotide 1115 comprises a P7 primer sequence, an index sequence (unique identifier), a read 2 sequencing primer sequence and an ME sequence. In another example (not shown), an index sequence is present in only P5 oligonucleotide 1110. In yet another example (not shown), an index sequence is present in only the P7 oligonucleotide 1115. In yet another example (not shown), an index sequence is absent in both P5 oligonucleotide 1110 and P7 oligonucleotide 1115.
At a step 1015, complementary mosaic end (ME') oligonucleotides are hybridized to the bead-bound P5 and P7 oligonucleotides. This step is also shown pictorially in schematic diagram 1200 of Figure 12. Referring now to Figure 12, complementary ME sequences (ME') 1125 are hybrid to P5 oligonucleotide 1110 and P7 oligonucleotide 1115. Complementary ME sequences (ME') 1125 (e.g., complementary ME sequences (ME') 1125a and complementary ME sequences (ME') 1125b) hybridize to the ME sequences in P5 oligonucleotide 1110 and P7 oligonucleotide 1115, respectively. Complementary ME sequence (ME') 1125 is typically about 15 bases in length and phosphorylated at its 5' end.
At a step 1020, transposase enzyme is added to the bead-bound oligonucleotides to form a mixture of bead-bound transposome complexes. This step is also shown pictorially in schematic diagram 1300 of Figure 13. Referring now to Figure 13, transposase enzyme is added to form a plurality of transposome complexes 1310. In this example, transposome complex 1310 is a duplex structure that comprises transposase enzyme, two surface-bound oligonucleotide sequences, and their hybridized complementary ME sequences (ME') 1125. For example, transposome complex 1310a comprises P5 oligonucleotide 1110 hybridized to complementary ME sequence (ME') 1125 and P7 oligonucleotide 1115 hybridized to complementary ME sequence (ME') 1125 (i.e., P5:P7); transposome complex 1310b comprises two P5 oligonucleotides 1110 hybridized to complementary ME sequences (ME') 1125 (i.e., P5:P5); and transposome complex 1310c comprises two P7 oligonucleotides 1115 hybridized to complementary ME sequences (ME') 1125 (i.e., P7:P7). The ratio of P5:P5, P7:P7, and P5:P7 transposome complexes may be, for example, 25:25:50, respectively.
Figure 14 shows an exemplary schematic diagram 1400 of a tagmentation process using the transposome coated bead 1120 of Figure 13. In this example, when bead 1120 with transposome complexes 1310 thereon is added to a solution of DNA 1410 in a tagmentation buffer, tagmentation occurs and the DNA is linked to the surface of bead 1120 via transposomes 1310. Successive tagmentation of DNA 1410 results in a plurality of bridged molecules 1415 between transposomes 1310. The length of bridged molecules 1415 may be dependent on the density of transposome complexes 1310 on the surface of bead 1120. In one example, the density of transposome complexes 1310 on the surface of bead 1120 may be tuned by varying the amount of P5 and P7 oligonucleotides bound to the surface of bead 1120 in step 1010 of method 100 of Figure 10. In another example, the density of transposome complexes 1310 on the surface of bead 1120 may be tuned by varying the amount of complementary ME sequence (ME') hybridized to P5 and P7 oligonucleotides in step 1015 of method 1000 of Figure 10. In yet another example, the density of transposome complexes 1310 on the surface of bead 1120 may be tuned by varying the amount of transposase enzyme added in step 1020 of method 1000 of Figure 1.
The length of bridged molecules 1415 is independent of the quantity of beads 1120 with transposome complexes 1310 bound thereon used in a tagmentation reaction. Similarly, adding more or less DNA 1410 in a tagmentation reaction does not alter the size of the final tagmented product, but may affect the yield of the reaction.
In one example, bead 1120 is a paramagnetic bead. In this example, purification of the tagmentation reaction is readily achieved by immobilizing beads 1120 with a magnet and washing. Therefore, tagmentation and subsequent PCR amplification may be performed in a single reaction compartment ("one -pot") reaction.
In one aspect, the present invention relate to methods and compositions of transposase- based methods that can derive contiguity information of a target nucleic acid on a solid support. In some embodiments, the compositions and the methods can derive assembly/phasing information. In one embodiment, the solid support is a bead. In one embodiment, the target nucleic acid is DNA. In one embodiment, the target nucleic acid is genomic DNA. In some embodiment, the target nucleic acid is RNA. In some embodiments, the RNA is mRNA. In some embodiments, the target nucleic acid is complimentary DNA (cDNA).
In some embodiments, transposons may be immobilized as dimers to solid-support such as beads, followed by the binding of transposase to the transposons to form transposomes.
In some embodiments, particularly related to formation of transposomes on solid-phases by solid-phase immobilized transposons and addition of transposase, two transposons may be immobilized in close proximity (preferably fixed distance) to one another in a solid support. There are several advantages to this approach. First, the two transposons will always be immobilized simultaneously, with preferably an optimum linker length and orientation of the two transposons to form transposomes efficiently. Second, transposome formation efficiency will not be a function of transposon density. Two transposons will always be available with the right orientation and distance between them to form transposomes. Third, with random immobilized transposons on surfaces, various distances are created between transposons, therefore only a fraction has the optimum orientation and distance to form transposomes efficiently. As a consequence, not all transposons are converted into transposomes and solid-phase immobilized non-complexed transposons will be present. These transposons are susceptible as a target to transposition as the ME-part is double-stranded DNA. This could result in a reduction of transposition efficiency or creates undesired side products. Thus, transposomes may be prepared on solid support, which can subsequently be used to derive contiguity information through tagmentation and sequencing. An exemplary scheme is illustrated in Figure 15. In some embodiments, the transposons may be immobilized to the solid support by means other than chemical coupling. Exemplary methods of immobilizing transposons on the solid support may include, but are not limited to affinity binding such as streptavidin-biotin, maltose-maltose binding protein, antigen-antibody, DNA-DNA or DNA-RNA hybridization.
In some embodiments, transposomes can be pre-assembled and then immobilized on a solid-support. In some embodiments, the transposons comprise unique indexes, barcodes, and amplification primer binding sites. Transposase can be added in solution comprising transposons to form transposome dimers, which can be immobilized on a solid support. In one embodiment, multiple bead sets can be generated in which each set has the same index derived from the immobilized transposons thus generating indexed beads. Target nucleic acid can be added to each set of indexed beads as shown in Figure 29A.
In some embodiments, target nucleic acid can be added to each set of indexed beads, tagmented and subsequent PCR amplification may be performed separately.
In some embodiments, target nucleic acid, indexed beads, and transposomes can be combined in droplets such that a number of droplets contain a single bead with one or more DNA molecules and adequate transposomes. In some embodiments, the indexed beads can be pooled, target nucleic acid can be added to the pool, tagmented and subsequent PCR amplification may be performed in a single reaction compartment ("one-pot").
In one aspect, the present invention relate to methods and compositions to derive contiguity information by means of capturing contiguously-linked, transposed, target nucleic acid onto a solid support. In some embodiments, contiguity preserving transposition (CPT) is carried out on the DNA, but the DNA is kept intact (CPT-DNA), thus making contiguously- linked libraries. Contiguity information can be preserved by the use of transposase to maintain the association of template nucleic acid fragments adjacent in the target nucleic acid. The CPT- DNA can be captured by hybridization of complimentary oligonucleotides having unique indexes or barcodes and immobilized on solid support, e.g., beads (Figure 29B). In some embodiments, the oligonucleotide immobilized on the solid support may further comprise primer binding sites, unique molecular indices (UMI), in addition to barcodes.
Advantageously, such use of transposomes to maintain physical proximity of fragmented nucleic acids increases the likelihood that fragmented nucleic acids from the same original molecule, e.g., chromosome, will receive the same unique barcode and index information from the oligonucleotides immobilized on a solid support. This will result in a contiguously-linked sequencing library with unique barcodes. The contiguously-linked sequencing library can be sequenced to derive contiguous sequence information.
Figures 16 and 17 show schematic representations of an exemplary embodiment of the above aspect of the invention of making contiguously-linked libraries with unique barcodes or indices. The exemplary method leverages on ligation of the CPT-DNA with the immobilized oligonucleotides on the solid support comprising unique indexes and barcodes and strand- replacement PCR to generate a sequencing library. In one embodiment, clonal indexed beads may be generated with immobilized DNA sequences such as random or specific primer and index. Contiguously-linked libraries can be captured onto clonal-indexed beads by hybridization to the immobilized oligonucleotides followed by ligation. As intramolecular hybridization capture is much faster than intermolecular hybridization, contiguously-transposed libraries will "wrap" around the bead. Figures 18 and 19 depict the capture of the CPT-DNA on clonal indexed beads and the preservation of the contiguity information. Strand-replacement PCR can transfer the clonal bead index information to the individual molecule. Thus, each contiguously- linked library will be uniquely indexed.
In some embodiments, the oligonucleotide immobilized on a solid support can comprise a partially double stranded structure such that one strand is immobilized to the solid support and the other strand is partially complementary to the immobilized strand resulting in a Y-adaptor. In some embodiments, the Y-adaptor immobilized on the solid surface is linked to the contiguously linked tagmented DNA by ligation and gap filling and shown in Figure 20.
In some embodiments, Y-adaptor is formed through hybridization capture of CPT-DNA with the probe/index on the solid support such as beads. Figure 21 shows an exemplary scheme of making such Y-adapters. The use of these Y-adapters ensures that potentially every fragment can become a sequencing library. This increases the coverage per sequencing.
In some embodiments, free transposomes may be separated from CPT-DNA. In some embodiments, the separation of the free transposomes is by size exclusion chromatography. In one embodiment, the separation may be achieved by MicroSpin S-400 HR Columns (GE Healthcare Life Sciences, Pittsburgh, PA). Figure 22 shows an agarose gel electrophoresis of the separated of CPT-DNA from the free transposomes.
Capturing contiguously-linked, transposed, target nucleic acid onto a solid support through hybridization has several unique advantages. First, the method is based on hybridization and not transposition. Intramolecular hybridization rate » intermolecular hybridization rate. Thus, chances of contiguously-transposed libraries on a single target DNA molecule to wrap around a uniquely indexed bead is much higher than having two or more different single target DNA molecule to wrap around a uniquely indexed bead. Second, DNA transposition and barcoding of the transposed DNA occur in two separate steps. Third, the challenges associated with active transposome assembly on beads and surface density optimization of transposons on solid-surfaces can be avoided. Fourth, self-transposition products can be removed by column purification. Fifth, as contiguously linked, transposed, DNA contains gaps, the DNA is more flexible and therefore puts less of a burden on transposition density (insert size) compared to immobilizing transposome on bead methods. Sixth, the method can be used with combinatorial barcoding schemes. Seventh, it is easy to covalently-link indexed oligos to the beads. Thus, there is less chance for index exchange. Eight, the tagmentation and subsequent PCR amplification may be multiplexed and can be performed in a single reaction compartment ("one- pot") reaction eliminating the need to carryout individual reactions for each index sequences.
In some embodiments, a plurality of unique barcodes throughout the target nucleic acid may be inserted during transposition. In some embodiments, each barcode includes a first barcode sequence and a second barcode sequence, having a fragmentation site disposed therebetween. The first barcode sequence and second barcode sequence can be identified or designated to be paired with one another. The pairing can be informative so that a first barcode is associated with a second barcode. Advantageously, the paired barcode sequences can be used to assemble sequencing data from the library of template nucleic acids. For example, identifying a first template nucleic acid comprising a first barcode sequence and a second template nucleic acid comprising a second barcode sequence that is paired with the first indicates that the first and second template nucleic acids represent sequences adjacent to one another in a sequence representation of the target nucleic acid. Such methods can be used to assemble a sequence representation of a target nucleic acid de novo, without the requirement of a reference genome.
In one aspect, the present invention relate to methods and compositions to generate shotgun sequence library of a specific DNA fragment.
In one embodiment, clonal indexed beads are generated with immobilized oligonucleotide sequences: random or specific primer and unique indexes. Target nucleic acid is added to the clonal indexed beads. In some embodiments, the target nucleic acid is DNA. In one embodiment, the target DNA is denatured. The target DNA hybridizes with primers comprising unique indexes immobilized on the solid surface (e.g., bead) and subsequently with other primers with the same index. The primers on the bead amplify the DNA. One or more further rounds of amplification may be carried out. In one embodiment, the amplification may be carried out by whole genome amplification using bead immobilized primers with a 3 ' random n-mer sequence. In a preferred embodiment, the random n-mer contains pseudocomplementary bases (2-thiothymine, 2-amino dA, N4-ethyl cytosine, etc.) to prevent primer-primer interaction during amplification (Hoshika, S; Chen, F; Leal, NA; Benner, SA , Angew. Chem. Int. Ed.49(32) 5554-5557 (2010). Figure 23 shows an exemplary scheme of generating shotgun sequence library of a specific DNA fragment. A clonal indexed sequencing library can library of the amplified product can be generated. In one embodiment, such library can be generated by transposition. Sequence information of the clonal indexed library can be used to assemble the contiguous information using the index information as a guide. Figure 24 shows an exemplary scheme of assembling the sequence information from clonal indexed sequencing library.
The methods of the above embodiments have several advantages. Intra-molecular amplification on a bead is much faster than inter-bead amplification. Thus, the products on a bead will have the same index. A shotgun library of a specific DNA fragment can be created. Random primers amplify the template at random locations and therefore a shotgun library with the same index can be generated from a specific molecule and the sequence information can be assembled using the indexed sequence. A significant advantage of the methods of the above embodiments is that the reactions can be multiplexed in a single reaction (one pot reaction) and will not require using many individual wells. Many index clonal beads can be prepared so many different fragments can be uniquely labeled, and discrimination can be made to the parental alleles for same genomic regions. With a high number of indexes, the chance that the DNA copy of the father and copy of the mother will receive the same index for the same genomic region is low. The method takes advantage of the fact that intra reactions are much faster than inter, the beads basically generate a virtual partition in a larger physical compartment.
In some embodiments of all of the above aspect of the inventions, the method may be used for cell free DNA (cfDNA) in cfDNA assays. In some embodiments, the cfDNA is obtained from plasma, placental fluids.
In one embodiment, the plasma can be obtained from undiluted whole blood using membrane based, sedimentation assisted plasma separator (Liu et al. Anal Chem. 2013 Nov 5;85(21): 10463-70). In one embodiment, the collection zone of the plasma of the plasma separator may comprise solid support comprising transposomes. The solid support comprising transposomes may capture the cfDNA from the isolated plasma as it is separated from the whole blood and can concentrate the cfDNA and/or tagment the DNA. In some embodiments, the tagmentation will further introduce unique barcodes to allow subsequent demultiplexing after sequencing of the pool of libraries.
In some embodiments, the collection zone of the separator may comprise PCR master mix (primers, nucleotides, buffers, metals) and polymerase. In one embodiment, the master mix can be in dry form such that it will be reconstituted as the plasma comes out of the separator. In some embodiments the primers are random primers. In some embodiments, the primers can be specific primers for a particular gene. PCR amplification of the cfDNA will result in the generation of library directly from the separated plasma.
In some embodiments, the collection zone of the separator may comprise RT-PCR master mix (primers, nucleotides, buffers, metals), reverse transcriptase and polymerase. In some embodiments the primers are random primers or oligo dT primers. In some embodiments, the primers can be specific primers for a particular gene. The resulting cDNA can be used for sequencing. Alternatively, the cDNA can be treated with transposomes immobilized on a solid support for sequence library preparation.
In some embodiments, the plasma separator may comprise barcodes (ID or 2D barcodes). In some embodiments, the separation device may comprise blood collection device. This would result in direct delivery of the blood to the plasma separator and library prep device. In some embodiments, the device may comprise a downstream sequence analyzer. In some embodiments, sequence analyzer is a single use sequencer. In some embodiments, the sequencer is capable of queuing samples before sequencing in a batch. Alternatively, the sequencer may have random access capability, where samples are delivered to their sequencing area.
In some embodiments, the collection zone for plasma may comprise silica substrates, such that the cell free DNA is concentrated
Simultaneous phasing and methylation detection
The 5-methyl Cytosine (5-Me-C) and 5-hydroxymethyl Cytosine (5-hydroxy-C), also known as epi modifications play an important role in cellular metabolism, differentiation and cancer development. Inventors of the present application has surprisingly and unexpectedly found that phasing and simultaneous methylation detection is possible using the methods and compositions of the present application. The present methods will allow to combine CPT-seq on beads (indexed contiguity linked libraries) with DNA methylation detection. For example, individual libraries generated on beads can be treated with bisulfite, converting non-methylated Cs, but not methylated Cs to Us, allowing the detection of 5-Me-C. Through additional phasing analysis using heterozygous SNPs, epi-medication-phasing blocks can be established multi megabase range.
In some embodiments, the size of the DNA analyzed can be about hundred bases to about multi mega bases. In some embodiments, the size of the DNA analyzed can be about 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 1200, 1300, 1500, 2000, 3000, 3500, 4000, 4500, 5000, 5500, 6000, 6500, 7000, 7,500, 8000, 8500, 9000, 9500, 10,000, 10,500, 11,000, 11,500, 12,000, 12500, 13000, 14000, 14500, 15000, 15500, 16000, 16500, 17000, 17,500, 18,000, 18,500, 19,000, 19,500, 20,000, 20,500, 21,000, 21,500, 22,000, 22,500, 23,000, 23,500, 24,000, 24,500, 25,000, 25,500, 26,000, 26,500, 27,000, 27,500, 28,000, 28,500, 29,500, 30,000, 30,500, 31,000, 31,500, 32,000, 33,000, 34,000, 35,000, 36,000, 37,000, 38,000, 39,000, 40,000, 42,000, 45,000, 50,000, 55,000, 60,000, 65,000, 70,000, 75,000, 80,000, 85,000, 90,000, 95,000, 100,000, 110,000, 120,000, 130,000, 140,000, 150,000, 160,000, 170,000, 180,000, 200,000, 225,000, 250,000, 300,000, 350,000, 400,000, 450,000, 500,000, 550,000, 600,000, 650,000, 700,000, 750,000, 800,000, 850,000, 900,000, 1,000,000, 1,250,000, 1,500,000, 2,000,000, 2,500,000, 3,000,000, 4,000,000, 5,000,000, 6,000,000, 7,000,000, 8,000,000, 9,000,000, 10,000,000, 15,000,000, 20,000,000, 30,000,000, 40,000,000, 50,000,000, 75,000,000, 100,000,000 or more bases.
Other epi-modifications like 5-hydroxy-C, DNA oxidation products, DNA alkylation products, histone-foot printing etc. can also be analyzed in the context of phasing using the disclosed methods and compositions of the present application.
In some embodiments, DNA is first transformed into indexed-linked libraries on a solid- support. Individual indexed libraries, much smaller than the original DNA, are less prone to fragmentation since the individual libraries are smaller. Even if a small fraction of indexed libraries are lost, phasing information is still maintained across the long span of the indexed DNA molecule. For example, if a lOOkb molecule in traditional bisulfite conversion (BSC) is fragmented in half the contiguity is now restricted to 50kb. In the methods disclosed herein, a lOOkb library is first indexed and even if a fraction of individual libraries are lost, contiguity is still at ~100kb (except in the unlikely event when all libraries lost are from one end of the DNA molecule. Also, methods disclosed in the present application has an additional advantage because no additional purification steps are required in contrast to those required in traditional bisulfite conversion approaches, thereby improving the yield. In the methods of the present application, the beads are simply washed after bisulfite conversion. Additionally, while DNA is bound to a solid phase, buffer exchanges can be readily performed with minimal loss of DNA (indexed libraries) and reduced hands on time.
Exemplary scheme of simultaneous phasing and methylation detection is shown in Fig.
43. The workflow consists of tagmentation of DNA on beads, gap-fill-ligate the 9-bp repeat regions, removal of Tn5 with SDS, and bisulfite conversion of the individual libraries on the beads. The bisulfite conversion is performed under denaturing conditions to ensure that neighboring complementary libraries are not re-annealing, therefore reducing the bisulfite conversion efficiency. BCS converts non-methylated C's to U's and methylated C's are not converted.
Figure 44 shows an alternative exemplary scheme of simultaneous phasing and methylation detection. After preparing sequencing libraries after transposition, a fraction of gap- filled-ligated libraries are degraded in order to prepare single-stranded templates. Single-stranded templates need milder conditions for bisulfite conversion since the templates are already single- stranded which could reduce library loss or improve bisulfite conversion efficiency. In one embodiment, a mixture of 3' thio-protected transposons (Exo resistant) and non-protected transposons are used on the same bead. Enzymes, for example, Exo I, can be used to digest the non-thio-protected libraries, converting them to single stranded libraries. Using a mixture of 50:50 of thio-protected transposons: non-protected transposons, 50% of the libraries will be converted to single-stranded libraries (50% have one transposon of the library is protected and one, the complement strand, is not protected), 25% will not be converted (both transposons are thio protected), and 25% are both converted removing the whole library, (both transposons not protected).
One challenge to performing bisulfite conversion of DNA bound to a solid phase, such as streptavidin magnetic beads is that extended treatment of bead bound DNA with sodium bisulfite at high temperatures damages both the DNA and the beads. To help ameliorate DNA damage, carrier DNA (i.e. Lambda DNA) is added to the reaction mixture prior to bisulfite treatment. Even in presence of carrier DNA, it has been estimated that approximately 80% of starting DNA is lost. As a result, CPTSeq contiguity blocks have fewer members than those in the traditional CPTSeq protocol.
Therefore, several strategies are proposed herein to improve DNA yield of the Epi- CPTSeq protocol. The first strategy relies on decreasing library insert size by more densely populating transposome complexes to the streptavidin beads. By decreasing library size, a smaller proportion of library elements are degraded by bisulfite treatment.
The second strategy to improve DNA yield of the Epi-CPTSeq protocol is enzymatic recovery of broken library elements. The purpose of the recovery strategy is to add the 3' common sequence necessary for library amplification back to the bead bound library elements that became digested and lost their 3 ' portion during bisulfite treatment. After the addition of the 3 ' common sequence these elements can now be PCR amplified and sequenced. Figure 67 and 68 shows an exemplary scheme of this strategy. Double stranded CPTSeq library elements have been denatured and bisulfite converted (top panel). During bisulfite conversion, one of DNA strands has been damaged (middle panel), leading to loss of the PCR common sequence on the 3' end. Template rescue strategies restore the 3' common sequence (green) necessary for PCR amplification (bottom panel). In one example, a terminal transferase in a presence of 3' phosphorylated attenuator oligo, a sequence containing a sequencing adapter followed by an oligo dT stretch is used (Figure 68A). Briefly, TdT adds a stretch of 10 to 15 dAs to the 3' end of a broken library element, which anneals to the oligo dT portion of the attenuator oligo. Formation of this DNA hybrid stops TdT reaction and provides template for consequent extension of the 3 'end of a broken library element by DNA polymerase.
In an alternative workflow (Figure 68B), the TdT tailing reaction is performed in the presence of a partially double stranded attenuator oligo, containing a single stranded oligo dT portion and 5 ' phosphorylated double stranded sequencing adapter portion. Upon termination of TdT reaction, the nick between last added dA and 5 ' phosphorylated attenuator oligo is sealed by DNA ligase.
Both of the described workflows rely on a controllable TdT tailing reaction recently developed and described in US Patent Application Publication 20150087027. A common sequencing adapter can also be added to the 3 ' end of broken library elements by a recently introduced ssDNA template switching activity of MMLV RT. In short, MMLV RT and a template switch oligo (TS oligo) are added to damaged DNA (Figure 68C). In first step of this reaction, reverse transcriptase adds a few additional nucleotides to 3' ends of a single-stranded DNA fragment, and these bases pair with an oligo (N) sequence presented at the 3 ' end of one of the TS oligos. Then, reverse transcriptase template switching activity adds the sequences of the annealed common primers to the 3' end of BSC broken library element, restoring its ability to get amplified in PCR with common sequencing primers.
As a part of the third strategy, an Epicentre's EpiGenome kit "post-bisulfite conversion" library construction method can be used to rescue library elements which lost their common sequences at the 3' end during bisulfite conversion. As shown in Figure 69, this library rescue method utilizes 3 ' phosphorylated oligos with common sequences followed by a short stretch of random sequence. These short random sequences hybridize to the bisulfite-treated single-stranded DNA and common sequences are subsequently copied to the broken library strand by DNA polymerase.
Figure 74 shows the fourth strategy to improve the bisulfite sequencing methods on beads.
A first common sequence comprising a capture tag is covalently attached to the 5 ' ends of DNA. The first common sequence can be attached to DNA using various methods, including single- sided transposition (as pictured), adapter ligation, or terminal transferase (TdT) adapter ligation as described in US Patent Application Publication 20150087027.
Next, DNA is denatured (e.g. incubation at high heat) and bound to a solid support. If biotin is used as a capture tag on CS1, for example, DNA can be bound using streptavidin magnetic beads (as pictured). Once bound to the solid support buffer exchanges can be readily made.
In the next step, bisulfite conversion of ssDNA is performed. In the single stranded form, DNA should be readily accessible for bisulfite conversion; up to 95% conversion efficiencies have been observed using a modified version of Promega's Methyl Edge BSC kit (Figure 75).
After bisulfite conversion, a second common sequence is covalently attached to the 3 ' end of ssDNA attached to solid support. Several methods have been described above to covalently attach oligos to ssDNA. Using the TdT attenuator/adapter ligation method, ligation efficiencies of >95% have been achieved. As a result, final library yields using the proposed MethylSeq workflow should be greater than existing methods.
In the final step, PCR is performed to amplify the library and remove it from the solid support. PCR primers can be designed to add additional commons sequences, such as sequencing adapters, to the ends of the MethylSeq library.
Preparation of different size libraries in a single assay
The accuracy of the assembly of genomes is contingent on the use of different length scale technologies. For example, shotgun (100's of bp) - matepair (~3Kb) to -Hi-C (Mb-scale) are all methods that sequentially improve assemblies and contig lengths. The challenge is that multiple assays are required to accomplish this, making the multi-layered approach cumbersome and costly. The compositions and methods disclosed herein can address multiple length scales in a single assay. In some embodiments, library preparation can be achieved in a single assay using differentially sized solid support, for example, beads. Each bead size will generate a specific library size or range of sizes, with the physical size of the bead determining the library size. The various sized beads all have unique clonal indices that are transferred to the library. As such, different sizes libraries are generated with each different library scale-length uniquely indexed. The various length-scale libraries are prepared simultaneously in the same physical compartment, reducing cost and improving overall work flow. In some embodiments, each specific solid support size, for example, bead size receives a unique index. In some other embodiments, multiple different indexes of the same solid support size, for example, bead size are also prepared so multiple DNA molecules can be index partitioned for that size range. Fig. 45 shows an exemplary scheme to generate various sized libraries using various sized clonally indexed beads in a single assay.
In some embodiments, the size of the libraries generated are about 50, 75, 100, 150, 200, 250, 300, 350, 400, 500, 600, 700, 800, 900, 1000, 1200, 1300, 1500, 2000, 3000, 3500, 4000, 4500, 5000, 5500, 6000, 6500, 7000, 7,500, 8000, 8500, 9000, 9500, 10,000, 10,500, 11,000, 11,500, 12,000, 12500, 13000, 14000, 14500, 15000, 15500, 16000, 16500, 17000, 17,500, 18,000, 18,500, 19,000, 19,500, 20,000, 20,500, 21,000, 21,500, 22,000, 22,500, 23,000, 23,500, 24,000, 24,500, 25,000, 25,500, 26,000, 26,500, 27,000, 27,500, 28,000, 28,500, 29,500, 30,000, 30,500, 31,000, 31,500, 32,000, 33,000, 34,000, 35,000, 36,000, 37,000, 38,000, 39,000, 40,000, 42,000, 45,000, 50,000, 55,000, 60,000, 65,000, 70,000, 75,000, 80,000, 85,000, 90,000, 95,000, 100,000, 110,000, 120,000, 130,000, 140,000, 150,000, 160,000, 170,000, 180,000, 200,000, 225,000, 250,000, 300,000, 350,000, 400,000, 450,000, 500,000, 550,000, 600,000, 650,000, 700,000, 750,000, 800,000, 850,000, 900,000, 1,000,000, 1,250,000, 1,500,000, 2,000,000, 2,500,000, 3,000,000, 4,000,000, 5,000,000, 6,000,000, 7,000,000, 8,000,000, 9,000,000, 10,000,000, 15,000,000, 20,000,000, 30,000,000, 40,000,000, 50,000,000, 75,000,000, 100,000,000 or more bases.
In some embodiments, multiple length scale libraries discussed above can be used in the assembly of pseudogenes, paralogs etc. instead of having one large length scale. In some embodiments, multiple length scale libraries are prepared simultaneously in a single assay. The advantage is that at least one length-scale will link a unique region with only the pseudo-gene and or gene, but not both. As such, variants detected with this length-scale can uniquely assign the variant to either the gene or the pseudo-gene. The same holds true for copy number variants, paralogs etc. The strength of assembly is the use of different length scales. Using the methods disclosed herein different length scale indexed linked libraries can be generated in a single assay instead of individual, different library preparations for different length scales. Fig. 46 shows an exemplary scheme of determining genetic variants with different length scale libraries.
Analysis of Genomic Variants
The compositions and methods disclosed herein relate to analysis of genomic variants. Exemplary genomic variants include but are not limited to deletions, inter chromosomal translocations, duplications, paralogs, interchromosomal gene fusions. In some embodiments, the compositions and methods disclosed herein relate to determining phasing information of the genomic variants. The table below shows exemplary interchromosomal gene fusions.
Table 1: Interchromosomal Gene Fusions
Figure imgf000033_0001
Table 2 shows exemplary deletions in chromosome 1 ,
Table 2: Exemplary deletions in chromosome 1
Figure imgf000034_0001
In some embodiments, target nucleic acid can be fragmented prior to exposing it to transposomes. Exemplary fragmentation methods include, but are not limited to sonication, mechanical shearing, and restriction digestion. Fragmentation of target nucleic acid prior to tagmentation (fragmentation and tagging) is advantageous for assembly/phasing of pseudogenes (e.g., CYP2D6). Long islands (>30kb) of indexed linked reads will span the pseudogenes A and A' as shown in Figure 64. Due to high sequence homology, it will be challenging to determine which variant belongs to Gene A and Gene A'. Shorter variants will link one variant of the pseudogenes with unique surrounding sequences. Such shorter islands can be achieved by fragmenting the target nucleic acid prior to tagmentation.
Linked Transposomes
In some embodiments, transposases are multimeric in a transposome complex, e.g., they form dimers, tetramers etc. in a transposome complex. Inventors of the present application have surprisingly and unexpectedly found that linking the monomer transposases in multimeric transposome complex or linking the transposon ends of a transposome monomer in multimeric transposome complex has several advantages. First, the linking of the transposases or the transposons leads to the complexes that are more stable and a large fraction is in an active state. Second, lower concentrations of transposomes can potentially be used in the fragmentation by transposition reaction. Third, the linking leads to lower exchange of the mosaic ends (ME) of transposome complexes, thus less mixing of barcodes or adaptor molecules. Such swapping of ME ends are possible if the complexes fall apart and reform, or in case where transposomes are immobilized on solid support by streptavidin/biotin, the streptavidin/biotin interaction can break and reform, or when there is a possible contamination. Inventors of the present application noted that there is a significant swap or exchange of ME ends under various reaction conditions. In some embodiments, the exchange can be as high as 15%. The exchange is pronounced in high salt buffer and the exchange is reduced in glutamate buffer. Figures 57 and 58 shows some possible mechanisms of ME swapping.
In some embodiments, the transposase subunits in the transposome complex can be linked to each other by covalent and non-covalent means. In some embodiments, transposase monomers can be linked before making the transposome complex (before addition of the transposons). In some embodiments, transposase monomers can be linked after transposome formation.
In some embodiments, native amino acid residues may be substituted with Cysteine (Cys) amino acids at the multimeric interface to promote disulfide bond formation. For example, in Tn5 transposase, Asp468, Tyr407, Asp461, Lys459, Ser458, Gly462, Ala466, Met470 may be substituted with Cys to promote disulfide bond between the monomer subunits and shown in Figures 59 and 60. For Mos-1 transposase, exemplary amino acids that can be substituted with cysteine include, but are not limited to Leu21, Leu32, Ala35, His20, Phel7, Phe36, Ilel6, Thrl3, Argl2, GlnlO, Glu9 and shown in Fig. 61. In some embodiments, the modified transposase with amino acid residues substituted with cysteine can chemically cross-linked to each other using a chemical cross-linker using maleimide or pyridyldithiol reactive groups. Exemplary chemical cross-linkers are available commercially from Pierce Protein Biology/ThermoFisher Scientific (Grand Island, NY, USA).
In some embodiments, transposome multimer complexes can be covalently linked to solid support. Exemplary solid supports include but are not limited to nanoparticles, beads, flow cell surfaces, column matrices. In some embodiments, solid surfaces may be coated with amine groups. Modified transposase with amino acid residues substituted with cysteine can be chemically cross-linked to such amine groups using an amine-to-sulfhydryl crosslinker (i.e., succinimidyl-4-(N-maleimidomethyl)cyclohexane-l-carboxylate (SMCC)). Exemplary scheme is shown in Figure 62. In some embodiments, a maleimide- PEG-biotin crosslinker may be used to couple dTnp to a streptavidin coated solid surface.
In some embodiments, transposase gene can be modified to express multimeric protein in a single polypeptide. For example, Tn5 or Mos-1 genes can be modified to express two Tn5 or Mos-1 proteins in a single polypeptide. Similarly Mu transposase gene can be modified to encode four mu transposase units in a single polypeptide.
In some embodiments, the transposon ends of a transposome monomer unit can be linked to form a linked transposome multimer complex. Linking the transposon ends allow insertion of primer sites, sequencing primers, amplification primers or any role DNA can play into gDNA without fragmenting the target DNA. Insertion of such functionality are advantages in
haplotyping assays or junction tagging assays in which information needs to be extracted from intact molecules or in which sub-sampling are important. In some embodiments, transposon ends of Mu transposomes can be linked to a "looped" Mu transposase/transposon configuration. Since Mu is a tetramer, various configurations are possible but not limited by linking either R2UJ and/or R1UJ with R2J and/or R1J. In these configurations R2UJ and R1UJ can/are not connected with R2J and R1J, respectively. Figure 63 shows a Mu transposome complex where transposon ends are linked. In some embodiments, transposon ends of Tn5 or transposon ends of Mos-1 transposomes can be linked.
As used herein the term "transposon" means a double-stranded DNA that exhibits only the nucleotide sequences (the "transposon end sequences") that are necessary to form the complex with the transposase or integrase enzyme that is functional in an in vitro transposition reaction. A transposon forms a "complex" or a "synaptic complex" or a "transposome complex" or a "transposome composition with a transposase or integrase that recognizes and binds to the transposon, and which complex is capable of inserting or transposing the transposon into target DNA with which it is incubated in an in vitro transposition reaction. A transposon exhibits two complementary sequences consisting of a "transferred transposon sequence" or "transferred strand" and a "non-transferred transposon sequence," or "non transferred strand". For example, one transposon that forms a complex with a hyperactive Tn5 transposase (e.g., EZ-Tn5™
Transposase, EPICENTRE Biotechnologies, Madison, Wis., USA) that is active in an in vitro transposition reaction comprises a transferred strand that exhibits a "transferred transposon sequence" as follows:
5* AGATGTGTATAAGAGACAG 3*
and a non-transferred strand that exhibits a "non-transferred transposon sequence" as follows:
5* CTGTCT CTTATACACATCT 3*.
The 3'-end of a transferred strand is joined or transferred to target DNA in an in vitro transposition reaction. The non-transferred strand, which exhibits a transposon sequence that is complementary to the transferred transposon end sequence, is not joined or transferred to the target DNA in an in vitro transposition reaction. In some embodiments, the transposon sequences may comprise one or more of the following: a barcode, an adaptor sequence, a tag sequence, a primer binding sequence, a capture sequence, unique molecular identifier (UMI) sequence.
As used herein the term "adaptor" means a nucleic acid sequence that can comprise a barcode, a primer binding sequence, a capture sequence, a sequence complementary to a capture sequence, unique molecular identifier (UMI) sequence, an affinity moiety, restriction site.
As used herein the term "contiguity information" refers to a spatial relationship between two or more DNA fragments based on shared information. The shared aspect of the information can be with respect to adjacent, compartmental and distance spatial relationships. Information regarding these relationships in turn facilitates hierarchical assembly or mapping of sequence reads derived from the DNA fragments. This contiguity information improves the efficiency and accuracy of such assembly or mapping because traditional assembly or mapping methods used in association with conventional shotgun sequencing do not take into account the relative genomic origins or coordinates of the individual sequence reads as they relate to the spatial relationship between the two or more DNA fragments from which the individual sequence reads were derived. Therefore, according to the embodiments described herein, methods of capturing contiguity information may be accomplished by short range contiguity methods to determine adjacent spatial relationships, mid-range contiguity methods to determine compartmental spatial relationships, or long range contiguity methods to determine distance spatial relationships. These methods facilitate the accuracy and quality of DNA sequence assembly or mapping, and may be used with any sequencing method, such as those described above.
Contiguity information includes the relative genomic origins or coordinates of the individual sequence reads as they relate to the spatial relationship between the two or more DNA fragments from which the individual sequence reads were derived. In some embodiments, contiguity information includes sequence information from non-overlapping sequence reads.
In some embodiments, the contiguity information of a target nucleic acid sequence is indicative of haplotype information. In some embodiments, the contiguity information of a target nucleic acid sequence is indicative of genomic variants.
As used herein the term "maintaining the contiguity of the target nucleic acid" in the context of fragmenting a nucleic acid means maintaining the order of the nucleic acid sequence of the fragments from the same target nucleic acid.
As used herein the term "at least a portion" and/or grammatical equivalents thereof can refer to any fraction of a whole amount. For example, "at least a portion" can refer to at least about 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 99%, 99.9% or 100% of a whole amount.
As used herein the term "about" means +/- 10%.
As used herein, the term "sequencing read" and/or grammatical equivalents thereof can refer to a repetitive process of physical or chemical steps that is carried out to obtain signals indicative of the order of monomers in a polymer. The signals can be indicative of an order of monomers at single monomer resolution or lower resolution. In particular embodiments, the steps can be initiated on a nucleic acid target and carried out to obtain signals indicative of the order of bases in the nucleic acid target. The process can be carried out to its typical completion, which is usually defined by the point at which signals from the process can no longer distinguish bases of the target with a reasonable level of certainty. If desired, completion can occur earlier, for example, once a desired amount of sequence information has been obtained. A sequencing read can be carried out on a single target nucleic acid molecule or simultaneously on a population of target nucleic acid molecules having the same sequence, or simultaneously on a population of target nucleic acids having different sequences. In some embodiments, a sequencing read is terminated when signals are no longer obtained from one or more target nucleic acid molecules from which signal acquisition was initiated. For example, a sequencing read can be initiated for one or more target nucleic acid molecules that are present on a solid phase substrate and terminated upon removal of the one or more target nucleic acid molecules from the substrate. Sequencing can be terminated by otherwise ceasing detection of the target nucleic acids that were present on the substrate when the sequencing run was initiated. Exemplary methods of sequencing are described in U.S. Patent No. 9,029,103, which is incorporated herein by reference in its entirety.
As used herein, the term "sequencing representation" and/or grammatical equivalents thereof can refer to information that signifies the order and type of monomeric units in the polymer. For example, the information can indicate the order and type of nucleotides in a nucleic acid. The information can be in any of a variety of formats including, for example, a depiction, image, electronic medium, series of symbols, series of numbers, series of letters, series of colors, etc. The information can be at single monomer resolution or at lower resolution. An exemplary polymer is a nucleic acid, such as DNA or RNA, having nucleotide units. A series of "A," "T," "G," and "C" letters is a well-known sequence representation for DNA that can be correlated, at single nucleotide resolution, with the actual sequence of a DNA molecule. Other exemplary polymers are proteins having amino acid units and polysaccharides having saccharide units.
Solid Support
Throughout this application, solid support and solid surface are used interchangeably. In some embodiments, the solid support or its surface is non-planar, such as the inner or outer surface of a tube or vessel. In some embodiments, the solid support comprises microspheres or beads. By "microspheres" or "beads" or "particles" or grammatical equivalents herein is meant small discrete particles. Suitable bead compositions include, but are not limited to, plastics, ceramics, glass, polystyrene, methylstyrene, acrylic polymers, paramagnetic materials, thoria sol, carbon graphite, titanium dioxide, latex or cross-linked dextrans such as Sepharose, cellulose, nylon, cross-linked micelles and Teflon, as well as any other materials outlined herein for solid supports may all be used. "Microsphere Detection Guide" from Bangs Laboratories, Fishers Ind. is a helpful guide. In certain embodiments, the microspheres are magnetic microspheres or beads. In some embodiments, the beads can be color coded. For example, MicroPlex® Microspheres from Luminex, Austin, TX may be used.
The beads need not be spherical; irregular particles may be used. Alternatively or additionally, the beads may be porous. The bead sizes range from nanometers, i.e. about 10 nm, to millimeters in diameter, i.e. 1 mm, with beads from about 0.2 micron to about 200 microns being preferred, and from about 0.5 to about 5 micron being particularly preferred, although in some embodiments smaller or larger beads may be used. In some embodiments, beads can be about 0.1 , 0.2, 0.3, 0.4, 0.5. 0.6, 0.7, 0.8, 0.9, 1 , 1.5, 2, 2.5, 2.8, 3, 3.5, 4, 4.5, 5, 5.5, 6, 6.5, 7, 7.5, 8, 8.5, 9, 9.5, 10, 10.5, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 150, or 200 μιη in diameter.
Transposomes
A "transposome" comprises an integration enzyme such as an integrase or transposase, and a nucleic acid comprising an integration recognition site, such as a transposase recognition site. In embodiments provided herein, the transposase can form a functional complex with a transposase recognition site that is capable of catalyzing a transposition reaction. The transposase may bind to the transposase recognition site and insert the transposase recognition site into a target nucleic acid in a process sometimes termed "tagmentation". In some such insertion events, one strand of the transposase recognition site may be transferred into the target nucleic acid. In one example, a transposome comprises a dimeric transposase comprising two subunits, and two non-contiguous transposon sequences. In another example, a transposome comprises a transposase comprises a dimeric transposase comprising two subunits, and a contiguous transposon sequence.
Some embodiments can include the use of a hyperactive Tn5 transposase and a Tn5-type transposase recognition site (Goryshin and Reznikoff, J. Biol. Chem., 273 :7367 (1998)), or MuA transposase and a Mu transposase recognition site comprising Rl and R2 end sequences (Mizuuchi, K., Cell, 35 : 785, 1983; Savilahti, H, et al, EMBO J., 14: 4893, 1995). An exemplary transposase recognition site that forms a complex with a hyperactive Tn5 transposase (e.g., EZ- Tn5™ Transposase, Epicentre Biotechnologies, Madison, Wisconsin) comprises the following 19b transferred strand (sometimes "M" or "ME") and non-transferred strands: 5' AGATGTGTATAAGAGACAG 3', 5' CTGTCT CTTATACACATCT 3', respectively. ME sequences can also be used as optimized by a skilled artisan.
More examples of transposition systems that can be used with certain embodiments of the compositions and methods provided herein include Staphylococcus aureus Tn552 (Colegio et al, J. BacterioL, 183: 2384-8, 2001 ; Kirby C et al, Mol. Microbiol, 43: 173-86, 2002), Tyl (Devine & Boeke, Nucleic Acids Res., 22: 3765-72, 1994 and International Publication WO 95/23875), Transposon Tn7 (Craig, N L, Science. 271 : 1512, 1996; Craig, N L, Review in: Curr Top Microbiol Immunol, 204:27-48, 1996), Tn/O and IS10 (Kleckner N, et al, Curr Top Microbiol Immunol, 204:49-82, 1996), Mariner transposase (Lampe D J, et al, EMBO J., 15: 5470-9, 1996), Tel (Plasterk R H, Curr. Topics Microbiol. Immunol, 204: 125-43, 1996), P Element (Gloor, G B, Methods Mol. Biol, 260: 97-114, 2004), Tn3 (Ichikawa & Ohtsubo, J Biol. Chem. 265: 18829-32, 1990), bacterial insertion sequences (Ohtsubo & Sekine, Curr. Top. Microbiol. Immunol. 204: 1-26, 1996), retroviruses (Brown, et al, Proc Natl Acad Sci USA, 86:2525-9, 1989), and retrotransposon of yeast (Boeke & Corces, Annu Rev Microbiol. 43:403-34, 1989). More examples include IS5, TnlO, Tn903, IS911, , Sleeping Beauty, SPIN, hAT, PiggyBac, Hermes, TcBuster, AeBusterl, Tol2, and engineered versions of transposase family enzymes (Zhang et al, (2009) PLoS Genet. 5 :e 1000689. Epub 2009 Oct 16; Wilson C. et al (2007) J. Microbiol. Methods 71 :332-5).
More examples of integrases that may be used with the methods and compositions provided herein include retroviral integrases and integrase recognition sequences for such retroviral integrases, such as integrases from HIV-1, HIV-2, SIV, PFV-1, RSV.
Barcodes
Generally, a barcode can include one or more nucleotide sequences that can be used to identify one or more particular nucleic acids. The barcode can be an artificial sequence, or can be a naturally occurring sequence generated during transposition, such as identical flanking genomic DNA sequences (g-codes) at the end of formerly juxtaposed DNA fragments. In some embodiments, the barcodes are artificial sequences that are absent in the target nucleic acid sequence and can be used to identify one or more target nucleic acid sequences.
A barcode can comprise at least about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 or more consecutive nucleotides. In some embodiments, a barcode comprises at least about 10, 20, 30, 40, 50, 60, 70 80, 90, 100 or more consecutive nucleotides. In some embodiments, at least a portion of the barcodes in a population of nucleic acids comprising barcodes is different. In some embodiments, at least about 10%, 20%>, 30%>, 40%>, 50%>, 60%>, 70%), 80%), 90%), 95%o, 99%o of the barcodes are different. In more such embodiments, all of the barcodes are different. The diversity of different barcodes in a population of nucleic acids comprising barcodes can be randomly generated or non-randomly generated.
In some embodiments, a transposon sequence comprises at least one barcode. In some embodiments, such as transposomes comprising two non-contiguous transposon sequences, the first transposon sequence comprises a first barcode, and the second transposon sequence comprises a second barcode. In some embodiments, a transposon sequence comprises a barcode comprising a first barcode sequence and a second barcode sequence. In some of the foregoing embodiments, the first barcode sequence can be identified or designated to be paired with the second barcode sequence. For example, a known first barcode sequence can be known to be paired with a known second barcode sequence using a reference table comprising a plurality of first and second bar code sequences known to be paired to one another.
In another example, the first barcode sequence can comprise the same sequence as the second barcode sequence. In another example, the first barcode sequence can comprise the reverse complement of the second barcode sequence. In some embodiments, the first barcode sequence and the second barcode sequence are different. The first and second barcode sequences may comprise a bi-code.
In some embodiments of compositions and methods described herein, barcodes are used in the preparation of template nucleic acids. As will be understood, the vast number of available barcodes permits each template nucleic acid molecule to comprise a unique identification. Unique identification of each molecule in a mixture of template nucleic acids can be used in several applications. For example, uniquely identified molecules can be applied to identify individual nucleic acid molecules, in samples having multiple chromosomes, in genomes, in cells, in cell types, in cell disease states, and in species, for example, in haplotype sequencing, in parental allele discrimination, in metagenomic sequencing, and in sample sequencing of a genome.
Exemplary barcode sequences include, but are not limited to TATAGCCT, ATAGAGGC, CCTATCCT, GGCTCTGA, AGGCGAAG, TAATCTTA, CAGGACGT, and GTACTGAC. Primer sites
In some embodiments, a transposon sequence can include a "sequencing adaptor" or "sequencing adaptor site", that is to say a region that comprises one or more sites that can hybridize to a primer. In some embodiments, a transposon sequence can include at least a first primer site useful for amplification, sequencing, and the like. Exemplary sequences of sequence binding sites include, but are not limited to AATG AT AC GGCG AC C AC CG AG ATCT AC AC (P5 sequence) and CAAGCAGAAGACGGCATACGAGAT (P7 sequence).
Target nucleic acids
A target nucleic acid can include any nucleic acid of interest. Target nucleic acids can include DNA, RNA, peptide nucleic acid, morpholino nucleic acid, locked nucleic acid, glycol nucleic acid, threose nucleic acid, mixed samples of nucleic acids, polyploidy DNA (i.e., plant DNA), mixtures thereof, and hybrids thereof. In a preferred embodiment, genomic DNA or amplified copies thereof are used as the target nucleic acid. In another preferred embodiment, cDNA, mitochondrial DNA or chloroplast DNA is used. In some embodiments, the target nucleic acid is mRNA.
In some embodiments, target nucleic acid is from a single cell or from fractions of a single cell. In some embodiments, the target nucleic acid is from a single organelle. Exemplary single organelle includes but is not limited to single nuclei, single mitochondria, and a single ribosome. In some embodiments, target nucleic acid is from formalin fixed paraffin embedded (FFPE) sample. In some embodiments, target nucleic acid is cross-linked nucleic acid. In some embodiments, the target nucleic acid is cross-linked with protein. In some embodiments, the target nucleic acid is cross-linked DNA. In some embodiments, the target nucleic acid is histone protected DNA. In some embodiments, histones are removed from the target nucleic acid. In some embodiments, target nucleic acid is from nucleosomes. In some embodiments, target nucleic acid is from nucleosomes from which nuclear proteins are removed.
A target nucleic acid can comprise any nucleotide sequence. In some embodiments, the target nucleic acid comprises homopolymer sequences. A target nucleic acid can also include repeat sequences. Repeat sequences can be any of a variety of lengths including, for example, 2, 5, 10, 20, 30, 40, 50, 100, 250, 500 or 1000 nucleotides or more. Repeat sequences can be repeated, either contiguously or non-contiguously, any of a variety of times including, for example, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15 or 20 times or more. Some embodiments described herein can utilize a single target nucleic acid. Other embodiments can utilize a plurality of target nucleic acids. In such embodiments, a plurality of target nucleic acids can include a plurality of the same target nucleic acids, a plurality of different target nucleic acids where some target nucleic acids are the same, or a plurality of target nucleic acids where all target nucleic acids are different. Embodiments that utilize a plurality of target nucleic acids can be carried out in multiplex formats so that reagents are delivered simultaneously to the target nucleic acids, for example, in one or more chambers or on an array surface. In some embodiments, the plurality of target nucleic acids can include substantially all of a particular organism's genome. The plurality of target nucleic acids can include at least a portion of a particular organism's genome including, for example, at least about 1%, 5%, 10%, 25%, 50%>, 75%), 80%o, 85%), 90%), 95%, or 99% of the genome. In particular embodiments the portion can have an upper limit that is at most about 1%, 5%, 10%, 25%, 50%, 75%, 80%, 85%, 90%, 95%, or 99% of the genome
Target nucleic acids can be obtained from any source. For example, target nucleic acids may be prepared from nucleic acid molecules obtained from a single organism or from populations of nucleic acid molecules obtained from natural sources that include one or more organisms. Sources of nucleic acid molecules include, but are not limited to, organelles, cells, tissues, organs, or organisms. Cells that may be used as sources of target nucleic acid molecules may be prokaryotic (bacterial cells, for example, Escherichia, Bacillus, Serratia, Salmonella, Staphylococcus, Streptococcus, Clostridium, Chlamydia, Neisseria, Treponema, Mycoplasma, Borrelia, Legionella, Pseudomonas, Mycobacterium, Helicobacter, Erwinia, Agrobacterium, Rhizobium, and Streptomyces genera); archeaon, such as crenarchaeota, nanoarchaeota or euryarchaeotia; or eukaryotic such as fungi, (for example, yeasts), plants, protozoans and other parasites, and animals (including insects (for example, Drosophila spp.), nematodes (e.g., Caenorhabditis elegans), and mammals (for example, rat, mouse, monkey, non-human primate and human). Target nucleic acids and template nucleic acids can be enriched for certain sequences of interest using various methods well known in the art. Examples of such methods are provided in Int. Pub. No. WO/2012/108864, which is incorporated herein by reference in its entirety. In some embodiments, nucleic acids may be further enriched during methods of preparing template libraries. For example, nucleic acids may be enriched for certain sequences, before insertion of transposomes after insertion of transposomes and/or after amplification of nucleic acids.
In addition, in some embodiments, target nucleic acids and/or template nucleic acids can be highly purified, for example, nucleic acids can be at least about 70%, 80%, 90%>, 95%, 96%, 97%), 98%), 99%), or 100% free from contaminants before use with the methods provided herein. In some embodiments, it is beneficial to use methods known in the art that maintain the quality and size of the target nucleic acid, for example isolation and/or direct transposition of target DNA may be performed using agarose plugs. Transposition can also be performed directly in cells, with population of cells, lysates, and non-purified DNA.
In some embodiments, target nucleic acid may be obtained from a biological sample or a patient sample. The term "biological sample" or "patient sample" as used herein includes samples such as tissues and bodily fluids. "Bodily fluids" may include, but are not limited to, blood, serum, plasma, saliva, cerebral spinal fluid, pleural fluid, tears, lactal duct fluid, lymph, sputum, urine, amniotic fluid, and semen. A sample may include a bodily fluid that is "acellular." An "acellular bodily fluid" includes less than about 1% (w/w) whole cellular material. Plasma or serum are examples of acellular bodily fluids. A sample may include a specimen of natural or synthetic origin (i.e., a cellular sample made to be acellular).
In some embodiments of the above disclosed methods, target nucleic acid can be fragmented (e.g., by sonication, by restriction digestion, other mechanical means) prior to exposing the target nucleic acid to the transposomes.
The term "Plasma" as used herein refers to acellular fluid found in blood. "Plasma" may be obtained from blood by removing whole cellular material from blood by methods known in the art (e.g., centrifugation, filtration, and the like).
Unless otherwise specified, the terms "a" or "an" mean "one or more" throughout this application.
When the terms "for example", "e.g.", "such as", "include", "including" or variations thereof are used herein, these terms will not be deemed to be terms of limitation, and will be interpreted to mean "but not limited to" or "without limitation." The following Examples provide illustrative embodiments and do not in any way limit the inventions provided herein. EXAMPLES
Example 1— DNA cluster yield from the bead-based tagmentation process
DNA cluster yield from the bead-based tagmentation process of Figure 3 were evaluated and shown in the table of Figure 4. In this example, 50, 250, and 1000 ng of human NA12878 DNA were tagmented using the same batch of tagmentation beads (2.8 μιη beads). A second 50 ng aliquot of NA12878 DNA was tagmented using a second batch of tagmentation beads (full repeat; 2.8 μιη beads). The bead-bound tagmented DNA samples were PCR amplified and purified. An aliquot (5.4 μΕ) of each purified PCR product (unquantified) was diluted 270 fold to make stock sample solutions of about 50 pM. For each sample, the 50 pM stock solution was diluted to 15, 19, 21, and 24 pM. The diluted samples were loaded onto a flow cell for cluster generation and sequencing. The data show that starting from the same dilution (~50 pM), cluster numbers are between 100 - 114% for the three different input levels (i.e., 50, 250, and 1000 ng) using the same set of beads. The cluster number for the 50 ng full repeat (with a different batch of beads) was 81%. Different dilutions (15, 19, 21, and 24 pM) yield the same number of clusters within about 10%. The data indicates that the beads are largely controlling the yield and yield is reproducible for different DNA inputs and different repeats. Example 2— Reproducibility of the bead-based tagmentation process
The reproducibility of the bead-based tagmentation process of Figure 3 is shown in Figure 5. In this example, six different preparations of indexed beads (indexes 1 through 6; 2.8 μιη beads) made at the "same" transposome density were used to prepare tagmented DNA using 50 and 500 ng of input NA12878 DNA. The tagmented DNA was PCR amplified and purified. The 12 purified PCR products were pooled into two mixtures (pool 1 and pool 2) of six for two HiSeq lanes. Each pool includes 3 - 50 ng and 3 - 500 ng samples per lane. Data table 500 shows the median insert size and the mean insert size for each indexed sample.
Example 3— Insert size of pool 1 and the insert size of pool 2
The insert size of pool 1 and the insert size of pool 2 are shown in Figure 6A (Plot 600) Figure 6B (Plot 650), respectively, of the indexed samples of Figure 5. The data also shows that the insert size is uniform between the six different preparations of indexed beads. Bead- based tagmentation provides a mechanism to control the size of the inserts and DNA yield.
Example 4— Reproducibility of total number of reads
The reproducibility of total number of reads and percent reads aligned for the experiment described in Figure 5 is shown in Figure 7 (Bar graph 700). At both inputs (50 ng and 500 ng) the total number of reads is similar for the same indexed bead preparation. Four of the six indexed bead preparations (index 1, 2, 3, and 6) have very similar yields; indexed bead preparations 4 and 5 shown some variability which may be due to the index sequence.
In one application, the bead-based tagmentation process may be used in an exome enrichment assay which includes a tagmentation step, e.g., Illumina's Nextera® Rapid Capture Enrichment protocol. In the current exome enrichment assay (i.e., Illumina's Nextera® Rapid Capture Enrichment protocol), solution-based tagmentation (Nextera) is used to fragment the genomic DNA. Gene specific primers are then used to pull down specific gene fragments of interest. Two enrichment cycles are performed and fragments pulled down are then enriched by PCR and sequenced.
To evaluate the use of the bead-based tagmentation process in the exome enrichment assay, human NA12878 DNA was tagmented using 25, 50, 100, 150, 200, and 500 ng of input DNA. A control library (NA00536) was prepared from 50 ng input DNA according to the standard protocol. Each DNA input had a different index (unique identifier). Ten cycles of PCR using enhanced polymerase mastermix (EPM) were used to match standard methods and to ensure a sufficient amount of fragments were present for pulldown. The amplification protocol was 3 minutes at 72 °C, 30 seconds at 98 °C, followed by 10 cycles of 10 seconds at 98 °C, 30 seconds at 65 °C, and 1 minute at 72 °C. The samples were then held at 10 °C. The samples were then processed through the exome enrichment pulldown process and sequenced.
Example 5— Insert size in a control and bead-based tagmented library in the exome enrichment assay
Figures 8A, 8B, and 8C show a plot 800 of insert size in a control library, a plot 820 of insert size in a bead-based tagmented library, and a summary data table 840, respectively, in the exome enrichment assay. The data show that the bead-based tagmentation libraries have a wider insert size spread compared to the control library, but the insert size is very similar irrespective of the DNA input for the samples.
Example 6— Quality of the read sequences
Figures 9A, 9B, and 9C show a bar graph 900 of percent duplicates passing filters (dups
PF), a bar graph 920 of PCT selected bases, and bar graph 940 of PCT usable bases on target, respectively, in the exome enrichment assay of Figures 8A, 8B, and 8C. Referring to Figure 9A, the percent dups PF is a measure of how many reads are duplicated elsewhere on the flow cell. This number will ideally be low (as here) to ensure that all clusters are bringing useful data to the results.
Figure 9B shows PCT selected bases, which is a measure of the ratio of reads that sequence at or near the site of interest which should have been enriched during the enrichment process. Ideally this number will be close to 1 to reflect the success of the enrichment process and show that reads that should not be enriched do not get through the process.
Figure 9C shows the PCT usable bases on target, which is a measure of the ratio of reads that actually sequence over the particular base of interest within the enriched region. Ideally all enriched reads would sequence over the base of interest within the enriched read, but due to the random nature of the tagmentation and the variable length of the inserts, reads may be enriched that do not end up being sequenced over the area of interest.
Two techniques may be used to optimize the insert size distribution. In one example, an
SPRI clean-up may be used to remove fragments that are too small or too large. SPRI clean-up is a process of removing fragments that are larger or smaller than the desired size, by selective DNA precipitation based on size and either retention of the precipitated or non-precipitated DNA as desired (i.e., a first step is to precipitate only DNA that is larger than the desired size and retain the soluble smaller fragments). The smaller fragments are then further precipitated and this time the very small fragments that are not wanted (still in solution) are removed and the precipitated DNA is retained, washed and then resolubilized to give a desired size range of DNA. In another example, the spacing of active transposomes on the bead surface may be used to control the insert size distribution. For example, gaps on the bead surface may be filled with inactive transposomes (e.g., transposomes with inactive transposons). Contiguity of the bead-based tagmentation process was assessed. Table 3 shows the number of times 0, 1, 2, or 3 reads occur within a 1000 bp windows sharing an index. Beads were generated with 9 different indexed transposomes and used to tagment a small amount of human DNA. Reads were generated, aligned, and analyzed for the number of reads within a 1000 bp or 10 Kb window that shared the same index. Some reads within a small window sharing an index may be generated by chance and a prediction of how many times this is likely to occur is given in the "Random" row of Table 3 and Table 4. The numbers in the "Bead" row show the actual number of 1000 bp (Table 3) or 10 Kb (Table 4) windows that share an index. As shown in Table 3 and Table 4, the actual number of times the same index was found within 1000 bp or 10 Kb window is significantly greater than expected in the random case. "0" windows show all the times a particular 1000 bp window had no indexed reads mapping to it. The number is largest here because only a very small amount of the human genome was sequence and most windows have no reads aligning to them. "1" is the number of times just one read maps to a lOOObp (or 10Kb) window; "2" the number of times 2 reads share an index within a lOOObp (or 10KB) window, etc. This data suggests that in over 1400 cases the same piece of DNA (over 10Kb) is being tagmented by the same bead at least twice and up to 5 times, out of about 15000 tagmentation events. Since the fragments share an index, they are unlikely to be there by chance, but are coming from the same bead.
Figure imgf000049_0001
Table 4 shows the number of reads (up to 5) within a 10 kb windows sharing an index.
Figure imgf000049_0002
Example 7— Separation of free transposomes from CPT-DNA
Following transposition, the reaction mixture comprising CPT-DNA and free transposomes were subjected to column chromatography using Sephacryl S-400 and Sephacryl S-200 size exclusion chromatography and shown in Figure 22. CPT-DNA is indicated as NCP DNA.
Example 8— Optimization of Capture Probe Density on Beads
Densities of capture probes A7 and B7 were optimized on 1 μιη beads and the results were shown in Figure 25. Lanes 1 (A7) and 3 (B7) had higher probe densities and lanes 2 (A7) and 4 (B7) had probe density of estimated 10,000-100,000 per lum bead. The ligation product of the capture probe to the target molecule was evaluated in a agarose gel. Probe density of approximately 10,000-100,000 per bead had better ligation efficiency than those with higher probe densities.
Example 9— Testing the feasibility of preparing indexed sequencing libraries of CPT- DNA on beads by intra-molecular hybridization
Transposomes were prepared by mixing transposons having A7' and B7' capture sequences, that are complementary to A7 and B7 capture sequences on beads, with hyperactive Tn5 transposase. High molecular weight genomic DNA is mixed with the transposomes to generate CPT-DNA. Separately, beads are prepared with immobilized oligonucleotides: P5-A7, P7-B7, or P5-A7+ P7-B7, where P5 and P7 are primer binding sequences and A7 and B7 are capture sequences complementary to A7' and B7' sequences respectively. Beads comprising P5- A7 alone, P7-B7 alone, P5-A7+ P7-B7, or a mixture of P5-A7 and P7-B7beads are treated with CPT-DNA and ligase was added to the reaction mixture to determine the efficiency of the hybridization of the immobilized oligos to the transposed DNA. The results are shown in Figure 26. Sequencing libraries are only made when P5-A7 & P7-B7 are immobilized together on one bead (lane 4) as shown by high molecular weight bands on an agarose gel. The results indicate a high efficiency of intra-molecular hybridization and prove the feasibility of the preparing indexed sequencing libraries of CPT-DNA on beads by intra-molecular hybridization.
Example 10— Testing the feasibility of clonal indexing.
Several sets of transposomes were prepared. In one set, hyperactive Tn5 transposase is mixed with transposon sequences Tnpl with 5' biotin to prepare transposome 1. In another set,Tnp2 having unique index2 with 5' biotin to prepare a transposome 2. In another set, hyperactive Tn5 transposase is mixed with transposon sequences Tnp3 with 5' biotin to for transposome 3. In another Tnp4 having unique index 4 and 5 '-biotin to prepare a transposome 4. Each of transposome 1&2 and transposome 3&4 are mixed separately with streptavidin beads to generate bead set 1 and bead set 2. The two set of beads are then mixed together and incubated with genomic DNA and tagmentation buffer to promote tagmentation of the genomic DNA. This is then followed by PCR amplification of the tagmented sequences. The amplified DNA is sequenced to analyze the insertion of the index sequences. If tagmentation is confined to the beads, majority of fragments will be coded with Tnpl/Tnp2 and Tnp3/Tnp4 indexes. If there is intra-molecular hybridization, the fragments may be coded with Tnpl/Tnp4, Tnp2/Tnp3, Tnpl/Tnp3, and Tnp2/Tnp4 indexes. Sequencing results after 5 and 10 cycles of PCR were shown in Figure 27. The control has all four transposons mixed together and immobilized on a bead. Results indicate that the majority of the sequences had Tnpl/Tnp2 or Tnp3/Tnp4 indexes indicating that clonal indexing is feasible. The control shows no distinction between the indexes. Example 11— Indexed clonal bead transposition in a single reaction
Ninety six indexed transposome bead sets are prepared. Individual indexed transposomes were prepared by mixing transposon comprising an oligonucleotide comprising a Tn5 mosaic end sequence (ME) at the 5 '-end and index sequence. Individually indexed transposomes were immobilized on beads through streptavidin-biotin interaction. Transposomes on beads were washed and all 96 individually indexed transposomes on beads were pooled. Oligonucleotides complimentary to the ME sequence and comprising an index sequence is annealed to the immobilized oligonucleotide creating transposons with unique indexes. The ninety six clonal indexed transposome bead sets are combined and incubated with high molecular weight (HMW) genomic DNA in presence of Nextera tagmentation buffer in a single tube.
The beads are washed and the transposase are removed by treating the reaction mixture with 0.1% SDS. The tagmented DNA is amplified with indexed primers and sequenced with PE HiSeq flow cell v2 using TrueSeq v3 cluster kit and sequencing data are analyzed.
Clusters or islands of reads are observed. A plot of the nearest neighbor distances between the reads for each sequence shows essentially to major peaks, one from within the cluster (proximal) and another from between clusters (distal). A schematic of the method and the results are shown in Figures 30 and 31. The island sizes ranged from approximately between 3- 10 kb. Percent of bases covered are approximately 5% to 10%. The insert sizes of the genomic DNA are approximately 200-300 bases.
Example 12— Library sizes for transposomes on beads
Transposomes are first assembled in solution by mixing a first oligonucleotide having ME' sequence, a second oligonucleotide having ME -barcode -P5/P7 sequence, and Tn5 transposase. In first set, the first oligonucleotide having ME' sequence is biotinylated at the 3'- end. In second case the oligonucleotide having ME-barcode-P5/P7 sequence is biotinylated at the 5'-end. To various concentrations (ΙΟηΜ, 50nM, and 200 NM) of each of the resulting transposome sets streptavidin beads are added such that the transposomes are immobilized on the streptavidin beads. The beads are washed and HMW genomic DNA is added and tagmentation is carried out. In some cases, the tagmented DNA is treated with 0.1 % SDS and in other cases the tagmented DNA are untreated. The tagmented DNA is PCR amplified for 5-8 cycles and sequenced. The schematic is shown in Figure 32.
As shown in Figure 33, treatment of SDS improves the amplification efficiency and sequencing quality. Oligonucleotides with 3'-biotin has better library sizes for transposomes.
Figure 34 shows the effect of transposome surface density on the insertion size. Transposomes with 5'-biotin shows smaller sized library and more self-insertion by-products.
Example 13— Titration of input DNA
Various amounts of target HMW DNA was added to clonally indexed beads with 50mM Tn5: Transposon density and incubated for 15 or 60 min at 37 degree C or for 60 min at room temperature. The transposomes comprised oligonucleotides with 3'-biotin. The tagmentation was carried out, the reaction mixture was treated with 0.1% SDS, and PCR amplified. The amplified DNA was sequenced. Figure 35 shows the effect of input DNA on the size distribution. Reactions with 10 pg of input DNA showed the least signal. Size distribution pattern was similar for DNA inputs ranging from 20, 40, and 200 pg.
Example 14— Island size and distribution using solution based and bead based methods Island size and distribution using solution based and bead based methods are compared. In a solution based approach, 96 transposomes each with unique index in the transposons are assembled in a 96 well plate. HMW genomic DNA is added, and the tagmentation reaction is carried out. The reaction product is treated with 0.1% SDS and PCR amplified. The amplified products were sequenced.
In a bead based approach, 96 transposomes each with unique index in the transposons are assembled in a 96 well plate. The oligonucleotides comprised 3' -end biotin. Streptavidin beads are added to each of the 96 well plate and incubated such that the transposomes are immobilized on the streptavidin beads. The beads are individually washed and pooled, HMW genomic DNA is added, and the tagmentation reaction is carried out in a single reaction vessel (one pot). The reaction product is treated with 0.1% SDS and PCR amplified. The amplified products were sequenced.
In the negative control, all 96 transposon sequences, each with unique index, are mixed together first. The oligonucleotides comprised 3 '-end biotin. Transposomes are prepared from the individually mixed indexed transposons. Streptavidin beads are added to the mixture. HMW genomic DNA is added, and the tagmentation reaction is carried out. The reaction product is treated with 0.1 % SDS and PCR amplified. The amplified products were sequenced.
The number of intra island reads is plotted versus the island size. The results as shown in Figure 36 indicate that islands (proximity reads) are observed with the one-pot clonal indexed beads, similar to the solution based method. When indexed transposons were mixed before transposome formation, no island (proximity reads) were observed. Mixing transposons before transposome formation gives beads with different indexes/transposomes per bead, i.e. not clonal.
Example 15— Structural variant analysis with CPT-seq
Detection of 60kb Heterozygous Deletion
The sequencing data are extracted as fastq files and go through the demultiplexing process to generate individual fastq file for each barcode. The fastq files from the CPT sequencing are demultiplexed according to their indexes and aligned to the reference genome with the duplicates removed. The chromosomes are scanned by 5kb/lkb window, in which the number of the indexes showing any reads within the scanning window is recorded. Statistically for heterozygous deletion region only half amount of DNA is available for the library generation compared to its neighboring regions, therefore the number of indexes should be roughly half as its neighbors' as well. The NA12878 chrl 60kb heterozygous deletion are shown in Fig. 47A and 47B by scanning in 5kb window from 9216 indexed CPT sequencing data.
Detection of Gene Fusion
The fastq files from the CPT sequencing are demultiplexed according to their index and aligned to the reference genome with the duplicates removed. Chromosomes are scanned in 2kb window. Each 2kb window is a 36864 vector in which each element records how many reads from a unique index have been found in this 2kb window. For every 2kb window pair (X,Y) across the genome, the weighted- Jaccard index is calculated. This index indicates the de facto distance between (X,Y) in the sample. Those indexes are displayed as the heatmap shown in Fig.48, each data point representing a pair of 2kb scanning window; the top left square is for X,Y both from region 1, bottom right is for X,Y both from region2 and top right is for the X,Y from region 1 cross region2. The gene fusion signal is revealed as the horizontal line in the middle in this case.
Detection of Deletions
The fastq files from the CPT sequencing are demultiplexed according to their index and aligned to the reference genome with the duplicates removed. Chromosomes are scanned in lkb window. Fig. 49 shows results of detection of genetic deletions.
Example 16— Phasing and Methylation Detection
Bisulfite Conversion Efficiency Optimization
Conversion was assessed at the ME (mosaic element region) and gDNA region for index linked CPT-Seq libraries on beads. Promega's MethylEdge Bisulfite Conversion system was optimized to improve efficiency.
Figure imgf000054_0001
3 lhr @ 60°C / 1M NaOH lhr @ 65°C / 0.3M NaOH
ME sequences were analyzed to determine efficiency of bisulfite conversion treatments and shown in Fig. 50. 95% bisulfite conversion (BSC) of indexed-linked libraries attached to beads. Similar PCR yields observed between bisulfite conditions > Harsher bisulfite treatment did not appear to degrade libraries and shown in Fig. 51. Approximately 95% BSC of indexed linked libraries on beads were observed. Variables investigated to improve BSC (C's->U's) were temperature and NaOH concentration (denaturation). 60°C and 1M NaOH or °C and 0.3 M NaOH performed well.
Expected sequencing read structure after sequencing BSC converted CPT-seq on beads libraries observed. Percent base metrics displayed with the IVC plot in Fig. 52.
Fig. 53 shows an image of agarose gel electrophoresis of indexed-linked libraries after PCR after bisulfite conversion. The expected size-range of 200-5 OObp libraries was observed. Reaction without DNA does not yield indexed-linked libraries.
Example 17— Targeted Phasing
Whole-genome indexed linked CPT-seq libraries were enriched. Fig. 54 the bioanalyzer trace of whole-genome indexed linked CPT-seq libraries before enrichment without size- selection. Fig. 55 shows the agarose gel analysis of libraries after enrichment.
Enrichment statistics for HLA region is shown below:
Figure imgf000055_0001
Figure imgf000056_0001
Figure 56 shows the results of application of targeted haplotyping to the HLA region in the chromosome. Illustration of enrichment of whole genome indexed linked read library in the left. Each small bar represents an indexed short library. Clusters of indexed libraries are "islands", the region that got clonally indexed on a single bead with the same index, hence the proximity of the reads ("island" character) on a genomic scale. Enrichment (see Selective enrichment of nucleic acids WO 2012108864 Al) of libraries in the targeted region are displayed on the right. Reads are enriched for the HLA region. Additionally, when reads are sorted by index and aligned to the genome they again display the "island" structure indicating that contiguity information is maintained from the indexed linked reads.
Example 18— Index Exchange
To evaluate the exchange of the mosaic ends (ME) of transposome complexes, bead with different indices were prepared. After mixing, index exchange was determined by sequencing the libraries and reporting the indices for each library. % "swapped" was calculated as (D4+D5+E3+E5+f )/(sum of all 96) and shown in the Figure 65.
Example 19— Decreasing library insert size by more densely populating transposome complexes to the streptavidin beads
Streptavidin magnetic beads were loaded with lx, 6x, and 12x concentrations of TsTn5 transposome complex. The Epi-CPT seq protocol was performed for each bead type. The final PCR product was loaded on the Agilent BioAnalyzer for analysis and shown in Figure . Epi- CPT seq libraries fragments are smaller and have greater yield when more TsTn5 is loaded on beads.
Example 20— Fragmentation of DNA library during sodium bisulfate conversion.
After bisulfite conversion, DNA becomes damaged, resulting in loss of the common sequences (CS2) needed for PCR amplification. DNA fragments CPTSeq and Epi-CPTSeq (Me- CPTSeq) libraries were analyzed by BioAnalyzer. Due to DNA damage during bisulfite conversion, the Epi-CPTSeq library has 5-fold lower yield and a smaller library size distribution compared to the CPTSeq library as shown in Figure 70.
Example 21— TdT mediated ssDNA ligation reaction
Feasibility of the DNA end-recovery by Terminal transferase (TdT) mediated ligation was tested. Briefly, 5 pmoles of ssDNA template was incubated with TdT( 10/50U), Attenuator/adapter duplex (0/15/25 pmoles) and DNA Ligase (0/10U) were incubated for 15m at 37C. DNA products of extension/ligation were analyzed on a TBE-Urea gel and the results were shown in Figure 71. Addition of all reaction components resulted in almost complete ligation of the adapter molecule (Lanes 5-8).
Feasibility of the DNA end-recovery by Terminal transferase (TdT) mediated ligation was tested for sodium bisulfate converted bead bound library and shown in Figure 72. Briefly, DNA was tagmented on beads (first two lanes), treated with Promega's MethylEdge bisulfate conversion kit (lanes 3 and 4) and subjected to DNA rescue protocol (lanes 5 and 6). There is an obvious increase in yield and size of DNA library after rescue reaction. There is also an increase in abundance of self inserted transposons (Sis) indicating efficient ligation of the adapter molecule.
Results of Methyl-CPTSeq assay are presented in Figure 73.

Claims

What is claimed is:
1. A method of preparing a library of barcoded DNA fragments of a target nucleic acid comprising:
a. contacting a target nucleic acid with a plurality of transposome complexes, each transposome complex comprising:
transposons and transposases, wherein the transposons comprise transferred strands and non-transferred strands, wherein at least one of the transposons of the transposome complex comprise an adaptor sequence capable of hybridizing to a complementary capture sequence; b. fragmenting the target nucleic acid into a plurality of fragments and inserting plurality of transferred strands to the 5 ' end of at least one strand of the fragments while maintaining the contiguity of the target nucleic acid;
c. contacting the plurality of fragments of the target nucleic acid with a plurality of solid supports, each of the solid supports in the plurality comprising a plurality of immobilized oligonucleotides, each of the oligonucleotides comprising a complementary capture sequence and a first barcode sequence, and wherein the first barcode sequence from each solid support in the plurality of the solid supports differs from the first barcode sequence from other solid supports in the plurality of solid supports;
d. transferring the barcode sequence information to the target nucleic acid fragments, thereby producing an library of double-stranded fragments wherein at least one strand is 5'- tagged with the first barcode, wherein at least two fragments of the same target nucleic acid receives identical barcode information.
2. A method for determining contiguity information of a target nucleic acid sequence, the method comprising:
a. contacting the target nucleic acid with a plurality of transposome complexes, each transposome complex comprising:
transposons and transposases, wherein the transposons comprise transferred strands and non-transferred strands, wherein at least one of the transposons of the transposome complex comprise an adaptor sequence capable of hybridizing to a complementary capture sequence; b. fragmenting the target nucleic acid into a plurality of fragments and inserting plurality of transferred strands while maintain the contiguity of the target nucleic acid; c. contacting the plurality of fragments of the target nucleic acid with a plurality of solid supports, each of the solid supports in the plurality comprising a plurality of immobilized oligonucleotides, each of the oligonucleotides comprising a complementary capture sequence and a first barcode sequence, and wherein the first barcode sequence from each solid support in the plurality of the solid supports differs from the first barcode sequence from other solid supports in the plurality of solid supports;
d. transferring the barcode sequence information to the target nucleic acid fragments, wherein at least two fragments of the same target nucleic acid receives identical barcode information;
e. determining the sequence of the target nucleic acid fragments and the barcode sequences;
f. determining the contiguity information of the target nucleic acid by identifying the barcode sequences.
3. A method for simultaneously determining phasing information and methylation status of a target nucleic acid sequence, the method comprising:
a. contacting the target nucleic acid with a plurality of transposome complexes, each transposome complex comprising:
transposons and transposases, wherein the transposons comprise transferred strands and non-transferred strands, wherein at least one of the transposons of the transposome complex comprise an adaptor sequence capable of hybridizing to a complementary capture sequence; b. fragmenting the target nucleic acid into a plurality of fragments and inserting plurality of transferred strands while maintaining the contiguity of the target nucleic acid;
c. contacting the plurality of fragments of the target nucleic acid with a plurality of solid supports, each of the solid supports in the plurality comprising a plurality of immobilized oligonucleotides, each of the oligonucleotides comprising a complementary capture sequence and a first barcode sequence, and wherein the first barcode sequence from each solid support in the plurality of the solid supports differs from the first barcode sequence from other solid supports in the plurality of solid supports;
d. transferring the barcode sequence information to the target nucleic acid fragments, wherein at least two fragments of the same target nucleic acid receives identical barcode information; e. subjecting the target nucleic acid fragments comprising barcodes to bisulfite treatment, thereby generating bisulfite treated target nucleic acid fragments comprising barcodes;
f. determining the sequence of the bisulfite treated target nucleic acid fragments and the barcode sequences;
g. determining the contiguity information of the target nucleic acid by identifying the barcode sequences,
wherein the sequence information is indicative of the methylation status of the target nucleic acid and the contiguity information is indicative of the haplotype information.
4. The method of any one of claims 1-3, wherein a single barcode sequence is present in the plurality of immobilized oligonucleotides on each individual solid support.
5. The method of any one of claims 1-3, wherein different barcode sequences are present in the plurality of immobilized oligonucleotides on each individual solid support.
6. The method of any one of claims 1-5, wherein the transferring of the barcode sequence information to the target nucleic acid fragments is by ligation.
7 The method of any one of claims 1-5, wherein the transferring of the barcode sequence information to the target nucleic acid fragments is by polymerase extension.
8. The method of any one of claims 1-5, wherein the transferring of the barcode sequence information to the target nucleic acid fragments is by both ligation and polymerase extension.
9. The method of any one of claims 7-8, wherein the polymerase extension is by extending the 3 '-end of the non-ligated transposon strand with a DNA polymerase using the ligated immobilized oligonucleotide as a template.
10. The method of any one of claims 1-9, wherein at least a portion of the adaptor sequences further comprise a second barcode sequence.
11. The method of any one of claims 1-10, wherein the transposome complexes are multimeric, and wherein the adaptor sequences of the transposons of each monomeric unit are different from the other monomeric unit in the same transposome complex.
12. The method of any one of claims 1-11, wherein the adaptor sequence further comprises a first primer binding sequence.
13. The method of claim 12, wherein the first primer binding site has no sequence homology to the capture sequence or to the complement of the capture sequence.
14. The method of any one of claims 1-13, wherein the immobilized oligonucleotides on the solid support further comprise a second primer binding sequences.
15. The method of claim 1-14, wherein the transposome complexes are multimeric, and wherein the transposome monomeric units are linked to each other in the same transposome complex.
16. The method of claim 15, wherein the transposase of a transposome monomeric unit is linked to another transposase of another transposome monomeric unit of the same transposome complex.
17. The method of claim 15, wherein the transposons of a transposome monomeric unit are linked to transposons of another transposome monomeric unit of the same transposome complex.
18. The method of any one of claims 1-17, wherein the contiguity information of a target nucleic acid sequence is indicative of haplotype information.
19. The method of any one of claims 1-17, wherein the contiguity information of a target nucleic acid sequence is indicative of genomic variants.
20. The method of claim 19, wherein the genomic variants are selected from the group consisting of deletions, translocations, interchromosomal gene fusions, duplications, and paralogs.
21. The method of any one of claims 1-20, wherein the oligonucleotides immobilized on the solid support comprises a partially double stranded region and a partially single stranded region.
22. The method of claim 21, wherein the partially single stranded region of the oligonucleotide comprises the second barcode sequence and the second primer binding sequence.
23. The method of any one of claims 1-22, wherein target nucleic acid fragments comprising the barcodes are amplified prior to determining the sequence of the target nucleic acid fragments.
24. The method of claim 23, wherein the steps (a)-(d) and the subsequent amplification are carried out in a single reaction compartment prior to determining the sequence of the target nucleic acid fragments.
25. The method of claim 23, wherein a third barcode sequence is introduced to the target nucleic acid fragments during the amplification.
26. The method of any one of claims 1-24, further comprising: combining the target nucleic acid fragments comprising the barcodes of step (d) from a plurality of first set of reaction compartments into a pool of target nucleic acid fragments comprising the barcodes; redistributing the pool of target nucleic acid fragments comprising the barcodes to a plurality of second set of reaction compartments; introducing a third barcode in to the target nucleic acid fragments by amplifying the target nucleic acid fragments in the second set of reaction compartments prior to sequencing.
27. The method of any one of claims 1-26, further comprising pre-fragmenting the target nucleic acid prior to contacting the target nucleic acid with transposome complexes.
28. The method of claim 27, wherein the pre-fragmenting the target nucleic acid is by a method selected from the group consisting of sonication and restriction digestion.
29. A method of preparing an immobilized library of tagged DNA fragments comprising:
(a) providing a plurality of solid supports having transposome complexes immobilized thereon, wherein the transposome complexes are multimeric and the transposome monomeric units of the same transposome complex are linked to each other, and wherein said transposome monomeric units comprise a transposase bound to a first polynucleotide, said first polynucleotide comprising
(i) a 3 ' portion comprising a transposon end sequence, and
(ii) a first adaptor comprising a first barcode,
(b) applying a target DNA to the plurality of solid supports under conditions whereby the target DNA is fragmented by the transposome complexes, and the 3' transposon end sequence of the first polynucleotide is transferred to a 5' end of at least one strand of the fragments; thereby producing an immobilized library of double-stranded fragments wherein at least one strand is 5'- tagged with the first barcode.
30. The method of claim 29, wherein the transposase of a transposome monomeric unit is linked to another transposase of another transposome monomeric unit of the same transposome complex.
31. The method of claim 29, wherein the transposons of a transposome monomeric unit is linked to transposons of another transposome monomeric unit of the same transposome complex.
32. A method of preparing a sequencing library for determining the methylation status of a target nucleic acid comprising:
a. fragmenting the target nucleic acid into two or more fragments; b. incorporating a first common adaptor sequence to the 5 '-end of the fragments of the target nucleic acid, wherein the adaptor sequence comprises a first primer binding sequence and an affinity moiety, wherein the affinity moiety in one member of the binding pair;
c. denaturing the target nucleic acid fragments;
d. immobilizing the target nucleic acid fragments on a solid support, wherein the solid support comprises other member of the binding pair and the immobilization of the target nucleic acid is by binding of the binding pair;
e. subjecting the immobilized target nucleic acid fragments to bisulfite treatment; f. incorporating a second common adaptor sequence to the bisulfite treated immobilized target nucleic acid fragments, wherein the second common adaptor comprises a second primer binding site;
g. amplifying the bisulfite treated target nucleic acid fragments immobilized on solid support using a first and a second primer, thereby producing a sequencing library for determining the methylation status of a target nucleic acid.
33. The method of claim 32, wherein the first common adaptor sequence is incorporated to the 5 '-end fragments of the target nucleic acid by one-sided transposition.
34. The method of claim 32, wherein the first common adaptor sequence is incorporated to the 5 '-end fragments of the target nucleic acid by ligation.
35. The method of any one of claims 32-34, wherein incorporating the second common adaptor sequence into the bisulfite treated immobilized target nucleic acid fragments comprises (i) extending the 3 '-end of the immobilized target nucleic acid fragments using terminal transferase such that the 3 '-end of the immobilized target nucleic acid fragments comprise a homopolymeric tail;
(ii) hybridizing an oligonucleotide comprising a first portion and a second portion, wherein the first portion comprises a single stranded homopolymeric portion that is complementary to the homopolymeric tail of the immobilized target nucleic acid fragments, and wherein the second portion comprising a double stranded portion comprising the second common adaptor sequence;
(iii) ligating the second common adaptor sequence to the immobilized target nucleic acid fragments, thereby incorporating the second common adaptor sequence into the bisulfite treated immobilized target nucleic acid fragments.
36. The method of any one of claims 1-35, wherein the target nucleic acid is from a single cell.
37. The method of any one of claims 1-36, wherein the target nucleic acid is from a single organelle.
38. The method of any one of claims 1-37, wherein the target nucleic acid is genomic
DNA.
39. The method of any one of claims 1-38, wherein the target nucleic acid is cross-linked to other nucleic acids.
40. The method of any one of claims 1-39, wherein the target nucleic acid is cell free tumor DNA.
41. The method of claim 40, wherein the cell free tumor DNA is obtained from placental fluid.
42. The method of claim 40, wherein the cell free tumor DNA is obtained from plasma.
43. The method of claim 42, wherein the plasma is collected from whole blood using a membrane separator comprising a collection zone for the plasma.
44. The method of claim 43, wherein the collection zone for the plasma comprises transposome complexes immobilized on solid support.
45. The method of any one of claims 1-37, wherein the target nucleic acid cDNA.
46. The method of any one of claims 1-37, wherein the target nucleic acid is from formalin fixed paraffin embedded tissue sample.
47. The method of any one of claims 1-37, wherein the target nucleic acid is histone protected DNA.
48. The method of any one of claims 1-47, wherein the solid support is a bead.
49. The method of claim 1-31, wherein the plurality of solid supports is plurality of beads and wherein the plurality of beads is of different sizes.
50. A method of preparing a sequencing library for determining the methylation status of a target nucleic acid comprising:
a. providing a plurality of solid support comprising immobilized transposome complexes immobilized thereon, wherein the transposome complexes comprise transposons and transposases, wherein the transposons comprise transferred strands and non-transferred strands, the transferred strand comprising
(i) a first portion at the 3 '-end comprising the transposase recognition sequence, and
(ii) a second portion located 5 'to the first portion comprising a first adaptor sequence and first member of a binding pair, wherein the first member of the binding pair binds to a second member of the binding pair on the solid support, thereby immobilizes the transposon to the solid support, and wherein the first adaptor comprises a first primer binding sequence; the non-transferred strand comprising
(i) a first portion at the 5 '-end comprising the transposase recognition sequence (ii) a second portion located 3 'to the first portion comprising a second adaptor sequence, wherein the terminal nucleotide at the 3 '-end is blocked, and wherein the second adaptor comprises a second primer binding sequence;
b. contacting a target nucleic acid with the plurality of solid support comprising immobilized transposome complexes;
c. fragmenting the target nucleic acid into a plurality of fragments and inserting plurality of transferred strands to the 5' end of at least one strand of the fragments, thereby immobilizing the target nucleic acid fragments to the solid support;
d. extending the 3 '-end of the fragmented target nucleic acid with a DNA polymerase; e. ligating the non-transferred strand to the 3 '-end of the fragmented target nucleic acid; f. subjecting the immobilized target nucleic acid fragments to bisulfite treatment; g. extending the 3 '-end of the immobilized target nucleic acid fragments damaged during the bisulfite treatment by using a DNA polymerase such that the 3 '-end of the immobilized target nucleic acid fragments comprises a homopolymeric tail;
h. incorporating a second adaptor sequence to the 3 '-end of the immobilized target nucleic acid fragments damaged during the bisulfite treatment;
i. amplifying the bisulfite treated target nucleic acid fragments immobilized on solid support using a first and a second primer, thereby producing a sequencing library for determining the methylation status of a target nucleic acid.
51. A method of preparing a sequencing library for determining the methylation status of a target nucleic acid comprising:
a. contacting the target nucleic acid with transposome complexes, wherein the transposome complexes comprise transposons and transposases, wherein the transposons comprise transferred strands and non-transferred strands, the transferred strand comprising
(i) a first portion at the 3 '-end comprising the transposase recognition sequence, and
(ii) a second portion located 5 'to the first portion comprising a first adaptor sequence and first member of a binding pair, wherein the first member of the binding pair binds to a second member of the binding pair;
the non-transferred strand comprising
(i) a first portion at the 5 '-end comprising the transposase recognition sequence
(ii) a second portion located 3 'to the first portion comprising a second adaptor sequence, wherein the terminal nucleotide at the 3 '-end is blocked, and wherein the second adaptor comprises a second primer binding sequence,
b. fragmenting the target nucleic acid into a plurality of fragments and inserting plurality of transferred strands to the 5' end of at least one strand of the fragments, thereby immobilizing the target nucleic acid fragments to the solid support;
c. contacting a target nucleic acid fragments comprising the transposon end with the plurality of solid support comprising second member of the binding pair, wherein binding of the first member of the binding pair to the second member of the binding pair immobilizes the target nucleic acid to the solid support; d. extending the 3 '-end of the fragmented target nucleic acid with a DNA polymerase; e. ligating the non-transferred strand to the 3 '-end of the fragmented target nucleic acid; f. subjecting the immobilized target nucleic acid fragments to bisulfite treatment;
g. extending the 3 '-end of the immobilized target nucleic acid fragments damaged during the bisulfite treatment by using a DNA polymerase such that the 3 '-end of the immobilized target nucleic acid fragments comprises a homopolymeric tail;
h. incorporating a second adaptor sequence to the 3 '-end of the immobilized target nucleic acid fragments damaged during the bisulfite treatment;
i. amplifying the bisulfite treated target nucleic acid fragments immobilized on solid support using a first and a second primer, thereby producing a sequencing library for determining the methylation status of a target nucleic acid.
52. The method of any one of claims 50-51 , wherein the solid support is a bead.
53. The method of any one of claims 50-52, wherein the first and second members of the binding pair are biotin and streptavidin.
54. The method of any one of claims 50-51, wherein the first adaptor further comprises a barcode.
55. The method of any one of claims 50-51, wherein the second adaptor further comprises a barcode.
56. The method of any one of claims 50-51, wherein the first and second adaptors comprise first and second adaptors comprise a first and second barcodes.
57. The method of any one of claims 50-51, wherein extending the 3 '-end of the immobilized target nucleic acid fragments damaged during the bisulfite treatment is by terminal transferase.
58. The method of any one of claims 50-57, wherein the terminal nucleotide at the 3'- end of the second adaptor is blocked by a member selected from the group consisting of a dideoxy nucleotide, a phosphate group, thiophosphate group, and an azido group.
PCT/US2015/056040 2014-10-17 2015-10-16 Contiguity preserving transposition WO2016061517A2 (en)

Priority Applications (23)

Application Number Priority Date Filing Date Title
BR122021026781-2A BR122021026781B1 (en) 2014-10-17 2015-10-16 CONTIGUITY PRESERVING TRANSPOSON
BR122021026779-0A BR122021026779B1 (en) 2014-10-17 2015-10-16 CONTIGUITY PRESERVING TRANSPOSON
CA2964799A CA2964799A1 (en) 2014-10-17 2015-10-16 Contiguity preserving transposition
AU2015331739A AU2015331739B2 (en) 2014-10-17 2015-10-16 Contiguity preserving transposition
EP15797490.8A EP3207134B1 (en) 2014-10-17 2015-10-16 Contiguity preserving transposition
ES15797490T ES2745814T3 (en) 2014-10-17 2015-10-16 Conservative transposition of contiguity
CN201580068116.1A CN107969137B (en) 2014-10-17 2015-10-16 Proximity-preserving transposons
RU2017116989A RU2709655C2 (en) 2014-10-17 2015-10-16 Transposition with preservation of gene adhesion
LTEP15797490.8T LT3207134T (en) 2014-10-17 2015-10-16 Contiguity preserving transposition
BR112017007912-7A BR112017007912B1 (en) 2014-10-17 2015-10-16 CONTIGUITY PRESERVING TRANSPOSON
EP19183798.8A EP3636757A1 (en) 2014-10-17 2015-10-16 Contiguity preserving transposition
US15/519,482 US11873480B2 (en) 2014-10-17 2015-10-16 Contiguity preserving transposition
IL299976A IL299976B1 (en) 2014-10-17 2015-10-16 Contiguity preserving transposition
JP2017520884A JP6808617B2 (en) 2014-10-17 2015-10-16 Dislocations that maintain continuity
KR1020177013242A KR102472027B1 (en) 2014-10-17 2015-10-16 Contiguity preserving transposition
DK15797490.8T DK3207134T3 (en) 2014-10-17 2015-10-16 CONTIGUITY CONVERSING TRANSPOSITION
KR1020227041250A KR102643955B1 (en) 2014-10-17 2015-10-16 Contiguity preserving transposition
SG11201703139VA SG11201703139VA (en) 2014-10-17 2015-10-16 Contiguity preserving transposition
IL251737A IL251737B (en) 2014-10-17 2017-04-13 Contiguity preserving transposition
US16/173,202 US20190048332A1 (en) 2014-10-17 2018-10-29 Contiguity Preserving Transposition
IL287853A IL287853B2 (en) 2014-10-17 2021-11-04 Contiguity preserving transposition
AU2022201205A AU2022201205A1 (en) 2014-10-17 2022-02-22 Contiguity Preserving Transposition
US17/719,276 US20220282242A1 (en) 2014-10-17 2022-04-12 Contiguity Preserving Transposition

Applications Claiming Priority (6)

Application Number Priority Date Filing Date Title
US201462065544P 2014-10-17 2014-10-17
US62/065,544 2014-10-17
US201562157396P 2015-05-05 2015-05-05
US62/157,396 2015-05-05
US201562242880P 2015-10-16 2015-10-16
US62/242,880 2015-10-16

Related Child Applications (2)

Application Number Title Priority Date Filing Date
US15/519,482 A-371-Of-International US11873480B2 (en) 2014-10-17 2015-10-16 Contiguity preserving transposition
US16/173,202 Continuation US20190048332A1 (en) 2014-10-17 2018-10-29 Contiguity Preserving Transposition

Publications (2)

Publication Number Publication Date
WO2016061517A2 true WO2016061517A2 (en) 2016-04-21
WO2016061517A3 WO2016061517A3 (en) 2016-06-23

Family

ID=55747561

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2015/056040 WO2016061517A2 (en) 2014-10-17 2015-10-16 Contiguity preserving transposition

Country Status (10)

Country Link
US (3) US11873480B2 (en)
JP (3) JP6808617B2 (en)
KR (2) KR102643955B1 (en)
AU (2) AU2015331739B2 (en)
BR (2) BR122021026781B1 (en)
CA (1) CA2964799A1 (en)
IL (3) IL299976B1 (en)
RU (2) RU2709655C2 (en)
SG (2) SG11201703139VA (en)
WO (1) WO2016061517A2 (en)

Cited By (128)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016154540A1 (en) * 2015-03-26 2016-09-29 Dovetail Genomics Llc Physical linkage preservation in dna storage
WO2016189331A1 (en) * 2015-05-28 2016-12-01 Illumina Cambridge Limited Surface-based tagmentation
WO2017034970A1 (en) * 2015-08-21 2017-03-02 The General Hospital Corporation Combinatorial single molecule analysis of chromatin
WO2017120531A1 (en) 2016-01-08 2017-07-13 Bio-Rad Laboratories, Inc. Multiple beads per droplet resolution
US9715573B2 (en) 2015-02-17 2017-07-25 Dovetail Genomics, Llc Nucleic acid sequence assembly
WO2017197300A1 (en) * 2016-05-13 2017-11-16 Dovetail Genomics Llc Recovering long-range linkage information from preserved samples
US20180016632A1 (en) * 2016-07-12 2018-01-18 Kapa Biosystems, Inc. System and method for transposase-mediated amplicon sequencing
WO2018057779A1 (en) * 2016-09-23 2018-03-29 Jianbiao Zheng Compositions of synthetic transposons and methods of use thereof
WO2018118971A1 (en) * 2016-12-19 2018-06-28 Bio-Rad Laboratories, Inc. Droplet tagging contiguity preserved tagmented dna
WO2018156519A1 (en) * 2017-02-21 2018-08-30 Illumina Inc. Tagmentation using immobilized transposomes with linkers
WO2018203141A1 (en) 2017-05-05 2018-11-08 Scipio Bioscience Methods for trapping and barcoding discrete biological units in hydrogel
US20180340172A1 (en) * 2017-05-26 2018-11-29 10X Genomics, Inc. Single cell analysis of transposase accessible chromatin
US10227648B2 (en) 2012-12-14 2019-03-12 10X Genomics, Inc. Methods and systems for processing polynucleotides
EP3365445A4 (en) * 2015-10-19 2019-03-20 Dovetail Genomics, LLC Methods for genome assembly, haplotype phasing, and target independent nucleic acid detection
CN109526228A (en) * 2017-05-26 2019-03-26 10X基因组学有限公司 The chromatinic single cell analysis of transposase accessibility
US10253364B2 (en) 2012-12-14 2019-04-09 10X Genomics, Inc. Method and systems for processing polynucleotides
WO2019076768A1 (en) * 2017-10-16 2019-04-25 Tervisetehnoloogiate Arenduskeskus As Method and kit for dna library preparation
US10273541B2 (en) 2012-08-14 2019-04-30 10X Genomics, Inc. Methods and systems for processing polynucleotides
WO2019089959A1 (en) 2017-11-02 2019-05-09 Bio-Rad Laboratories, Inc. Transposase-based genomic analysis
WO2019089836A1 (en) 2017-10-31 2019-05-09 Encodia, Inc. Kits for analysis using nucleic acid encoding and/or label
US10287623B2 (en) 2014-10-29 2019-05-14 10X Genomics, Inc. Methods and compositions for targeted nucleic acid sequencing
WO2019060722A3 (en) * 2017-09-22 2019-05-16 X Gen Us Co. Methods and compositions for preparing polynucleotides
US20190169602A1 (en) * 2016-01-12 2019-06-06 Seqwell, Inc. Compositions and methods for sequencing nucleic acids
US10323279B2 (en) 2012-08-14 2019-06-18 10X Genomics, Inc. Methods and systems for processing polynucleotides
US10343166B2 (en) 2014-04-10 2019-07-09 10X Genomics, Inc. Fluidic devices, systems, and methods for encapsulating and partitioning reagents, and applications of same
US10400280B2 (en) 2012-08-14 2019-09-03 10X Genomics, Inc. Methods and systems for processing polynucleotides
US10428326B2 (en) 2017-01-30 2019-10-01 10X Genomics, Inc. Methods and systems for droplet-based single cell barcoding
WO2019213294A1 (en) * 2018-05-03 2019-11-07 Becton, Dickinson And Company High throughput multiomics sample analysis
WO2019217452A1 (en) 2018-05-08 2019-11-14 Mgi Tech Co., Ltd. Single tube bead-based dna co-barcoding for accurate and cost-effective sequencing, haplotyping, and assembly
US10526641B2 (en) 2014-08-01 2020-01-07 Dovetail Genomics, Llc Tagging nucleic acids for sequence assembly
US10533221B2 (en) 2012-12-14 2020-01-14 10X Genomics, Inc. Methods and systems for processing polynucleotides
US10550429B2 (en) 2016-12-22 2020-02-04 10X Genomics, Inc. Methods and systems for processing polynucleotides
US10557158B2 (en) 2015-01-12 2020-02-11 10X Genomics, Inc. Processes and systems for preparation of nucleic acid sequencing libraries and libraries prepared using same
WO2020041293A1 (en) 2018-08-20 2020-02-27 Bio-Rad Laboratories, Inc. Nucleotide sequence generation by barcode bead-colocalization in partitions
US10619186B2 (en) 2015-09-11 2020-04-14 Cellular Research, Inc. Methods and compositions for library normalization
US10619203B2 (en) 2009-12-15 2020-04-14 Becton, Dickinson And Company Digital counting of individual molecules by stochastic attachment of diverse labels
US10640763B2 (en) 2016-05-31 2020-05-05 Cellular Research, Inc. Molecular indexing of internal sequences
US10669570B2 (en) 2017-06-05 2020-06-02 Becton, Dickinson And Company Sample indexing for single cells
US10676789B2 (en) 2012-12-14 2020-06-09 10X Genomics, Inc. Methods and systems for processing polynucleotides
US10697000B2 (en) 2015-02-24 2020-06-30 10X Genomics, Inc. Partition processing methods and systems
WO2020144373A1 (en) * 2019-01-11 2020-07-16 Illumina Cambridge Limited Complex surface-bound transposome complexes
US10725027B2 (en) 2018-02-12 2020-07-28 10X Genomics, Inc. Methods and systems for analysis of chromatin
JP2020523011A (en) * 2017-06-07 2020-08-06 オレゴン ヘルス アンド サイエンス ユニバーシティ Single-cell whole-genome library for methylation sequencing
WO2020157684A1 (en) 2019-01-29 2020-08-06 Mgi Tech Co., Ltd. High coverage stlfr
US10745742B2 (en) 2017-11-15 2020-08-18 10X Genomics, Inc. Functionalized gel beads
EP3497219A4 (en) * 2016-08-10 2020-08-19 President and Fellows of Harvard College Methods of de novo assembly of barcoded genomic dna fragments
WO2020165433A1 (en) 2019-02-14 2020-08-20 MAX-PLANCK-Gesellschaft zur Förderung der Wissenschaften e.V. Haplotagging - haplotype phasing and single-tube combinatorial barcoding of nucleic acid molecules using bead-immobilized tn5 transposase
US10752949B2 (en) 2012-08-14 2020-08-25 10X Genomics, Inc. Methods and systems for processing polynucleotides
US10774370B2 (en) 2015-12-04 2020-09-15 10X Genomics, Inc. Methods and compositions for nucleic acid analysis
US10815525B2 (en) 2016-12-22 2020-10-27 10X Genomics, Inc. Methods and systems for processing polynucleotides
US10829815B2 (en) 2017-11-17 2020-11-10 10X Genomics, Inc. Methods and systems for associating physical and genetic properties of biological particles
US10894990B2 (en) 2016-05-17 2021-01-19 Shoreline Biome, Llc High throughput method for identification and sequencing of unknown microbial and eukaryotic genomes from complex mixtures
WO2021008805A1 (en) * 2019-07-12 2021-01-21 Illumina Cambridge Limited Compositions and methods for preparing nucleic acid sequencing libraries using crispr/cas9 immobilized on a solid support
US10927419B2 (en) 2013-08-28 2021-02-23 Becton, Dickinson And Company Massively parallel single cell analysis
US10941396B2 (en) 2012-02-27 2021-03-09 Becton, Dickinson And Company Compositions and kits for molecular counting
WO2021055864A1 (en) * 2019-09-20 2021-03-25 Illumina, Inc. Methods and compositions for identifying ligands on arrays using indexes and barcodes
US10975417B2 (en) 2016-02-23 2021-04-13 Dovetail Genomics, Llc Generation of phased read-sets for genome assembly and haplotype phasing
CN112739829A (en) * 2018-09-27 2021-04-30 深圳华大生命科学研究院 Construction method of sequencing library, sequencing library obtained by construction method and sequencing method
US10995333B2 (en) 2017-02-06 2021-05-04 10X Genomics, Inc. Systems and methods for nucleic acid preparation
US11030276B2 (en) 2013-12-16 2021-06-08 10X Genomics, Inc. Methods and apparatus for sorting data
US11081208B2 (en) 2016-02-11 2021-08-03 10X Genomics, Inc. Systems, methods, and media for de novo assembly of whole genome sequence data
US11078522B2 (en) 2012-08-14 2021-08-03 10X Genomics, Inc. Capsule array devices and methods of use
US11084036B2 (en) 2016-05-13 2021-08-10 10X Genomics, Inc. Microfluidic systems and methods of use
US11135584B2 (en) 2014-11-05 2021-10-05 10X Genomics, Inc. Instrument systems for integrated sample processing
US11155881B2 (en) 2018-04-06 2021-10-26 10X Genomics, Inc. Systems and methods for quality control in single cell processing
US11193121B2 (en) 2013-02-08 2021-12-07 10X Genomics, Inc. Partitioning and processing of analytes and other species
US11192083B2 (en) 2019-01-29 2021-12-07 Illumina, Inc. Flow cells with chambers, depressions, and capture sites
USRE48913E1 (en) 2015-02-27 2022-02-01 Becton, Dickinson And Company Spatially addressable molecular barcoding
US11274343B2 (en) 2015-02-24 2022-03-15 10X Genomics, Inc. Methods and compositions for targeted nucleic acid sequence coverage
US11318462B2 (en) 2019-01-29 2022-05-03 Illumina, Inc. Flow cells with a hydrophobic barrier
US11319583B2 (en) 2017-02-01 2022-05-03 Becton, Dickinson And Company Selective amplification using blocking oligonucleotides
US11365438B2 (en) 2017-11-30 2022-06-21 10X Genomics, Inc. Systems and methods for nucleic acid preparation and analysis
US11365409B2 (en) 2018-05-03 2022-06-21 Becton, Dickinson And Company Molecular barcoding on opposite transcript ends
US11371094B2 (en) 2015-11-19 2022-06-28 10X Genomics, Inc. Systems and methods for nucleic acid processing using degenerate nucleotides
WO2022087150A3 (en) * 2020-10-21 2022-06-30 Illumina, Inc. Sequencing templates comprising multiple inserts and compositions and methods for improving sequencing throughput
US11390914B2 (en) 2015-04-23 2022-07-19 Becton, Dickinson And Company Methods and compositions for whole transcriptome amplification
US11460468B2 (en) 2016-09-26 2022-10-04 Becton, Dickinson And Company Measurement of protein expression using reagents with barcoded oligonucleotide sequences
US11459607B1 (en) 2018-12-10 2022-10-04 10X Genomics, Inc. Systems and methods for processing-nucleic acid molecules from a single cell using sequential co-partitioning and composite barcodes
US11467153B2 (en) 2019-02-12 2022-10-11 10X Genomics, Inc. Methods for processing nucleic acid molecules
US11492660B2 (en) 2018-12-13 2022-11-08 Becton, Dickinson And Company Selective extension in single cell whole transcriptome analysis
RU2783536C2 (en) * 2017-02-21 2022-11-14 Иллюмина Инк. Tagmentation using immobilized transposomes with linkers
US11525157B2 (en) 2016-05-31 2022-12-13 Becton, Dickinson And Company Error correction in amplification of samples
US11535890B2 (en) 2019-01-29 2022-12-27 Illumina, Inc. Sequencing kits
US11535889B2 (en) 2016-03-31 2022-12-27 Agilent Technologies, Inc. Use of transposase and Y adapters to fragment and tag DNA
US11535882B2 (en) 2015-03-30 2022-12-27 Becton, Dickinson And Company Methods and compositions for combinatorial barcoding
EP3870704A4 (en) * 2018-10-25 2023-01-11 Illumina, Inc. Methods and compositions for identifying ligands on arrays using indexes and barcodes
US11584954B2 (en) 2017-10-27 2023-02-21 10X Genomics, Inc. Methods and systems for sample preparation and analysis
US11584953B2 (en) 2019-02-12 2023-02-21 10X Genomics, Inc. Methods for processing nucleic acid molecules
US11591637B2 (en) 2012-08-14 2023-02-28 10X Genomics, Inc. Compositions and methods for sample processing
US11629344B2 (en) 2014-06-26 2023-04-18 10X Genomics, Inc. Methods and systems for processing polynucleotides
US11634709B2 (en) 2019-04-30 2023-04-25 Encodia, Inc. Methods for preparing analytes and related kits
US11639928B2 (en) 2018-02-22 2023-05-02 10X Genomics, Inc. Methods and systems for characterizing analytes from individual cells or cell populations
US11639517B2 (en) 2018-10-01 2023-05-02 Becton, Dickinson And Company Determining 5′ transcript sequences
US11649497B2 (en) 2020-01-13 2023-05-16 Becton, Dickinson And Company Methods and compositions for quantitation of proteins and RNA
US11655499B1 (en) 2019-02-25 2023-05-23 10X Genomics, Inc. Detection of sequence elements in nucleic acid molecules
US11661625B2 (en) 2020-05-14 2023-05-30 Becton, Dickinson And Company Primers for immune repertoire profiling
US11661631B2 (en) 2019-01-23 2023-05-30 Becton, Dickinson And Company Oligonucleotides associated with antibodies
US11703427B2 (en) 2018-06-25 2023-07-18 10X Genomics, Inc. Methods and systems for cell and bead processing
US11725231B2 (en) 2017-10-26 2023-08-15 10X Genomics, Inc. Methods and systems for nucleic acid preparation and chromatin analysis
US11739443B2 (en) 2020-11-20 2023-08-29 Becton, Dickinson And Company Profiling of highly expressed and lowly expressed proteins
US11773436B2 (en) 2019-11-08 2023-10-03 Becton, Dickinson And Company Using random priming to obtain full-length V(D)J information for immune repertoire sequencing
EP4272764A1 (en) 2022-05-03 2023-11-08 Scipio Bioscience Method of complexing biological units with particles
US11845983B1 (en) 2019-01-09 2023-12-19 10X Genomics, Inc. Methods and systems for multiplexing of droplet based assays
US11845986B2 (en) 2016-05-25 2023-12-19 Becton, Dickinson And Company Normalization of nucleic acid libraries
US11851700B1 (en) 2020-05-13 2023-12-26 10X Genomics, Inc. Methods, kits, and compositions for processing extracellular molecules
US11851683B1 (en) 2019-02-12 2023-12-26 10X Genomics, Inc. Methods and systems for selective analysis of cellular samples
EP4299803A2 (en) 2016-05-02 2024-01-03 Encodia, Inc. Macromolecule analysis employing nucleic acid encoding
US11873530B1 (en) 2018-07-27 2024-01-16 10X Genomics, Inc. Systems and methods for metabolome analysis
US11873481B2 (en) 2017-11-21 2024-01-16 Arima Genomics, Inc. Preserving spatial-proximal contiguity and molecular contiguity in nucleic acid templates
US11884964B2 (en) 2017-10-04 2024-01-30 10X Genomics, Inc. Compositions, methods, and systems for bead formation using improved polymers
US11920183B2 (en) 2019-03-11 2024-03-05 10X Genomics, Inc. Systems and methods for processing optically tagged beads
US11932849B2 (en) 2018-11-08 2024-03-19 Becton, Dickinson And Company Whole transcriptome analysis of single cells using random priming
US11932901B2 (en) 2020-07-13 2024-03-19 Becton, Dickinson And Company Target enrichment using nucleic acid probes for scRNAseq
US11932899B2 (en) 2018-06-07 2024-03-19 10X Genomics, Inc. Methods and systems for characterizing nucleic acid molecules
EP4103741A4 (en) * 2020-02-12 2024-03-20 Universal Sequencing Technology Corporation Methods for intracellular barcoding and spatial barcoding
US11939622B2 (en) 2019-07-22 2024-03-26 Becton, Dickinson And Company Single cell chromatin immunoprecipitation sequencing assay
US11952626B2 (en) 2021-02-23 2024-04-09 10X Genomics, Inc. Probe-based analysis of nucleic acids and proteins
US11981891B2 (en) 2018-05-17 2024-05-14 Illumina, Inc. High-throughput single-cell sequencing with reduced amplification bias
EP3559268B1 (en) * 2016-12-23 2024-06-12 CS Genetics Limited Methods and reagents for molecular barcoding
US12018314B2 (en) 2015-07-02 2024-06-25 Arima Genomics, Inc. Accurate molecular deconvolution of mixture samples
WO2024137703A1 (en) * 2022-12-20 2024-06-27 Illumina, Inc. Multivalent assemblies for enhanced target hybridization
US12049621B2 (en) 2018-05-10 2024-07-30 10X Genomics, Inc. Methods and systems for molecular composition generation
US12054773B2 (en) 2018-02-28 2024-08-06 10X Genomics, Inc. Transcriptome sequencing through random ligation
US12065688B2 (en) 2018-08-20 2024-08-20 10X Genomics, Inc. Compositions and methods for cellular processing
US12071617B2 (en) 2019-02-14 2024-08-27 Becton, Dickinson And Company Hybrid targeted and whole transcriptome amplification
US12071659B2 (en) 2013-03-15 2024-08-27 Complete Genomics, Inc. Multiple tagging of long DNA fragments
US12084715B1 (en) 2020-11-05 2024-09-10 10X Genomics, Inc. Methods and systems for reducing artifactual antisense products
US12104200B2 (en) 2017-12-22 2024-10-01 10X Genomics, Inc Systems and methods for processing nucleic acid molecules from one or more cells

Families Citing this family (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9074251B2 (en) 2011-02-10 2015-07-07 Illumina, Inc. Linking sequence reads using paired code tags
JP6017458B2 (en) 2011-02-02 2016-11-02 ユニヴァーシティ・オブ・ワシントン・スルー・イッツ・センター・フォー・コマーシャリゼーション Mass parallel continuity mapping
WO2014142850A1 (en) 2013-03-13 2014-09-18 Illumina, Inc. Methods and compositions for nucleic acid sequencing
US11535883B2 (en) * 2016-07-22 2022-12-27 Illumina, Inc. Single cell whole genome libraries and combinatorial indexing methods of making thereof
JP2022539630A (en) 2019-07-12 2022-09-13 イルミナ ケンブリッジ リミテッド Preparation of nucleic acid libraries using electrophoresis
WO2021011803A1 (en) 2019-07-16 2021-01-21 Omniome, Inc. Synthetic nucleic acids having non-natural structures
EP4018001A4 (en) * 2019-08-19 2023-09-13 Universal Sequencing Technology Corporation Methods and compositions for tracking nucleic acid fragment origin for nucleic acid sequencing
WO2021077415A1 (en) * 2019-10-25 2021-04-29 Peking University Methylation detection and analysis of mammalian dna
AU2020398353A1 (en) * 2019-12-02 2021-10-14 Illumina Cambridge Limited Time-based cluster imaging of amplified contiguity preserved library fragments of genomic DNA
CA3134746A1 (en) 2019-12-19 2021-06-24 Illumina, Inc. High-throughput single-cell libraries and methods of making and of using
US11211144B2 (en) 2020-02-18 2021-12-28 Tempus Labs, Inc. Methods and systems for refining copy number variation in a liquid biopsy assay
US11211147B2 (en) 2020-02-18 2021-12-28 Tempus Labs, Inc. Estimation of circulating tumor fraction using off-target reads of targeted-panel sequencing
US11475981B2 (en) 2020-02-18 2022-10-18 Tempus Labs, Inc. Methods and systems for dynamic variant thresholding in a liquid biopsy assay
CN112950101B (en) * 2021-05-13 2021-08-17 北京富通东方科技有限公司 Automobile factory mixed flow line production scheduling method based on yield prediction
EP4430206A1 (en) 2021-11-10 2024-09-18 Encodia, Inc. Methods for barcoding macromolecules in individual cells
WO2023230553A2 (en) * 2022-05-26 2023-11-30 Illumina, Inc. Preparation of long read nucleic acid libraries
WO2023239907A1 (en) * 2022-06-09 2023-12-14 The Regents Of The University Of California Single cell co-sequencing of dna methylation and rna
WO2024089953A1 (en) 2022-10-27 2024-05-02 住友化学株式会社 Method for producing oligonucleotide

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1995023875A1 (en) 1994-03-02 1995-09-08 The Johns Hopkins University In vitro transposition of artificial transposons
WO2012108864A1 (en) 2011-02-08 2012-08-16 Illumina, Inc. Selective enrichment of nucleic acids
US20150087027A1 (en) 2012-03-13 2015-03-26 Swift Biosciences, Inc. Methods and Compositions for Size-Controlled Homopolymer Tailing of Substrate Polynucleotides by a Nucleic Acid Polymerase

Family Cites Families (117)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA1323293C (en) 1987-12-11 1993-10-19 Keith C. Backman Assay using template-dependent nucleic acid probe reorganization
GB8810400D0 (en) 1988-05-03 1988-06-08 Southern E Analysing polynucleotide sequences
CA1341584C (en) 1988-04-06 2008-11-18 Bruce Wallace Method of amplifying and detecting nucleic acid sequences
WO1989009835A1 (en) 1988-04-08 1989-10-19 The Salk Institute For Biological Studies Ligase-based amplification method
DE68927373T2 (en) 1988-06-24 1997-03-20 Amgen Inc., Thousand Oaks, Calif. METHOD AND MEANS FOR DETECTING NUCLEIC ACID SEQUENCES
US5130238A (en) 1988-06-24 1992-07-14 Cangene Corporation Enhanced nucleic acid amplification process
EP0425563B1 (en) 1988-07-20 1996-05-15 David Segev Process for amplifying and detecting nucleic acid sequences
US5185243A (en) 1988-08-25 1993-02-09 Syntex (U.S.A.) Inc. Method for detection of specific nucleic acid sequences
CA2044616A1 (en) 1989-10-26 1991-04-27 Roger Y. Tsien Dna sequencing
US5573907A (en) 1990-01-26 1996-11-12 Abbott Laboratories Detecting and amplifying target nucleic acids using exonucleolytic activity
CA2035010C (en) 1990-01-26 1996-12-10 Keith C. Backman Method of amplifying target nucleic acids applicable to both polymerase and ligase chain reactions
US5223414A (en) 1990-05-07 1993-06-29 Sri International Process for nucleic acid hybridization and amplification
US5455166A (en) 1991-01-31 1995-10-03 Becton, Dickinson And Company Strand displacement amplification
CA2182517C (en) 1994-02-07 2001-08-21 Theo Nikiforov Ligase/polymerase-mediated primer extension of single nucleotide polymorphisms and its use in genetic analysis
CA2185239C (en) 1994-03-16 2002-12-17 Nanibhushan Dattagupta Isothermal strand displacement nucleic acid amplification
US5552278A (en) 1994-04-04 1996-09-03 Spectragen, Inc. DNA sequencing by stepwise ligation and cleavage
US5641658A (en) 1994-08-03 1997-06-24 Mosaic Technologies, Inc. Method for performing amplification of nucleic acid with two primers bound to a single solid support
RU2370541C2 (en) 1994-12-09 2009-10-20 Имерджент Продакт Дивелопмент Юк Лимитед Vgc2 dna of salmonella typhimurium, mutant bacterium with reduced adaptability to environment, and method of its obtainment
US5750341A (en) 1995-04-17 1998-05-12 Lynx Therapeutics, Inc. DNA sequencing by parallel oligonucleotide extensions
US5965443A (en) 1996-09-09 1999-10-12 Wisconsin Alumni Research Foundation System for in vitro transposition
US5925545A (en) 1996-09-09 1999-07-20 Wisconsin Alumni Research Foundation System for in vitro transposition
GB9620209D0 (en) 1996-09-27 1996-11-13 Cemu Bioteknik Ab Method of sequencing DNA
US5858671A (en) 1996-11-01 1999-01-12 The University Of Iowa Research Foundation Iterative and regenerative DNA sequencing method
GB9626815D0 (en) 1996-12-23 1997-02-12 Cemu Bioteknik Ab Method of sequencing DNA
AU6846698A (en) 1997-04-01 1998-10-22 Glaxo Group Limited Method of nucleic acid amplification
JP2001517948A (en) 1997-04-01 2001-10-09 グラクソ、グループ、リミテッド Nucleic acid sequencing
FI103809B1 (en) 1997-07-14 1999-09-30 Finnzymes Oy In vitro method for producing templates for DNA sequencing
AR021833A1 (en) 1998-09-30 2002-08-07 Applied Research Systems METHODS OF AMPLIFICATION AND SEQUENCING OF NUCLEIC ACID
US20010046669A1 (en) 1999-02-24 2001-11-29 Mccobmie William R. Genetically filtered shotgun sequencing of complex eukaryotic genomes
US20050244870A1 (en) 1999-04-20 2005-11-03 Illumina, Inc. Nucleic acid sequencing using microsphere arrays
US6355431B1 (en) 1999-04-20 2002-03-12 Illumina, Inc. Detection of nucleic acid amplification reactions using bead arrays
US7244559B2 (en) 1999-09-16 2007-07-17 454 Life Sciences Corporation Method of sequencing a nucleic acid
US6274320B1 (en) 1999-09-16 2001-08-14 Curagen Corporation Method of sequencing a nucleic acid
KR100612551B1 (en) 1999-11-08 2006-08-11 에이켄 카가꾸 가부시끼가이샤 Method for synthesizing nucleic acid
US7955794B2 (en) 2000-09-21 2011-06-07 Illumina, Inc. Multiplex nucleic acid reactions
US7582420B2 (en) 2001-07-12 2009-09-01 Illumina, Inc. Multiplex nucleic acid reactions
US7611869B2 (en) 2000-02-07 2009-11-03 Illumina, Inc. Multiplexed methylation detection methods
US7001792B2 (en) 2000-04-24 2006-02-21 Eagle Research & Development, Llc Ultra-fast nucleic acid sequencing device and a method for making and using the same
EP1290225A4 (en) 2000-05-20 2004-09-15 Univ Michigan Method of producing a dna library using positional amplification
US20030064366A1 (en) 2000-07-07 2003-04-03 Susan Hardin Real-time sequence determination
US6846658B1 (en) 2000-10-12 2005-01-25 New England Biolabs, Inc. Method for cloning and producing the Msel restriction endonuclease
WO2002044425A2 (en) 2000-12-01 2002-06-06 Visigen Biotechnologies, Inc. Enzymatic nucleic acid synthesis: compositions and methods for altering monomer incorporation fidelity
AR031640A1 (en) 2000-12-08 2003-09-24 Applied Research Systems ISOTHERMAL AMPLIFICATION OF NUCLEIC ACIDS IN A SOLID SUPPORT
US20040110191A1 (en) 2001-01-31 2004-06-10 Winkler Matthew M. Comparative analysis of nucleic acids using population tagging
US7138267B1 (en) 2001-04-04 2006-11-21 Epicentre Technologies Corporation Methods and compositions for amplifying DNA clone copy number
US6777187B2 (en) 2001-05-02 2004-08-17 Rubicon Genomics, Inc. Genome walking by selective amplification of nick-translate DNA library and amplification from complex mixtures of templates
GB0115194D0 (en) 2001-06-21 2001-08-15 Leuven K U Res & Dev Novel technology for genetic mapping
US7057026B2 (en) 2001-12-04 2006-06-06 Solexa Limited Labelled nucleotides
US7399590B2 (en) 2002-02-21 2008-07-15 Asm Scientific, Inc. Recombinase polymerase amplification
US20040002090A1 (en) 2002-03-05 2004-01-01 Pascal Mayer Methods for detecting genome-wide sequence variations associated with a phenotype
SI3363809T1 (en) 2002-08-23 2020-08-31 Illumina Cambridge Limited Modified nucleotides for polynucleotide sequencing
US7595883B1 (en) 2002-09-16 2009-09-29 The Board Of Trustees Of The Leland Stanford Junior University Biological analysis arrangement and approach therefor
WO2004042078A1 (en) 2002-11-05 2004-05-21 The University Of Queensland Nucleotide sequence analysis by quantification of mutagenesis
US7575865B2 (en) 2003-01-29 2009-08-18 454 Life Sciences Corporation Methods of amplifying and sequencing nucleic acids
DE602004031284D1 (en) 2003-02-10 2011-03-17 Max Delbrueck Centrum TRANSPOSON SYSTEM FOR TARGETED INTEGRATION
US20050053980A1 (en) 2003-06-20 2005-03-10 Illumina, Inc. Methods and compositions for whole genome amplification and genotyping
EP1682680B2 (en) 2003-10-31 2018-03-21 AB Advanced Genetic Analysis Corporation Methods for producing a paired tag from a nucleic acid sequence and methods of use thereof
EP3175914A1 (en) 2004-01-07 2017-06-07 Illumina Cambridge Limited Improvements in or relating to molecular arrays
US7595160B2 (en) 2004-01-13 2009-09-29 U.S. Genomics, Inc. Analyte detection using barcoded polymers
WO2005100585A2 (en) 2004-03-30 2005-10-27 Epicentre Methods for obtaining directionally truncated polypeptides
US20080268507A1 (en) 2004-05-25 2008-10-30 Airbus Deutschland Gmbh Recombinant Dna Nicking Endonuclease and Uses Thereof
US7319142B1 (en) 2004-08-31 2008-01-15 Monsanto Technology Llc Nucleotide and amino acid sequences from Xenorhabdus and uses thereof
WO2006044078A2 (en) 2004-09-17 2006-04-27 Pacific Biosciences Of California, Inc. Apparatus and method for analysis of molecules
US7449297B2 (en) 2005-04-14 2008-11-11 Euclid Diagnostics Llc Methods of copying the methylation pattern of DNA during isothermal amplification and microarrays
WO2007145612A1 (en) 2005-06-06 2007-12-21 454 Life Sciences Corporation Paired end sequencing
US7709197B2 (en) 2005-06-15 2010-05-04 Callida Genomics, Inc. Nucleic acid analysis by random mixtures of non-overlapping fragments
US7405281B2 (en) 2005-09-29 2008-07-29 Pacific Biosciences Of California, Inc. Fluorescent nucleotide analogs and uses therefor
GB0522310D0 (en) 2005-11-01 2005-12-07 Solexa Ltd Methods of preparing libraries of template polynucleotides
US20070128610A1 (en) 2005-12-02 2007-06-07 Buzby Philip R Sample preparation method and apparatus for nucleic acid sequencing
WO2007087312A2 (en) 2006-01-23 2007-08-02 Population Genetics Technologies Ltd. Molecular counting
WO2007107710A1 (en) 2006-03-17 2007-09-27 Solexa Limited Isothermal methods for creating clonal single molecule arrays
CA2648149A1 (en) 2006-03-31 2007-11-01 Solexa, Inc. Systems and devices for sequence by synthesis analysis
WO2008005459A2 (en) 2006-06-30 2008-01-10 Nugen Technologies, Inc. Methods for fragmentation and labeling of nucleic acids
US7754429B2 (en) 2006-10-06 2010-07-13 Illumina Cambridge Limited Method for pair-wise sequencing a plurity of target polynucleotides
AU2007309504B2 (en) 2006-10-23 2012-09-13 Pacific Biosciences Of California, Inc. Polymerase enzymes and reagents for enhanced nucleic acid sequencing
WO2008143640A1 (en) 2006-11-07 2008-11-27 Government Of The United Nations Of America, As Represented By The Secretariat, Department Of Healthand Human Services Influenza virus nucleic acid microarray and method of use
US20080242560A1 (en) 2006-11-21 2008-10-02 Gunderson Kevin L Methods for generating amplified nucleic acid arrays
EP2639578B1 (en) 2006-12-14 2016-09-14 Life Technologies Corporation Apparatus for measuring analytes using large scale fet arrays
US8262900B2 (en) 2006-12-14 2012-09-11 Life Technologies Corporation Methods and apparatus for measuring analytes using large scale FET arrays
US8349167B2 (en) 2006-12-14 2013-01-08 Life Technologies Corporation Methods and apparatus for detecting molecular interactions using FET arrays
JP5342456B2 (en) * 2007-01-19 2013-11-13 エピゲノミクス アーゲー Methods and nucleic acids for detection of cell proliferative disorders
EP2173467B1 (en) 2007-07-13 2016-05-04 The Board Of Trustees Of The Leland Stanford Junior University Method and apparatus using electric field for improved biological assays
CN101802223A (en) 2007-08-15 2010-08-11 香港大学 methods and compositions for high-throughput bisulphite dna-sequencing and utilities
US8415099B2 (en) 2007-11-05 2013-04-09 Complete Genomics, Inc. Efficient base determination in sequencing reactions
US8852864B2 (en) 2008-01-17 2014-10-07 Sequenom Inc. Methods and compositions for the analysis of nucleic acids
JP5539325B2 (en) 2008-04-30 2014-07-02 インテグレイテツド・デイー・エヌ・エイ・テクノロジーズ・インコーポレイテツド RNase H based assay using modified RNA monomers
CA2729159C (en) 2008-06-30 2020-01-14 Bionanomatrix, Inc. Methods and devices for single-molecule whole genome analysis
US8383345B2 (en) 2008-09-12 2013-02-26 University Of Washington Sequence tag directed subassembly of short sequencing reads into long sequencing reads
US20100137143A1 (en) 2008-10-22 2010-06-03 Ion Torrent Systems Incorporated Methods and apparatus for measuring analytes
US9080211B2 (en) 2008-10-24 2015-07-14 Epicentre Technologies Corporation Transposon end compositions and methods for modifying nucleic acids
WO2010048605A1 (en) * 2008-10-24 2010-04-29 Epicentre Technologies Corporation Transposon end compositions and methods for modifying nucleic acids
WO2010099301A2 (en) 2009-02-25 2010-09-02 The Johns Hopkins University Piggybac transposon variants and methods of use
US8709717B2 (en) 2009-04-03 2014-04-29 Illumina, Inc. Generation of uniform fragments of nucleic acids using patterned substrates
EP3072968A1 (en) 2010-02-25 2016-09-28 Advanced Liquid Logic, Inc. Method of making nucleic acid libraries
US9029103B2 (en) 2010-08-27 2015-05-12 Illumina Cambridge Limited Methods for sequencing polynucleotides
JP2013535986A (en) * 2010-08-27 2013-09-19 ジェネンテック, インコーポレイテッド Nucleic acid capture and sequencing
GB2497912B (en) * 2010-10-08 2014-06-04 Harvard College High-throughput single cell barcoding
US9096899B2 (en) 2010-10-27 2015-08-04 Illumina, Inc. Microdevices and biosensor cartridges for biological or chemical analysis and systems and methods for the same
CA2821299C (en) 2010-11-05 2019-02-12 Frank J. Steemers Linking sequence reads using paired code tags
US8829171B2 (en) * 2011-02-10 2014-09-09 Illumina, Inc. Linking sequence reads using paired code tags
US9074251B2 (en) 2011-02-10 2015-07-07 Illumina, Inc. Linking sequence reads using paired code tags
US8951781B2 (en) 2011-01-10 2015-02-10 Illumina, Inc. Systems, methods, and apparatuses to image a sample for biological or chemical analysis
EP3037536B1 (en) * 2011-01-28 2019-11-27 Illumina, Inc. Oligonucleotide replacement for di-tagged and directional libraries
JP6017458B2 (en) 2011-02-02 2016-11-02 ユニヴァーシティ・オブ・ワシントン・スルー・イッツ・センター・フォー・コマーシャリゼーション Mass parallel continuity mapping
EP2702175B1 (en) 2011-04-25 2018-08-08 Bio-Rad Laboratories, Inc. Methods and compositions for nucleic acid analysis
US20130017978A1 (en) 2011-07-11 2013-01-17 Finnzymes Oy Methods and transposon nucleic acids for generating a dna library
NO2694769T3 (en) 2012-03-06 2018-03-03
CA2874413A1 (en) 2012-05-21 2013-11-28 The Scripps Research Institute Methods of sample preparation
US9012022B2 (en) 2012-06-08 2015-04-21 Illumina, Inc. Polymer coatings
US8895249B2 (en) 2012-06-15 2014-11-25 Illumina, Inc. Kinetic exclusion amplification of nucleic acid libraries
US9644199B2 (en) * 2012-10-01 2017-05-09 Agilent Technologies, Inc. Immobilized transposase complexes for DNA fragmentation and tagging
US9683230B2 (en) * 2013-01-09 2017-06-20 Illumina Cambridge Limited Sample preparation on a solid support
CN108753766A (en) 2013-02-08 2018-11-06 10X基因组学有限公司 Polynucleotides bar code generating at
CN105074010B (en) 2013-03-07 2018-04-17 积水医疗株式会社 The detection method of methylate DNA
WO2014142850A1 (en) 2013-03-13 2014-09-18 Illumina, Inc. Methods and compositions for nucleic acid sequencing
GB2546833B (en) 2013-08-28 2018-04-18 Cellular Res Inc Microwell for single cell analysis comprising single cell and single bead oligonucleotide capture labels
WO2015103339A1 (en) 2013-12-30 2015-07-09 Atreca, Inc. Analysis of nucleic acids associated with single cells using nucleic acid barcodes

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1995023875A1 (en) 1994-03-02 1995-09-08 The Johns Hopkins University In vitro transposition of artificial transposons
WO2012108864A1 (en) 2011-02-08 2012-08-16 Illumina, Inc. Selective enrichment of nucleic acids
US20150087027A1 (en) 2012-03-13 2015-03-26 Swift Biosciences, Inc. Methods and Compositions for Size-Controlled Homopolymer Tailing of Substrate Polynucleotides by a Nucleic Acid Polymerase

Non-Patent Citations (20)

* Cited by examiner, † Cited by third party
Title
BOEKE; CORCES, ANNU REV MICROBIOL., vol. 43, 1989, pages 403 - 34
BROWN ET AL., PROC NATL ACAD SCI USA, vol. 86, 1989, pages 2525 - 9
COLEGIO ET AL., J. BACTERIOL., vol. 183, 2001, pages 2384 - 8
CRAIG, N L, REVIEW IN: CURR TOP MICROBIOL IMMUNOL., vol. 204, 1996, pages 27 - 48
CRAIG, N L, SCIENCE, vol. 271, 1996, pages 1512
DEVINE; BOEKE, NUCLEIC ACIDS RES., vol. 22, 1994, pages 3765 - 72
GLOOR, G B, METHODS MOL. BIOL., vol. 260, 2004, pages 97 - 114
GORYSHIN; REZNIKOFF, J. BIOL. CHEM., vol. 273, 1998, pages 7367
HOSHIKA, S; CHEN, F; LEAL, NA; BENNER, SA, ANGEW. CHEM. INT. ED., vol. 49, no. 32, 2010, pages 5554 - 5557
ICHIKAWA; OHTSUBO, J BIOL. CHEM., vol. 265, 1990, pages 18829 - 32
KIRBY C ET AL., MOL. MICROBIOL., vol. 43, 2002, pages 173 - 86
KLECKNER N ET AL., CURR TOP MICROBIOL IMMUNOL., vol. 204, 1996, pages 49 - 82
LAMPE D J ET AL., EMBO J., vol. 15, 1996, pages 5470 - 9
LIU ET AL., ANAL CHEM., vol. 85, no. 21, 5 November 2013 (2013-11-05), pages 10463 - 70
MIZUUCHI, K., CELL, vol. 35, 1983, pages 785
OHTSUBO; SEKINE, CURR. TOP. MICROBIOL. IMMUNOL., vol. 204, 1996, pages 1 - 26
PLASTERK R H, CURR. TOPICS MICROBIOL. IMMUNOL., vol. 204, 1996, pages 125 - 43
SAVILAHTI, H ET AL., EMBO J., vol. 14, 1995, pages 4893
WILSON C. ET AL., J. MICROBIOL. METHODS, vol. 71, 2007, pages 332 - 5
ZHANG ET AL., PLOS GENET, vol. 5, 16 October 2009 (2009-10-16), pages E 1000689

Cited By (250)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11970737B2 (en) 2009-12-15 2024-04-30 Becton, Dickinson And Company Digital counting of individual molecules by stochastic attachment of diverse labels
US10619203B2 (en) 2009-12-15 2020-04-14 Becton, Dickinson And Company Digital counting of individual molecules by stochastic attachment of diverse labels
US12060607B2 (en) 2009-12-15 2024-08-13 Becton, Dickinson And Company Digital counting of individual molecules by stochastic attachment of diverse labels
US11993814B2 (en) 2009-12-15 2024-05-28 Becton, Dickinson And Company Digital counting of individual molecules by stochastic attachment of diverse labels
US11634708B2 (en) 2012-02-27 2023-04-25 Becton, Dickinson And Company Compositions and kits for molecular counting
US10941396B2 (en) 2012-02-27 2021-03-09 Becton, Dickinson And Company Compositions and kits for molecular counting
US12037634B2 (en) 2012-08-14 2024-07-16 10X Genomics, Inc. Capsule array devices and methods of use
US10273541B2 (en) 2012-08-14 2019-04-30 10X Genomics, Inc. Methods and systems for processing polynucleotides
US11035002B2 (en) 2012-08-14 2021-06-15 10X Genomics, Inc. Methods and systems for processing polynucleotides
US11078522B2 (en) 2012-08-14 2021-08-03 10X Genomics, Inc. Capsule array devices and methods of use
US11441179B2 (en) 2012-08-14 2022-09-13 10X Genomics, Inc. Methods and systems for processing polynucleotides
US12098423B2 (en) 2012-08-14 2024-09-24 10X Genomics, Inc. Methods and systems for processing polynucleotides
US10400280B2 (en) 2012-08-14 2019-09-03 10X Genomics, Inc. Methods and systems for processing polynucleotides
US10752950B2 (en) 2012-08-14 2020-08-25 10X Genomics, Inc. Methods and systems for processing polynucleotides
US10584381B2 (en) 2012-08-14 2020-03-10 10X Genomics, Inc. Methods and systems for processing polynucleotides
US10752949B2 (en) 2012-08-14 2020-08-25 10X Genomics, Inc. Methods and systems for processing polynucleotides
US10450607B2 (en) 2012-08-14 2019-10-22 10X Genomics, Inc. Methods and systems for processing polynucleotides
US10597718B2 (en) 2012-08-14 2020-03-24 10X Genomics, Inc. Methods and systems for sample processing polynucleotides
US10323279B2 (en) 2012-08-14 2019-06-18 10X Genomics, Inc. Methods and systems for processing polynucleotides
US11021749B2 (en) 2012-08-14 2021-06-01 10X Genomics, Inc. Methods and systems for processing polynucleotides
US11359239B2 (en) 2012-08-14 2022-06-14 10X Genomics, Inc. Methods and systems for processing polynucleotides
US10626458B2 (en) 2012-08-14 2020-04-21 10X Genomics, Inc. Methods and systems for processing polynucleotides
US10669583B2 (en) 2012-08-14 2020-06-02 10X Genomics, Inc. Method and systems for processing polynucleotides
US11591637B2 (en) 2012-08-14 2023-02-28 10X Genomics, Inc. Compositions and methods for sample processing
US10676789B2 (en) 2012-12-14 2020-06-09 10X Genomics, Inc. Methods and systems for processing polynucleotides
US11421274B2 (en) 2012-12-14 2022-08-23 10X Genomics, Inc. Methods and systems for processing polynucleotides
US10612090B2 (en) 2012-12-14 2020-04-07 10X Genomics, Inc. Methods and systems for processing polynucleotides
US10253364B2 (en) 2012-12-14 2019-04-09 10X Genomics, Inc. Method and systems for processing polynucleotides
US10227648B2 (en) 2012-12-14 2019-03-12 10X Genomics, Inc. Methods and systems for processing polynucleotides
US10533221B2 (en) 2012-12-14 2020-01-14 10X Genomics, Inc. Methods and systems for processing polynucleotides
US11473138B2 (en) 2012-12-14 2022-10-18 10X Genomics, Inc. Methods and systems for processing polynucleotides
US11193121B2 (en) 2013-02-08 2021-12-07 10X Genomics, Inc. Partitioning and processing of analytes and other species
US12071659B2 (en) 2013-03-15 2024-08-27 Complete Genomics, Inc. Multiple tagging of long DNA fragments
US11702706B2 (en) 2013-08-28 2023-07-18 Becton, Dickinson And Company Massively parallel single cell analysis
US10954570B2 (en) 2013-08-28 2021-03-23 Becton, Dickinson And Company Massively parallel single cell analysis
US10927419B2 (en) 2013-08-28 2021-02-23 Becton, Dickinson And Company Massively parallel single cell analysis
US11618929B2 (en) 2013-08-28 2023-04-04 Becton, Dickinson And Company Massively parallel single cell analysis
US11030276B2 (en) 2013-12-16 2021-06-08 10X Genomics, Inc. Methods and apparatus for sorting data
US11853389B2 (en) 2013-12-16 2023-12-26 10X Genomics, Inc. Methods and apparatus for sorting data
US10343166B2 (en) 2014-04-10 2019-07-09 10X Genomics, Inc. Fluidic devices, systems, and methods for encapsulating and partitioning reagents, and applications of same
US12005454B2 (en) 2014-04-10 2024-06-11 10X Genomics, Inc. Fluidic devices, systems, and methods for encapsulating and partitioning reagents, and applications of same
US10480028B2 (en) 2014-06-26 2019-11-19 10X Genomics, Inc. Methods and systems for processing polynucleotides
US11713457B2 (en) 2014-06-26 2023-08-01 10X Genomics, Inc. Methods and systems for processing polynucleotides
US11629344B2 (en) 2014-06-26 2023-04-18 10X Genomics, Inc. Methods and systems for processing polynucleotides
US10760124B2 (en) 2014-06-26 2020-09-01 10X Genomics, Inc. Methods and systems for processing polynucleotides
US10344329B2 (en) 2014-06-26 2019-07-09 10X Genomics, Inc. Methods and systems for processing polynucleotides
US10457986B2 (en) 2014-06-26 2019-10-29 10X Genomics, Inc. Methods and systems for processing polynucleotides
US10337061B2 (en) 2014-06-26 2019-07-02 10X Genomics, Inc. Methods and systems for processing polynucleotides
US10526641B2 (en) 2014-08-01 2020-01-07 Dovetail Genomics, Llc Tagging nucleic acids for sequence assembly
US10287623B2 (en) 2014-10-29 2019-05-14 10X Genomics, Inc. Methods and compositions for targeted nucleic acid sequencing
US11739368B2 (en) 2014-10-29 2023-08-29 10X Genomics, Inc. Methods and compositions for targeted nucleic acid sequencing
US11135584B2 (en) 2014-11-05 2021-10-05 10X Genomics, Inc. Instrument systems for integrated sample processing
US10557158B2 (en) 2015-01-12 2020-02-11 10X Genomics, Inc. Processes and systems for preparation of nucleic acid sequencing libraries and libraries prepared using same
US11414688B2 (en) 2015-01-12 2022-08-16 10X Genomics, Inc. Processes and systems for preparation of nucleic acid sequencing libraries and libraries prepared using same
US9715573B2 (en) 2015-02-17 2017-07-25 Dovetail Genomics, Llc Nucleic acid sequence assembly
US10318706B2 (en) 2015-02-17 2019-06-11 Dovetail Genomics, Llc Nucleic acid sequence assembly
US11600361B2 (en) 2015-02-17 2023-03-07 Dovetail Genomics, Llc Nucleic acid sequence assembly
US10697000B2 (en) 2015-02-24 2020-06-30 10X Genomics, Inc. Partition processing methods and systems
US11603554B2 (en) 2015-02-24 2023-03-14 10X Genomics, Inc. Partition processing methods and systems
US11274343B2 (en) 2015-02-24 2022-03-15 10X Genomics, Inc. Methods and compositions for targeted nucleic acid sequence coverage
USRE48913E1 (en) 2015-02-27 2022-02-01 Becton, Dickinson And Company Spatially addressable molecular barcoding
WO2016154540A1 (en) * 2015-03-26 2016-09-29 Dovetail Genomics Llc Physical linkage preservation in dna storage
GB2554572A (en) * 2015-03-26 2018-04-04 Dovetail Genomics Llc Physical linkage preservation in DNA storage
US11807896B2 (en) 2015-03-26 2023-11-07 Dovetail Genomics, Llc Physical linkage preservation in DNA storage
GB2554572B (en) * 2015-03-26 2021-06-23 Dovetail Genomics Llc Physical linkage preservation in DNA storage
US11535882B2 (en) 2015-03-30 2022-12-27 Becton, Dickinson And Company Methods and compositions for combinatorial barcoding
US11390914B2 (en) 2015-04-23 2022-07-19 Becton, Dickinson And Company Methods and compositions for whole transcriptome amplification
US11453875B2 (en) 2015-05-28 2022-09-27 Illumina Cambridge Limited Surface-based tagmentation
WO2016189331A1 (en) * 2015-05-28 2016-12-01 Illumina Cambridge Limited Surface-based tagmentation
US12018314B2 (en) 2015-07-02 2024-06-25 Arima Genomics, Inc. Accurate molecular deconvolution of mixture samples
US11479805B2 (en) 2015-08-21 2022-10-25 The General Hospital Corporation Combinatorial single molecule analysis of chromatin
WO2017034970A1 (en) * 2015-08-21 2017-03-02 The General Hospital Corporation Combinatorial single molecule analysis of chromatin
US10619186B2 (en) 2015-09-11 2020-04-14 Cellular Research, Inc. Methods and compositions for library normalization
US11332776B2 (en) 2015-09-11 2022-05-17 Becton, Dickinson And Company Methods and compositions for library normalization
US10457934B2 (en) 2015-10-19 2019-10-29 Dovetail Genomics, Llc Methods for genome assembly, haplotype phasing, and target independent nucleic acid detection
EP3365445A4 (en) * 2015-10-19 2019-03-20 Dovetail Genomics, LLC Methods for genome assembly, haplotype phasing, and target independent nucleic acid detection
US11371094B2 (en) 2015-11-19 2022-06-28 10X Genomics, Inc. Systems and methods for nucleic acid processing using degenerate nucleotides
US10774370B2 (en) 2015-12-04 2020-09-15 10X Genomics, Inc. Methods and compositions for nucleic acid analysis
US11873528B2 (en) 2015-12-04 2024-01-16 10X Genomics, Inc. Methods and compositions for nucleic acid analysis
US11473125B2 (en) 2015-12-04 2022-10-18 10X Genomics, Inc. Methods and compositions for nucleic acid analysis
US11624085B2 (en) 2015-12-04 2023-04-11 10X Genomics, Inc. Methods and compositions for nucleic acid analysis
WO2017120531A1 (en) 2016-01-08 2017-07-13 Bio-Rad Laboratories, Inc. Multiple beads per droplet resolution
US20190169602A1 (en) * 2016-01-12 2019-06-06 Seqwell, Inc. Compositions and methods for sequencing nucleic acids
US11081208B2 (en) 2016-02-11 2021-08-03 10X Genomics, Inc. Systems, methods, and media for de novo assembly of whole genome sequence data
US10975417B2 (en) 2016-02-23 2021-04-13 Dovetail Genomics, Llc Generation of phased read-sets for genome assembly and haplotype phasing
US11535889B2 (en) 2016-03-31 2022-12-27 Agilent Technologies, Inc. Use of transposase and Y adapters to fragment and tag DNA
US11959922B2 (en) 2016-05-02 2024-04-16 Encodia, Inc. Macromolecule analysis employing nucleic acid encoding
US12019078B2 (en) 2016-05-02 2024-06-25 Encodia, Inc. Macromolecule analysis employing nucleic acid encoding
EP4299803A2 (en) 2016-05-02 2024-01-03 Encodia, Inc. Macromolecule analysis employing nucleic acid encoding
US12019077B2 (en) 2016-05-02 2024-06-25 Encodia, Inc. Macromolecule analysis employing nucleic acid encoding
US11084036B2 (en) 2016-05-13 2021-08-10 10X Genomics, Inc. Microfluidic systems and methods of use
US10947579B2 (en) 2016-05-13 2021-03-16 Dovetail Genomics, Llc Recovering long-range linkage information from preserved samples
US12065691B2 (en) 2016-05-13 2024-08-20 Dovetail Genomics, Llc Recovering long-range linkage information from preserved samples
WO2017197300A1 (en) * 2016-05-13 2017-11-16 Dovetail Genomics Llc Recovering long-range linkage information from preserved samples
US10894990B2 (en) 2016-05-17 2021-01-19 Shoreline Biome, Llc High throughput method for identification and sequencing of unknown microbial and eukaryotic genomes from complex mixtures
US11845986B2 (en) 2016-05-25 2023-12-19 Becton, Dickinson And Company Normalization of nucleic acid libraries
US11220685B2 (en) 2016-05-31 2022-01-11 Becton, Dickinson And Company Molecular indexing of internal sequences
US11525157B2 (en) 2016-05-31 2022-12-13 Becton, Dickinson And Company Error correction in amplification of samples
US10640763B2 (en) 2016-05-31 2020-05-05 Cellular Research, Inc. Molecular indexing of internal sequences
EP3485034B1 (en) * 2016-07-12 2021-08-18 Kapa Biosystems, Inc. System and method for transposase-mediated amplicon sequencing
US20180016632A1 (en) * 2016-07-12 2018-01-18 Kapa Biosystems, Inc. System and method for transposase-mediated amplicon sequencing
EP3497219A4 (en) * 2016-08-10 2020-08-19 President and Fellows of Harvard College Methods of de novo assembly of barcoded genomic dna fragments
WO2018057779A1 (en) * 2016-09-23 2018-03-29 Jianbiao Zheng Compositions of synthetic transposons and methods of use thereof
US11467157B2 (en) 2016-09-26 2022-10-11 Becton, Dickinson And Company Measurement of protein expression using reagents with barcoded oligonucleotide sequences
US11782059B2 (en) 2016-09-26 2023-10-10 Becton, Dickinson And Company Measurement of protein expression using reagents with barcoded oligonucleotide sequences
US11460468B2 (en) 2016-09-26 2022-10-04 Becton, Dickinson And Company Measurement of protein expression using reagents with barcoded oligonucleotide sequences
US11685947B2 (en) 2016-12-19 2023-06-27 Bio-Rad Laboratories, Inc. Droplet tagging contiguity preserved tagmented DNA
CN110139932B (en) * 2016-12-19 2024-05-17 生物辐射实验室股份有限公司 Drop-on labeled DNA with maintained adjacency
WO2018118971A1 (en) * 2016-12-19 2018-06-28 Bio-Rad Laboratories, Inc. Droplet tagging contiguity preserved tagmented dna
US11021738B2 (en) 2016-12-19 2021-06-01 Bio-Rad Laboratories, Inc. Droplet tagging contiguity preserved tagmented DNA
CN110139932A (en) * 2016-12-19 2019-08-16 生物辐射实验室股份有限公司 The labeling DNA that the adjacency of drop mark-on retains
US11180805B2 (en) 2016-12-22 2021-11-23 10X Genomics, Inc Methods and systems for processing polynucleotides
US10954562B2 (en) 2016-12-22 2021-03-23 10X Genomics, Inc. Methods and systems for processing polynucleotides
US12084716B2 (en) 2016-12-22 2024-09-10 10X Genomics, Inc. Methods and systems for processing polynucleotides
US10858702B2 (en) 2016-12-22 2020-12-08 10X Genomics, Inc. Methods and systems for processing polynucleotides
US10815525B2 (en) 2016-12-22 2020-10-27 10X Genomics, Inc. Methods and systems for processing polynucleotides
US11248267B2 (en) 2016-12-22 2022-02-15 10X Genomics, Inc. Methods and systems for processing polynucleotides
US10793905B2 (en) 2016-12-22 2020-10-06 10X Genomics, Inc. Methods and systems for processing polynucleotides
US10550429B2 (en) 2016-12-22 2020-02-04 10X Genomics, Inc. Methods and systems for processing polynucleotides
US11732302B2 (en) 2016-12-22 2023-08-22 10X Genomics, Inc. Methods and systems for processing polynucleotides
US12110549B2 (en) 2016-12-22 2024-10-08 10X Genomics, Inc. Methods and systems for processing polynucleotides
EP4421184A2 (en) 2016-12-23 2024-08-28 CS Genetics Limited Methods and reagents for molecular barcoding
EP3559268B1 (en) * 2016-12-23 2024-06-12 CS Genetics Limited Methods and reagents for molecular barcoding
US10428326B2 (en) 2017-01-30 2019-10-01 10X Genomics, Inc. Methods and systems for droplet-based single cell barcoding
US11193122B2 (en) 2017-01-30 2021-12-07 10X Genomics, Inc. Methods and systems for droplet-based single cell barcoding
US11319583B2 (en) 2017-02-01 2022-05-03 Becton, Dickinson And Company Selective amplification using blocking oligonucleotides
US10995333B2 (en) 2017-02-06 2021-05-04 10X Genomics, Inc. Systems and methods for nucleic acid preparation
US10920219B2 (en) 2017-02-21 2021-02-16 Illumina, Inc. Tagmentation using immobilized transposomes with linkers
KR20190120056A (en) * 2017-02-21 2019-10-23 일루미나, 인코포레이티드 Tagging Using Immobilized Transposomes with Linkers
KR102607830B1 (en) 2017-02-21 2023-12-01 일루미나, 인코포레이티드 Tagmentation using immobilized transposomes with linkers
WO2018156519A1 (en) * 2017-02-21 2018-08-30 Illumina Inc. Tagmentation using immobilized transposomes with linkers
JP2022177288A (en) * 2017-02-21 2022-11-30 イルミナ インコーポレイテッド Tagmentation using immobilized transposomes with linkers
CN109415758B (en) * 2017-02-21 2022-08-30 Illumina公司 Tagmentation using adaptor-containing immobilized transposomes
CN109415758A (en) * 2017-02-21 2019-03-01 Illumina公司 The label fragmentation carried out using the swivel base body through immobilization containing connector
RU2783536C2 (en) * 2017-02-21 2022-11-14 Иллюмина Инк. Tagmentation using immobilized transposomes with linkers
JP2020510401A (en) * 2017-02-21 2020-04-09 イルミナ インコーポレイテッド Tagmentation using immobilized transposomes with linker
IL263118B2 (en) * 2017-02-21 2023-12-01 Illumina Inc Tagmentation using immobilized transposomes with linkers
EP3783112A1 (en) * 2017-02-21 2021-02-24 Illumina, Inc. Tagmentation using immobilized transposomes with linkers
US11708573B2 (en) 2017-02-21 2023-07-25 Illumina, Inc. Tagmentation using immobilized transposomes with linkers
IL263118B1 (en) * 2017-02-21 2023-08-01 Illumina Inc Tagmentation using immobilized transposomes with linkers
US12123878B2 (en) 2017-05-02 2024-10-22 Encodia, Inc. Macromolecule analysis employing nucleic acid encoding
EP4345159A2 (en) 2017-05-05 2024-04-03 Scipio Bioscience Methods for trapping and barcoding discrete biological units in hydrogel
WO2018203141A1 (en) 2017-05-05 2018-11-08 Scipio Bioscience Methods for trapping and barcoding discrete biological units in hydrogel
US11773389B2 (en) 2017-05-26 2023-10-03 10X Genomics, Inc. Single cell analysis of transposase accessible chromatin
CN109526228A (en) * 2017-05-26 2019-03-26 10X基因组学有限公司 The chromatinic single cell analysis of transposase accessibility
US11198866B2 (en) 2017-05-26 2021-12-14 10X Genomics, Inc. Single cell analysis of transposase accessible chromatin
CN109526228B (en) * 2017-05-26 2022-11-25 10X基因组学有限公司 Single cell analysis of transposase accessible chromatin
US10844372B2 (en) 2017-05-26 2020-11-24 10X Genomics, Inc. Single cell analysis of transposase accessible chromatin
US20180340172A1 (en) * 2017-05-26 2018-11-29 10X Genomics, Inc. Single cell analysis of transposase accessible chromatin
US10927370B2 (en) 2017-05-26 2021-02-23 10X Genomics, Inc. Single cell analysis of transposase accessible chromatin
US11155810B2 (en) 2017-05-26 2021-10-26 10X Genomics, Inc. Single cell analysis of transposase accessible chromatin
US10400235B2 (en) 2017-05-26 2019-09-03 10X Genomics, Inc. Single cell analysis of transposase accessible chromatin
US10676779B2 (en) 2017-06-05 2020-06-09 Becton, Dickinson And Company Sample indexing for single cells
US10669570B2 (en) 2017-06-05 2020-06-02 Becton, Dickinson And Company Sample indexing for single cells
US12084712B2 (en) 2017-06-05 2024-09-10 Becton, Dickinson And Company Sample indexing for single cells
JP2020523011A (en) * 2017-06-07 2020-08-06 オレゴン ヘルス アンド サイエンス ユニバーシティ Single-cell whole-genome library for methylation sequencing
JP7407597B2 (en) 2017-06-07 2024-01-04 オレゴン ヘルス アンド サイエンス ユニバーシティ Single-cell whole-genome libraries for methylation sequencing
WO2019060722A3 (en) * 2017-09-22 2019-05-16 X Gen Us Co. Methods and compositions for preparing polynucleotides
US11634750B2 (en) 2017-09-22 2023-04-25 Cygnus Biosciences (Beijing) Co., Ltd. Methods and compositions for preparing polynucleotides
US11884964B2 (en) 2017-10-04 2024-01-30 10X Genomics, Inc. Compositions, methods, and systems for bead formation using improved polymers
WO2019076768A1 (en) * 2017-10-16 2019-04-25 Tervisetehnoloogiate Arenduskeskus As Method and kit for dna library preparation
US11725231B2 (en) 2017-10-26 2023-08-15 10X Genomics, Inc. Methods and systems for nucleic acid preparation and chromatin analysis
US11584954B2 (en) 2017-10-27 2023-02-21 10X Genomics, Inc. Methods and systems for sample preparation and analysis
US11782062B2 (en) 2017-10-31 2023-10-10 Encodia, Inc. Kits for analysis using nucleic acid encoding and/or label
WO2019089836A1 (en) 2017-10-31 2019-05-09 Encodia, Inc. Kits for analysis using nucleic acid encoding and/or label
WO2019089959A1 (en) 2017-11-02 2019-05-09 Bio-Rad Laboratories, Inc. Transposase-based genomic analysis
EP4180534A1 (en) 2017-11-02 2023-05-17 Bio-Rad Laboratories, Inc. Transposase-based genomic analysis
US10745742B2 (en) 2017-11-15 2020-08-18 10X Genomics, Inc. Functionalized gel beads
US11884962B2 (en) 2017-11-15 2024-01-30 10X Genomics, Inc. Functionalized gel beads
US10876147B2 (en) 2017-11-15 2020-12-29 10X Genomics, Inc. Functionalized gel beads
US10829815B2 (en) 2017-11-17 2020-11-10 10X Genomics, Inc. Methods and systems for associating physical and genetic properties of biological particles
US11873481B2 (en) 2017-11-21 2024-01-16 Arima Genomics, Inc. Preserving spatial-proximal contiguity and molecular contiguity in nucleic acid templates
US11365438B2 (en) 2017-11-30 2022-06-21 10X Genomics, Inc. Systems and methods for nucleic acid preparation and analysis
US12104200B2 (en) 2017-12-22 2024-10-01 10X Genomics, Inc Systems and methods for processing nucleic acid molecules from one or more cells
US11739440B2 (en) 2018-02-12 2023-08-29 10X Genomics, Inc. Methods and systems for analysis of chromatin
US11131664B2 (en) 2018-02-12 2021-09-28 10X Genomics, Inc. Methods and systems for macromolecule labeling
US11002731B2 (en) 2018-02-12 2021-05-11 10X Genomics, Inc. Methods and systems for antigen screening
US12049712B2 (en) 2018-02-12 2024-07-30 10X Genomics, Inc. Methods and systems for analysis of chromatin
US10928386B2 (en) 2018-02-12 2021-02-23 10X Genomics, Inc. Methods and systems for characterizing multiple analytes from individual cells or cell populations
US10816543B2 (en) 2018-02-12 2020-10-27 10X Genomics, Inc. Methods and systems for analysis of major histocompatability complex
US11255847B2 (en) 2018-02-12 2022-02-22 10X Genomics, Inc. Methods and systems for analysis of cell lineage
US10725027B2 (en) 2018-02-12 2020-07-28 10X Genomics, Inc. Methods and systems for analysis of chromatin
US11852628B2 (en) 2018-02-22 2023-12-26 10X Genomics, Inc. Methods and systems for characterizing analytes from individual cells or cell populations
US11639928B2 (en) 2018-02-22 2023-05-02 10X Genomics, Inc. Methods and systems for characterizing analytes from individual cells or cell populations
US12092635B2 (en) 2018-02-22 2024-09-17 10X Genomics, Inc. Methods and systems for characterizing analytes from individual cells or cell populations
US12054773B2 (en) 2018-02-28 2024-08-06 10X Genomics, Inc. Transcriptome sequencing through random ligation
US11155881B2 (en) 2018-04-06 2021-10-26 10X Genomics, Inc. Systems and methods for quality control in single cell processing
US11365409B2 (en) 2018-05-03 2022-06-21 Becton, Dickinson And Company Molecular barcoding on opposite transcript ends
EP4234717A3 (en) * 2018-05-03 2023-11-01 Becton, Dickinson and Company High throughput multiomics sample analysis
WO2019213294A1 (en) * 2018-05-03 2019-11-07 Becton, Dickinson And Company High throughput multiomics sample analysis
US11773441B2 (en) 2018-05-03 2023-10-03 Becton, Dickinson And Company High throughput multiomics sample analysis
EP4245861A3 (en) * 2018-05-08 2023-10-11 MGI Tech Co., Ltd. Single tube bead-based dna co-barcoding for accurate and cost-effective sequencing, haplotyping, and assembly
WO2019217452A1 (en) 2018-05-08 2019-11-14 Mgi Tech Co., Ltd. Single tube bead-based dna co-barcoding for accurate and cost-effective sequencing, haplotyping, and assembly
EP3790967A4 (en) * 2018-05-08 2021-12-29 MGI Tech Co., Ltd. Single tube bead-based dna co-barcoding for accurate and cost-effective sequencing, haplotyping, and assembly
US12049621B2 (en) 2018-05-10 2024-07-30 10X Genomics, Inc. Methods and systems for molecular composition generation
US11981891B2 (en) 2018-05-17 2024-05-14 Illumina, Inc. High-throughput single-cell sequencing with reduced amplification bias
US11932899B2 (en) 2018-06-07 2024-03-19 10X Genomics, Inc. Methods and systems for characterizing nucleic acid molecules
US11703427B2 (en) 2018-06-25 2023-07-18 10X Genomics, Inc. Methods and systems for cell and bead processing
US12117378B2 (en) 2018-06-25 2024-10-15 10X Genomics, Inc. Methods and systems for cell and bead processing
US11873530B1 (en) 2018-07-27 2024-01-16 10X Genomics, Inc. Systems and methods for metabolome analysis
US12065688B2 (en) 2018-08-20 2024-08-20 10X Genomics, Inc. Compositions and methods for cellular processing
WO2020041293A1 (en) 2018-08-20 2020-02-27 Bio-Rad Laboratories, Inc. Nucleotide sequence generation by barcode bead-colocalization in partitions
EP4249651A2 (en) 2018-08-20 2023-09-27 Bio-Rad Laboratories, Inc. Nucleotide sequence generation by barcode bead-colocalization in partitions
US20210214783A1 (en) * 2018-09-27 2021-07-15 Bgi Shenzhen Method for constructing sequencing library, obtained sequencing library and sequencing method
EP3859014A4 (en) * 2018-09-27 2022-04-27 BGI Shenzhen Method for constructing sequencing library, obtained sequencing library and sequencing method
CN112739829A (en) * 2018-09-27 2021-04-30 深圳华大生命科学研究院 Construction method of sequencing library, sequencing library obtained by construction method and sequencing method
US11639517B2 (en) 2018-10-01 2023-05-02 Becton, Dickinson And Company Determining 5′ transcript sequences
EP3870704A4 (en) * 2018-10-25 2023-01-11 Illumina, Inc. Methods and compositions for identifying ligands on arrays using indexes and barcodes
US11932849B2 (en) 2018-11-08 2024-03-19 Becton, Dickinson And Company Whole transcriptome analysis of single cells using random priming
US11459607B1 (en) 2018-12-10 2022-10-04 10X Genomics, Inc. Systems and methods for processing-nucleic acid molecules from a single cell using sequential co-partitioning and composite barcodes
US11492660B2 (en) 2018-12-13 2022-11-08 Becton, Dickinson And Company Selective extension in single cell whole transcriptome analysis
US11845983B1 (en) 2019-01-09 2023-12-19 10X Genomics, Inc. Methods and systems for multiplexing of droplet based assays
RU2790295C2 (en) * 2019-01-11 2023-02-16 Иллумина Кембридж Лимитед Complex systems of transposome bound on surface
US11685946B2 (en) 2019-01-11 2023-06-27 Illumina Cambridge Limited Complex surface-bound transposome complexes
WO2020144373A1 (en) * 2019-01-11 2020-07-16 Illumina Cambridge Limited Complex surface-bound transposome complexes
US11306348B2 (en) 2019-01-11 2022-04-19 Illumina Cambridge Limited Complex surface-bound transposome complexes
US11661631B2 (en) 2019-01-23 2023-05-30 Becton, Dickinson And Company Oligonucleotides associated with antibodies
US11535890B2 (en) 2019-01-29 2022-12-27 Illumina, Inc. Sequencing kits
US11819843B2 (en) 2019-01-29 2023-11-21 Illumina, Inc. Flow cells with a hydrophobic barrier
US11318462B2 (en) 2019-01-29 2022-05-03 Illumina, Inc. Flow cells with a hydrophobic barrier
US11779897B2 (en) 2019-01-29 2023-10-10 Illumina, Inc. Flow cells using sequencing-ready nucleic acid fragments attached to carrier beads immobilized at capture sites of a plurality of chambers
CN113366115A (en) * 2019-01-29 2021-09-07 深圳华大智造科技股份有限公司 High coverage STLFR
US11192083B2 (en) 2019-01-29 2021-12-07 Illumina, Inc. Flow cells with chambers, depressions, and capture sites
WO2020157684A1 (en) 2019-01-29 2020-08-06 Mgi Tech Co., Ltd. High coverage stlfr
EP3918088A4 (en) * 2019-01-29 2023-01-18 MGI Tech Co., Ltd. High coverage stlfr
US11851683B1 (en) 2019-02-12 2023-12-26 10X Genomics, Inc. Methods and systems for selective analysis of cellular samples
US11467153B2 (en) 2019-02-12 2022-10-11 10X Genomics, Inc. Methods for processing nucleic acid molecules
US11584953B2 (en) 2019-02-12 2023-02-21 10X Genomics, Inc. Methods for processing nucleic acid molecules
US12071617B2 (en) 2019-02-14 2024-08-27 Becton, Dickinson And Company Hybrid targeted and whole transcriptome amplification
WO2020165433A1 (en) 2019-02-14 2020-08-20 MAX-PLANCK-Gesellschaft zur Förderung der Wissenschaften e.V. Haplotagging - haplotype phasing and single-tube combinatorial barcoding of nucleic acid molecules using bead-immobilized tn5 transposase
US11655499B1 (en) 2019-02-25 2023-05-23 10X Genomics, Inc. Detection of sequence elements in nucleic acid molecules
US11920183B2 (en) 2019-03-11 2024-03-05 10X Genomics, Inc. Systems and methods for processing optically tagged beads
US11634709B2 (en) 2019-04-30 2023-04-25 Encodia, Inc. Methods for preparing analytes and related kits
WO2021008805A1 (en) * 2019-07-12 2021-01-21 Illumina Cambridge Limited Compositions and methods for preparing nucleic acid sequencing libraries using crispr/cas9 immobilized on a solid support
US11939622B2 (en) 2019-07-22 2024-03-26 Becton, Dickinson And Company Single cell chromatin immunoprecipitation sequencing assay
RU2825578C1 (en) * 2019-09-20 2024-08-27 Иллюмина, Инк. Methods and compositions for determining ligands on matrices using indices and barcodes
WO2021055864A1 (en) * 2019-09-20 2021-03-25 Illumina, Inc. Methods and compositions for identifying ligands on arrays using indexes and barcodes
US11773436B2 (en) 2019-11-08 2023-10-03 Becton, Dickinson And Company Using random priming to obtain full-length V(D)J information for immune repertoire sequencing
US11649497B2 (en) 2020-01-13 2023-05-16 Becton, Dickinson And Company Methods and compositions for quantitation of proteins and RNA
EP4103741A4 (en) * 2020-02-12 2024-03-20 Universal Sequencing Technology Corporation Methods for intracellular barcoding and spatial barcoding
US11851700B1 (en) 2020-05-13 2023-12-26 10X Genomics, Inc. Methods, kits, and compositions for processing extracellular molecules
US11661625B2 (en) 2020-05-14 2023-05-30 Becton, Dickinson And Company Primers for immune repertoire profiling
US11932901B2 (en) 2020-07-13 2024-03-19 Becton, Dickinson And Company Target enrichment using nucleic acid probes for scRNAseq
WO2022087150A3 (en) * 2020-10-21 2022-06-30 Illumina, Inc. Sequencing templates comprising multiple inserts and compositions and methods for improving sequencing throughput
US12084715B1 (en) 2020-11-05 2024-09-10 10X Genomics, Inc. Methods and systems for reducing artifactual antisense products
US11739443B2 (en) 2020-11-20 2023-08-29 Becton, Dickinson And Company Profiling of highly expressed and lowly expressed proteins
US11952626B2 (en) 2021-02-23 2024-04-09 10X Genomics, Inc. Probe-based analysis of nucleic acids and proteins
EP4272764A1 (en) 2022-05-03 2023-11-08 Scipio Bioscience Method of complexing biological units with particles
WO2023214353A1 (en) 2022-05-03 2023-11-09 Scipio Bioscience Method of complexing biological units with particles
WO2024137703A1 (en) * 2022-12-20 2024-06-27 Illumina, Inc. Multivalent assemblies for enhanced target hybridization

Also Published As

Publication number Publication date
IL299976B1 (en) 2024-07-01
WO2016061517A3 (en) 2016-06-23
AU2015331739B2 (en) 2021-12-02
JP7127104B2 (en) 2022-08-29
RU2019138705A3 (en) 2020-05-22
US20190040382A1 (en) 2019-02-07
IL287853B2 (en) 2023-06-01
JP6808617B2 (en) 2021-01-06
KR102643955B1 (en) 2024-03-07
KR20220162873A (en) 2022-12-08
RU2736728C2 (en) 2020-11-19
BR122021026779B1 (en) 2023-12-19
US20190048332A1 (en) 2019-02-14
IL299976A (en) 2023-03-01
AU2022201205A1 (en) 2022-03-17
RU2017116989A (en) 2018-11-19
RU2709655C2 (en) 2019-12-19
KR102472027B1 (en) 2022-11-30
IL251737B (en) 2021-12-01
JP2018501776A (en) 2018-01-25
SG11201703139VA (en) 2017-07-28
BR122021026781B1 (en) 2023-11-14
KR20170107423A (en) 2017-09-25
RU2017116989A3 (en) 2019-05-20
JP2021052779A (en) 2021-04-08
IL251737A0 (en) 2017-06-29
IL287853A (en) 2022-01-01
AU2015331739A1 (en) 2017-05-11
CA2964799A1 (en) 2016-04-21
US11873480B2 (en) 2024-01-16
RU2019138705A (en) 2020-01-27
BR112017007912A2 (en) 2018-01-23
JP2022172158A (en) 2022-11-15
SG10201903408VA (en) 2019-05-30
US20220282242A1 (en) 2022-09-08
JP7532455B2 (en) 2024-08-13

Similar Documents

Publication Publication Date Title
US20220282242A1 (en) Contiguity Preserving Transposition
EP3207134B1 (en) Contiguity preserving transposition
US11299730B2 (en) Massively parallel contiguity mapping
EP3204518B1 (en) Universal blocking oligo system and improved hybridization capture methods for multiplexed capture reactions
CN111201329A (en) High throughput single cell sequencing with reduced amplification bias
WO2013177220A1 (en) Methods of sample preparation
EP3988669A1 (en) Method for nucleic acid detection by oligo hybridization and pcr-based amplification
BR112017007912B1 (en) CONTIGUITY PRESERVING TRANSPOSON

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15797490

Country of ref document: EP

Kind code of ref document: A2

ENP Entry into the national phase

Ref document number: 2964799

Country of ref document: CA

WWE Wipo information: entry into national phase

Ref document number: 251737

Country of ref document: IL

ENP Entry into the national phase

Ref document number: 2017520884

Country of ref document: JP

Kind code of ref document: A

WWE Wipo information: entry into national phase

Ref document number: 122021026781

Country of ref document: BR

NENP Non-entry into the national phase

Ref country code: DE

REG Reference to national code

Ref country code: BR

Ref legal event code: B01A

Ref document number: 112017007912

Country of ref document: BR

ENP Entry into the national phase

Ref document number: 2015331739

Country of ref document: AU

Date of ref document: 20151016

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 20177013242

Country of ref document: KR

Kind code of ref document: A

REEP Request for entry into the european phase

Ref document number: 2015797490

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2017116989

Country of ref document: RU

Kind code of ref document: A

WWE Wipo information: entry into national phase

Ref document number: 11201703139V

Country of ref document: SG

ENP Entry into the national phase

Ref document number: 112017007912

Country of ref document: BR

Kind code of ref document: A2

Effective date: 20170417