US20160281137A1 - Methods and compositions for targeted nucleic acid sequencing - Google Patents

Methods and compositions for targeted nucleic acid sequencing Download PDF

Info

Publication number
US20160281137A1
US20160281137A1 US15/174,923 US201615174923A US2016281137A1 US 20160281137 A1 US20160281137 A1 US 20160281137A1 US 201615174923 A US201615174923 A US 201615174923A US 2016281137 A1 US2016281137 A1 US 2016281137A1
Authority
US
United States
Prior art keywords
fragments
regions
targeted
nucleic acid
capture
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US15/174,923
Inventor
Mirna Jarosz
Michael Schnall-Levin
Serge Saxonov
Benjamin Hindson
Xinying Zheng
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
10X Genomics Inc
Original Assignee
10X Genomics Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 10X Genomics Inc filed Critical 10X Genomics Inc
Priority to US15/174,923 priority Critical patent/US20160281137A1/en
Assigned to 10X GENOMICS, INC. reassignment 10X GENOMICS, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: JAROSZ, MIRNA, SAXONOV, SERGE, HINDSON, BENJAMIN, SCHNALL-LEVIN, MICHAEL, ZHENG, Xinying
Publication of US20160281137A1 publication Critical patent/US20160281137A1/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6806Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6813Hybridisation assays
    • C12Q1/6834Enzymatic or biochemical coupling of nucleic acids to a solid phase
    • C12Q1/6837Enzymatic or biochemical coupling of nucleic acids to a solid phase using probe arrays or probe chips
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • C12Q1/6874Methods for sequencing involving nucleic acid arrays, e.g. sequencing by hybridisation
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2535/00Reactions characterised by the assay type for determining the identity of a nucleotide base or a sequence of oligonucleotides
    • C12Q2535/122Massive parallel sequencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2537/00Reactions characterised by the reaction format or use of a specific feature
    • C12Q2537/10Reactions characterised by the reaction format or use of a specific feature the purpose or use of
    • C12Q2537/159Reduction of complexity, e.g. amplification of subsets, removing duplicated genomic regions
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2563/00Nucleic acid detection characterized by the use of physical, structural and functional properties
    • C12Q2563/179Nucleic acid detection characterized by the use of physical, structural and functional properties the label being a nucleic acid
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2565/00Nucleic acid analysis characterised by mode or means of detection
    • C12Q2565/50Detection characterised by immobilisation to a surface
    • C12Q2565/514Detection characterised by immobilisation to a surface characterised by the use of the arrayed oligonucleotides as identifier tags, e.g. universal addressable array, anti-tag or tag complement array

Definitions

  • exome sequencing offers advantages over whole genome sequencing: it is significantly less expensive, is more easily understood for functional interpretation, is significantly faster to analyze, makes very deep sequencing affordable, and results in a dataset that is easier to manage.
  • the present invention provides methods, systems and compositions for obtaining sequence information for targeted regions of the genome.
  • the present disclosure provides a method for sequencing one or more selected portions of a genome, the method generally including the steps of: (a) providing starting genomic material, (b) distributing individual nucleic acid molecules from the starting genomic material into discrete partitions such that each discrete partition contains a first individual nucleic acid molecule; (c) fragmenting the individual nucleic acid molecules in the discrete partitions to form a plurality of fragments, where each of the fragments further includes a barcode, and where fragments within a given discrete partition each include a common barcode, thereby associating each fragment with the individual nucleic acid molecule from which it is derived; (d) providing a population enriched for fragments including at least a portion of the one or more selected portions of the genome; (e) obtaining sequence information from the population, thereby sequencing one or more selected portions of a genome.
  • providing the population enriched for fragments including at least a portion of the one or more selected portions of the genome includes the steps of (i) hybridizing probes complementary to regions in or near the one or more selected portions of the genome to the fragments to form probe-fragment complexes; and (ii) capturing probe-fragment complexes to a surface of a solid support; thereby enriching the population with fragments including at least a portion of the one or more selected portions of the genome.
  • the solid support includes a bead.
  • the probes include binding moieties and the surface include capture moieties, and the probe-fragment complexes are captured on the surface through a reaction between the binding moieties and the capture moieties.
  • the capture moieties include streptavidin and the binding moieties include biotin.
  • the capture moieties comprise streptavidin magnetic beads and the binding moieties comprise biotinylated RNA library baits.
  • the methods of the invention include the use of capture moieties that are directed to whole or partial exome capture, panel capture, targeted exon capture, anchored exome capture, or tiled genomic region capture.
  • the methods disclosed herein include an obtaining step that includes a sequencing reaction.
  • the sequencing reaction is a short read-length sequencing reaction or a long read-length sequencing reaction.
  • the sequencing reaction provides sequence information on less than 90%, less than 75%, or less than 50% of the starting genomic material.
  • the methods described herein further include linking two or more of the individual nucleic acid molecules in an inferred contig based upon overlapping sequences of the isolated fragments, wherein the inferred contig comprises a length N50 of at least 10 kb, 20 kb, 40 kb, 50 kb, 100 kb, or 200 kb.
  • the methods disclosed herein further include linking two or more of the individual nucleic acid molecules in a phase block based upon overlapping phased variants within the sequences of the isolated fragments, where the phase block comprises a length N50 of at least 10 kb, of at least 20 kb, of at least 40 kb, of at least 50 kb, of at least 100 kb or of at least 200 kb.
  • the methods disclosed herein provide sequence information from selected portions of the genome that together cover an exome.
  • the individual nucleic acid molecules in the discrete partitions include genomic DNA from a single cell.
  • the discrete partitions each include genomic DNA from a different chromosome.
  • the present disclosure provides a method of obtaining sequence information from one or more targeted portions of a genomic sample.
  • a method includes without limitation the steps of: (a) providing individual first nucleic acid fragment molecules of the genomic sample in discrete partitions; (b) fragmenting the individual first nucleic acid fragment molecules within the discrete partitions to create a plurality of second fragments from each of the individual first nucleic acid fragment molecules; (c) attaching a common barcode sequence to the plurality of the second fragments within a discrete partition, such that each of the plurality of second fragments are attributable to the discrete partition in which they are contained; (d) applying a library of probes directed to the one or more targeted portions of the genomic sample to the second fragments; (e) conducting a sequencing reaction to identify sequences of the plurality of second fragments that hybridized to the library of probes, thereby obtaining sequence information from the one or more targeted portions of the genomic sample.
  • the library of probes are attached to binding moieties, and before the conducting step (e), the second fragments are captured on a surface comprising capture moieties through a reaction between the binding moieties and the capture moieties.
  • the second fragments are amplified before or after the second fragments are captured on the surface.
  • the binding moieties comprise biotin and the capture moieties comprise streptavidin.
  • the sequencing reaction is a short read, high accuracy sequencing reaction.
  • the second fragments are amplified such that the resultant amplification products are capable of forming partial or complete hairpin structures.
  • the present disclosure provides methods for obtaining sequence information from one or more targeted portions of a genomic sample while retaining molecular context.
  • Such methods include the steps of: (a) providing starting genomic material; (b) distributing individual nucleic acid molecules from the starting genomic material into discrete partitions such that each discrete partition contains a first individual nucleic acid molecule; (c) fragmenting the first individual nucleic acid molecules in the discrete partitions to form a plurality of fragments; (d) providing a population enriched for fragments that include at least a portion of the one or more selected portions of the genome; (e) obtaining sequence information from the population, thereby sequencing one or more targeted portions of the genomic sample while retaining molecular context.
  • the plurality of fragments are tagged with a barcode to associate each fragment with the discrete partition in which it was formed.
  • the individual nucleic acid molecules in step (b) are distributed such that molecular context of each first individual nucleic acid molecule is maintained.
  • the present disclosure provides methods of obtaining sequence information from one or more targeted portions of a genomic sample.
  • Such methods include without limitation steps of (a) providing individual nucleic acid molecules of the genomic sample; (b) fragmenting the individual nucleic acid molecules to form a plurality of fragments, where each of the fragments further includes a barcode, and where fragments from the same individual nucleic molecule have a common barcode, thereby associating each fragment with the individual nucleic acid molecule from which it is derived; (c) enriching the plurality of fragments for fragments containing the one or more targeted portions of the genomic sample; and (d) conducting a sequencing reaction to identify sequences of the enriched plurality of fragments, thereby obtaining sequence information from the one or more targeted portions of the genomic sample.
  • the enriching step including applying a library of probes directed to the one or more targeted portions of the genomic sample.
  • the library of probes are attached to binding moieties, and prior to the conducting step, the fragments are captured through a reaction between the binding moieties and the capture moieties.
  • the reaction between the binding moieties and the capture moieties immobilizes the fragments on a surface.
  • FIG. 1 provides a schematic illustration of identification and analysis of targeted genomic regions using conventional processes versus the processes and systems described herein.
  • FIG. 2A and FIG. 2B provide schematic illustrations of identification and analysis of targeted genomic regions using processes and systems described herein.
  • FIG. 3 illustrates a typical workflow for performing an assay to detect sequence information, using the methods and compositions disclosed herein.
  • FIG. 4 provides a schematic illustration of a process for combining a nucleic acid sample with beads and partitioning the nucleic acids and beads into discrete droplets.
  • FIG. 5 provides a schematic illustration of a process for barcoding and amplification of chromosomal nucleic acid fragments.
  • FIG. 6 provides a schematic illustration of the use of barcoding of chromosomal nucleic acid fragments in attributing sequence data to individual chromosomes.
  • FIG. 7 illustrates a general embodiment of a method of the invention.
  • FIG. 8 illustrates a general embodiment of a method of the invention.
  • the practice of the present invention may employ, unless otherwise indicated, conventional techniques and descriptions of organic chemistry, polymer technology, molecular biology (including recombinant techniques), cell biology, biochemistry, and immunology, which are within the skill of the art.
  • Such conventional techniques include polymer array synthesis, hybridization, ligation, phage display, and detection of hybridization using a label.
  • Specific illustrations of suitable techniques can be had by reference to the example herein below. However, other equivalent conventional procedures can, of course, also be used.
  • Such conventional techniques and descriptions can be found in standard laboratory manuals such as Genome Analysis: A Laboratory Manual Series (Vols.
  • compositions and methods include the recited elements, but not excluding others.
  • Consisting essentially of when used to define compositions and methods, shall mean excluding other elements of any essential significance to the composition or method.
  • Consisting of shall mean excluding more than trace elements of other ingredients for claimed compositions and substantial method steps. Embodiments defined by each of these transition terms are within the scope of this invention. Accordingly, it is intended that the methods and compositions can include additional steps and components (comprising) or alternatively including steps and compositions of no significance (consisting essentially of) or alternatively, intending only the stated method steps or compositions (consisting of).
  • compositions and systems useful for characterization of genetic material provide methods, compositions and systems useful for characterization of genetic material.
  • the methods, compositions and systems described herein provide genetic characterization of targeted regions of a genome, including without limitation particular chromosomes, regions of chromosomes, all exons (exomes), portions of exomes, specific genes, panels of genes (e.g., kinomes or other targeted gene panels), intronic regions, tiled portions of a genome, or any other chosen portion of a genome.
  • the methods and systems described herein accomplish targeted genomic sequencing by providing for the determination of the sequence of long individual nucleic acid molecules and/or the identification of direct molecular linkage as between two sequence segments separated by long stretches of sequence, which permit the identification and use of long range sequence information, but this sequencing information is obtained using methods that have the advantages of the extremely low sequencing error rates and high throughput of short read sequencing technologies.
  • the methods and systems described herein segment long nucleic acid molecules into smaller fragments that can be sequenced using high-throughput, higher accuracy short-read sequencing technologies, and that segmentation is accomplished in a manner that allows the sequence information derived from the smaller fragments to retain the original long range molecular sequence context, i.e., allowing the attribution of shorter sequence reads to originating longer individual nucleic acid molecules.
  • By attributing sequence reads to an originating longer nucleic acid molecule one can gain significant characterization information for that longer nucleic acid sequence that one cannot generally obtain from short sequence reads alone.
  • This long range molecular context is not only preserved through a sequencing process, but is also preserved through the targeted enrichment process used in targeted sequencing approaches described herein, where no other sequencing approach has shown this ability.
  • sequence information from smaller fragments will retain the original long range molecular sequence context through the use of a tagging procedure, including the addition of barcodes as described herein and known in the art.
  • fragments originating from the same original longer individual nucleic acid molecule will be tagged with a common barcode, such that any later sequence reads from those fragments can be attributed to that originating longer individual nucleic acid molecule.
  • Such barcodes can be added using any method known in the art, including addition of barcode sequences during amplification methods that amplify segments of the individual nucleic acid molecules as well as insertion of barcodes into the original individual nucleic acid molecules using transposons, including methods such as those described in Amini et al., Nature Genetics 46: 1343-1349 (2014) (advance online publication on Oct. 29, 2014), which is hereby incorporated by reference in its entirety for all purposes and in particular for all teachings related to adding adaptor and other oligonucleotides using transposons.
  • the resultant tagged fragments can be enriched using methods described herein such that the population of fragments represents targeted regions of the genome.
  • sequence reads from that population allows for targeted sequencing of select regions of the genome, and those sequence reads can also be attributed to the originating nucleic acid molecules, thus preserving the original long range molecular sequence context.
  • sequence reads can be obtained using any sequencing methods and platforms known in the art and described herein.
  • the methods and systems described herein can also provide other characterizations of genomic material, including without limitation haplotype phasing, identification of structural variations, and identifying copy number variations, as described in co-pending applications U.S. Ser. Nos. 14/752,589 and 14/752,602, both filed on Jun. 26, 2015), which are herein incorporated by reference in their entirety for all purposes and in particular for all written description, figures and working examples directed to characterization of genomic material.
  • nucleic acids 102 and 104 are illustrated, each having a number of regions of interest, e.g., region 106 and 108 in nucleic acid 102 , and regions 110 and 112 in nucleic acid 104 .
  • the regions of interest in each nucleic acid are linked within the same nucleic acid molecule, but may be relatively separated from each other, e.g., more than 1 kb apart, more than 5 kb apart, more than 10 kb apart, more than 20 kb apart, more than 30 kb apart, more than 40 kb apart, more than 50 kb apart, and in some cases, as much as 100 kb apart.
  • the regions may denote individual genes, gene groups, exons, or simply discrete and separate parts of the genome. Solely for ease of discussion, the regions shown in FIG. 1 will be referred to as exons 106 , 108 , 110 and 112 .
  • each nucleic acid 102 and 104 is separated into its own partition 114 and 116 , respectively.
  • these partitions are, in many cases, aqueous droplets in a water in oil emulsion.
  • portions of each fragment are copied in a manner that preserves the original molecular context of those fragments, e.g., as having originated from the same molecule. As shown, this is achieved through the inclusion in each copied fragment of a barcode sequence, e.g., barcode sequence “1” or “2” as illustrated, that is representative of the droplet into which the originating fragment was partitioned.
  • target enrichment steps may be applied to the libraries of barcoded sequence fragments in order to “pull down” the sequences associated with the desired targets.
  • These may include exon targeted pull downs, gene panel specific targeted pull downs, or the like.
  • a large number of targeted pull down kits that allow for the enriched separation of specific targeted regions of the genome are commercially available, such as the Agilent SureSelect exome pull down kits, and the like.
  • application of a targeted enrichment results in enriched, barcoded sequence library 118 .
  • the pulled down fragments within library 118 retain their original molecular context, e.g., through the retention of the barcode information, they may be reassembled into their original molecular contexts with embedded long range linkage information, e.g., with inferred linkage as between each of the assembled regions of interest 106 : 108 and 110 : 112 .
  • one may identify direct molecular linkage between two disparate targeted portions of the genome, e.g., two or more exons, and that direct molecular linkage may be used to identify structural variations and other genomic characteristics, as well as to identify the phase information as to the two or more exons, e.g. providing phased exons, including potentially an entire phased exome, or other phased targeted portions of a genome.
  • methods of the invention include steps as illustrated in FIG. 7 , which provides a schematic overview of methods of the invention discussed in further detail herein.
  • FIG. 9 provides a schematic overview of methods of the invention discussed in further detail herein.
  • the method outlined in FIG. 9 is an exemplary embodiment that may be altered or modified as needed and as described herein.
  • each partition will include a single individual nucleic acid molecule from a particular locus that is then fragmented or copied in such a way as to preserve the original molecular context of the fragments ( 702 ), usually by barcoding the fragments that are specific to the partition in which they are contained.
  • Each partition may in some examples include more than one nucleic acid, and will in some instances contain several hundred nucleic acid molecules—in situations in which multiple nucleic acids are within a partition, any particular locus of the genome will generally be represented by a single individual nucleic acid prior to barcoding.
  • oligonucleotides are the samples within the distinct partitions. Such oligonucleotides may comprise random sequences intended to randomly prime numerous different regions of the samples, or they may comprise a specific primer sequence targeted to prime upstream of a targeted region of the sample. In further examples, these oligonucleotides also contain a barcode sequence, such that the replication process also barcodes the resultant replicated fragment of the original sample nucleic acid. A particularly elegant process for use of these barcode oligonucleotides in amplifying and barcoding samples is described in detail in U.S. patent application Ser. Nos.
  • Extension reaction reagents e.g., DNA polymerase, nucleoside triphosphates, co-factors (e.g., Mg 2+ or Mn 2+ etc.), that are also contained in the partitions, then extend the primer sequence using the sample as a template, to produce a complementary fragment to the strand of the template to which the primer annealed, and the complementary fragment includes the oligonucleotide and its associated barcode sequence.
  • Annealing and extension of multiple primers to different portions of the sample can result in a large pool of overlapping complementary fragments of the sample, each possessing its own barcode sequence indicative of the partition in which it was created.
  • these complementary fragments may themselves be used as a template primed by the oligonucleotides present in the partition to produce a complement of the complement that again, includes the barcode sequence.
  • this replication process is configured such that when the first complement is duplicated, it produces two complementary sequences at or near its termini to allow the formation of a hairpin structure or partial hairpin structure, which reduces the ability of the molecule to be the basis for producing further iterative copies.
  • the barcoded fragments are then pooled ( 703 ).
  • Target enrichment techniques can then be applied ( 704 ) to “pull down” the targeted regions of interest.
  • Those targeted regions of interest are then sequenced ( 705 ) and the sequences of the fragments are attributed to their originating molecular context ( 706 ), such that the targeted regions of interest are both identified and also linked with that originating molecular context.
  • a unique feature of the methods and systems described herein and illustrated in FIG. 7 is that barcodes are attached to the fragments ( 702 ) prior to the targeted enrichment step ( 704 ).
  • An advantage of the methods and systems described herein is that attaching a partition- or sample-specific barcode to the copied fragments prior to enriching the fragments for targeted genomic regions preserves the original molecular context of those targeted regions, allowing them to be attributed to their original partition and thus their originating sample nucleic acid.
  • targeted genomic regions are enriched, isolated or separated, i.e., “pulled down,” for further analysis, particularly sequencing, using methods that include both chip-based and solution-based capture methods.
  • Such methods utilize probes that are complementary to the genomic regions of interest or to regions near or adjacent to the genomic regions of interest.
  • hybrid (or chip-based) capture microarrays containing capture probes (usually single-stranded oligonucleotides) with sequences that taken together cover the region of interest are fixed to a surface.
  • Genomic DNA is fragmented and may further undergo processing such as end-repair to produce blunt ends and/or addition of additional features such as universal priming sequences. These fragments are hybridized to the probes on the microarray.
  • Unhybridized fragments are washed away and the desired fragments are eluted or otherwise processed on the surface for sequencing or other analysis, and thus the population of fragments remaining on the surface is enriched for fragments containing the targeted regions of interest (e.g., the regions comprising the sequences complementary to those contained in the capture probes).
  • the enriched population of fragments may further be amplified using any amplification technologies known in the art.
  • Additional methods of targeted genomic region capture include solution-based methods, in which genomic DNA fragments are hybridized to oligonucleotide probes.
  • the oligonucleotide probes are often referred to as “baits”.
  • These baits are generally attached to a capture molecule, including without limitation a biotin molecule.
  • the baits are complementary to targeted regions of the genome (or to regions near or adjacent to the targeted regions of interest), such that upon application to genomic DNA fragments, the baits hybridize to the fragments, and the capture molecule (e.g., biotin) is then used to selectively pull down the targeted regions of interest (for example, with magnetic streptavidin beads) to thereby enrich the resultant population of fragments with those containing the targeted regions of interest.
  • capture protocols can include any of those known in the art, including without limitation any of the exome capture protocols and kits produced by Roche/NimbleGen, IIlumina, and Agilent.
  • Capture of targeted genomic regions for use in the methods and systems described herein are not limited to whole exomes, and can include any one or combination of partial exomes, genes, panels of genes, introns, and combinations of introns and exons.
  • the procedure for capture of these different types of targeted regions follows the general method of using baits to pull down fragments containing the targeted regions of interest.
  • the design of the baits, particularly the oligonucleotide probe portions of the baits that hybridize to or near to the targeted regions of interest, will in part depend on the type of targeted region to be captured.
  • the baits can be designed to capture that part of the exome.
  • the specific identities of the portions of the exome that are needed are known, and the library of baits comprises oligonucleotides that are complementary to those identified portions or to regions that are near or adjacent to those portions.
  • Such examples can further include without limitation capture of specific genes and/or panels of genes, or identified portions of the exome known to be associated with a particular phenotype, such as a disorder or disease.
  • the baits used can be subsets of a library directed to a whole genome, and that subset can be chosen randomly or through any kind of intelligent design in which the library of baits is selected or enriched for probes that are complementary to the targeted subsections of the genome or exome.
  • the targeted regions can be captured using baits that comprise oligonucleotide probes that are complementary to the whole or part of a targeted region, or the oligonucleotide probes may be complementary to another region, e.g., an intronic region, that is near the targeted region or adjacent to the targeted region.
  • a genomic sequence 201 comprises exonic regions 202 and 203 .
  • Those exonic regions can be captured by directing the baits to one or more of the intronic sequences nearby (for example intronic region 204 and/or 205 to capture exonic region 202 and intronic region 206 for capture of exonic region 203 ).
  • a population of fragments comprising exonic regions 202 or 203 can be captured through the use of baits complementary to intronic regions 204 and/or 205 and 206 .
  • the intronic region used as an intronic bait for the nearby exonic region can be adjacent to the exonic region of interest—i.e., there is no gap between the intronic region and the targeted exonic region.
  • the intronic region used to capture the nearby exonic region may be near enough so that both regions are likely to be in the same fragment, but there is a gap of one or more nucleotides between the exonic region and the intronic region (for example 202 and 205 in FIG. 2A ).
  • the baits are designed to be complementary to portions of the genome at particular ranges or distances.
  • the library of baits can be designed to cover sequences every 5 kilobases (kb) along the genome, such that applying this library of baits to a fragmented genomic sample will capture only a certain subset of the genome—i.e., those regions that are contained in fragments containing complementary sequences to the baits.
  • the baits can be designed based on a reference sequence, such as a human genome reference sequence.
  • the tiled library of baits is designed to capture regions every 1, 2, 5, 10, 15, 20, 25, 50, 100, 200, 250, 500, 750, 1000, or 10000 kilobases of a genome.
  • the tiled library of baits is designed to capture a mixture of distances—that mixture can be a random mixture of distances or intelligently designed such that a specific portion or percentage of the genome is captured.
  • tiling methods of capture will capture both intronic and exonic regions of the genome for further analysis such as sequencing. Any of the tiling or other intronic baiting methods described herein provide a way to link sequence information from exons widely separated by long intervening intronic regions.
  • the tiling or other capture methods described herein will capture about 5%, 10%, 15%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95% of the whole genome. In still further examples, the capture methods described herein capture about 1-10%, 5-20%, 10-30%, 15-40%, 20-50%, 25-60%, 30-70%, 35-80%, 40-90%, or 45-95% of the whole genome.
  • sample preparation methods including methods of fragmenting, amplifying, partitioning, and otherwise processing genomic DNA, can lead to biases or lower coverage of certain regions of a genome.
  • biases or lowered coverage can be compensated for in the methods and systems disclosed herein by altering the concentration or genomic locations of baits used to capture targeted regions of the genome.
  • the library of baits can be altered to increase the concentration of baits directed to those regions of low coverage—in other words, the population of baits used may be “spiked” to ensure that a sufficient number of fragments containing targeted regions of the genome in those low coverage areas are obtained in the final population of fragments to be sequenced.
  • Such spiking of baits may be conducted in commercially available whole exome kits, such that a custom library of baits directed toward the lower coverage regions are added to off-the-shelf exome capture kits.
  • baits can be design to target a region of the genome that is very close to the region of interest, but has more favorable coverage, as is also discussed in further detail herein and embodiments of which are schematically illustrated in FIG. 2 .
  • the library of baits used in methods of the present invention is a product of informed design that fulfills one or more characteristics as further described herein.
  • This informed design includes instances in which the library of baits is directed to informative single nucleotide polymorphisms (SNPs).
  • SNPs single nucleotide polymorphisms
  • the term “informative SNPs” as used herein refers to SNPs that are heterozygous.
  • the library of baits in some examples is designed to contain a plurality of probes that are directed to regions of the genomic sample that contain informative SNPs. By “directed to” as used herein is meant that the probes contain sequences that are complementary to sequences that encompass the SNPs.
  • the library of baits is designed to contain probes directed to SNPs that are at predetermined distances from the boundary of an exon and an intron.
  • the library of baits includes probes that tile across such regions at a predetermined distance and/or that hybridize to the first informative SNP within the next nearest intron or exon.
  • An advantage of the methods and systems described herein is that the targeted regions that are captured are processed prior to capture in such a way that even after the steps of capturing the targeted regions and conducting sequencing analyses, the original molecular context of those targeted regions is retained.
  • the ability to attribute specific targeted regions to their original molecular context (which can include the original chromosome or chromosomal region from which they are derived and/or the location of particular targeted regions in relation to each other within the full genome) provides a way to obtain sequence information from regions of the genome that are otherwise poorly mapped or have poor coverage using traditional sequencing techniques.
  • nucleic acid molecule 207 contains two exons (shaded bars) interrupted by a long intronic region ( 208 ).
  • Generally used sequencing technologies would be unable to span the distance across the intron to provide information on the relationship of the two exons.
  • the individual nucleic acid molecule 207 is distributed into its own discrete partition 209 and then fragmented such that different fragments contain different portions of the exons and the intron. Because each of those fragments is tagged such that any sequence information obtained from the fragments is then attributable to the discrete partition in which it was generated, each fragment is thus also attributable to the individual nucleic acid molecule 207 from which it was derived.
  • fragments from different partitions are combined together. Targeted capture methods can then be used to enrich the population of fragments that undergoes further analysis, such as sequencing, with fragments containing the targeted region of interest. In the example illustrated in FIG.
  • the baits used will enrich the population of fragments to capture only those containing a portion of one of the two exons and/or part of the intervening intron, but regions outside of the exons and intron (such as 209 and 210 ) would not be captured.
  • regions outside of the exons and intron such as 209 and 210 .
  • Short read, high accuracy sequencing technologies can then be used to identify the sequences of this enriched population of fragments, and because each of the fragments is tagged and thus attributable to its original molecular context, i.e., its original individual nucleic acid molecule, the short read sequences can provide information that spans over the long length of the intervening intron to provide information on the relationship between the two exons.
  • individual molecular context refers to sequence context beyond the specific sequence read, e.g., relation to adjacent or proximal sequences, that are not included within the sequence read itself, and as such, will typically be such that they would not be included in whole or in part in a short sequence read, e.g., a read of about 150 bases, or about 300 bases for paired reads.
  • a short sequence read e.g., a read of about 150 bases, or about 300 bases for paired reads.
  • the methods and systems provide long range sequence context for short sequence reads.
  • Such long range context includes relationship or linkage of a given sequence read to sequence reads that are within a distance of each other of longer than 1 kb, longer than 5 kb, longer than 10 kb, longer than 15 kb, longer than 20 kb, longer than 30 kb, longer than 40 kb, longer than 50 kb, longer than 60 kb, longer than 70 kb, longer than 80 kb, longer than 90 kb or even longer than 100 kb, or longer.
  • one can also derive the phasing information of variants within that individual molecular context e.g., variants on a particular long molecule will be, by definition commonly phased.
  • Sequence context can include mapping or providing linkage of fragments across different (generally on the kilobase scale) ranges of full genomic sequence.
  • mapping the short sequence reads to the individual longer molecules or contigs of linked molecules as well as long range sequencing of large portions of the longer individual molecules, e.g., having contiguous determined sequences of individual molecules where such determined sequences are longer than 1 kb, longer than 5 kb, longer than 10 kb, longer than 15 kb, longer than 20 kb, longer than 30 kb, longer than 40 kb, longer than 50 kb, longer than 60 kb, longer than 70 kb, longer than 80 kb, longer than 90 kb or even longer than 100 kb.
  • the attribution of short sequences to longer nucleic acids may include both mapping of short sequences against longer nucleic acid stretches to provide high level sequence context, as well as providing assembled sequences from the short sequences through these longer nucleic acids.
  • long range sequence context associated with long individual molecules
  • having such long range sequence context also allows one to infer even longer range sequence context.
  • long range molecular context described above, one can identify overlapping variant portions, e.g., phased variants, translocated sequences, etc., among long sequences from different originating molecules, allowing the inferred linkage between those molecules.
  • inferred linkages or molecular contexts are referred to herein as “inferred contigs”.
  • phase blocks may represent commonly phased sequences, e.g., where by virtue of overlapping phased variants, one can infer a phased contig of substantially greater length than the individual originating molecules.
  • phase blocks are referred to herein as “phase blocks”.
  • inferred contig or phase blocks By starting with longer single molecule reads (e.g., the “long virtual single molecule reads” discussed above), one can derive longer inferred contigs or phase blocks than would otherwise be attainable using short read sequencing technologies or other approaches to phased sequencing. See, e.g., published U.S. Patent Application No. 2013-0157870.
  • N50 where the sum of the block lengths that are greater than the stated N50 number is 50% of the sum of all block lengths
  • inferred contig or phase block lengths having an N50 of at least about 100 kb, at least about 150 kb, at least about 200 kb, and in many cases, at least about 250 kb, at least about 300 kb, at least about 350 kb, at least about 400 kb, and in some cases, at least about 500 kb or more, are attained.
  • maximum phase block lengths in excess of 200 kb, in excess of 300 kb, in excess of 400 kb, in excess of 500 kb, in excess of 1 Mb, or even in excess of 2 Mb may be obtained.
  • the methods and systems described herein provide for the compartmentalization, depositing or partitioning of sample nucleic acids, or fragments thereof, into discrete compartments or partitions (referred to interchangeably herein as partitions), where each partition maintains separation of its own contents from the contents of other partitions.
  • Unique identifiers may be previously, subsequently or concurrently delivered to the partitions that hold the compartmentalized or partitioned sample nucleic acids, in order to allow for the later attribution of the characteristics, e.g., nucleic acid sequence information, to the sample nucleic acids included within a particular compartment, and particularly to relatively long stretches of contiguous sample nucleic acids that may be originally deposited into the partitions.
  • sample nucleic acids utilized in the methods described herein typically represent a number of overlapping portions of the overall sample to be analyzed, e.g., an entire chromosome, exome, or other large genomic portion. These sample nucleic acids may include whole genomes, individual chromosomes, exomes, amplicons, or any of a variety of different nucleic acids of interest.
  • the sample nucleic acids are typically partitioned such that the nucleic acids are present in the partitions in relatively long fragments or stretches of contiguous nucleic acid molecules.
  • these fragments of the sample nucleic acids may be longer than 1 kb, longer than 5 kb, longer than 10 kb, longer than 15 kb, longer than 20 kb, longer than 30 kb, longer than 40 kb, longer than 50 kb, longer than 60 kb, longer than 70 kb, longer than 80 kb, longer than 90 kb or even longer than 100 kb, which permits the longer range molecular context described above.
  • the sample nucleic acids are also typically partitioned at a level whereby a given partition has a very low probability of including two overlapping fragments of the starting sample nucleic acid. This is typically accomplished by providing the sample nucleic acid at a low input amount and/or concentration during the partitioning process. As a result, in preferred cases, a given partition may include a number of long, but non-overlapping fragments of the starting sample nucleic acids.
  • the sample nucleic acids in the different partitions are then associated with unique identifiers, where for any given partition, nucleic acids contained therein possess the same unique identifier, but where different partitions may include different unique identifiers.
  • the partitioning step allocates the sample components into very small volume partitions or droplets, it will be appreciated that in order to achieve the desired allocation as set forth above, one need not conduct substantial dilution of the sample, as would be required in higher volume processes, e.g., in tubes, or wells of a multiwell plate. Further, because the systems described herein employ such high levels of barcode diversity, one can allocate diverse barcodes among higher numbers of genomic equivalents, as provided above. In particular, previously described, multiwell plate approaches (see, e.g., U.S. Published Application No.
  • 2013-0079231 and 2013-0157870 typically only operate with a hundred to a few hundred different barcode sequences, and employ a limiting dilution process of their sample in order to be able to attribute barcodes to different cells/nucleic acids. As such, they will generally operate with far fewer than 100 cells, which would typically provide a ratio of genomes:(barcode type) on the order of 1:10, and certainly well above 1:100.
  • diverse barcode types can operate at genome:(barcode type) ratios that are on the order of 1:50 or less, 1:100 or less, 1:1000 or less, or even smaller ratios, while also allowing for loading higher numbers of genomes (e.g., on the order of greater than 100 genomes per assay, greater than 500 genomes per assay, 1000 genomes per assay, or even more) while still providing for far improved barcode diversity per genome.
  • the sample is combined with a set of oligonucleotide tags that are releasably-attached to beads prior to the partitioning step. That combination can then lead to barcoding of nucleic acids in the samples using methods known in the art and described herein.
  • amplification methods are used to add barcodes to the resultant amplification products, which in some examples contain smaller segments (fragments) of the full originating nucleic acid molecule from which they are derived.
  • methods using transposons are utilized as described in Amini et al, Nature Genetics 46: 1343-1349 (2014) (advance online publication on Oct.
  • methods of attaching barcodes can include the use of nicking enzymes or polymerases and/or invasive probes such as recA to produce gaps along double stranded sample nucleic acids—barcodes can then be inserted into those gaps.
  • the oligonucleotide tags may comprise at least a first and second region.
  • the first region may be a barcode region that, as between oligonucleotides within a given partition, may be substantially the same barcode sequence, but as between different partitions, may and, in most cases is a different barcode sequence.
  • the second region may be an N-mer (either a random N-mer or an N-mer designed to target a particular sequence) that can be used to prime the nucleic acids within the sample within the partitions.
  • the N-mer may be designed to target a particular chromosome (e.g., chromosome 1, 13, 18, or 21), or region of a chromosome, e.g., an exome or other targeted region.
  • the N-mer may be designed to target a particular gene or genetic region, such as a gene or region associated with a disease or disorder (e.g., cancer).
  • an amplification reaction may be conducted using the second N-mer to prime the nucleic acid sample at different places along the length of the nucleic acid.
  • each partition may contain amplified products of the nucleic acid that are attached to an identical or near-identical barcode, and that may represent overlapping, smaller fragments of the nucleic acids in each partition.
  • the bar-code can serve as a marker that signifies that a set of nucleic acids originated from the same partition, and thus potentially also originated from the same strand of nucleic acid.
  • the nucleic acids may be pooled, sequenced, and aligned using a sequencing algorithm.
  • shorter sequence reads may, by virtue of their associated barcode sequences, be aligned and attributed to a single, long fragment of the sample nucleic acid, all of the identified variants on that sequence can be attributed to a single originating fragment and single originating chromosome. Further, by aligning multiple co-located variants across multiple long fragments, one can further characterize that chromosomal contribution. Accordingly, conclusions regarding the phasing of particular genetic variants may then be drawn, as can analyses across long ranges of genomic sequence—for example, identification of sequence information across stretches of poorly characterized regions of the genome. Such information may also be useful for identifying haplotypes, which are generally a specified set of genetic variants that reside on the same nucleic acid strand or on different nucleic acid strands. Copy number variations may also be identified in this manner.
  • the described methods and systems provide significant advantages over current nucleic acid sequencing technologies and their associated sample preparation methods.
  • Ensemble sample preparation and sequencing methods are predisposed towards primarily identifying and characterizing the majority constituents in the sample, and are not designed to identify and characterize minority constituents, e.g., genetic material contributed by one chromosome, or by one or a few cells, or fragmented tumor cell DNA molecule circulating in the bloodstream, that constitute a small percentage of the total DNA in the extracted sample.
  • the described methods and systems also provide a significant advantage for detecting populations that are present within a larger sample.
  • the methods disclosed herein are also useful for providing sequence information over regions of the genome that are poorly characterized or are poorly represented in a population of nucleic acid targets due to biases introduced during sample preparation.
  • the use of the barcoding technique disclosed herein confers the unique capability of providing individual molecular context for a given set of genetic markers, i.e., attributing a given set of genetic markers (as opposed to a single marker) to individual sample nucleic acid molecules, and through variant coordinated assembly, to provide a broader or even longer range inferred individual molecular context, among multiple sample nucleic acid molecules, and/or to a specific chromosome.
  • These genetic markers may include specific genetic loci, e.g., variants, such as SNPs, or they may include short sequences.
  • the use of barcoding confers the additional advantages of facilitating the ability to discriminate between minority constituents and majority constituents of the total nucleic acid population extracted from the sample, e.g.
  • implementation in a microfluidics format confers the ability to work with extremely small sample volumes and low input quantities of DNA, as well as the ability to rapidly process large numbers of sample partitions (droplets) to facilitate genome-wide tagging.
  • an advantage of the methods and systems described herein is that they can achieve the desired results through the use of ubiquitously available, short read sequencing technologies.
  • Such technologies have the advantages of being readily available and widely dispersed within the research community, with protocols and reagent systems that are well characterized and highly effective.
  • These short read sequencing technologies include those available from, e.g., IIlumina, inc. (GXII, NextSeq, MiSeq, HiSeq, X10), Ion Torrent division of Thermo-Fisher (Ion Proton and Ion PGM), pyrosequencing methods, as well as others.
  • the methods and systems described herein utilize these short read sequencing technologies and do so with their associated low error rates.
  • the methods and systems described herein achieve the desired individual molecular readlengths or context, as described above, but with individual sequencing reads, excluding mate pair extensions, that are shorter than 1000 bp, shorter than 500 bp, shorter than 300 bp, shorter than 200 bp, shorter than 150 bp or even shorter; and with sequencing error rates for such individual molecular readlengths that are less than 5%, less than 1%, less than 0.5%, less than 0.1%, less than 0.05%, less than 0.01%, less than 0.005%, or even less than 0.001%.
  • the methods and systems described in the disclosure provide for depositing or partitioning individual samples (e.g., nucleic acids) into discrete partitions, where each partition maintains separation of its own contents from the contents in other partitions.
  • the partitions refer to containers or vessels that may include a variety of different forms, e.g., wells, tubes, micro or nanowells, through holes, or the like. In preferred aspects, however, the partitions are flowable within fluid streams. These vessels may be comprised of, e.g., microcapsules or micro-vesicles that have an outer barrier surrounding an inner fluid center or core, or they may be a porous matrix that is capable of entraining and/or retaining materials within its matrix.
  • these partitions may comprise droplets of aqueous fluid within a non-aqueous continuous phase, e.g., an oil phase.
  • a non-aqueous continuous phase e.g., an oil phase.
  • a variety of different vessels are described in, for example, U.S. patent application Ser. No. 13/966,150, filed Aug. 13, 2013.
  • emulsion systems for creating stable droplets in non-aqueous or oil continuous phases are described in detail in, e.g., Published U.S. Patent Application No. 2010-0105112.
  • microfluidic channel networks are particularly suited for generating partitions as described herein. Examples of such microfluidic devices include those described in detail in U.S. patent application Ser. No. 14/682,952, filed Apr.
  • partitioning of sample materials into discrete partitions may generally be accomplished by flowing an aqueous, sample containing stream, into a junction into which is also flowing a non-aqueous stream of partitioning fluid, e.g., a fluorinated oil, such that aqueous droplets are created within the flowing stream partitioning fluid, where such droplets include the sample materials.
  • partitions e.g., droplets
  • the partitions also typically include co-partitioned barcode oligonucleotides.
  • the relative amount of sample materials within any particular partition may be adjusted by controlling a variety of different parameters of the system, including, for example, the concentration of sample in the aqueous stream, the flow rate of the aqueous stream and/or the non-aqueous stream, and the like.
  • the partitions described herein are often characterized by having extremely small volumes.
  • the droplets may have overall volumes that are less than 1000 pL, less than 900 pL, less than 800 pL, less than 700 pL, less than 600 pL, less than 500 pL, less than 400 pL, less than 300 pL, less than 200 pL, less than 100 pL, less than 50 pL, less than 20 pL, less than 10 pL, or even less than 1 pL.
  • the sample fluid volume within the partitions may be less than 90% of the above described volumes, less than 80%, less than 70%, less than 60%, less than 50%, less than 40%, less than 30%, less than 20%, or even less than 10% the above described volumes.
  • the use of low reaction volume partitions is particularly advantageous in performing reactions with very small amounts of starting reagents, e.g., input nucleic acids.
  • the sample nucleic acids within partitions are generally provided with unique identifiers such that, upon characterization of those nucleic acids they may be attributed as having been derived from their respective origins. Accordingly, the sample nucleic acids are typically co-partitioned with the unique identifiers (e.g., barcode sequences).
  • the unique identifiers are provided in the form of oligonucleotides that comprise nucleic acid barcode sequences that may be attached to those samples.
  • the oligonucleotides are partitioned such that as between oligonucleotides in a given partition, the nucleic acid barcode sequences contained therein are the same, but as between different partitions, the oligonucleotides can, and preferably have differing barcode sequences. In preferred aspects, only one nucleic acid barcode sequence will be associated with a given partition, although in some cases, two or more different barcode sequences may be present.
  • the nucleic acid barcode sequences will typically include from 6 to about 20 or more nucleotides within the sequence of the oligonucleotides. These nucleotides may be completely contiguous, i.e., in a single stretch of adjacent nucleotides, or they may be separated into two or more separate subsequences that are separated by one or more nucleotides. Typically, separated subsequences may typically be from about 4 to about 16 nucleotides in length.
  • the co-partitioned oligonucleotides also typically comprise other functional sequences useful in the processing of the partitioned nucleic acids. These sequences include, e.g., targeted or random/universal amplification primer sequences for amplifying the genomic DNA from the individual nucleic acids within the partitions while attaching the associated barcode sequences, sequencing primers, hybridization or probing sequences, e.g., for identification of presence of the sequences, or for pulling down barcoded nucleic acids, or any of a number of other potential functional sequences.
  • sequences include, e.g., targeted or random/universal amplification primer sequences for amplifying the genomic DNA from the individual nucleic acids within the partitions while attaching the associated barcode sequences, sequencing primers, hybridization or probing sequences, e.g., for identification of presence of the sequences, or for pulling down barcoded nucleic acids, or any of a number of other potential functional sequences.
  • beads are provided that each may include large numbers of the above described oligonucleotides releasably attached to the beads, where all of the oligonucleotides attached to a particular bead may include the same nucleic acid barcode sequence, but where a large number of diverse barcode sequences may be represented across the population of beads used.
  • the population of beads may provide a diverse barcode sequence library that may include at least 1000 different barcode sequences, at least 10,000 different barcode sequences, at least 100,000 different barcode sequences, or in some cases, at least 1,000,000 different barcode sequences.
  • each bead may typically be provided with large numbers of oligonucleotide molecules attached.
  • the number of molecules of oligonucleotides including the barcode sequence on an individual bead may be at least bout 10,000 oligonucleotides, at least 100,000 oligonucleotide molecules, at least 1,000,000 oligonucleotide molecules, at least 100,000,000 oligonucleotide molecules, and in some cases at least 1 billion oligonucleotide molecules.
  • the oligonucleotides may be releasable from the beads upon the application of a particular stimulus to the beads.
  • the stimulus may be a photo-stimulus, e.g., through cleavage of a photo-labile linkage that may release the oligonucleotides.
  • a thermal stimulus may be used, where elevation of the temperature of the beads environment may result in cleavage of a linkage or other release of the oligonucleotides form the beads.
  • a chemical stimulus may be used that cleaves a linkage of the oligonucleotides to the beads, or otherwise may result in release of the oligonucleotides from the beads.
  • the beads including the attached oligonucleotides may be co-partitioned with the individual samples, such that a single bead and a single sample are contained within an individual partition.
  • the flows and channel architectures are controlled as to ensure a desired number of singly occupied partitions, less than a certain level of unoccupied partitions and less than a certain level of multiply occupied partitions.
  • FIG. 3 illustrates one particular example method for barcoding and subsequently sequencing a sample nucleic acid, particularly for use for a copy number variation or haplotype assay.
  • a sample comprising nucleic acid may be obtained from a source, 300 , and a set of barcoded beads may also be obtained, 310 .
  • the beads are preferably linked to oligonucleotides containing one or more barcode sequences, as well as a primer, such as a random N-mer or other primer.
  • the barcode sequences are releasable from the barcoded beads, e.g., through cleavage of a linkage between the barcode and the bead or through degradation of the underlying bead to release the barcode, or a combination of the two.
  • the barcoded beads can be degraded or dissolved by an agent, such as a reducing agent to release the barcode sequences.
  • an agent such as a reducing agent to release the barcode sequences.
  • a low quantity of the sample comprising nucleic acid, 305 , barcoded beads, 315 , and optionally other reagents, e.g., a reducing agent, 320 are combined and subject to partitioning.
  • such partitioning may involve introducing the components to a droplet generation system, such as a microfluidic device, 325 .
  • a droplet generation system such as a microfluidic device, 325 .
  • a water-in-oil emulsion 330 may be formed, wherein the emulsion contains aqueous droplets that contain sample nucleic acid, 305 , reducing agent, 320 , and barcoded beads, 315 .
  • the reducing agent may dissolve or degrade the barcoded beads, thereby releasing the oligonucleotides with the barcodes and random N-mers from the beads within the droplets, 335 .
  • the random N-mers may then prime different regions of the sample nucleic acid, resulting in amplified copies of the sample after amplification, wherein each copy is tagged with a barcode sequence, 340 .
  • each droplet contains a set of oligonucleotides that contain identical barcode sequences and different random N-mer sequences.
  • additional sequences e.g., sequences that aid in particular sequencing methods, additional barcodes, etc.
  • sequences e.g., sequences that aid in particular sequencing methods, additional barcodes, etc.
  • sequences e.g., sequences that aid in particular sequencing methods, additional barcodes, etc.
  • Sequencing may then be performed, 355 , and an algorithm applied to interpret the sequencing data, 360 .
  • Sequencing algorithms are generally capable, for example, of performing analysis of barcodes to align sequencing reads and/or identify the sample from which a particular sequence read belongs. In addition, and as is described herein, these algorithms may also further be used to attribute the sequences of the copies to their originating molecular context.
  • FIG. 4 An example of a microfluidic channel structure for co-partitioning samples and beads comprising barcode oligonucleotides is schematically illustrated in FIG. 4 . As shown, channel segments 402 , 404 , 406 , 408 and 410 are provided in fluid communication at channel junction 412 . An aqueous stream comprising the individual samples 414 is flowed through channel segment 402 toward channel junction 412 . As described elsewhere herein, these samples may be suspended within an aqueous fluid prior to the partitioning process.
  • an aqueous stream comprising the barcode carrying beads 416 is flowed through channel segment 404 toward channel junction 412 .
  • a non-aqueous partitioning fluid is introduced into channel junction 412 from each of side channels 406 and 408 , and the combined streams are flowed into outlet channel 410 .
  • the two combined aqueous streams from channel segments 402 and 404 are combined, and partitioned into droplets 418 , that include co-partitioned samples 414 and beads 416 .
  • each of the fluids combining at channel junction 412 can optimize the combination and partitioning to achieve a desired occupancy level of beads, samples or both, within the partitions 418 that are generated.
  • reagents may be co-partitioned along with the samples and beads, including, for example, chemical stimuli, nucleic acid extension, transcription, and/or amplification reagents such as polymerases, reverse transcriptases, nucleoside triphosphates or NTP analogues, primer sequences and additional cofactors such as divalent metal ions used in such reactions, ligation reaction reagents, such as ligase enzymes and ligation sequences, dyes, labels, or other tagging reagents.
  • chemical stimuli such as nucleic acid extension, transcription, and/or amplification reagents such as polymerases, reverse transcriptases, nucleoside triphosphates or NTP analogues, primer sequences and additional cofactors such as divalent metal ions used in such reactions, ligation reaction reagents, such as ligase enzymes and ligation sequences, dyes, labels, or other tagging reagents.
  • nucleic acid extension such as
  • the oligonucleotides disposed upon the bead may be used to barcode and amplify the partitioned samples.
  • a particularly elegant process for use of these barcode oligonucleotides in amplifying and barcoding samples is described in detail in U.S. patent application Ser. Nos. 14/316,383, 14/316,398, 14/316,416, 14/316,431, 14/316,447, 14/316,463, all filed Jun. 26, 2014, the full disclosures of which are hereby incorporated by reference in their entireties.
  • the oligonucleotides present on the beads that are co-partitioned with the samples and released from their beads into the partition with the samples.
  • the oligonucleotides typically include, along with the barcode sequence, a primer sequence at its 5′ end.
  • This primer sequence may be random or structured. Random primer sequences are generally intended to randomly prime numerous different regions of the samples. Structured primer sequences can include a range of different structures including defined sequences targeted to prime upstream of a specific targeted region of the sample as well as primers that have some sort of partially defined structure, including without limitation primers containing a percentage of specific bases (such as a percentage of GC N-mers), primers containing partially or wholly degenerate sequences, and/or primers containing sequences that are partially random and partially structured in accordance with any of the description herein. As will be appreciated, any one or more of the above types of random and structured primers may be included in oligonucleotides in any combination.
  • the primer portion of the oligonucleotide can anneal to a complementary region of the sample.
  • Extension reaction reagents e.g., DNA polymerase, nucleoside triphosphates, co-factors (e.g., Mg2+ or Mn2+ etc.), that are also co-partitioned with the samples and beads, then extend the primer sequence using the sample as a template, to produce a complementary fragment to the strand of the template to which the primer annealed, with complementary fragment includes the oligonucleotide and its associated barcode sequence.
  • Annealing and extension of multiple primers to different portions of the sample may result in a large pool of overlapping complementary fragments of the sample, each possessing its own barcode sequence indicative of the partition in which it was created.
  • these complementary fragments may themselves be used as a template primed by the oligonucleotides present in the partition to produce a complement of the complement that again, includes the barcode sequence.
  • this replication process is configured such that when the first complement is duplicated, it produces two complementary sequences at or near its termini, to allow the formation of a hairpin structure or partial hairpin structure, which reduces the ability of the molecule to be the basis for producing further iterative copies. A schematic illustration of one example of this is shown in FIG. 5 .
  • oligonucleotides that include a barcode sequence are co-partitioned in, e.g., a droplet 502 in an emulsion, along with a sample nucleic acid 504 .
  • the oligonucleotides 508 may be provided on a bead 506 that is co-partitioned with the sample nucleic acid 504 , which oligonucleotides are preferably releasable from the bead 506 , as shown in panel A.
  • the oligonucleotides 508 include a barcode sequence 512 , in addition to one or more functional sequences, e.g., sequences 510 , 514 and 516 .
  • oligonucleotide 508 is shown as comprising barcode sequence 512 , as well as sequence 510 that may function as an attachment or immobilization sequence for a given sequencing system, e.g., a P5 sequence used for attachment in flow cells of an Illumina Hiseq or Miseq system.
  • the oligonucleotides also include a primer sequence 516 , which may include a random or targeted N-mer for priming replication of portions of the sample nucleic acid 504 .
  • oligonucleotide 508 Also included within oligonucleotide 508 is a sequence 514 which may provide a sequencing priming region, such as a “read1” or R1 priming region, that is used to prime polymerase mediated, template directed sequencing by synthesis reactions in sequencing systems.
  • a sequencing priming region such as a “read1” or R1 priming region
  • the barcode sequence 512 , immobilization sequence 510 and R1 sequence 514 may be common to all of the oligonucleotides attached to a given bead.
  • the primer sequence 516 may vary for random N-mer primers, or may be common to the oligonucleotides on a given bead for certain targeted applications.
  • the oligonucleotides are able to prime the sample nucleic acid as shown in panel B, which allows for extension of the oligonucleotides 508 and 508 a using polymerase enzymes and other extension reagents also co-portioned with the bead 506 and sample nucleic acid 504 .
  • panel C following extension of the oligonucleotides that, for random N-mer primers, would anneal to multiple different regions of the sample nucleic acid 504 ; multiple overlapping complements or fragments of the nucleic acid are created, e.g., fragments 518 and 520 .
  • sequence portions that are complementary to portions of sample nucleic acid e.g., sequences 522 and 524
  • these constructs are generally referred to herein as comprising fragments of the sample nucleic acid 504 , having the attached barcode sequences.
  • the replicated portions of the template sequences as described above are often referred to herein as “fragments” of that template sequence.
  • fragment encompasses any representation of a portion of the originating nucleic acid sequence, e.g., a template or sample nucleic acid, including those created by other mechanisms of providing portions of the template sequence, such as actual fragmentation of a given molecule of sequence, e.g., through enzymatic, chemical or mechanical fragmentation.
  • fragments of a template or sample nucleic acid sequence will denote replicated portions of the underlying sequence or complements thereof.
  • the barcoded nucleic acid fragments may then be subjected to characterization, e.g., through sequence analysis, or they may be further amplified in the process, as shown in panel D.
  • additional oligonucleotides e.g., oligonucleotide 508 b, also released from bead 306 , may prime the fragments 518 and 520 .
  • the oligonucleotide anneals with the fragment 518 , and is extended to create a complement 526 to at least a portion of fragment 518 which includes sequence 528 , that comprises a duplicate of a portion of the sample nucleic acid sequence. Extension of the oligonucleotide 508 b continues until it has replicated through the oligonucleotide portion 508 of fragment 518 .
  • the oligonucleotides may be configured to prompt a stop in the replication by the polymerase at a desired point, e.g., after replicating through sequences 516 and 514 of oligonucleotide 508 that is included within fragment 518 .
  • this may be accomplished by different methods, including, for example, the incorporation of different nucleotides and/or nucleotide analogues that are not capable of being processed by the polymerase enzyme used.
  • this may include the inclusion of uracil containing nucleotides within the sequence region 512 to prevent a non-uracil tolerant polymerase to cease replication of that region.
  • a fragment 526 is created that includes the full-length oligonucleotide 508 b at one end, including the barcode sequence 512 , the attachment sequence 510 , the R1 primer region 514 , and the random N-mer sequence 516 b.
  • the complement 516 ′ to the random N-mer of the first oligonucleotide 508 will be included, as well as a complement to all or a portion of the R1 sequence, shown as sequence 514 ′.
  • the R1 sequence 514 and its complement 514 ′ are then able to hybridize together to form a partial hairpin structure 528 .
  • sequence 516 ′ which is the complement to random N-mer 516
  • sequence 516 b which is the complement to random N-mer 516
  • targeted primers where the N-mers would be common among oligonucleotides within a given partition.
  • partial hairpin structures By forming these partial hairpin structures, it allows for the removal of first level duplicates of the sample sequence from further replication, e.g., preventing iterative copying of copies.
  • the partial hairpin structure also provides a useful structure for subsequent processing of the created fragments, e.g., fragment 526 .
  • a nucleic acid 604 originated from a first source 600 (e.g., individual chromosome, strand of nucleic acid, etc.) and a nucleic acid 606 derived from a different chromosome 602 or strand of nucleic acid are each partitioned along with their own sets of barcode oligonucleotides as described above.
  • a first source 600 e.g., individual chromosome, strand of nucleic acid, etc.
  • a nucleic acid 606 derived from a different chromosome 602 or strand of nucleic acid are each partitioned along with their own sets of barcode oligonucleotides as described above.
  • each nucleic acid 604 and 606 is then processed to separately provide overlapping set of second fragments of the first fragment(s), e.g., second fragment sets 608 and 610 .
  • This processing also provides the second fragments with a barcode sequence that is the same for each of the second fragments derived from a particular first fragment.
  • the barcode sequence for second fragment set 608 is denoted by “1” while the barcode sequence for fragment set 610 is denoted by “2”.
  • a diverse library of barcodes may be used to differentially barcode large numbers of different fragment sets. However, it is not necessary for every second fragment set from a different first fragment to be barcoded with different barcode sequences. In fact, in many cases, multiple different first fragments may be processed concurrently to include the same barcode sequence. Diverse barcode libraries are described in detail elsewhere herein.
  • the barcoded fragments may then be pooled for sequencing using, for example, sequence by synthesis technologies available from Illumina or Ion Torrent division of Thermo Fisher, Inc.
  • sequence reads 612 can be attributed to their respective fragment set, e.g., as shown in aggregated reads 614 and 616 , at least in part based upon the included barcodes, and optionally, and preferably, in part based upon the sequence of the fragment itself.
  • the attributed sequence reads for each fragment set are then assembled to provide the assembled sequence for each sample fragment, e.g., sequences 618 and 620 , which in turn, may be further attributed back to their respective original chromosomes ( 600 and 602 ).
  • Methods and systems for assembling genomic sequences are described in, for example, U.S. patent application Ser. No. 14/752,773, filed Jun. 26, 2015, the full disclosure of which is hereby incorporated by reference in its entirety.
  • targeted regions of a genome is meant a whole genome or any one or more regions of a genome identified as of interest and/or selected through one or more methods described herein.
  • the targeted regions of the genome sequenced by methods and systems described herein include without limitation introns, exons, intergenic regions, or any combination thereof.
  • the methods and systems described herein provide sequence information on whole exomes, portions of exomes, one or more selected genes (including selected panels of genes), one or more introns, and combinations of intronic and exonic sequences.
  • Targeted regions of the genome may also include certain portions or percentages of the genome rather than regions identified by sequence.
  • targeted regions of the genome captured and analyzed in accordance with the methods described herein include portions of the genome located every 1, 2, 5, 10, 15, 20, 25, 50, 100, 200, 250, 500, 750, 1000, or 10000 kilobases of a genome.
  • targeted regions of the genome comprise 5%, 10%, 15%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95% of the whole genome.
  • the targeted regions comprise 1-10%, 5-20%, 10-30%, 15-40%, 20-50%, 25-60%, 30-70%, 35-80%, 40-90%, or 45-95% of the whole genome.
  • targeted regions of a genome are captured for use in any sequencing methods known in the art and described herein.
  • captured as used herein is meant any method or system for enriching a population of nucleic acid and/or nucleic acid fragments such that the resultant population contains an increased percentage of the targeted regions of interest as compared to the genomic regions that are not of interest.
  • the enriched population contains at least 50%, 55%, 60%, 70%, 75%, 80%, 85%, 90%, 95%, 97%, 98%, 99%, or 100% nucleic acids/nucleic acid fragments comprising the targeted regions.
  • Capture methods generally include chip-based methods, in which targeted regions are captured through hybridization or other association with capture molecules on a surface, and solution based methods, in which oligonucleotide probes (baits), which are complementary to the targeted regions (or to regions near the targeted regions) are hybridized to genomic fragment libraries.
  • the probes used in the capture methods disclosed herein are generally attached to capture molecules, such as biotin, which can be used to “pull down” the probes and the fragments to which they are hybridized—these pull down methods include any methods by which the baits hybridized to nucleic acids or nucleic acid fragments that contain the targeted regions of interest are separated from fragments that do not contain the regions of interest.
  • the probes are biotynilated
  • magnetic streptavidin beads are used to selectively pull-down and enrich baits with bound targeted regions.
  • a library of baits is used that covers all the targeted regions desired for further study.
  • a library of baits thus includes oligonucleotide probes that together cover the full exome.
  • only portions of the exome are needed for further analysis.
  • the baits are designed to target that subset of the exome. This design can be accomplished using methods and algorithms known in the art and in general is based upon a reference sequence, such as the human genome.
  • the targeted genomic regions processed and sequenced in accordance with the methods and systems described herein are full or partial exomes.
  • These full or partial exomes can be captured for sequencing using any methods known in the art, including without limitation any of the Roche/NimbleGen exome protocols, including the NimbleGen 2.1M Human Exome array and the NimbleGen SeqCap EZ Exome Library, any of the Agilent SureSelect products, any Illumina exome capture products, including the TruSeq and Nextera Exome products, and any other products, methods, systems and protocols known in the art.
  • the baits used to capture those targeted regions may be designed to be complementary to those exonic sequences.
  • the baits are not complementary to the exonic sequences themselves but are instead complementary to sequences near the exonic sequence or to intronic sequences between two exons.
  • Such designs are also referred to herein as “anchored exome capture” or “intronic baiting,” by which, as discussed herein, is meant a process in which one or more portions of an exome are captured through the use of baits complementary to one or more intronic sequences near or adjacent to the one or more portions of the exome that are of interest. For example, as schematically illustrated in FIG.
  • a genomic sequence 201 comprises exonic regions 202 and 203 .
  • Those exonic regions can be captured by utilizing baits directed to one or more of the intronic sequences nearby (for example intronic region 204 and/or 205 to capture exonic region 202 and intronic region 206 for capture of exonic region 203 ).
  • baits directed to one or more of the intronic sequences nearby (for example intronic region 204 and/or 205 to capture exonic region 202 and intronic region 206 for capture of exonic region 203 ).
  • a population of fragments comprising exonic regions 202 or 203 would be captured through the use of baits complementary to intronic regions 204 and/or 205 and 206 .
  • intronic baiting is used to bridge exons separated by long intronic regions by sparsely baiting longer introns.
  • the baits are not necessarily targeting intronic regions that are close to the exonic regions of interest, but the baits are instead designed to target regions separated by particular distances (or sets of distances) or are designed to tile across the intronic regions by a particular number of bases or combinations of numbers of bases. Such embodiments are described in further detail below.
  • the intronic regions used for anchored exome capture/intronic baiting techniques of the invention are adjacent to the exonic region to be captured.
  • the intronic regions are separated from the exonic region to be captured by about 1-50, 2-45, 3-40, 4-35, 5-30, 6-25, 7-20, 8-15, 9-10, 2-20, 3-15, 4-10, 5-30, 10-40, 15-50, 20-75, 25-100 nucleotides.
  • the intronic regions are separated from the exonic regions to be captured by about 5, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 125, 150, 175, 200, 300, 400, or 500 nucleotides.
  • the intronic regions are separated from the exonic regions to be captured by distances on the orders of kilobases, e.g., 1-20, 2-18, 3-16, 4-14, 5-12, 6-10 kilobases. Since the original molecular context of the enriched population of oligonucleotides is retained, this sparse baiting of intronic regions allows for the linking of sequence information between exonic regions separated by long introns.
  • the baits are instead designed to be complementary to portions of the genome at particular ranges or distances.
  • the library of baits can be designed to hybridize to sequences located every 5 kilobases (kb) along the genome, such that applying this library of baits to a fragmented genomic sample will capture only a certain subset of the genome—i.e., those regions that are contained in fragments containing complementary sequences to the baits.
  • the baits can be designed based on a reference sequence, such as a human genome reference sequence.
  • the tiled library of baits is designed to capture regions every 1, 2, 5, 10, 15, 20, 25, 50, 100, 200, 250, 500, 750, 1000, or 10000 kilobases of a genome.
  • this tiling method has the effect of sparsely capturing intronic regions, thus providing a way to link sequence information of exonic regions that are separated by long intronic regions, because the original molecular context of those exonic regions captured through sparse capture of intronic regions is retained.
  • the baits are designed to tile the genome in a random or combined manner—for example, a mixture of tiled libraries can be used where some of the libraries capture regions every 1 kb, whereas other libraries in the mixture capture regions every 100 kb.
  • the tiled libraries are designed so that the baits target within a range of positions within the genome—for example, the baits may target regions of every 1-10, 2-5, 5-200, 10-175, 15-150, 20-125, 30-100, 40-75, 50-60 kb of the genome.
  • the tiled or other capture methods described herein will capture about 5%, 10%, 15%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95% of the whole genome. As will be appreciated, such tiling methods of capture will capture both intronic and exonic regions of the genome for further analysis such as sequencing.
  • the library of baits used in methods of the present invention is a product of informed design that fulfills one or more characteristics as further described herein.
  • This informed design includes instances in which the library of baits is directed to informative single nucleotide polymorphisms (SNPs).
  • SNPs single nucleotide polymorphisms
  • the term “informative SNPs” as used herein refers to SNPs that are heterozygous.
  • the library of baits in some examples is designed to contain a plurality of probes that are directed to regions of the genomic sample that contain informative SNPs.
  • directed to as used herein is meant that the probes contain sequences that are complementary to those regions of the genomic sequences.
  • Informed bait design provides the ability to optimize targeted sequencing methods by allowing for targeted enrichment with full coverage while at the same time reducing the number of probes needed (and thus reducing costs and streamlining the work flow).
  • the libraries of baits are designed to include baits directed to particular sequences in targeted regions of the genome based on the presence or absence of informative SNPs in those regions and/or the location(s) of those informative SNPs.
  • An exemplary illustration of general considerations for informed bait design is provided in FIG. 8 .
  • a region of the genome 801 can include exons ( 802 and 803 ).
  • an informative SNP 804 will be located at the boundary between the exon ( 802 ) and the adjacent intron.
  • the bait library can be designed to include probes directed to one or more nucleotides ( 805 ) at a specified distance away from the boundary.
  • the bait library can be designed to include probes directed to one or more positions in the intron near that boundary ( 807 and 808 ). Those positions will preferably include informative SNPs, but may also include other SNPs and/or other sequences as needed.
  • the bait library can be designed to include probes directed to several positions 810 , 811 , and 812 in the adjacent intron that include a mixture of informative and non-informative SNPs (as well as any other sequences as needed).
  • one or more input characteristics are used to design a probe bait library that is directed to shifting locations along the genome based on those input characteristics as well as map quality in various regions.
  • This design is generally based on spacing between informative SNPs rather than on the locations of introns and exons.
  • any of the descriptions provided herein related to bait design based on intron and exon locations can also be used in combination with the informed bait design methods based on informative SNPs.
  • Input characteristics used in informed bait design include without limitation and in any combination locations of exons, introns, intergenic regions, informative SNPs, as well as regions of repeating sequences (such as GC-rich regions), centromeres, and sample nucleic acid lengths.
  • probe bait libraries are designed to include probes directed to regions that have a high likelihood of containing informative SNPs in a given sample.
  • targets may include individual bases (the informative SNPs themselves) or one or more bases that are proximal or adjacent to the informative SNPs.
  • the targets for the probe baits may be directly adjacent to the informative SNPs or separated by distances from about 1-200, 10-190, 20-180, 30-170, 40-160, 50-150, 60-140, 70-130, 80-120, 90-100 bases from an informative SNP.
  • the probe bait libraries include probes directed to regions of particular densities related to the average length of the nucleic acid molecules.
  • the probes can be designed to include probes at a density of target sequences that is x-fold more dense than the average length of the nucleic acid molecules/fragments to which the probes are hybridizing, where x can be without limitation 1, 5, 10, 20, 50, 75, 100, 125, 150, or 200.
  • Increasing the density of the probe targets relative to the length of the nucleic acids increases the ability to link probes across loci on the same physical molecule.
  • Such methods can also improve the probability that the linked regions will include informative SNPs, thus further improving the ability of the probe bait libraries to attach to targeted regions of the genome.
  • the density of the probe targets may also be increased in situations in which (at the population level) there is not a high probability of informative SNPs in a given region of interest. In such regions, tiling methods such as those described herein can be used to direct probes at periodic spacings along the region. In certain embodiments, the density of the spacing can be differentially based, such that the density of probe spacing in these regions lacking informative SNPs are at a 1, 2, 5, 10, 25, 50-fold shorter distance than probe spacing in regions containing informative SNPs.
  • the probe bait library is designed to consider only informative SNP distribution within a gene (including exons and introns).
  • This method of design is directed to capture a sufficient number of heterozygous SNPs at key locations to link/phase from one end of the gene to the other.
  • Such a design method includes baits directed to sets of targets that combine exonic informative SNPs with one or more non-exonic SNPs such that the distance between informative SNPs in a gene is below the above described densities of spacing.
  • Such informed design methods allow detection of not only general targeted regions of the genome, but also allows the detection and phasing of genomic structural variations, such as translocations and gene fusions. By ensuring that any individual gene can be phased, it follows that the vast majority of gene fusion events can be detected and phased using the methods described herein.
  • the bait libraries are designed to target probes at distances of about 1 kb to about 2 Mb. In further embodiments, the distances are from about 1-50, 5-45, 10-40, 15-35, 20-30, 10-50 kb.
  • the nucleic acid fragments being targeted by the probe baits are from about 2 kb to about 250 Mb. In still further embodiments, the fragments are from about 10-1000, 20-900, 30-800, 40-700, 50-600, 60-500, 70-400, 80-300, 90-200, 100-150, 50-500, 25-300 kb.
  • the probe bait libraries are designed such that about 60-95% of the probes hybridize to sequences containing informative SNPs. In further embodiments, the probe bait libraries are designed such that about 65%-85%, 70%-80%, 60-90%, 80-90%, 90-95%, 95%-99% of the probes in the library of probes are designed to hybridize to informative SNPs. In still further embodiments, at least 65%, 75%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97% of the probes in the library of probes are designed to hybridize to informative SNPs. As will be appreciated, for a probe to be designed to “hybridize to” an informative SNP means that such a probe hybridizes to a sequence region that includes that informative SNP.
  • the probe bait libraries are designed to include a plurality of probes directed to informative SNPs that are located within both exons and introns in targeted portions of the genomic sample.
  • the libraries are designed such that a majority of the probes in the library hybridize to informative SNPs spaced apart by about 1-15, 5-10, 3-6 kb. In yet further embodiments, the majority of the probes in the library of probes are further designed to hybridize to informative SNPs spaced apart by about 1, 3, 5, 10, 20, 30, 50 kb.
  • a plurality of probes within the library of probes are designed such that for targeted portions of the genomic samples in which there are no informative SNPs within 5-300, 10-50, 20-100, 30-150, or 40-200 kb of boundaries between exons and introns, the plurality of probes is designed to hybridize at an informative SNP within an intron from those boundaries.
  • a plurality of probes within the library of probes are designed such that for targeted portions of the genomic samples in which there is a first informative SNP within an exon and that first informative SNP is located 5-300, 10-50, 20-100, 30-150, or 40-200 kb from a boundary with an adjacent intron and a second informative SNP within the adjacent intron and that second informative SNP is located 10-50 kb from the boundary, the plurality of probes is designed to hybridize to a region of the genomic sample between the first and second informative SNPs;
  • a plurality of probes within the library of probes are designed such that for targeted portions of the genomic samples comprising no informative SNPs for at least 5-300, 10-50, 20-100, 30-150, or 40-200 kb, the plurality of probes is designed to hybridize every 0.5, 1, 3, or 5 kb to those targeted portions of the genomic samples. In further embodiments, the plurality of probes is designed to hybridize every 0.1, 0.5, 1, 1.5, 3, 5, 10, 15, 20, 30, 35, 40, 45, 50 kb along those targeted portions of the genomic samples.
  • a plurality of probes within the library of probes are designed such that for targeted portions of the genomic samples in which there are no informative SNPs within 5-300, 10-50, 20-100, 30-150, or 40-200 kb of boundaries between exons and introns, the plurality of probes are designed to hybridize to the next closest informative SNP to the exon-intron boundaries.
  • the library of probes comprises probes designed to hybridize to regions of the genomic sample that flank exons at a density that provides linkage information across barcodes.
  • the range of coverage represented by the library of probes is inversely proportional to the distribution of lengths of the individual nucleic acid fragment molecules of the genomic sample in the discrete partitions, such that methods containing a higher proportion of longer individual nucleic acid fragment molecules use libraries of probes with smaller ranges of coverage.
  • the library of probes is optimized for coverage of the targeted portions of the genomic sample.
  • the density of coverage may be lower for regions of high map quality, particularly for those regions containing informative SNPs, and the density may further be higher for regions of low map quality to ensure that linkage information is provided across targeted regions.
  • the library of probes has features informed by characteristics of the one or more targeted portions of a genomic sample, such that for targeted portions with high map quality, the library of probes comprises probes that hybridize to informative SNPs within 1 kb-1 Mb of boundaries of exons and introns.
  • the library of probes may in such situations further include probes that hybridize to informative SNPs within 10-500, 20-450, 30-400, 40-350, 50-300, 60-250, 70-200, 80-150, 90-100 kb of boundaries of exons and introns.
  • the library of probes has features informed by characteristics of the one or more targeted portions of a genomic sample, such that for targeted portions in which the distribution of lengths of the barcoded fragments has a high proportion of fragments longer than about 100, 150, 200, 250 kb, the library of probes comprise probes that hybridize to informative SNPs separated by at least 50 kb.
  • the library of probes may in such situations further include probes that hybridize to informative SNPs separated by at least 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 75, 100, 125, 150, 175, 200 kb.
  • the library of probes has features informed by characteristics of the one or more targeted portions of a genomic sample, such that for targeted portions with low map quality, the library of probes comprises probes that hybridize to informative SNPs within 1 kb of exon-intron boundaries.
  • the library of probes may in such situations further include probes that hybridize to informative SNPs within 2, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 75, 100, 125, 150, 175, 200 kb of exon-intron boundaries.
  • the library will further include probes that hybridize and probes that hybridize to informative SNPs within exons, within introns, or both.
  • the library of probes has features informed by characteristics of the one or more targeted portions of a genomic sample, such that for targeted portions comprising intergenic regions, the library of probes comprises probes that hybridize to informative SNPs spaced apart at distances of at least 1, 2, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 75, 100 kb.
  • the baits used in the capture methods described herein can be of any size or structure that is useful for enriching a population of fragments for fragments containing targeted regions of the genome.
  • the baits of use in the present invention comprise oligonucleotide probes that are attached to a capture molecule, such as biotin.
  • the oligonucleotide probes may be complementary to sequences within a targeted region of interest, or they may be complementary to regions outside of the targeted region but close enough to that targeted region that both the “anchoring” region and the targeted region are within the same fragment, such that the bait is able to pull down the targeted region by hybridizing to that nearby region (such as a flanking intron).
  • the capture molecule attached to the bait may be any capture molecule that can be used for isolating the bait and its hybridization partner from other fragments in a population.
  • the baits used herein are attached to biotin, and then solid supports comprising streptavidin (including without limitation magnetic streptavidin beads) can be used to capture the baits and the fragments to which they are hybridized.
  • Other capture molecule pairs may include without limitation biotin/neutravidin, antigen/antibody, or complementary oligonucleotide sequences.
  • the oligonucleotide probe portion of the baits can be of any length suitable for hybridizing to targeted regions or to regions near targeted regions.
  • the oligonucleotide probe portion comprises a length of about 5-10, 10-50, 20-100, 30-90, 40-80, 50-70, nucleotides in length.
  • any of the oligonucleotide probe portions described herein may comprise RNA, DNA, non-natural nucleotides such as PNAs, LNAs, and so on, or any combinations thereof.
  • An advantage of the methods and systems described herein is that the targeted regions that are captured are processed prior to capture in such a way that even after the steps of capturing the targeted regions and conducting sequencing analyses, the original molecular context of those targeted regions is retained.
  • the ability to attribute specific targeted regions to their original molecular context (which can include the original chromosome or chromosomal region from which they are derived and/or the location of particular targeted regions in relation to each other within the full genome) provides a way to obtain sequence information from regions of the genome that are otherwise poorly mapped or have poor coverage using traditional sequencing techniques.
  • Short-read technologies are often preferable sequencing technologies, because they possess superior accuracy as compared to long-read technologies.
  • generally used short-read technologies are unable to span across long regions of the genome, and thus information may not be obtainable using these conventional technologies in regions of the genome that are difficult to characterize due to structural characteristics such as long lengths of tandem repeating sequences, high GC content, and exons containing long introns.
  • the molecular context of targeted regions is retained, generally through the tagging procedure illustrated in FIG. 1 and described in further detail herein. As such, links can be made across extended regions of the genome.
  • nucleic acid molecule 207 contains two exons (shaded bars) with a long intronic region ( 208 ).
  • the individual nucleic acid molecule 207 is distributed into its own discrete partition 211 and then fragmented such that different fragments contain different portions of the exons and the intron. Because each of those fragments is tagged such that any sequence information obtained from the fragments is then attributable to the discrete partition in which it was generated, each fragment is thus also attributable to the individual nucleic acid molecule 207 from which it was derived.
  • fragments from different partitions are combined together.
  • Targeted capture methods can then be used to enrich the population of fragments that undergoes further analysis, such as sequencing, with fragments containing the targeted region of interest.
  • the baits used will enrich the population of fragments to capture only those containing a portion of the exons, but regions outside of the exon and intron (such as 209 and 210 ) would not be captured.
  • the final population of fragments that undergoes sequencing will be enriched for the fragments containing the portions of the exons, even if those exons are separated by a long intronic region.
  • Short read, high accuracy sequencing technologies can then be used to identify the sequences of this enriched population of fragments, and because each of the fragments is tagged and thus attributable to its original molecular context, i.e., its original individual nucleic acid molecule, the short read sequences can be pieced together to provide information about the relationship between the exons.
  • the baits used to capture fragments containing all or part of one or more exons are complementary to one or more portions of the one or more exons themselves.
  • the baits are complementary to one or more portions of the intervening introns or to sequences adjacent to or near the exon on either the 3′ or 5′ side of the exon regions (such baits are also referred to herein as “intronic baits”).
  • the baits used to capture the fragments containing all or part of the exon include baits complementary to the exon itself and intronic baits.
  • the ability to retain the molecular context of the targeted regions captured for sequencing also provides the advantage of allowing for sequencing across poorly characterized regions of the genome.
  • a significant percentage at least 5-10% according to, for example Altemose et al., PLOS Computational Biology, May 15, 2014, Vol. 10, Issue 5
  • the reference assembly generally annotates these missing regions as multi-megabase heterochromatic gaps, found primarily near centromeres and on the short arms of the acrocentric chromosomes.
  • This missing fraction of the genome includes structural features that remain resistant to accurate characterization using generally used sequencing technologies.
  • sample preparation methods including methods of fragmenting, amplifying, partitioning, and otherwise processing genomic DNA, can lead to biases or lower coverage of certain regions of a genome.
  • biases or lowered coverage can be compensated for in the methods and systems disclosed herein by altering the concentration of baits used to capture targeted regions of the genome. For example, in some situations it is known that certain regions of the genome will have low coverage after the fragment library is processed, such as regions containing high GC content or other structural variations that lead to bias toward certain areas of the genome over others.
  • the library of baits can be altered to increase the concentration of baits directed to those regions of low coverage—in other words, the population of baits used may be “spiked” to ensure that a sufficient number of fragments containing targeted regions of the genome in those low coverage areas are obtained in the final population of fragments to be sequenced.
  • Such spiking of baits may be conducted through design of custom libraries in some embodiments.
  • the spiking of baits can be conducted in commercially available whole exome kits, such that a custom library of baits directed toward the lower coverage regions are added to off-the-shelf exome capture kits.
  • An advantage of the methods and systems described herein is that the targeted regions that are captured are processed prior to capture in such a way that even after the steps of capturing the targeted regions and conducting sequencing analyses, the original molecular context of those targeted regions is retained.
  • the ability to attribute specific targeted regions to their original molecular context (which can include the original chromosome or chromosomal region from which they are derived and/or the location of particular targeted regions in relation to each other within the full genome) provides a way to obtain sequence information from regions of the genome that are otherwise poorly mapped or have poor coverage using traditional sequencing techniques.
  • nucleic acid molecule 207 contains exons (shaded bars) interrupted by a long intronic region.
  • sequencing technologies would be unable to span the distance across the intron to provide information on the relationship between the two exons.
  • the individual nucleic acid molecule 207 is distributed into its own discrete partition 209 and then fragmented such that different fragments contain different portions of the exons and the intron. Because each of those fragments is tagged such that any sequence information obtained from the fragments is then attributable to the discrete partition in which it was generated, each fragment is thus also attributable to the individual nucleic acid molecule 207 from which it was derived.
  • fragments from different partitions are combined together. Targeted capture methods can then be used to enrich the population of fragments that undergoes further analysis, such as sequencing, with fragments containing the targeted region of interest. In the example illustrated in FIG.
  • the baits used will enrich the population of fragments to capture only those containing a portion of one of exons, but regions outside of the exons (such as 209 and 210 ) would not be captured. Thus, the final population of fragments that undergoes sequencing will be enriched for the fragments containing the exons of interest.
  • Short read, high accuracy sequencing technologies can then be used to identify the sequences of this enriched population of fragments, and because each of the fragments is tagged and thus attributable to its original molecular context, i.e., its original individual nucleic acid molecule, the short read sequences can be pieced together to span across the length of the intervening intron (which can in some examples be on the order of 1, 2, 5, 10 or more kilobases in length) to provide linked sequence information on the two exons.
  • the intervening intron which can in some examples be on the order of 1, 2, 5, 10 or more kilobases in length
  • individual molecular context refers to sequence context beyond the specific sequence read, e.g., relation to adjacent or proximal sequences, that are not included within the sequence read itself, and as such, will typically be such that they would not be included in whole or in part in a short sequence read, e.g., a read of about 150 bases, or about 300 bases for paired reads.
  • a short sequence read e.g., a read of about 150 bases, or about 300 bases for paired reads.
  • the methods and systems provide long range sequence context for short sequence reads.
  • Such long range context includes relationship or linkage of a given sequence read to sequence reads that are within a distance of each other of longer than 1 kb, longer than 5 kb, longer than 10 kb, longer than 15 kb, longer than 20 kb, longer than 30 kb, longer than 40 kb, longer than 50 kb, longer than 60 kb, longer than 70 kb, longer than 80 kb, longer than 90 kb or even longer than 100 kb, or longer.
  • the methods and systems of the invention also provide much longer inferred molecular context.
  • Sequence context can include lower resolution context, e.g., from mapping the short sequence reads to the individual longer molecules or contigs of linked molecules, as well as the higher resolution sequence context, e.g., from long range sequencing of large portions of the longer individual molecules, e.g., having contiguous determined sequences of individual molecules where such determined sequences are longer than 1 kb, longer than 5 kb, longer than 10 kb, longer than 15 kb, longer than 20 kb, longer than 30 kb, longer than 40 kb, longer than 50 kb, longer than 60 kb, longer than 70 kb, longer than 80 kb, longer than 90 kb or even longer than 100 kb.
  • the attribution of short sequences to longer nucleic acids may include both mapping of short sequences against longer nucleic acid stretches to provide high level sequence context, as well as providing assembled sequences from the short sequences through these longer nucleic acids.
  • genomic material may be obtained from a sample taken from a patient.
  • genomic material may be obtained from a sample taken from a patient.
  • samples and types of genomic material of use in the methods and systems discussed herein include without limitation polynucleotides, nucleic acids, oligonucleotides, circulating cell-free nucleic acid, circulating tumor cell (CTC), nucleic acid fragments, nucleotides, DNA, RNA, peptide polynucleotides, complementary DNA (cDNA), double stranded DNA (dsDNA), single stranded DNA (ssDNA), plasmid DNA, cosmid DNA, chromosomal DNA, genomic DNA (gDNA), viral DNA, bacterial DNA, mtDNA (mitochondrial DNA), ribosomal RNA, cell-free DNA, cell free fetal DNA (cffDNA), mRNA, rRNA, tRNA, nRNA, siRNA, snRNA, s
  • the substance may be a fluid, e.g., a biological fluid.
  • a fluidic substance may include, but not limited to, blood, cord blood, saliva, urine, sweat, serum, semen, vaginal fluid, gastric and digestive fluid, spinal fluid, placental fluid, cavity fluid, ocular fluid, serum, breast milk, lymphatic fluid, or combinations thereof.
  • the substance may be solid, for example, a biological tissue.
  • the substance may comprise normal healthy tissues, diseased tissues, or a mix of healthy and diseased tissues. In some cases, the substance may comprise tumors. Tumors may be benign (non-cancer) or malignant (cancer).
  • Non-limiting examples of tumors may include: fibrosarcoma, myxosarcoma, liposarcoma, chondrosarcoma, osteogenic sarcoma, chordoma, angiosarcoma, endotheliosarcoma, lymphangiosarcoma, lymphangioendotheliosarcoma, synovioma, mesothelioma, Ewing's, leiomyosarcoma, rhabdomyosarcoma, gastrointestinal system carcinomas, colon carcinoma, pancreatic cancer, breast cancer, genitourinary system carcinomas, ovarian cancer, prostate cancer, squamous cell carcinoma, basal cell carcinoma, adenocarcinoma, sweat gland carcinoma, sebaceous gland carcinoma, papillary carcinoma, papillary adenocarcinomas, cystadenocarcinoma, medullary carcinoma, bronchogenic carcinoma, renal cell carcinoma, hepatoma, bile duct carcinoma, chor
  • the substance may be associated with various types of organs.
  • organs may include brain, liver, lung, kidney, prostate, ovary, spleen, lymph node (including tonsil), thyroid, pancreas, heart, skeletal muscle, intestine, larynx, esophagus, stomach, or combinations thereof.
  • the substance may comprise a variety of cells, including but not limited to: eukaryotic cells, prokaryotic cells, fungi cells, heart cells, lung cells, kidney cells, liver cells, pancreas cells, reproductive cells, stem cells, induced pluripotent stem cells, gastrointestinal cells, blood cells, cancer cells, bacterial cells, bacterial cells isolated from a human microbiome sample, etc.
  • the substance may comprise contents of a cell, such as, for example, the contents of a single cell or the contents of multiple cells.
  • contents of a cell such as, for example, the contents of a single cell or the contents of multiple cells.
  • Samples may be obtained from various subjects.
  • a subject may be a living subject or a dead subject. Examples of subjects may include, but not limited to, humans, mammals, non-human mammals, rodents, amphibians, reptiles, canines, felines, bovines, equines, goats, ovines, hens, avines, mice, rabbits, insects, slugs, microbes, bacteria, parasites, or fish.
  • the subject may be a patient who is having, suspected of having, or at a risk of developing a disease or disorder.
  • the subject may be a pregnant woman.
  • the subject may be a normal healthy pregnant woman.
  • the subject may be a pregnant woman who is at a risking of carrying a baby with certain birth defect.
  • a sample may be obtained from a subject by any means known in the art.
  • a sample may be obtained from a subject through accessing the circulatory system (e.g., intravenously or intra-arterially via a syringe or other apparatus), collecting a secreted biological sample (e.g., saliva, sputum urine, feces, etc.), surgically (e.g., biopsy) acquiring a biological sample (e.g., intra-operative samples, post-surgical samples, etc.), swabbing (e.g., buccal swab, oropharyngeal swab), or pipetting.
  • a biological sample e.g., intra-operative samples, post-surgical samples, etc.
  • swabbing e.g., buccal swab, oropharyngeal swab
  • pipetting e.g., buccal swab, oropharyngeal swab
  • Genomic DNA from the NA12878 human cell line was subjected to size based separation of fragments using a Blue Pippin DNA sizing system to recover fragments that were greater than or equal to approximately 10 kb in length.
  • the size selected sample nucleic acids were then copartitioned with barcode beads in aqueous droplets within a fluorinated oil continuous phase using a microfluidic partitioning system (See, e.g., U.S. patent application Ser. No. 14/682,952, filed Apr.
  • aqueous droplets also included the dNTPs, thermostable DNA polymerase and other reagents for carrying out amplification within the droplets, as well as DTT for releasing the barcode oligonucleotides from the beads.
  • dNTPs thermostable DNA polymerase
  • DTT DTT for releasing the barcode oligonucleotides from the beads.
  • BC denotes the barcode portion of the oligonucleotide
  • Nmer denotes a random 10 base N-mer priming sequence used to prime the template nucleic acids. See, e.g., U.S. patent application Ser. No. 14/316,383, filed Jun. 26, 2014, the full disclosure of which is hereby incorporated herein by reference in its entirety for all purposes.
  • the droplets were thermocycled to allow for primer extension of the barcode oligos against the template of the sample nucleic acids within each droplet. This resulted in amplified copy fragments of the sample nucleic acids that included the barcode sequence representative of the originating partition, in addition to the other included sequences set forth above.
  • the emulsion of droplets including the amplified copy fragments was broken and the additional sequencer required components, e.g., read2 primer sequence and P7 attachment sequence, were added to the copy fragments through an additional amplification step, which attached these sequences to the other end of the copy fragments.
  • the barcoded DNA was then subjected to hybrid capture using an Agilent SureSelect Exome capture kit.
  • the three different versions listed above represent three different shear lengths for the barcoded fragments before the second adapter attachment step.
  • Genomic DNA from the NA19701 and NA19661 cell lines was prepared according to the methods described above in Example 1. Data, including phasing data, from those two cells lines is provided in the table below:

Abstract

The present invention is directed to methods, compositions and systems for capturing and analyzing sequence information contained in targeted regions of a genome. Such targeted regions may include exomes, partial exomes, introns, combinations of exonic and intronic regions, genes, panels of genes, and any other subsets of a whole genome that may be of interest.

Description

    CROSS-REFERENCE TO RELATED APPLICATIONS
  • This application is a continuation of U.S. application Ser. No. 14/927,297, filed Oct. 29, 2015, which claims the benefit of U.S. Provisional Application No. 62/072,164, filed Oct. 29, 2014, each of which is expressly incorporated herein by reference in their entirety for all purposes.
  • BACKGROUND OF THE INVENTION
  • The ability to sequence genomes accurately and rapidly is revolutionizing biology and medicine. The study of complex genomes, and in particular, the search for the genetic basis of disease in humans, involves genetic analysis on a massive scale. Such genetic analysis on a whole genome level is costly not only monetarily but also in time and labor. These costs increase with protocols involving analyses of separate individual DNA samples. Sequencing (and re-sequencing) of polymorphic areas in the genome that are linked to disease development will contribute greatly to the understanding of diseases, such as cancer, and therapeutic development and will help meet the pharmacogenomics challenge to identify the genes and functional polymorphisms associated with the variability in drug response. Screens for numerous genetic markers performed for populations large enough to yield statistically significant data are needed before associations can be made between a given genotype and a particular disease.
  • One way to reduce the costs associated with genome sequencing while retaining the benefits of genomic analysis on a large scale is to perform high throughput, high accuracy sequencing on targeted regions of the genome. A widely used approach captures much of the entire protein coding region of a genome (the exome), which makes up about 1% of the human genome, and has become a routine technique in clinical and basic research. Exome sequencing offers advantages over whole genome sequencing: it is significantly less expensive, is more easily understood for functional interpretation, is significantly faster to analyze, makes very deep sequencing affordable, and results in a dataset that is easier to manage. A need exists for methods, systems and compositions for the enrichment of target regions of interest for high accuracy and high throughput sequencing and genetic analysis.
  • SUMMARY OF THE INVENTION
  • Accordingly, the present invention provides methods, systems and compositions for obtaining sequence information for targeted regions of the genome.
  • In some aspects, the present disclosure provides a method for sequencing one or more selected portions of a genome, the method generally including the steps of: (a) providing starting genomic material, (b) distributing individual nucleic acid molecules from the starting genomic material into discrete partitions such that each discrete partition contains a first individual nucleic acid molecule; (c) fragmenting the individual nucleic acid molecules in the discrete partitions to form a plurality of fragments, where each of the fragments further includes a barcode, and where fragments within a given discrete partition each include a common barcode, thereby associating each fragment with the individual nucleic acid molecule from which it is derived; (d) providing a population enriched for fragments including at least a portion of the one or more selected portions of the genome; (e) obtaining sequence information from the population, thereby sequencing one or more selected portions of a genome.
  • In further embodiments and in accordance with the above, providing the population enriched for fragments including at least a portion of the one or more selected portions of the genome includes the steps of (i) hybridizing probes complementary to regions in or near the one or more selected portions of the genome to the fragments to form probe-fragment complexes; and (ii) capturing probe-fragment complexes to a surface of a solid support; thereby enriching the population with fragments including at least a portion of the one or more selected portions of the genome. In yet further embodiments, the solid support includes a bead. In still further embodiments, the probes include binding moieties and the surface include capture moieties, and the probe-fragment complexes are captured on the surface through a reaction between the binding moieties and the capture moieties. In further examples, the capture moieties include streptavidin and the binding moieties include biotin. In still further examples, the capture moieties comprise streptavidin magnetic beads and the binding moieties comprise biotinylated RNA library baits.
  • In some embodiments and in accordance with any of the above, the methods of the invention include the use of capture moieties that are directed to whole or partial exome capture, panel capture, targeted exon capture, anchored exome capture, or tiled genomic region capture.
  • In yet further embodiments and in accordance with any of the above, the methods disclosed herein include an obtaining step that includes a sequencing reaction. In further embodiments, the sequencing reaction is a short read-length sequencing reaction or a long read-length sequencing reaction. In still further examples, the sequencing reaction provides sequence information on less than 90%, less than 75%, or less than 50% of the starting genomic material.
  • In still further embodiments, the methods described herein further include linking two or more of the individual nucleic acid molecules in an inferred contig based upon overlapping sequences of the isolated fragments, wherein the inferred contig comprises a length N50 of at least 10 kb, 20 kb, 40 kb, 50 kb, 100 kb, or 200 kb.
  • In yet further examples and in accordance with any of the above, the methods disclosed herein further include linking two or more of the individual nucleic acid molecules in a phase block based upon overlapping phased variants within the sequences of the isolated fragments, where the phase block comprises a length N50 of at least 10 kb, of at least 20 kb, of at least 40 kb, of at least 50 kb, of at least 100 kb or of at least 200 kb.
  • In still further embodiments and in accordance with any of the above, the methods disclosed herein provide sequence information from selected portions of the genome that together cover an exome. In yet further embodiments, the individual nucleic acid molecules in the discrete partitions include genomic DNA from a single cell. In still further embodiments, the discrete partitions each include genomic DNA from a different chromosome.
  • In further aspects, the present disclosure provides a method of obtaining sequence information from one or more targeted portions of a genomic sample. Such a method includes without limitation the steps of: (a) providing individual first nucleic acid fragment molecules of the genomic sample in discrete partitions; (b) fragmenting the individual first nucleic acid fragment molecules within the discrete partitions to create a plurality of second fragments from each of the individual first nucleic acid fragment molecules; (c) attaching a common barcode sequence to the plurality of the second fragments within a discrete partition, such that each of the plurality of second fragments are attributable to the discrete partition in which they are contained; (d) applying a library of probes directed to the one or more targeted portions of the genomic sample to the second fragments; (e) conducting a sequencing reaction to identify sequences of the plurality of second fragments that hybridized to the library of probes, thereby obtaining sequence information from the one or more targeted portions of the genomic sample. In further embodiments, the library of probes are attached to binding moieties, and before the conducting step (e), the second fragments are captured on a surface comprising capture moieties through a reaction between the binding moieties and the capture moieties. In still further embodiments and prior to the conducting step (e), the second fragments are amplified before or after the second fragments are captured on the surface. In yet further embodiments, the binding moieties comprise biotin and the capture moieties comprise streptavidin. In still further embodiments, the sequencing reaction is a short read, high accuracy sequencing reaction. In still further embodiments, the second fragments are amplified such that the resultant amplification products are capable of forming partial or complete hairpin structures.
  • In further aspects and in accordance with any of the above, the present disclosure provides methods for obtaining sequence information from one or more targeted portions of a genomic sample while retaining molecular context. Such methods include the steps of: (a) providing starting genomic material; (b) distributing individual nucleic acid molecules from the starting genomic material into discrete partitions such that each discrete partition contains a first individual nucleic acid molecule; (c) fragmenting the first individual nucleic acid molecules in the discrete partitions to form a plurality of fragments; (d) providing a population enriched for fragments that include at least a portion of the one or more selected portions of the genome; (e) obtaining sequence information from the population, thereby sequencing one or more targeted portions of the genomic sample while retaining molecular context. In further embodiments, prior to the obtaining step (e), the plurality of fragments are tagged with a barcode to associate each fragment with the discrete partition in which it was formed. In still further embodiments, the individual nucleic acid molecules in step (b) are distributed such that molecular context of each first individual nucleic acid molecule is maintained.
  • In some aspects, the present disclosure provides methods of obtaining sequence information from one or more targeted portions of a genomic sample. Such methods include without limitation steps of (a) providing individual nucleic acid molecules of the genomic sample; (b) fragmenting the individual nucleic acid molecules to form a plurality of fragments, where each of the fragments further includes a barcode, and where fragments from the same individual nucleic molecule have a common barcode, thereby associating each fragment with the individual nucleic acid molecule from which it is derived; (c) enriching the plurality of fragments for fragments containing the one or more targeted portions of the genomic sample; and (d) conducting a sequencing reaction to identify sequences of the enriched plurality of fragments, thereby obtaining sequence information from the one or more targeted portions of the genomic sample. In further embodiments, the enriching step including applying a library of probes directed to the one or more targeted portions of the genomic sample. In yet further embodiments, the library of probes are attached to binding moieties, and prior to the conducting step, the fragments are captured through a reaction between the binding moieties and the capture moieties. In exemplary embodiments, the reaction between the binding moieties and the capture moieties immobilizes the fragments on a surface.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • FIG. 1 provides a schematic illustration of identification and analysis of targeted genomic regions using conventional processes versus the processes and systems described herein.
  • FIG. 2A and FIG. 2B provide schematic illustrations of identification and analysis of targeted genomic regions using processes and systems described herein.
  • FIG. 3 illustrates a typical workflow for performing an assay to detect sequence information, using the methods and compositions disclosed herein.
  • FIG. 4 provides a schematic illustration of a process for combining a nucleic acid sample with beads and partitioning the nucleic acids and beads into discrete droplets.
  • FIG. 5 provides a schematic illustration of a process for barcoding and amplification of chromosomal nucleic acid fragments.
  • FIG. 6 provides a schematic illustration of the use of barcoding of chromosomal nucleic acid fragments in attributing sequence data to individual chromosomes.
  • FIG. 7 illustrates a general embodiment of a method of the invention.
  • FIG. 8 illustrates a general embodiment of a method of the invention.
  • DETAILED DESCRIPTION OF THE INVENTION
  • The practice of the present invention may employ, unless otherwise indicated, conventional techniques and descriptions of organic chemistry, polymer technology, molecular biology (including recombinant techniques), cell biology, biochemistry, and immunology, which are within the skill of the art. Such conventional techniques include polymer array synthesis, hybridization, ligation, phage display, and detection of hybridization using a label. Specific illustrations of suitable techniques can be had by reference to the example herein below. However, other equivalent conventional procedures can, of course, also be used. Such conventional techniques and descriptions can be found in standard laboratory manuals such as Genome Analysis: A Laboratory Manual Series (Vols. I-IV), Using Antibodies: A Laboratory Manual, Cells: A Laboratory Manual, PCR Primer: A Laboratory Manual, and Molecular Cloning: A Laboratory Manual (all from Cold Spring Harbor Laboratory Press), Stryer, L. (1995) Biochemistry (4th Ed.) Freeman, New York, Gait, “Oligonucleotide Synthesis: A Practical Approach” 1984, IRL Press, London, Nelson and Cox (2000), Lehninger, Principles of Biochemistry 3rd Ed., W. H. Freeman Pub., New York, N.Y. and Berg et al. (2002) Biochemistry 5th Ed., W. H. Freeman Pub., New York, N.Y., all of which are herein incorporated in their entirety by reference for all purposes.
  • Note that as used herein and in the appended claims, the singular forms “a,” “an,” and “the” include plural referents unless the context clearly dictates otherwise. Thus, for example, reference to “a polymerase” refers to one agent or mixtures of such agents, and reference to “the method” includes reference to equivalent steps and methods known to those skilled in the art, and so forth.
  • Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. All publications mentioned herein are incorporated herein by reference for the purpose of describing and disclosing devices, compositions, formulations and methodologies which are described in the publication and which might be used in connection with the presently described invention.
  • Where a range of values is provided, it is understood that each intervening value, to the tenth of the unit of the lower limit unless the context clearly dictates otherwise, between the upper and lower limit of that range and any other stated or intervening value in that stated range is encompassed within the invention. The upper and lower limits of these smaller ranges may independently be included in the smaller ranges is also encompassed within the invention, subject to any specifically excluded limit in the stated range. Where the stated range includes one or both of the limits, ranges excluding either both of those included limits are also included in the invention.
  • In the following description, numerous specific details are set forth to provide a more thorough understanding of the present invention. However, it will be apparent to one of skill in the art that the present invention may be practiced without one or more of these specific details. In other instances, well-known features and procedures well known to those skilled in the art have not been described in order to avoid obscuring the invention.
  • As used herein, the term “comprising” is intended to mean that the compositions and methods include the recited elements, but not excluding others. “Consisting essentially of” when used to define compositions and methods, shall mean excluding other elements of any essential significance to the composition or method. “Consisting of” shall mean excluding more than trace elements of other ingredients for claimed compositions and substantial method steps. Embodiments defined by each of these transition terms are within the scope of this invention. Accordingly, it is intended that the methods and compositions can include additional steps and components (comprising) or alternatively including steps and compositions of no significance (consisting essentially of) or alternatively, intending only the stated method steps or compositions (consisting of).
  • All numerical designations, e.g., pH, temperature, time, concentration, and molecular weight, including ranges, are approximations which are varied (+) or (−) by increments of 0.1. It is to be understood, although not always explicitly stated that all numerical designations are preceded by the term “about”. The term “about” also includes the exact value “X” in addition to minor increments of “X” such as “X+0.1” or “X−0.1.” It also is to be understood, although not always explicitly stated, that the reagents described herein are merely exemplary and that equivalents of such are known in the art.
  • I. Overview
  • This disclosure provides methods, compositions and systems useful for characterization of genetic material. In particular, the methods, compositions and systems described herein provide genetic characterization of targeted regions of a genome, including without limitation particular chromosomes, regions of chromosomes, all exons (exomes), portions of exomes, specific genes, panels of genes (e.g., kinomes or other targeted gene panels), intronic regions, tiled portions of a genome, or any other chosen portion of a genome.
  • In general, the methods and systems described herein accomplish targeted genomic sequencing by providing for the determination of the sequence of long individual nucleic acid molecules and/or the identification of direct molecular linkage as between two sequence segments separated by long stretches of sequence, which permit the identification and use of long range sequence information, but this sequencing information is obtained using methods that have the advantages of the extremely low sequencing error rates and high throughput of short read sequencing technologies. The methods and systems described herein segment long nucleic acid molecules into smaller fragments that can be sequenced using high-throughput, higher accuracy short-read sequencing technologies, and that segmentation is accomplished in a manner that allows the sequence information derived from the smaller fragments to retain the original long range molecular sequence context, i.e., allowing the attribution of shorter sequence reads to originating longer individual nucleic acid molecules. By attributing sequence reads to an originating longer nucleic acid molecule, one can gain significant characterization information for that longer nucleic acid sequence that one cannot generally obtain from short sequence reads alone. This long range molecular context is not only preserved through a sequencing process, but is also preserved through the targeted enrichment process used in targeted sequencing approaches described herein, where no other sequencing approach has shown this ability.
  • In general, sequence information from smaller fragments will retain the original long range molecular sequence context through the use of a tagging procedure, including the addition of barcodes as described herein and known in the art. In specific examples, fragments originating from the same original longer individual nucleic acid molecule will be tagged with a common barcode, such that any later sequence reads from those fragments can be attributed to that originating longer individual nucleic acid molecule. Such barcodes can be added using any method known in the art, including addition of barcode sequences during amplification methods that amplify segments of the individual nucleic acid molecules as well as insertion of barcodes into the original individual nucleic acid molecules using transposons, including methods such as those described in Amini et al., Nature Genetics 46: 1343-1349 (2014) (advance online publication on Oct. 29, 2014), which is hereby incorporated by reference in its entirety for all purposes and in particular for all teachings related to adding adaptor and other oligonucleotides using transposons. Once nucleic acids have been tagged using such methods, the resultant tagged fragments can be enriched using methods described herein such that the population of fragments represents targeted regions of the genome. As such, sequence reads from that population allows for targeted sequencing of select regions of the genome, and those sequence reads can also be attributed to the originating nucleic acid molecules, thus preserving the original long range molecular sequence context. The sequence reads can be obtained using any sequencing methods and platforms known in the art and described herein.
  • In addition to providing the ability to obtain sequence information from targeted regions of the genome, the methods and systems described herein can also provide other characterizations of genomic material, including without limitation haplotype phasing, identification of structural variations, and identifying copy number variations, as described in co-pending applications U.S. Ser. Nos. 14/752,589 and 14/752,602, both filed on Jun. 26, 2015), which are herein incorporated by reference in their entirety for all purposes and in particular for all written description, figures and working examples directed to characterization of genomic material.
  • Methods of processing and sequencing nucleic acids in accordance with the methods and systems described in the present application are also described in further detail in U.S. Ser. Nos. 14/316,383; 14/316,398; 14/316,416; 14/316,431; 14/316,447; and 14/316,463 which are herein incorporated by reference in their entirety for all purposes and in particular for all written description, figures and working examples directed to processing nucleic acids and sequencing and other characterizations of genomic material.
  • In general, as shown in FIG. 1, the methods and systems described herein may be used to characterize nucleic acids. In particular, as shown, two discrete individual nucleic acids 102 and 104 are illustrated, each having a number of regions of interest, e.g., region 106 and 108 in nucleic acid 102, and regions 110 and 112 in nucleic acid 104. The regions of interest in each nucleic acid are linked within the same nucleic acid molecule, but may be relatively separated from each other, e.g., more than 1 kb apart, more than 5 kb apart, more than 10 kb apart, more than 20 kb apart, more than 30 kb apart, more than 40 kb apart, more than 50 kb apart, and in some cases, as much as 100 kb apart. The regions may denote individual genes, gene groups, exons, or simply discrete and separate parts of the genome. Solely for ease of discussion, the regions shown in FIG. 1 will be referred to as exons 106, 108, 110 and 112. As shown, each nucleic acid 102 and 104 is separated into its own partition 114 and 116, respectively. As noted elsewhere herein, these partitions are, in many cases, aqueous droplets in a water in oil emulsion. Within each droplet, portions of each fragment are copied in a manner that preserves the original molecular context of those fragments, e.g., as having originated from the same molecule. As shown, this is achieved through the inclusion in each copied fragment of a barcode sequence, e.g., barcode sequence “1” or “2” as illustrated, that is representative of the droplet into which the originating fragment was partitioned. For whole genome sequence analysis applications, one could simply pool all of the copied fragments and their associated barcodes, in order to sequence and reassemble the full range sequence information from each of the originating nucleic acids 102 and 104. However, in many cases, it is more desirable to only analyze specific targeted portions of the overall genome, e.g., the exome, specific genes, or the like, in order to provide greater focus on scientifically relevant portions of the genome, and to minimize the time and expense of performing sequencing on less relevant or irrelevant portions of the genome.
  • In accordance with the methods described herein, target enrichment steps may be applied to the libraries of barcoded sequence fragments in order to “pull down” the sequences associated with the desired targets. These may include exon targeted pull downs, gene panel specific targeted pull downs, or the like. A large number of targeted pull down kits that allow for the enriched separation of specific targeted regions of the genome are commercially available, such as the Agilent SureSelect exome pull down kits, and the like. As shown in FIG. 1, application of a targeted enrichment results in enriched, barcoded sequence library 118. Further, because the pulled down fragments within library 118 retain their original molecular context, e.g., through the retention of the barcode information, they may be reassembled into their original molecular contexts with embedded long range linkage information, e.g., with inferred linkage as between each of the assembled regions of interest 106:108 and 110:112. By way of example, one may identify direct molecular linkage between two disparate targeted portions of the genome, e.g., two or more exons, and that direct molecular linkage may be used to identify structural variations and other genomic characteristics, as well as to identify the phase information as to the two or more exons, e.g. providing phased exons, including potentially an entire phased exome, or other phased targeted portions of a genome.
  • Generally, methods of the invention include steps as illustrated in FIG. 7, which provides a schematic overview of methods of the invention discussed in further detail herein. As will be appreciated, the method outlined in FIG. 9 is an exemplary embodiment that may be altered or modified as needed and as described herein.
  • As shown in FIG. 7, the methods described herein will in most examples include a step in which sample nucleic acids containing the targeted regions of interest are partitioned (701). Generally, each partition will include a single individual nucleic acid molecule from a particular locus that is then fragmented or copied in such a way as to preserve the original molecular context of the fragments (702), usually by barcoding the fragments that are specific to the partition in which they are contained. Each partition may in some examples include more than one nucleic acid, and will in some instances contain several hundred nucleic acid molecules—in situations in which multiple nucleic acids are within a partition, any particular locus of the genome will generally be represented by a single individual nucleic acid prior to barcoding. The barcoded fragments of step 702 can be generated using any methods known in the art—in some examples, oligonucleotides are the samples within the distinct partitions. Such oligonucleotides may comprise random sequences intended to randomly prime numerous different regions of the samples, or they may comprise a specific primer sequence targeted to prime upstream of a targeted region of the sample. In further examples, these oligonucleotides also contain a barcode sequence, such that the replication process also barcodes the resultant replicated fragment of the original sample nucleic acid. A particularly elegant process for use of these barcode oligonucleotides in amplifying and barcoding samples is described in detail in U.S. patent application Ser. Nos. 14/316,383, 14/316,398, 14/316,416, 14/316,431, 14/316,447, 14/316,463, all filed Jun. 26, 2014, each of which is herein incorporated by reference in its entirety for all purposes. Extension reaction reagents, e.g., DNA polymerase, nucleoside triphosphates, co-factors (e.g., Mg2+ or Mn2+ etc.), that are also contained in the partitions, then extend the primer sequence using the sample as a template, to produce a complementary fragment to the strand of the template to which the primer annealed, and the complementary fragment includes the oligonucleotide and its associated barcode sequence. Annealing and extension of multiple primers to different portions of the sample can result in a large pool of overlapping complementary fragments of the sample, each possessing its own barcode sequence indicative of the partition in which it was created. In some cases, these complementary fragments may themselves be used as a template primed by the oligonucleotides present in the partition to produce a complement of the complement that again, includes the barcode sequence. In further examples, this replication process is configured such that when the first complement is duplicated, it produces two complementary sequences at or near its termini to allow the formation of a hairpin structure or partial hairpin structure, which reduces the ability of the molecule to be the basis for producing further iterative copies.
  • Returning to the method exemplified in FIG. 7, once the partition-specific barcodes are attached to the copied fragments, the barcoded fragments are then pooled (703). Target enrichment techniques can then be applied (704) to “pull down” the targeted regions of interest. Those targeted regions of interest are then sequenced (705) and the sequences of the fragments are attributed to their originating molecular context (706), such that the targeted regions of interest are both identified and also linked with that originating molecular context. A unique feature of the methods and systems described herein and illustrated in FIG. 7 is that barcodes are attached to the fragments (702) prior to the targeted enrichment step (704). An advantage of the methods and systems described herein is that attaching a partition- or sample-specific barcode to the copied fragments prior to enriching the fragments for targeted genomic regions preserves the original molecular context of those targeted regions, allowing them to be attributed to their original partition and thus their originating sample nucleic acid.
  • In general, targeted genomic regions are enriched, isolated or separated, i.e., “pulled down,” for further analysis, particularly sequencing, using methods that include both chip-based and solution-based capture methods. Such methods utilize probes that are complementary to the genomic regions of interest or to regions near or adjacent to the genomic regions of interest. For example, in hybrid (or chip-based) capture, microarrays containing capture probes (usually single-stranded oligonucleotides) with sequences that taken together cover the region of interest are fixed to a surface. Genomic DNA is fragmented and may further undergo processing such as end-repair to produce blunt ends and/or addition of additional features such as universal priming sequences. These fragments are hybridized to the probes on the microarray. Unhybridized fragments are washed away and the desired fragments are eluted or otherwise processed on the surface for sequencing or other analysis, and thus the population of fragments remaining on the surface is enriched for fragments containing the targeted regions of interest (e.g., the regions comprising the sequences complementary to those contained in the capture probes). The enriched population of fragments may further be amplified using any amplification technologies known in the art.
  • Additional methods of targeted genomic region capture include solution-based methods, in which genomic DNA fragments are hybridized to oligonucleotide probes. The oligonucleotide probes are often referred to as “baits”. These baits are generally attached to a capture molecule, including without limitation a biotin molecule. The baits are complementary to targeted regions of the genome (or to regions near or adjacent to the targeted regions of interest), such that upon application to genomic DNA fragments, the baits hybridize to the fragments, and the capture molecule (e.g., biotin) is then used to selectively pull down the targeted regions of interest (for example, with magnetic streptavidin beads) to thereby enrich the resultant population of fragments with those containing the targeted regions of interest.
  • In examples in which targeted regions covering the whole exome are needed, a library of baits that together cover the whole exome is used to capture those targeted sequences. In such examples, capture protocols can include any of those known in the art, including without limitation any of the exome capture protocols and kits produced by Roche/NimbleGen, IIlumina, and Agilent.
  • Capture of targeted genomic regions for use in the methods and systems described herein are not limited to whole exomes, and can include any one or combination of partial exomes, genes, panels of genes, introns, and combinations of introns and exons. The procedure for capture of these different types of targeted regions follows the general method of using baits to pull down fragments containing the targeted regions of interest. The design of the baits, particularly the oligonucleotide probe portions of the baits that hybridize to or near to the targeted regions of interest, will in part depend on the type of targeted region to be captured.
  • In examples in which only a partial exome is needed for further analysis, the baits can be designed to capture that part of the exome. In certain examples, the specific identities of the portions of the exome that are needed are known, and the library of baits comprises oligonucleotides that are complementary to those identified portions or to regions that are near or adjacent to those portions. Such examples can further include without limitation capture of specific genes and/or panels of genes, or identified portions of the exome known to be associated with a particular phenotype, such as a disorder or disease. In some examples, it may be that a certain portion of the exome or the whole genome (including both intronic and exonic regions) is needed for further analysis, but the specific sequences for the portions of the genome to be captured are not known. In such embodiments, the baits used can be subsets of a library directed to a whole genome, and that subset can be chosen randomly or through any kind of intelligent design in which the library of baits is selected or enriched for probes that are complementary to the targeted subsections of the genome or exome.
  • For any of the methods described herein, the targeted regions can be captured using baits that comprise oligonucleotide probes that are complementary to the whole or part of a targeted region, or the oligonucleotide probes may be complementary to another region, e.g., an intronic region, that is near the targeted region or adjacent to the targeted region. For example, as schematically illustrated in FIG. 2A, a genomic sequence 201 comprises exonic regions 202 and 203. Those exonic regions can be captured by directing the baits to one or more of the intronic sequences nearby (for example intronic region 204 and/or 205 to capture exonic region 202 and intronic region 206 for capture of exonic region 203). In other words, a population of fragments comprising exonic regions 202 or 203 can be captured through the use of baits complementary to intronic regions 204 and/or 205 and 206. As shown in FIG. 2A, the intronic region used as an intronic bait for the nearby exonic region can be adjacent to the exonic region of interest—i.e., there is no gap between the intronic region and the targeted exonic region. In other examples, the intronic region used to capture the nearby exonic region may be near enough so that both regions are likely to be in the same fragment, but there is a gap of one or more nucleotides between the exonic region and the intronic region (for example 202 and 205 in FIG. 2A).
  • In some examples, rather than designing the baits to target particular regions of the genome, a tiling approach is used. In such an approach, rather than targeting specific exonic or intronic regions, the baits are designed to be complementary to portions of the genome at particular ranges or distances. For example, the library of baits can be designed to cover sequences every 5 kilobases (kb) along the genome, such that applying this library of baits to a fragmented genomic sample will capture only a certain subset of the genome—i.e., those regions that are contained in fragments containing complementary sequences to the baits. As will be appreciated, the baits can be designed based on a reference sequence, such as a human genome reference sequence. In further examples, the tiled library of baits is designed to capture regions every 1, 2, 5, 10, 15, 20, 25, 50, 100, 200, 250, 500, 750, 1000, or 10000 kilobases of a genome. In still further examples, the tiled library of baits is designed to capture a mixture of distances—that mixture can be a random mixture of distances or intelligently designed such that a specific portion or percentage of the genome is captured. As will be appreciated, such tiling methods of capture will capture both intronic and exonic regions of the genome for further analysis such as sequencing. Any of the tiling or other intronic baiting methods described herein provide a way to link sequence information from exons widely separated by long intervening intronic regions.
  • In further examples, the tiling or other capture methods described herein will capture about 5%, 10%, 15%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95% of the whole genome. In still further examples, the capture methods described herein capture about 1-10%, 5-20%, 10-30%, 15-40%, 20-50%, 25-60%, 30-70%, 35-80%, 40-90%, or 45-95% of the whole genome.
  • In some examples, sample preparation methods, including methods of fragmenting, amplifying, partitioning, and otherwise processing genomic DNA, can lead to biases or lower coverage of certain regions of a genome. Such biases or lowered coverage can be compensated for in the methods and systems disclosed herein by altering the concentration or genomic locations of baits used to capture targeted regions of the genome. In some examples, it may be known that certain regions of the genome containing high GC content or other structural variations will lead to low coverage—in such situations, the library of baits can be altered to increase the concentration of baits directed to those regions of low coverage—in other words, the population of baits used may be “spiked” to ensure that a sufficient number of fragments containing targeted regions of the genome in those low coverage areas are obtained in the final population of fragments to be sequenced. Such spiking of baits may be conducted in commercially available whole exome kits, such that a custom library of baits directed toward the lower coverage regions are added to off-the-shelf exome capture kits. Additionally, baits can be design to target a region of the genome that is very close to the region of interest, but has more favorable coverage, as is also discussed in further detail herein and embodiments of which are schematically illustrated in FIG. 2.
  • In further examples, the library of baits used in methods of the present invention is a product of informed design that fulfills one or more characteristics as further described herein. This informed design includes instances in which the library of baits is directed to informative single nucleotide polymorphisms (SNPs). The term “informative SNPs” as used herein refers to SNPs that are heterozygous. The library of baits in some examples is designed to contain a plurality of probes that are directed to regions of the genomic sample that contain informative SNPs. By “directed to” as used herein is meant that the probes contain sequences that are complementary to sequences that encompass the SNPs. In further examples, the library of baits is designed to contain probes directed to SNPs that are at predetermined distances from the boundary of an exon and an intron. In situations in which the targeted regions of the genome include regions that are devoid of or contain very few SNPs, the library of baits includes probes that tile across such regions at a predetermined distance and/or that hybridize to the first informative SNP within the next nearest intron or exon.
  • An advantage of the methods and systems described herein is that the targeted regions that are captured are processed prior to capture in such a way that even after the steps of capturing the targeted regions and conducting sequencing analyses, the original molecular context of those targeted regions is retained. As is discussed in further detail herein, the ability to attribute specific targeted regions to their original molecular context (which can include the original chromosome or chromosomal region from which they are derived and/or the location of particular targeted regions in relation to each other within the full genome) provides a way to obtain sequence information from regions of the genome that are otherwise poorly mapped or have poor coverage using traditional sequencing techniques.
  • For example, some genes possess long introns that are too long to span using generally available sequencing techniques, particularly using short-read technologies that possess superior accuracy as compared to long-read technologies. In the methods and systems described herein, however, the molecular context of targeted regions is retained, generally through the tagging procedure illustrated in FIG. 1 and described in further detail herein. As such, links can be made across extended regions of the genome. For example, as schematically illustrated in FIG. 2B, nucleic acid molecule 207 contains two exons (shaded bars) interrupted by a long intronic region (208). Generally used sequencing technologies would be unable to span the distance across the intron to provide information on the relationship of the two exons. In the methods described herein, the individual nucleic acid molecule 207 is distributed into its own discrete partition 209 and then fragmented such that different fragments contain different portions of the exons and the intron. Because each of those fragments is tagged such that any sequence information obtained from the fragments is then attributable to the discrete partition in which it was generated, each fragment is thus also attributable to the individual nucleic acid molecule 207 from which it was derived. In general, and as is described in further detail herein, after fragmentation and tagging, fragments from different partitions are combined together. Targeted capture methods can then be used to enrich the population of fragments that undergoes further analysis, such as sequencing, with fragments containing the targeted region of interest. In the example illustrated in FIG. 2B, the baits used will enrich the population of fragments to capture only those containing a portion of one of the two exons and/or part of the intervening intron, but regions outside of the exons and intron (such as 209 and 210) would not be captured. Thus, the final population of fragments that undergoes sequencing will be enriched for the fragments containing portions of the two exons of interest. Short read, high accuracy sequencing technologies can then be used to identify the sequences of this enriched population of fragments, and because each of the fragments is tagged and thus attributable to its original molecular context, i.e., its original individual nucleic acid molecule, the short read sequences can provide information that spans over the long length of the intervening intron to provide information on the relationship between the two exons.
  • As noted above, the methods and systems described herein provide individual molecular context for short sequence reads of longer nucleic acids. As used herein, individual molecular context refers to sequence context beyond the specific sequence read, e.g., relation to adjacent or proximal sequences, that are not included within the sequence read itself, and as such, will typically be such that they would not be included in whole or in part in a short sequence read, e.g., a read of about 150 bases, or about 300 bases for paired reads. In particularly preferred aspects, the methods and systems provide long range sequence context for short sequence reads. Such long range context includes relationship or linkage of a given sequence read to sequence reads that are within a distance of each other of longer than 1 kb, longer than 5 kb, longer than 10 kb, longer than 15 kb, longer than 20 kb, longer than 30 kb, longer than 40 kb, longer than 50 kb, longer than 60 kb, longer than 70 kb, longer than 80 kb, longer than 90 kb or even longer than 100 kb, or longer. As will be appreciated, by providing long range individual molecular context, one can also derive the phasing information of variants within that individual molecular context, e.g., variants on a particular long molecule will be, by definition commonly phased.
  • By providing longer range individual molecular context, the methods and systems of the invention also provide much longer inferred molecular context (also referred to herein as a “long virtual single molecule read”). Sequence context, as described herein can include mapping or providing linkage of fragments across different (generally on the kilobase scale) ranges of full genomic sequence. These methods include mapping the short sequence reads to the individual longer molecules or contigs of linked molecules, as well as long range sequencing of large portions of the longer individual molecules, e.g., having contiguous determined sequences of individual molecules where such determined sequences are longer than 1 kb, longer than 5 kb, longer than 10 kb, longer than 15 kb, longer than 20 kb, longer than 30 kb, longer than 40 kb, longer than 50 kb, longer than 60 kb, longer than 70 kb, longer than 80 kb, longer than 90 kb or even longer than 100 kb. As with sequence context, the attribution of short sequences to longer nucleic acids, e.g., both individual long nucleic acid molecules or collections of linked nucleic acid molecules or contigs, may include both mapping of short sequences against longer nucleic acid stretches to provide high level sequence context, as well as providing assembled sequences from the short sequences through these longer nucleic acids.
  • Furthermore, while one may utilize the long range sequence context associated with long individual molecules, having such long range sequence context also allows one to infer even longer range sequence context. By way of one example, by providing the long range molecular context described above, one can identify overlapping variant portions, e.g., phased variants, translocated sequences, etc., among long sequences from different originating molecules, allowing the inferred linkage between those molecules. Such inferred linkages or molecular contexts are referred to herein as “inferred contigs”. In some cases when discussed in the context of phased sequences, the inferred contigs may represent commonly phased sequences, e.g., where by virtue of overlapping phased variants, one can infer a phased contig of substantially greater length than the individual originating molecules. These phased contigs are referred to herein as “phase blocks”.
  • By starting with longer single molecule reads (e.g., the “long virtual single molecule reads” discussed above), one can derive longer inferred contigs or phase blocks than would otherwise be attainable using short read sequencing technologies or other approaches to phased sequencing. See, e.g., published U.S. Patent Application No. 2013-0157870. In particular, using the methods and systems described herein, one can obtain inferred contig or phase block lengths having an N50 (where the sum of the block lengths that are greater than the stated N50 number is 50% of the sum of all block lengths) of at least about 10 kb, at least about 20 kb, at least about 50 kb. In more preferred aspects, inferred contig or phase block lengths having an N50 of at least about 100 kb, at least about 150 kb, at least about 200 kb, and in many cases, at least about 250 kb, at least about 300 kb, at least about 350 kb, at least about 400 kb, and in some cases, at least about 500 kb or more, are attained. In still other cases, maximum phase block lengths in excess of 200 kb, in excess of 300 kb, in excess of 400 kb, in excess of 500 kb, in excess of 1 Mb, or even in excess of 2 Mb may be obtained.
  • In one aspect, and in conjunction with any of the capture methods described above and later herein, the methods and systems described herein provide for the compartmentalization, depositing or partitioning of sample nucleic acids, or fragments thereof, into discrete compartments or partitions (referred to interchangeably herein as partitions), where each partition maintains separation of its own contents from the contents of other partitions. Unique identifiers, e.g., barcodes, may be previously, subsequently or concurrently delivered to the partitions that hold the compartmentalized or partitioned sample nucleic acids, in order to allow for the later attribution of the characteristics, e.g., nucleic acid sequence information, to the sample nucleic acids included within a particular compartment, and particularly to relatively long stretches of contiguous sample nucleic acids that may be originally deposited into the partitions.
  • The sample nucleic acids utilized in the methods described herein typically represent a number of overlapping portions of the overall sample to be analyzed, e.g., an entire chromosome, exome, or other large genomic portion. These sample nucleic acids may include whole genomes, individual chromosomes, exomes, amplicons, or any of a variety of different nucleic acids of interest. The sample nucleic acids are typically partitioned such that the nucleic acids are present in the partitions in relatively long fragments or stretches of contiguous nucleic acid molecules. Typically, these fragments of the sample nucleic acids may be longer than 1 kb, longer than 5 kb, longer than 10 kb, longer than 15 kb, longer than 20 kb, longer than 30 kb, longer than 40 kb, longer than 50 kb, longer than 60 kb, longer than 70 kb, longer than 80 kb, longer than 90 kb or even longer than 100 kb, which permits the longer range molecular context described above.
  • The sample nucleic acids are also typically partitioned at a level whereby a given partition has a very low probability of including two overlapping fragments of the starting sample nucleic acid. This is typically accomplished by providing the sample nucleic acid at a low input amount and/or concentration during the partitioning process. As a result, in preferred cases, a given partition may include a number of long, but non-overlapping fragments of the starting sample nucleic acids. The sample nucleic acids in the different partitions are then associated with unique identifiers, where for any given partition, nucleic acids contained therein possess the same unique identifier, but where different partitions may include different unique identifiers. Moreover, because the partitioning step allocates the sample components into very small volume partitions or droplets, it will be appreciated that in order to achieve the desired allocation as set forth above, one need not conduct substantial dilution of the sample, as would be required in higher volume processes, e.g., in tubes, or wells of a multiwell plate. Further, because the systems described herein employ such high levels of barcode diversity, one can allocate diverse barcodes among higher numbers of genomic equivalents, as provided above. In particular, previously described, multiwell plate approaches (see, e.g., U.S. Published Application No. 2013-0079231 and 2013-0157870) typically only operate with a hundred to a few hundred different barcode sequences, and employ a limiting dilution process of their sample in order to be able to attribute barcodes to different cells/nucleic acids. As such, they will generally operate with far fewer than 100 cells, which would typically provide a ratio of genomes:(barcode type) on the order of 1:10, and certainly well above 1:100. The systems described herein, on the other hand, because of the high level of barcode diversity, e.g., in excess of 10,000, 100,000, 500,000, etc. diverse barcode types, can operate at genome:(barcode type) ratios that are on the order of 1:50 or less, 1:100 or less, 1:1000 or less, or even smaller ratios, while also allowing for loading higher numbers of genomes (e.g., on the order of greater than 100 genomes per assay, greater than 500 genomes per assay, 1000 genomes per assay, or even more) while still providing for far improved barcode diversity per genome.
  • Often, the sample is combined with a set of oligonucleotide tags that are releasably-attached to beads prior to the partitioning step. That combination can then lead to barcoding of nucleic acids in the samples using methods known in the art and described herein. In some examples, amplification methods are used to add barcodes to the resultant amplification products, which in some examples contain smaller segments (fragments) of the full originating nucleic acid molecule from which they are derived. In some examples, methods using transposons are utilized as described in Amini et al, Nature Genetics 46: 1343-1349 (2014) (advance online publication on Oct. 29, 2014), which is herein incorporated by reference in its entirety for all purposes and in particular for all teachings related to attaching barcodes or other oligonucleotide tags to nucleic acids. In further examples, methods of attaching barcodes can include the use of nicking enzymes or polymerases and/or invasive probes such as recA to produce gaps along double stranded sample nucleic acids—barcodes can then be inserted into those gaps.
  • In examples in which amplification is used to tag nucleic acid fragments, the oligonucleotide tags may comprise at least a first and second region. The first region may be a barcode region that, as between oligonucleotides within a given partition, may be substantially the same barcode sequence, but as between different partitions, may and, in most cases is a different barcode sequence. The second region may be an N-mer (either a random N-mer or an N-mer designed to target a particular sequence) that can be used to prime the nucleic acids within the sample within the partitions. In some cases, where the N-mer is designed to target a particular sequence, it may be designed to target a particular chromosome (e.g., chromosome 1, 13, 18, or 21), or region of a chromosome, e.g., an exome or other targeted region. In some cases, the N-mer may be designed to target a particular gene or genetic region, such as a gene or region associated with a disease or disorder (e.g., cancer). Within the partitions, an amplification reaction may be conducted using the second N-mer to prime the nucleic acid sample at different places along the length of the nucleic acid. As a result of the amplification, each partition may contain amplified products of the nucleic acid that are attached to an identical or near-identical barcode, and that may represent overlapping, smaller fragments of the nucleic acids in each partition. The bar-code can serve as a marker that signifies that a set of nucleic acids originated from the same partition, and thus potentially also originated from the same strand of nucleic acid. Following amplification, the nucleic acids may be pooled, sequenced, and aligned using a sequencing algorithm. Because shorter sequence reads may, by virtue of their associated barcode sequences, be aligned and attributed to a single, long fragment of the sample nucleic acid, all of the identified variants on that sequence can be attributed to a single originating fragment and single originating chromosome. Further, by aligning multiple co-located variants across multiple long fragments, one can further characterize that chromosomal contribution. Accordingly, conclusions regarding the phasing of particular genetic variants may then be drawn, as can analyses across long ranges of genomic sequence—for example, identification of sequence information across stretches of poorly characterized regions of the genome. Such information may also be useful for identifying haplotypes, which are generally a specified set of genetic variants that reside on the same nucleic acid strand or on different nucleic acid strands. Copy number variations may also be identified in this manner.
  • The described methods and systems provide significant advantages over current nucleic acid sequencing technologies and their associated sample preparation methods. Ensemble sample preparation and sequencing methods are predisposed towards primarily identifying and characterizing the majority constituents in the sample, and are not designed to identify and characterize minority constituents, e.g., genetic material contributed by one chromosome, or by one or a few cells, or fragmented tumor cell DNA molecule circulating in the bloodstream, that constitute a small percentage of the total DNA in the extracted sample. The described methods and systems also provide a significant advantage for detecting populations that are present within a larger sample. As such, they are particularly useful for assessing haplotype and copy number variations—the methods disclosed herein are also useful for providing sequence information over regions of the genome that are poorly characterized or are poorly represented in a population of nucleic acid targets due to biases introduced during sample preparation.
  • The use of the barcoding technique disclosed herein confers the unique capability of providing individual molecular context for a given set of genetic markers, i.e., attributing a given set of genetic markers (as opposed to a single marker) to individual sample nucleic acid molecules, and through variant coordinated assembly, to provide a broader or even longer range inferred individual molecular context, among multiple sample nucleic acid molecules, and/or to a specific chromosome. These genetic markers may include specific genetic loci, e.g., variants, such as SNPs, or they may include short sequences. Furthermore, the use of barcoding confers the additional advantages of facilitating the ability to discriminate between minority constituents and majority constituents of the total nucleic acid population extracted from the sample, e.g. for detection and characterization of circulating tumor DNA in the bloodstream, and also reduces or eliminates amplification bias during optional amplification steps. In addition, implementation in a microfluidics format confers the ability to work with extremely small sample volumes and low input quantities of DNA, as well as the ability to rapidly process large numbers of sample partitions (droplets) to facilitate genome-wide tagging.
  • As described previously, an advantage of the methods and systems described herein is that they can achieve the desired results through the use of ubiquitously available, short read sequencing technologies. Such technologies have the advantages of being readily available and widely dispersed within the research community, with protocols and reagent systems that are well characterized and highly effective. These short read sequencing technologies include those available from, e.g., IIlumina, inc. (GXII, NextSeq, MiSeq, HiSeq, X10), Ion Torrent division of Thermo-Fisher (Ion Proton and Ion PGM), pyrosequencing methods, as well as others.
  • Of particular advantage is that the methods and systems described herein utilize these short read sequencing technologies and do so with their associated low error rates. In particular, the methods and systems described herein achieve the desired individual molecular readlengths or context, as described above, but with individual sequencing reads, excluding mate pair extensions, that are shorter than 1000 bp, shorter than 500 bp, shorter than 300 bp, shorter than 200 bp, shorter than 150 bp or even shorter; and with sequencing error rates for such individual molecular readlengths that are less than 5%, less than 1%, less than 0.5%, less than 0.1%, less than 0.05%, less than 0.01%, less than 0.005%, or even less than 0.001%.
  • II. Work Flow Overview
  • In one exemplary aspect, the methods and systems described in the disclosure provide for depositing or partitioning individual samples (e.g., nucleic acids) into discrete partitions, where each partition maintains separation of its own contents from the contents in other partitions. As used herein, the partitions refer to containers or vessels that may include a variety of different forms, e.g., wells, tubes, micro or nanowells, through holes, or the like. In preferred aspects, however, the partitions are flowable within fluid streams. These vessels may be comprised of, e.g., microcapsules or micro-vesicles that have an outer barrier surrounding an inner fluid center or core, or they may be a porous matrix that is capable of entraining and/or retaining materials within its matrix. In preferred aspect, however, these partitions may comprise droplets of aqueous fluid within a non-aqueous continuous phase, e.g., an oil phase. A variety of different vessels are described in, for example, U.S. patent application Ser. No. 13/966,150, filed Aug. 13, 2013. Likewise, emulsion systems for creating stable droplets in non-aqueous or oil continuous phases are described in detail in, e.g., Published U.S. Patent Application No. 2010-0105112. In certain cases, microfluidic channel networks are particularly suited for generating partitions as described herein. Examples of such microfluidic devices include those described in detail in U.S. patent application Ser. No. 14/682,952, filed Apr. 9, 2015, the full disclosure of which is incorporated herein by reference in its entirety for all purposes. Alternative mechanisms may also be employed in the partitioning of individual cells, including porous membranes through which aqueous mixtures of cells are extruded into non-aqueous fluids. Such systems are generally available from, e.g., Nanomi, Inc.
  • In the case of droplets in an emulsion, partitioning of sample materials, e.g., nucleic acids, into discrete partitions may generally be accomplished by flowing an aqueous, sample containing stream, into a junction into which is also flowing a non-aqueous stream of partitioning fluid, e.g., a fluorinated oil, such that aqueous droplets are created within the flowing stream partitioning fluid, where such droplets include the sample materials. As described below, the partitions, e.g., droplets, also typically include co-partitioned barcode oligonucleotides. The relative amount of sample materials within any particular partition may be adjusted by controlling a variety of different parameters of the system, including, for example, the concentration of sample in the aqueous stream, the flow rate of the aqueous stream and/or the non-aqueous stream, and the like. The partitions described herein are often characterized by having extremely small volumes. For example, in the case of droplet based partitions, the droplets may have overall volumes that are less than 1000 pL, less than 900 pL, less than 800 pL, less than 700 pL, less than 600 pL, less than 500 pL, less than 400 pL, less than 300 pL, less than 200 pL, less than 100 pL, less than 50 pL, less than 20 pL, less than 10 pL, or even less than 1 pL. Where co-partitioned with beads, it will be appreciated that the sample fluid volume within the partitions may be less than 90% of the above described volumes, less than 80%, less than 70%, less than 60%, less than 50%, less than 40%, less than 30%, less than 20%, or even less than 10% the above described volumes. In some cases, the use of low reaction volume partitions is particularly advantageous in performing reactions with very small amounts of starting reagents, e.g., input nucleic acids. Methods and systems for analyzing samples with low input nucleic acids are presented in U.S. patent application Ser. No. 14/752,602, filed Jun. 26, 2015, the full disclosure of which is hereby incorporated by reference in its entirety.
  • Once the samples are introduced into their respective partitions, in accordance with the methods and systems described herein, the sample nucleic acids within partitions are generally provided with unique identifiers such that, upon characterization of those nucleic acids they may be attributed as having been derived from their respective origins. Accordingly, the sample nucleic acids are typically co-partitioned with the unique identifiers (e.g., barcode sequences). In particularly preferred aspects, the unique identifiers are provided in the form of oligonucleotides that comprise nucleic acid barcode sequences that may be attached to those samples. The oligonucleotides are partitioned such that as between oligonucleotides in a given partition, the nucleic acid barcode sequences contained therein are the same, but as between different partitions, the oligonucleotides can, and preferably have differing barcode sequences. In preferred aspects, only one nucleic acid barcode sequence will be associated with a given partition, although in some cases, two or more different barcode sequences may be present.
  • The nucleic acid barcode sequences will typically include from 6 to about 20 or more nucleotides within the sequence of the oligonucleotides. These nucleotides may be completely contiguous, i.e., in a single stretch of adjacent nucleotides, or they may be separated into two or more separate subsequences that are separated by one or more nucleotides. Typically, separated subsequences may typically be from about 4 to about 16 nucleotides in length.
  • The co-partitioned oligonucleotides also typically comprise other functional sequences useful in the processing of the partitioned nucleic acids. These sequences include, e.g., targeted or random/universal amplification primer sequences for amplifying the genomic DNA from the individual nucleic acids within the partitions while attaching the associated barcode sequences, sequencing primers, hybridization or probing sequences, e.g., for identification of presence of the sequences, or for pulling down barcoded nucleic acids, or any of a number of other potential functional sequences. Again, co-partitioning of oligonucleotides and associated barcodes and other functional sequences, along with sample materials is described in, for example, U.S. Patent Application Nos. U.S. patent application Ser. Nos. 14/316,383, 14/316,398, 14/316,416, 14/316,431, 14/316,447, 14/316,463, all filed Jun. 26, 2014, as well as U.S. patent application Ser. No. 14/175,935, filed Feb. 7, 2014, the full disclosures of which is hereby incorporated by reference in their entireties.
  • Briefly, in one exemplary process, beads are provided that each may include large numbers of the above described oligonucleotides releasably attached to the beads, where all of the oligonucleotides attached to a particular bead may include the same nucleic acid barcode sequence, but where a large number of diverse barcode sequences may be represented across the population of beads used. Typically, the population of beads may provide a diverse barcode sequence library that may include at least 1000 different barcode sequences, at least 10,000 different barcode sequences, at least 100,000 different barcode sequences, or in some cases, at least 1,000,000 different barcode sequences. Additionally, each bead may typically be provided with large numbers of oligonucleotide molecules attached. In particular, the number of molecules of oligonucleotides including the barcode sequence on an individual bead may be at least bout 10,000 oligonucleotides, at least 100,000 oligonucleotide molecules, at least 1,000,000 oligonucleotide molecules, at least 100,000,000 oligonucleotide molecules, and in some cases at least 1 billion oligonucleotide molecules.
  • The oligonucleotides may be releasable from the beads upon the application of a particular stimulus to the beads. In some cases, the stimulus may be a photo-stimulus, e.g., through cleavage of a photo-labile linkage that may release the oligonucleotides. In some cases, a thermal stimulus may be used, where elevation of the temperature of the beads environment may result in cleavage of a linkage or other release of the oligonucleotides form the beads. In some cases, a chemical stimulus may be used that cleaves a linkage of the oligonucleotides to the beads, or otherwise may result in release of the oligonucleotides from the beads.
  • In accordance with the methods and systems described herein, the beads including the attached oligonucleotides may be co-partitioned with the individual samples, such that a single bead and a single sample are contained within an individual partition. In some cases, where single bead partitions are desired, it may be desirable to control the relative flow rates of the fluids such that, on average, the partitions contain less than one bead per partition, in order to ensure that those partitions that are occupied, are primarily singly occupied. Likewise, one may wish to control the flow rate to provide that a higher percentage of partitions are occupied, e.g., allowing for only a small percentage of unoccupied partitions. In preferred aspects, the flows and channel architectures are controlled as to ensure a desired number of singly occupied partitions, less than a certain level of unoccupied partitions and less than a certain level of multiply occupied partitions.
  • FIG. 3 illustrates one particular example method for barcoding and subsequently sequencing a sample nucleic acid, particularly for use for a copy number variation or haplotype assay. First, a sample comprising nucleic acid may be obtained from a source, 300, and a set of barcoded beads may also be obtained, 310. The beads are preferably linked to oligonucleotides containing one or more barcode sequences, as well as a primer, such as a random N-mer or other primer. Preferably, the barcode sequences are releasable from the barcoded beads, e.g., through cleavage of a linkage between the barcode and the bead or through degradation of the underlying bead to release the barcode, or a combination of the two. For example, in certain preferred aspects, the barcoded beads can be degraded or dissolved by an agent, such as a reducing agent to release the barcode sequences. In this example, a low quantity of the sample comprising nucleic acid, 305, barcoded beads, 315, and optionally other reagents, e.g., a reducing agent, 320, are combined and subject to partitioning. By way of example, such partitioning may involve introducing the components to a droplet generation system, such as a microfluidic device, 325. With the aid of the microfluidic device 325, a water-in-oil emulsion 330 may be formed, wherein the emulsion contains aqueous droplets that contain sample nucleic acid, 305, reducing agent, 320, and barcoded beads, 315. The reducing agent may dissolve or degrade the barcoded beads, thereby releasing the oligonucleotides with the barcodes and random N-mers from the beads within the droplets, 335. The random N-mers may then prime different regions of the sample nucleic acid, resulting in amplified copies of the sample after amplification, wherein each copy is tagged with a barcode sequence, 340. Preferably, each droplet contains a set of oligonucleotides that contain identical barcode sequences and different random N-mer sequences. Subsequently, the emulsion is broken, 345 and additional sequences (e.g., sequences that aid in particular sequencing methods, additional barcodes, etc.) may be added, via, for example, amplification methods, 350 (e.g., PCR). Sequencing may then be performed, 355, and an algorithm applied to interpret the sequencing data, 360. Sequencing algorithms are generally capable, for example, of performing analysis of barcodes to align sequencing reads and/or identify the sample from which a particular sequence read belongs. In addition, and as is described herein, these algorithms may also further be used to attribute the sequences of the copies to their originating molecular context.
  • As noted above, while single bead occupancy may be the most desired state, it will be appreciated that multiply occupied partitions or unoccupied partitions may often be present. An example of a microfluidic channel structure for co-partitioning samples and beads comprising barcode oligonucleotides is schematically illustrated in FIG. 4. As shown, channel segments 402, 404, 406, 408 and 410 are provided in fluid communication at channel junction 412. An aqueous stream comprising the individual samples 414 is flowed through channel segment 402 toward channel junction 412. As described elsewhere herein, these samples may be suspended within an aqueous fluid prior to the partitioning process.
  • Concurrently, an aqueous stream comprising the barcode carrying beads 416 is flowed through channel segment 404 toward channel junction 412. A non-aqueous partitioning fluid is introduced into channel junction 412 from each of side channels 406 and 408, and the combined streams are flowed into outlet channel 410. Within channel junction 412, the two combined aqueous streams from channel segments 402 and 404 are combined, and partitioned into droplets 418, that include co-partitioned samples 414 and beads 416. As noted previously, by controlling the flow characteristics of each of the fluids combining at channel junction 412, as well as controlling the geometry of the channel junction, one can optimize the combination and partitioning to achieve a desired occupancy level of beads, samples or both, within the partitions 418 that are generated.
  • As will be appreciated, a number of other reagents may be co-partitioned along with the samples and beads, including, for example, chemical stimuli, nucleic acid extension, transcription, and/or amplification reagents such as polymerases, reverse transcriptases, nucleoside triphosphates or NTP analogues, primer sequences and additional cofactors such as divalent metal ions used in such reactions, ligation reaction reagents, such as ligase enzymes and ligation sequences, dyes, labels, or other tagging reagents.
  • Once co-partitioned, the oligonucleotides disposed upon the bead may be used to barcode and amplify the partitioned samples. A particularly elegant process for use of these barcode oligonucleotides in amplifying and barcoding samples is described in detail in U.S. patent application Ser. Nos. 14/316,383, 14/316,398, 14/316,416, 14/316,431, 14/316,447, 14/316,463, all filed Jun. 26, 2014, the full disclosures of which are hereby incorporated by reference in their entireties. Briefly, in one aspect, the oligonucleotides present on the beads that are co-partitioned with the samples and released from their beads into the partition with the samples. The oligonucleotides typically include, along with the barcode sequence, a primer sequence at its 5′ end. This primer sequence may be random or structured. Random primer sequences are generally intended to randomly prime numerous different regions of the samples. Structured primer sequences can include a range of different structures including defined sequences targeted to prime upstream of a specific targeted region of the sample as well as primers that have some sort of partially defined structure, including without limitation primers containing a percentage of specific bases (such as a percentage of GC N-mers), primers containing partially or wholly degenerate sequences, and/or primers containing sequences that are partially random and partially structured in accordance with any of the description herein. As will be appreciated, any one or more of the above types of random and structured primers may be included in oligonucleotides in any combination.
  • Once released, the primer portion of the oligonucleotide can anneal to a complementary region of the sample. Extension reaction reagents, e.g., DNA polymerase, nucleoside triphosphates, co-factors (e.g., Mg2+ or Mn2+ etc.), that are also co-partitioned with the samples and beads, then extend the primer sequence using the sample as a template, to produce a complementary fragment to the strand of the template to which the primer annealed, with complementary fragment includes the oligonucleotide and its associated barcode sequence. Annealing and extension of multiple primers to different portions of the sample may result in a large pool of overlapping complementary fragments of the sample, each possessing its own barcode sequence indicative of the partition in which it was created. In some cases, these complementary fragments may themselves be used as a template primed by the oligonucleotides present in the partition to produce a complement of the complement that again, includes the barcode sequence. In some cases, this replication process is configured such that when the first complement is duplicated, it produces two complementary sequences at or near its termini, to allow the formation of a hairpin structure or partial hairpin structure, which reduces the ability of the molecule to be the basis for producing further iterative copies. A schematic illustration of one example of this is shown in FIG. 5.
  • As the figure shows, oligonucleotides that include a barcode sequence are co-partitioned in, e.g., a droplet 502 in an emulsion, along with a sample nucleic acid 504. As noted elsewhere herein, the oligonucleotides 508 may be provided on a bead 506 that is co-partitioned with the sample nucleic acid 504, which oligonucleotides are preferably releasable from the bead 506, as shown in panel A. The oligonucleotides 508 include a barcode sequence 512, in addition to one or more functional sequences, e.g., sequences 510, 514 and 516. For example, oligonucleotide 508 is shown as comprising barcode sequence 512, as well as sequence 510 that may function as an attachment or immobilization sequence for a given sequencing system, e.g., a P5 sequence used for attachment in flow cells of an Illumina Hiseq or Miseq system. As shown, the oligonucleotides also include a primer sequence 516, which may include a random or targeted N-mer for priming replication of portions of the sample nucleic acid 504. Also included within oligonucleotide 508 is a sequence 514 which may provide a sequencing priming region, such as a “read1” or R1 priming region, that is used to prime polymerase mediated, template directed sequencing by synthesis reactions in sequencing systems. In many cases, the barcode sequence 512, immobilization sequence 510 and R1 sequence 514 may be common to all of the oligonucleotides attached to a given bead. The primer sequence 516 may vary for random N-mer primers, or may be common to the oligonucleotides on a given bead for certain targeted applications.
  • Based upon the presence of primer sequence 516, the oligonucleotides are able to prime the sample nucleic acid as shown in panel B, which allows for extension of the oligonucleotides 508 and 508 a using polymerase enzymes and other extension reagents also co-portioned with the bead 506 and sample nucleic acid 504. As shown in panel C, following extension of the oligonucleotides that, for random N-mer primers, would anneal to multiple different regions of the sample nucleic acid 504; multiple overlapping complements or fragments of the nucleic acid are created, e.g., fragments 518 and 520. Although including sequence portions that are complementary to portions of sample nucleic acid, e.g., sequences 522 and 524, these constructs are generally referred to herein as comprising fragments of the sample nucleic acid 504, having the attached barcode sequences. As will be appreciated, the replicated portions of the template sequences as described above are often referred to herein as “fragments” of that template sequence. Notwithstanding the foregoing, however, the term “fragment” encompasses any representation of a portion of the originating nucleic acid sequence, e.g., a template or sample nucleic acid, including those created by other mechanisms of providing portions of the template sequence, such as actual fragmentation of a given molecule of sequence, e.g., through enzymatic, chemical or mechanical fragmentation. In preferred aspects, however, fragments of a template or sample nucleic acid sequence will denote replicated portions of the underlying sequence or complements thereof.
  • The barcoded nucleic acid fragments may then be subjected to characterization, e.g., through sequence analysis, or they may be further amplified in the process, as shown in panel D. For example, additional oligonucleotides, e.g., oligonucleotide 508 b, also released from bead 306, may prime the fragments 518 and 520. In particular, again, based upon the presence of the random N-mer primer 516 b in oligonucleotide 508 b (which in many cases will be different from other random N-mers in a given partition, e.g., primer sequence 516), the oligonucleotide anneals with the fragment 518, and is extended to create a complement 526 to at least a portion of fragment 518 which includes sequence 528, that comprises a duplicate of a portion of the sample nucleic acid sequence. Extension of the oligonucleotide 508 b continues until it has replicated through the oligonucleotide portion 508 of fragment 518. As noted elsewhere herein, and as illustrated in panel D, the oligonucleotides may be configured to prompt a stop in the replication by the polymerase at a desired point, e.g., after replicating through sequences 516 and 514 of oligonucleotide 508 that is included within fragment 518. As described herein, this may be accomplished by different methods, including, for example, the incorporation of different nucleotides and/or nucleotide analogues that are not capable of being processed by the polymerase enzyme used. For example, this may include the inclusion of uracil containing nucleotides within the sequence region 512 to prevent a non-uracil tolerant polymerase to cease replication of that region. As a result a fragment 526 is created that includes the full-length oligonucleotide 508 b at one end, including the barcode sequence 512, the attachment sequence 510, the R1 primer region 514, and the random N-mer sequence 516 b. At the other end of the sequence will be included the complement 516′ to the random N-mer of the first oligonucleotide 508, as well as a complement to all or a portion of the R1 sequence, shown as sequence 514′. The R1 sequence 514 and its complement 514′ are then able to hybridize together to form a partial hairpin structure 528. As will be appreciated because the random N-mers differ among different oligonucleotides, these sequences and their complements would not be expected to participate in hairpin formation, e.g., sequence 516′, which is the complement to random N-mer 516, would not be expected to be complementary to random N-mer sequence 516 b. This would not be the case for other applications, e.g., targeted primers, where the N-mers would be common among oligonucleotides within a given partition.
  • By forming these partial hairpin structures, it allows for the removal of first level duplicates of the sample sequence from further replication, e.g., preventing iterative copying of copies. The partial hairpin structure also provides a useful structure for subsequent processing of the created fragments, e.g., fragment 526.
  • All of the fragments from multiple different partitions may then be pooled for sequencing on high throughput sequencers as described herein. Because each fragment is coded as to its partition of origin, the sequence of that fragment may be attributed back to its origin based upon the presence of the barcode. This is schematically illustrated in FIG. 6. As shown in one example, a nucleic acid 604 originated from a first source 600 (e.g., individual chromosome, strand of nucleic acid, etc.) and a nucleic acid 606 derived from a different chromosome 602 or strand of nucleic acid are each partitioned along with their own sets of barcode oligonucleotides as described above.
  • Within each partition, each nucleic acid 604 and 606 is then processed to separately provide overlapping set of second fragments of the first fragment(s), e.g., second fragment sets 608 and 610. This processing also provides the second fragments with a barcode sequence that is the same for each of the second fragments derived from a particular first fragment. As shown, the barcode sequence for second fragment set 608 is denoted by “1” while the barcode sequence for fragment set 610 is denoted by “2”. A diverse library of barcodes may be used to differentially barcode large numbers of different fragment sets. However, it is not necessary for every second fragment set from a different first fragment to be barcoded with different barcode sequences. In fact, in many cases, multiple different first fragments may be processed concurrently to include the same barcode sequence. Diverse barcode libraries are described in detail elsewhere herein.
  • The barcoded fragments, e.g., from fragment sets 608 and 610, may then be pooled for sequencing using, for example, sequence by synthesis technologies available from Illumina or Ion Torrent division of Thermo Fisher, Inc. Once sequenced, the sequence reads 612 can be attributed to their respective fragment set, e.g., as shown in aggregated reads 614 and 616, at least in part based upon the included barcodes, and optionally, and preferably, in part based upon the sequence of the fragment itself. The attributed sequence reads for each fragment set are then assembled to provide the assembled sequence for each sample fragment, e.g., sequences 618 and 620, which in turn, may be further attributed back to their respective original chromosomes (600 and 602). Methods and systems for assembling genomic sequences are described in, for example, U.S. patent application Ser. No. 14/752,773, filed Jun. 26, 2015, the full disclosure of which is hereby incorporated by reference in its entirety.
  • III. Application of Methods and Systems to Targeted Sequencing
  • In one aspect of the systems and methods described herein are used to obtain sequence information from targeted regions of a genome.
  • By “targeted” regions of a genome (as well as any grammatical equivalents thereof) is meant a whole genome or any one or more regions of a genome identified as of interest and/or selected through one or more methods described herein. The targeted regions of the genome sequenced by methods and systems described herein include without limitation introns, exons, intergenic regions, or any combination thereof. In certain examples, the methods and systems described herein provide sequence information on whole exomes, portions of exomes, one or more selected genes (including selected panels of genes), one or more introns, and combinations of intronic and exonic sequences.
  • Targeted regions of the genome may also include certain portions or percentages of the genome rather than regions identified by sequence. In certain embodiments, targeted regions of the genome captured and analyzed in accordance with the methods described herein include portions of the genome located every 1, 2, 5, 10, 15, 20, 25, 50, 100, 200, 250, 500, 750, 1000, or 10000 kilobases of a genome. In further embodiments, targeted regions of the genome comprise 5%, 10%, 15%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95% of the whole genome. In still further embodiments, the targeted regions comprise 1-10%, 5-20%, 10-30%, 15-40%, 20-50%, 25-60%, 30-70%, 35-80%, 40-90%, or 45-95% of the whole genome.
  • In general, targeted regions of a genome are captured for use in any sequencing methods known in the art and described herein. By “captured” as used herein is meant any method or system for enriching a population of nucleic acid and/or nucleic acid fragments such that the resultant population contains an increased percentage of the targeted regions of interest as compared to the genomic regions that are not of interest. In further embodiments, the enriched population contains at least 50%, 55%, 60%, 70%, 75%, 80%, 85%, 90%, 95%, 97%, 98%, 99%, or 100% nucleic acids/nucleic acid fragments comprising the targeted regions.
  • Capture methods generally include chip-based methods, in which targeted regions are captured through hybridization or other association with capture molecules on a surface, and solution based methods, in which oligonucleotide probes (baits), which are complementary to the targeted regions (or to regions near the targeted regions) are hybridized to genomic fragment libraries. The probes used in the capture methods disclosed herein are generally attached to capture molecules, such as biotin, which can be used to “pull down” the probes and the fragments to which they are hybridized—these pull down methods include any methods by which the baits hybridized to nucleic acids or nucleic acid fragments that contain the targeted regions of interest are separated from fragments that do not contain the regions of interest. In embodiments in which the probes are biotynilated, magnetic streptavidin beads are used to selectively pull-down and enrich baits with bound targeted regions.
  • In further aspects, a library of baits is used that covers all the targeted regions desired for further study. In the case of whole exome analysis, such a library of baits thus includes oligonucleotide probes that together cover the full exome. In certain embodiments, only portions of the exome are needed for further analysis. In such embodiments, the baits are designed to target that subset of the exome. This design can be accomplished using methods and algorithms known in the art and in general is based upon a reference sequence, such as the human genome.
  • In some examples, the targeted genomic regions processed and sequenced in accordance with the methods and systems described herein are full or partial exomes. These full or partial exomes can be captured for sequencing using any methods known in the art, including without limitation any of the Roche/NimbleGen exome protocols, including the NimbleGen 2.1M Human Exome array and the NimbleGen SeqCap EZ Exome Library, any of the Agilent SureSelect products, any Illumina exome capture products, including the TruSeq and Nextera Exome products, and any other products, methods, systems and protocols known in the art.
  • In further embodiments, when the targeted regions of interest comprise whole or portions of the exome, the baits used to capture those targeted regions may be designed to be complementary to those exonic sequences. In other embodiments, the baits are not complementary to the exonic sequences themselves but are instead complementary to sequences near the exonic sequence or to intronic sequences between two exons. Such designs are also referred to herein as “anchored exome capture” or “intronic baiting,” by which, as discussed herein, is meant a process in which one or more portions of an exome are captured through the use of baits complementary to one or more intronic sequences near or adjacent to the one or more portions of the exome that are of interest. For example, as schematically illustrated in FIG. 2, a genomic sequence 201 comprises exonic regions 202 and 203. Those exonic regions can be captured by utilizing baits directed to one or more of the intronic sequences nearby (for example intronic region 204 and/or 205 to capture exonic region 202 and intronic region 206 for capture of exonic region 203). In other words, a population of fragments comprising exonic regions 202 or 203 would be captured through the use of baits complementary to intronic regions 204 and/or 205 and 206. In some embodiments, intronic baiting is used to bridge exons separated by long intronic regions by sparsely baiting longer introns. In such embodiments, the baits are not necessarily targeting intronic regions that are close to the exonic regions of interest, but the baits are instead designed to target regions separated by particular distances (or sets of distances) or are designed to tile across the intronic regions by a particular number of bases or combinations of numbers of bases. Such embodiments are described in further detail below.
  • In some embodiments, the intronic regions used for anchored exome capture/intronic baiting techniques of the invention are adjacent to the exonic region to be captured. In further embodiments, the intronic regions are separated from the exonic region to be captured by about 1-50, 2-45, 3-40, 4-35, 5-30, 6-25, 7-20, 8-15, 9-10, 2-20, 3-15, 4-10, 5-30, 10-40, 15-50, 20-75, 25-100 nucleotides. In still further embodiments, the intronic regions are separated from the exonic regions to be captured by about 5, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 125, 150, 175, 200, 300, 400, or 500 nucleotides. In further embodiments, particularly for situations in which sparse baiting of intronic regions is of use (such as for phase variant detection or identification of linked exonic regions across large intronic distances) the intronic regions are separated from the exonic regions to be captured by distances on the orders of kilobases, e.g., 1-20, 2-18, 3-16, 4-14, 5-12, 6-10 kilobases. Since the original molecular context of the enriched population of oligonucleotides is retained, this sparse baiting of intronic regions allows for the linking of sequence information between exonic regions separated by long introns.
  • In further aspects, rather than designing the baits to target particular regions of the genome, a tiling approach is used. In such an approach, rather than targeting specific exonic or intronic regions, the baits are instead designed to be complementary to portions of the genome at particular ranges or distances. For example, the library of baits can be designed to hybridize to sequences located every 5 kilobases (kb) along the genome, such that applying this library of baits to a fragmented genomic sample will capture only a certain subset of the genome—i.e., those regions that are contained in fragments containing complementary sequences to the baits. As will be appreciated, the baits can be designed based on a reference sequence, such as a human genome reference sequence. In further embodiments, the tiled library of baits is designed to capture regions every 1, 2, 5, 10, 15, 20, 25, 50, 100, 200, 250, 500, 750, 1000, or 10000 kilobases of a genome. In some examples, this tiling method has the effect of sparsely capturing intronic regions, thus providing a way to link sequence information of exonic regions that are separated by long intronic regions, because the original molecular context of those exonic regions captured through sparse capture of intronic regions is retained.
  • In still further embodiments, the baits are designed to tile the genome in a random or combined manner—for example, a mixture of tiled libraries can be used where some of the libraries capture regions every 1 kb, whereas other libraries in the mixture capture regions every 100 kb. In still further embodiments, the tiled libraries are designed so that the baits target within a range of positions within the genome—for example, the baits may target regions of every 1-10, 2-5, 5-200, 10-175, 15-150, 20-125, 30-100, 40-75, 50-60 kb of the genome. In further examples, the tiled or other capture methods described herein will capture about 5%, 10%, 15%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95% of the whole genome. As will be appreciated, such tiling methods of capture will capture both intronic and exonic regions of the genome for further analysis such as sequencing.
  • In yet further embodiments and in accordance with any of the methods described herein, the library of baits used in methods of the present invention is a product of informed design that fulfills one or more characteristics as further described herein. This informed design includes instances in which the library of baits is directed to informative single nucleotide polymorphisms (SNPs). As discussed above, the term “informative SNPs” as used herein refers to SNPs that are heterozygous. The library of baits in some examples is designed to contain a plurality of probes that are directed to regions of the genomic sample that contain informative SNPs. By “directed to” as used herein is meant that the probes contain sequences that are complementary to those regions of the genomic sequences. Informed bait design provides the ability to optimize targeted sequencing methods by allowing for targeted enrichment with full coverage while at the same time reducing the number of probes needed (and thus reducing costs and streamlining the work flow).
  • In general, for methods utilizing informed bait design, the libraries of baits are designed to include baits directed to particular sequences in targeted regions of the genome based on the presence or absence of informative SNPs in those regions and/or the location(s) of those informative SNPs. An exemplary illustration of general considerations for informed bait design is provided in FIG. 8. A region of the genome 801 can include exons (802 and 803). In some examples, an informative SNP 804 will be located at the boundary between the exon (802) and the adjacent intron. In such a situation, the bait library can be designed to include probes directed to one or more nucleotides (805) at a specified distance away from the boundary. In further examples in which there is no informative SNP at the boundary between the exon and the adjacent intron (806), the bait library can be designed to include probes directed to one or more positions in the intron near that boundary (807 and 808). Those positions will preferably include informative SNPs, but may also include other SNPs and/or other sequences as needed. In still further examples in which an exon 803 contains an informative SNP 809 in the interior of the exon but no informative SNPs at the boundaries, the bait library can be designed to include probes directed to several positions 810, 811, and 812 in the adjacent intron that include a mixture of informative and non-informative SNPs (as well as any other sequences as needed).
  • In some aspects, one or more input characteristics are used to design a probe bait library that is directed to shifting locations along the genome based on those input characteristics as well as map quality in various regions. This design is generally based on spacing between informative SNPs rather than on the locations of introns and exons. However, as will be appreciated, any of the descriptions provided herein related to bait design based on intron and exon locations can also be used in combination with the informed bait design methods based on informative SNPs. Input characteristics used in informed bait design include without limitation and in any combination locations of exons, introns, intergenic regions, informative SNPs, as well as regions of repeating sequences (such as GC-rich regions), centromeres, and sample nucleic acid lengths.
  • For ease of discussion, different characteristics of informed design probe libraries are described below in terms of different potential embodiments. As will be appreciated, any of the probe libraries discussed herein, whether using any of the informed design elements or any of the other types of design discussed above can be used singly or in any combination. The design elements utilized are selected based on the targeted genomic regions of interest as well as sample input and the quality of mapping for those regions of interest.
  • In some embodiments, probe bait libraries are designed to include probes directed to regions that have a high likelihood of containing informative SNPs in a given sample. Such targets may include individual bases (the informative SNPs themselves) or one or more bases that are proximal or adjacent to the informative SNPs. In still further embodiments, the targets for the probe baits may be directly adjacent to the informative SNPs or separated by distances from about 1-200, 10-190, 20-180, 30-170, 40-160, 50-150, 60-140, 70-130, 80-120, 90-100 bases from an informative SNP.
  • In further embodiments, the probe bait libraries include probes directed to regions of particular densities related to the average length of the nucleic acid molecules. For example, the probes can be designed to include probes at a density of target sequences that is x-fold more dense than the average length of the nucleic acid molecules/fragments to which the probes are hybridizing, where x can be without limitation 1, 5, 10, 20, 50, 75, 100, 125, 150, or 200. Increasing the density of the probe targets relative to the length of the nucleic acids increases the ability to link probes across loci on the same physical molecule. Such methods can also improve the probability that the linked regions will include informative SNPs, thus further improving the ability of the probe bait libraries to attach to targeted regions of the genome.
  • The density of the probe targets may also be increased in situations in which (at the population level) there is not a high probability of informative SNPs in a given region of interest. In such regions, tiling methods such as those described herein can be used to direct probes at periodic spacings along the region. In certain embodiments, the density of the spacing can be differentially based, such that the density of probe spacing in these regions lacking informative SNPs are at a 1, 2, 5, 10, 25, 50-fold shorter distance than probe spacing in regions containing informative SNPs.
  • In further embodiments, the probe bait library is designed to consider only informative SNP distribution within a gene (including exons and introns). This method of design is directed to capture a sufficient number of heterozygous SNPs at key locations to link/phase from one end of the gene to the other. Such a design method includes baits directed to sets of targets that combine exonic informative SNPs with one or more non-exonic SNPs such that the distance between informative SNPs in a gene is below the above described densities of spacing.
  • Such informed design methods allow detection of not only general targeted regions of the genome, but also allows the detection and phasing of genomic structural variations, such as translocations and gene fusions. By ensuring that any individual gene can be phased, it follows that the vast majority of gene fusion events can be detected and phased using the methods described herein.
  • In certain embodiments and in accordance with any of the above, the bait libraries are designed to target probes at distances of about 1 kb to about 2 Mb. In further embodiments, the distances are from about 1-50, 5-45, 10-40, 15-35, 20-30, 10-50 kb.
  • In further embodiments, the nucleic acid fragments being targeted by the probe baits are from about 2 kb to about 250 Mb. In still further embodiments, the fragments are from about 10-1000, 20-900, 30-800, 40-700, 50-600, 60-500, 70-400, 80-300, 90-200, 100-150, 50-500, 25-300 kb.
  • In some embodiments, the probe bait libraries are designed such that about 60-95% of the probes hybridize to sequences containing informative SNPs. In further embodiments, the probe bait libraries are designed such that about 65%-85%, 70%-80%, 60-90%, 80-90%, 90-95%, 95%-99% of the probes in the library of probes are designed to hybridize to informative SNPs. In still further embodiments, at least 65%, 75%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97% of the probes in the library of probes are designed to hybridize to informative SNPs. As will be appreciated, for a probe to be designed to “hybridize to” an informative SNP means that such a probe hybridizes to a sequence region that includes that informative SNP.
  • In further embodiments, the probe bait libraries are designed to include a plurality of probes directed to informative SNPs that are located within both exons and introns in targeted portions of the genomic sample.
  • In still further embodiments, the libraries are designed such that a majority of the probes in the library hybridize to informative SNPs spaced apart by about 1-15, 5-10, 3-6 kb. In yet further embodiments, the majority of the probes in the library of probes are further designed to hybridize to informative SNPs spaced apart by about 1, 3, 5, 10, 20, 30, 50 kb.
  • In further embodiments, a plurality of probes within the library of probes are designed such that for targeted portions of the genomic samples in which there are no informative SNPs within 5-300, 10-50, 20-100, 30-150, or 40-200 kb of boundaries between exons and introns, the plurality of probes is designed to hybridize at an informative SNP within an intron from those boundaries.
  • In further embodiments, a plurality of probes within the library of probes are designed such that for targeted portions of the genomic samples in which there is a first informative SNP within an exon and that first informative SNP is located 5-300, 10-50, 20-100, 30-150, or 40-200 kb from a boundary with an adjacent intron and a second informative SNP within the adjacent intron and that second informative SNP is located 10-50 kb from the boundary, the plurality of probes is designed to hybridize to a region of the genomic sample between the first and second informative SNPs;
  • In further embodiments, a plurality of probes within the library of probes are designed such that for targeted portions of the genomic samples comprising no informative SNPs for at least 5-300, 10-50, 20-100, 30-150, or 40-200 kb, the plurality of probes is designed to hybridize every 0.5, 1, 3, or 5 kb to those targeted portions of the genomic samples. In further embodiments, the plurality of probes is designed to hybridize every 0.1, 0.5, 1, 1.5, 3, 5, 10, 15, 20, 30, 35, 40, 45, 50 kb along those targeted portions of the genomic samples.
  • In further embodiments, a plurality of probes within the library of probes are designed such that for targeted portions of the genomic samples in which there are no informative SNPs within 5-300, 10-50, 20-100, 30-150, or 40-200 kb of boundaries between exons and introns, the plurality of probes are designed to hybridize to the next closest informative SNP to the exon-intron boundaries.
  • In further embodiments, the library of probes comprises probes designed to hybridize to regions of the genomic sample that flank exons at a density that provides linkage information across barcodes.
  • In still further embodiments, the range of coverage represented by the library of probes is inversely proportional to the distribution of lengths of the individual nucleic acid fragment molecules of the genomic sample in the discrete partitions, such that methods containing a higher proportion of longer individual nucleic acid fragment molecules use libraries of probes with smaller ranges of coverage.
  • In still further embodiments, the library of probes is optimized for coverage of the targeted portions of the genomic sample. In yet further embodiments, the density of coverage may be lower for regions of high map quality, particularly for those regions containing informative SNPs, and the density may further be higher for regions of low map quality to ensure that linkage information is provided across targeted regions.
  • In yet further embodiments, the library of probes has features informed by characteristics of the one or more targeted portions of a genomic sample, such that for targeted portions with high map quality, the library of probes comprises probes that hybridize to informative SNPs within 1 kb-1 Mb of boundaries of exons and introns. The library of probes may in such situations further include probes that hybridize to informative SNPs within 10-500, 20-450, 30-400, 40-350, 50-300, 60-250, 70-200, 80-150, 90-100 kb of boundaries of exons and introns.
  • In yet further embodiments, the library of probes has features informed by characteristics of the one or more targeted portions of a genomic sample, such that for targeted portions in which the distribution of lengths of the barcoded fragments has a high proportion of fragments longer than about 100, 150, 200, 250 kb, the library of probes comprise probes that hybridize to informative SNPs separated by at least 50 kb. The library of probes may in such situations further include probes that hybridize to informative SNPs separated by at least 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 75, 100, 125, 150, 175, 200 kb.
  • In yet further embodiments, the library of probes has features informed by characteristics of the one or more targeted portions of a genomic sample, such that for targeted portions with low map quality, the library of probes comprises probes that hybridize to informative SNPs within 1 kb of exon-intron boundaries. The library of probes may in such situations further include probes that hybridize to informative SNPs within 2, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 75, 100, 125, 150, 175, 200 kb of exon-intron boundaries. In such situations, the library will further include probes that hybridize and probes that hybridize to informative SNPs within exons, within introns, or both.
  • In yet further embodiments, the library of probes has features informed by characteristics of the one or more targeted portions of a genomic sample, such that for targeted portions comprising intergenic regions, the library of probes comprises probes that hybridize to informative SNPs spaced apart at distances of at least 1, 2, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 75, 100 kb.
  • The baits used in the capture methods described herein can be of any size or structure that is useful for enriching a population of fragments for fragments containing targeted regions of the genome. As discussed above, generally the baits of use in the present invention comprise oligonucleotide probes that are attached to a capture molecule, such as biotin. The oligonucleotide probes may be complementary to sequences within a targeted region of interest, or they may be complementary to regions outside of the targeted region but close enough to that targeted region that both the “anchoring” region and the targeted region are within the same fragment, such that the bait is able to pull down the targeted region by hybridizing to that nearby region (such as a flanking intron).
  • The capture molecule attached to the bait may be any capture molecule that can be used for isolating the bait and its hybridization partner from other fragments in a population. In general, the baits used herein are attached to biotin, and then solid supports comprising streptavidin (including without limitation magnetic streptavidin beads) can be used to capture the baits and the fragments to which they are hybridized. Other capture molecule pairs may include without limitation biotin/neutravidin, antigen/antibody, or complementary oligonucleotide sequences.
  • In further embodiments, the oligonucleotide probe portion of the baits can be of any length suitable for hybridizing to targeted regions or to regions near targeted regions. In some embodiments, the oligonucleotide probe portion of the baits used in accordance with the methods described herein—i.e., the portion that hybridizes to the targeted region of the genome or to a region near the targeted region—generally has a length from about 10 to about 150 nucleotides in length (e.g., 35 nucleotides, 50 nucleotides, 100 nucleotides) and is chosen to specifically hybridize to a target sequence of interest. In further embodiments, the oligonucleotide probe portion comprises a length of about 5-10, 10-50, 20-100, 30-90, 40-80, 50-70, nucleotides in length. As will be appreciated, any of the oligonucleotide probe portions described herein may comprise RNA, DNA, non-natural nucleotides such as PNAs, LNAs, and so on, or any combinations thereof.
  • An advantage of the methods and systems described herein is that the targeted regions that are captured are processed prior to capture in such a way that even after the steps of capturing the targeted regions and conducting sequencing analyses, the original molecular context of those targeted regions is retained. The ability to attribute specific targeted regions to their original molecular context (which can include the original chromosome or chromosomal region from which they are derived and/or the location of particular targeted regions in relation to each other within the full genome) provides a way to obtain sequence information from regions of the genome that are otherwise poorly mapped or have poor coverage using traditional sequencing techniques.
  • For example, some genes possess long introns that are too long to span using generally available sequencing techniques, particularly using short-read technologies. Short-read technologies are often preferable sequencing technologies, because they possess superior accuracy as compared to long-read technologies. However, generally used short-read technologies are unable to span across long regions of the genome, and thus information may not be obtainable using these conventional technologies in regions of the genome that are difficult to characterize due to structural characteristics such as long lengths of tandem repeating sequences, high GC content, and exons containing long introns. In the methods and systems described herein, however, the molecular context of targeted regions is retained, generally through the tagging procedure illustrated in FIG. 1 and described in further detail herein. As such, links can be made across extended regions of the genome. For example, as schematically illustrated in FIG. 2B, nucleic acid molecule 207 contains two exons (shaded bars) with a long intronic region (208). In the methods described herein, the individual nucleic acid molecule 207 is distributed into its own discrete partition 211 and then fragmented such that different fragments contain different portions of the exons and the intron. Because each of those fragments is tagged such that any sequence information obtained from the fragments is then attributable to the discrete partition in which it was generated, each fragment is thus also attributable to the individual nucleic acid molecule 207 from which it was derived.
  • In general, and as is described in further detail herein, after fragmentation and tagging, fragments from different partitions are combined together. Targeted capture methods can then be used to enrich the population of fragments that undergoes further analysis, such as sequencing, with fragments containing the targeted region of interest. In the example illustrated in FIG. 2B, the baits used will enrich the population of fragments to capture only those containing a portion of the exons, but regions outside of the exon and intron (such as 209 and 210) would not be captured. Thus, the final population of fragments that undergoes sequencing will be enriched for the fragments containing the portions of the exons, even if those exons are separated by a long intronic region. Short read, high accuracy sequencing technologies can then be used to identify the sequences of this enriched population of fragments, and because each of the fragments is tagged and thus attributable to its original molecular context, i.e., its original individual nucleic acid molecule, the short read sequences can be pieced together to provide information about the relationship between the exons. In some embodiments, the baits used to capture fragments containing all or part of one or more exons are complementary to one or more portions of the one or more exons themselves. In other embodiments, the baits are complementary to one or more portions of the intervening introns or to sequences adjacent to or near the exon on either the 3′ or 5′ side of the exon regions (such baits are also referred to herein as “intronic baits”). In further embodiments, the baits used to capture the fragments containing all or part of the exon include baits complementary to the exon itself and intronic baits.
  • The ability to retain the molecular context of the targeted regions captured for sequencing also provides the advantage of allowing for sequencing across poorly characterized regions of the genome. As will be appreciated, a significant percentage (at least 5-10% according to, for example Altemose et al., PLOS Computational Biology, May 15, 2014, Vol. 10, Issue 5) of the human genome remains unassembled, unmapped, and poorly characterized. The reference assembly generally annotates these missing regions as multi-megabase heterochromatic gaps, found primarily near centromeres and on the short arms of the acrocentric chromosomes. This missing fraction of the genome includes structural features that remain resistant to accurate characterization using generally used sequencing technologies. By providing the ability to link information across extended regions of the genome, the methods described herein provide a way to allow for sequencing across these poorly characterized regions.
  • In some examples, sample preparation methods, including methods of fragmenting, amplifying, partitioning, and otherwise processing genomic DNA, can lead to biases or lower coverage of certain regions of a genome. Such biases or lowered coverage can be compensated for in the methods and systems disclosed herein by altering the concentration of baits used to capture targeted regions of the genome. For example, in some situations it is known that certain regions of the genome will have low coverage after the fragment library is processed, such as regions containing high GC content or other structural variations that lead to bias toward certain areas of the genome over others. In such situations, the library of baits can be altered to increase the concentration of baits directed to those regions of low coverage—in other words, the population of baits used may be “spiked” to ensure that a sufficient number of fragments containing targeted regions of the genome in those low coverage areas are obtained in the final population of fragments to be sequenced. Such spiking of baits may be conducted through design of custom libraries in some embodiments. In further embodiments, the spiking of baits can be conducted in commercially available whole exome kits, such that a custom library of baits directed toward the lower coverage regions are added to off-the-shelf exome capture kits.
  • An advantage of the methods and systems described herein is that the targeted regions that are captured are processed prior to capture in such a way that even after the steps of capturing the targeted regions and conducting sequencing analyses, the original molecular context of those targeted regions is retained. As is discussed in further detail herein, the ability to attribute specific targeted regions to their original molecular context (which can include the original chromosome or chromosomal region from which they are derived and/or the location of particular targeted regions in relation to each other within the full genome) provides a way to obtain sequence information from regions of the genome that are otherwise poorly mapped or have poor coverage using traditional sequencing techniques.
  • For example, some genes possess long introns that are too long to span using generally available sequencing techniques, particularly using short-read technologies that possess superior accuracy as compared to long-read technologies. In the methods and systems described herein, however, the molecular context of targeted regions is retained, generally through the tagging procedure illustrated in FIG. 1 and described in further detail herein. As such, links can be made across extended regions of the genome. For example, as schematically illustrated in FIG. 2B, nucleic acid molecule 207 contains exons (shaded bars) interrupted by a long intronic region. Generally used sequencing technologies would be unable to span the distance across the intron to provide information on the relationship between the two exons. In the methods described herein, the individual nucleic acid molecule 207 is distributed into its own discrete partition 209 and then fragmented such that different fragments contain different portions of the exons and the intron. Because each of those fragments is tagged such that any sequence information obtained from the fragments is then attributable to the discrete partition in which it was generated, each fragment is thus also attributable to the individual nucleic acid molecule 207 from which it was derived. In general, and as is described in further detail herein, after fragmentation and tagging, fragments from different partitions are combined together. Targeted capture methods can then be used to enrich the population of fragments that undergoes further analysis, such as sequencing, with fragments containing the targeted region of interest. In the example illustrated in FIG. 2B, the baits used will enrich the population of fragments to capture only those containing a portion of one of exons, but regions outside of the exons (such as 209 and 210) would not be captured. Thus, the final population of fragments that undergoes sequencing will be enriched for the fragments containing the exons of interest. Short read, high accuracy sequencing technologies can then be used to identify the sequences of this enriched population of fragments, and because each of the fragments is tagged and thus attributable to its original molecular context, i.e., its original individual nucleic acid molecule, the short read sequences can be pieced together to span across the length of the intervening intron (which can in some examples be on the order of 1, 2, 5, 10 or more kilobases in length) to provide linked sequence information on the two exons.
  • As noted above, the methods and systems described herein provide individual molecular context for short sequence reads of longer nucleic acids. As used herein, individual molecular context refers to sequence context beyond the specific sequence read, e.g., relation to adjacent or proximal sequences, that are not included within the sequence read itself, and as such, will typically be such that they would not be included in whole or in part in a short sequence read, e.g., a read of about 150 bases, or about 300 bases for paired reads. In particularly preferred aspects, the methods and systems provide long range sequence context for short sequence reads. Such long range context includes relationship or linkage of a given sequence read to sequence reads that are within a distance of each other of longer than 1 kb, longer than 5 kb, longer than 10 kb, longer than 15 kb, longer than 20 kb, longer than 30 kb, longer than 40 kb, longer than 50 kb, longer than 60 kb, longer than 70 kb, longer than 80 kb, longer than 90 kb or even longer than 100 kb, or longer. By providing longer range individual molecular context, the methods and systems of the invention also provide much longer inferred molecular context. Sequence context, as described herein can include lower resolution context, e.g., from mapping the short sequence reads to the individual longer molecules or contigs of linked molecules, as well as the higher resolution sequence context, e.g., from long range sequencing of large portions of the longer individual molecules, e.g., having contiguous determined sequences of individual molecules where such determined sequences are longer than 1 kb, longer than 5 kb, longer than 10 kb, longer than 15 kb, longer than 20 kb, longer than 30 kb, longer than 40 kb, longer than 50 kb, longer than 60 kb, longer than 70 kb, longer than 80 kb, longer than 90 kb or even longer than 100 kb. As with sequence context, the attribution of short sequences to longer nucleic acids, e.g., both individual long nucleic acid molecules or collections of linked nucleic acid molecules or contigs, may include both mapping of short sequences against longer nucleic acid stretches to provide high level sequence context, as well as providing assembled sequences from the short sequences through these longer nucleic acids.
  • IV. Samples
  • As will be appreciated, the methods and systems discussed herein can be used to obtain targeted sequence information from any type of genomic material. Such genomic material may be obtained from a sample taken from a patient. Exemplary samples and types of genomic material of use in the methods and systems discussed herein include without limitation polynucleotides, nucleic acids, oligonucleotides, circulating cell-free nucleic acid, circulating tumor cell (CTC), nucleic acid fragments, nucleotides, DNA, RNA, peptide polynucleotides, complementary DNA (cDNA), double stranded DNA (dsDNA), single stranded DNA (ssDNA), plasmid DNA, cosmid DNA, chromosomal DNA, genomic DNA (gDNA), viral DNA, bacterial DNA, mtDNA (mitochondrial DNA), ribosomal RNA, cell-free DNA, cell free fetal DNA (cffDNA), mRNA, rRNA, tRNA, nRNA, siRNA, snRNA, snoRNA, scaRNA, microRNA, dsRNA, viral RNA, and the like. In summary, the samples that are used may vary depending on the particular processing needs.
  • Any substance that comprises nucleic acid may be the source of a sample. The substance may be a fluid, e.g., a biological fluid. A fluidic substance may include, but not limited to, blood, cord blood, saliva, urine, sweat, serum, semen, vaginal fluid, gastric and digestive fluid, spinal fluid, placental fluid, cavity fluid, ocular fluid, serum, breast milk, lymphatic fluid, or combinations thereof. The substance may be solid, for example, a biological tissue. The substance may comprise normal healthy tissues, diseased tissues, or a mix of healthy and diseased tissues. In some cases, the substance may comprise tumors. Tumors may be benign (non-cancer) or malignant (cancer). Non-limiting examples of tumors may include: fibrosarcoma, myxosarcoma, liposarcoma, chondrosarcoma, osteogenic sarcoma, chordoma, angiosarcoma, endotheliosarcoma, lymphangiosarcoma, lymphangioendotheliosarcoma, synovioma, mesothelioma, Ewing's, leiomyosarcoma, rhabdomyosarcoma, gastrointestinal system carcinomas, colon carcinoma, pancreatic cancer, breast cancer, genitourinary system carcinomas, ovarian cancer, prostate cancer, squamous cell carcinoma, basal cell carcinoma, adenocarcinoma, sweat gland carcinoma, sebaceous gland carcinoma, papillary carcinoma, papillary adenocarcinomas, cystadenocarcinoma, medullary carcinoma, bronchogenic carcinoma, renal cell carcinoma, hepatoma, bile duct carcinoma, choriocarcinoma, seminoma, embryonal carcinoma, Wilms' tumor, cervical cancer, endocrine system carcinomas, testicular tumor, lung carcinoma, small cell lung carcinoma, non-small cell lung carcinoma, bladder carcinoma, epithelial carcinoma, glioma, astrocytoma, medulloblastoma, craniopharyngioma, ependymoma, pinealoma, hemangioblastoma, acoustic neuroma, oligodendroglioma, meningioma, melanoma, neuroblastoma, retinoblastoma, or combinations thereof. The substance may be associated with various types of organs. Non-limiting examples of organs may include brain, liver, lung, kidney, prostate, ovary, spleen, lymph node (including tonsil), thyroid, pancreas, heart, skeletal muscle, intestine, larynx, esophagus, stomach, or combinations thereof. In some cases, the substance may comprise a variety of cells, including but not limited to: eukaryotic cells, prokaryotic cells, fungi cells, heart cells, lung cells, kidney cells, liver cells, pancreas cells, reproductive cells, stem cells, induced pluripotent stem cells, gastrointestinal cells, blood cells, cancer cells, bacterial cells, bacterial cells isolated from a human microbiome sample, etc. In some cases, the substance may comprise contents of a cell, such as, for example, the contents of a single cell or the contents of multiple cells. Methods and systems for analyzing individual cells are provided in, e.g., U.S. patent application Ser. No. 14/752,641, filed Jun. 26, 2015, the full disclosure of which is hereby incorporated by reference in its entirety, particularly all teachings related to analyzing nucleic acids from individual cells.
  • Samples may be obtained from various subjects. A subject may be a living subject or a dead subject. Examples of subjects may include, but not limited to, humans, mammals, non-human mammals, rodents, amphibians, reptiles, canines, felines, bovines, equines, goats, ovines, hens, avines, mice, rabbits, insects, slugs, microbes, bacteria, parasites, or fish. In some cases, the subject may be a patient who is having, suspected of having, or at a risk of developing a disease or disorder. In some cases, the subject may be a pregnant woman. In some case, the subject may be a normal healthy pregnant woman. In some cases, the subject may be a pregnant woman who is at a risking of carrying a baby with certain birth defect.
  • A sample may be obtained from a subject by any means known in the art. For example, a sample may be obtained from a subject through accessing the circulatory system (e.g., intravenously or intra-arterially via a syringe or other apparatus), collecting a secreted biological sample (e.g., saliva, sputum urine, feces, etc.), surgically (e.g., biopsy) acquiring a biological sample (e.g., intra-operative samples, post-surgical samples, etc.), swabbing (e.g., buccal swab, oropharyngeal swab), or pipetting.
  • While preferred embodiments of the present invention have been shown and described herein, it will be obvious to those skilled in the art that such embodiments are provided by way of example only. Numerous variations, changes, and substitutions will now occur to those skilled in the art without departing from the invention. It should be understood that various alternatives to the embodiments of the invention described herein may be employed in practicing the invention. It is intended that the following claims define the scope of the invention and that methods and structures within the scope of these claims and their equivalents be covered thereby.
  • EXAMPLES Example 1 Whole Exome Capture and Sequencing: NA12878
  • Genomic DNA from the NA12878 human cell line was subjected to size based separation of fragments using a Blue Pippin DNA sizing system to recover fragments that were greater than or equal to approximately 10 kb in length. The size selected sample nucleic acids were then copartitioned with barcode beads in aqueous droplets within a fluorinated oil continuous phase using a microfluidic partitioning system (See, e.g., U.S. patent application Ser. No. 14/682,952, filed Apr. 9, 2015, and incorporated herein by reference in its entirety for all purposes), where the aqueous droplets also included the dNTPs, thermostable DNA polymerase and other reagents for carrying out amplification within the droplets, as well as DTT for releasing the barcode oligonucleotides from the beads. This was repeated both for 1 ng of total input DNA and 2 ng of total input DNA. The barcode beads were obtained as a subset of a stock library that represented barcode diversity of over 700,000 different barcode sequences. The barcode containing oligonucleotides included additional sequence components and had the general structure:
      • Bead-P5-BC-R1-Nmer
  • Where P5 and R1 refer to the IIlumina attachment and Read1 primer sequences, respectively, BC denotes the barcode portion of the oligonucleotide, and Nmer denotes a random 10 base N-mer priming sequence used to prime the template nucleic acids. See, e.g., U.S. patent application Ser. No. 14/316,383, filed Jun. 26, 2014, the full disclosure of which is hereby incorporated herein by reference in its entirety for all purposes.
  • Following bead dissolution, the droplets were thermocycled to allow for primer extension of the barcode oligos against the template of the sample nucleic acids within each droplet. This resulted in amplified copy fragments of the sample nucleic acids that included the barcode sequence representative of the originating partition, in addition to the other included sequences set forth above.
  • After barcode labeling of the copy fragments, the emulsion of droplets including the amplified copy fragments was broken and the additional sequencer required components, e.g., read2 primer sequence and P7 attachment sequence, were added to the copy fragments through an additional amplification step, which attached these sequences to the other end of the copy fragments. The barcoded DNA was then subjected to hybrid capture using an Agilent SureSelect Exome capture kit.
  • The table below provides targeting statistics for the NA 12878 genome:
  • Median % Fragments % Bases
    Sample Insert Size on Target on Target
    Version 1.A 258 81% 51%
    Version 1.B 224 81% 55%
    Version 1.C 165 81% 63%
  • The three different versions listed above represent three different shear lengths for the barcoded fragments before the second adapter attachment step.
  • Example 2 Whole Exome Capture and Sequencing: NA19701 and NA19661
  • Genomic DNA from the NA19701 and NA19661 cell lines was prepared according to the methods described above in Example 1. Data, including phasing data, from those two cells lines is provided in the table below:
  • NA19661 NA19701
    N50_phase_block 29,535 83,953
    N90_phase_block 8,595 25,584
    mean_phase_block 5,968 21,128
    median_phase_block 0 76.5
    longest_phase_block 209,323 504,140
    fract_genes_phased 0.719 0.841
    fract_genes_completely_phased 0.679 0.778
    fract_snps_phased 0.869 0.832
    fract_snps_barcode 0.644 0.607
    fract_snps_barcode_both_alleles 0.328 0.351
    prob_snp_correct_in_gene 0.906 0.927
    prob_snp_phased_in_gene 0.807 0.889
    snp_short_switch_error 0.013 0.013
    snp_long_switch_error 0.012 0.013
  • The present specification provides a complete description of the methodologies, systems and/or structures and uses thereof in example aspects of the presently-described technology. Although various aspects of this technology have been described above with a certain degree of particularity, or with reference to one or more individual aspects, those skilled in the art could make numerous alterations to the disclosed aspects without departing from the spirit or scope of the technology hereof. Since many aspects can be made without departing from the spirit and scope of the presently described technology, the appropriate scope resides in the claims hereinafter appended. Other aspects are therefore contemplated. Furthermore, it should be understood that any operations may be performed in any order, unless explicitly claimed otherwise or a specific order is inherently necessitated by the claim language. It is intended that all matter contained in the above description and shown in the accompanying drawings shall be interpreted as illustrative only of particular aspects and are not limiting to the embodiments shown. Unless otherwise clear from the context or expressly stated, any concentration values provided herein are generally given in terms of admixture values or percentages without regard to any conversion that occurs upon or following addition of the particular component of the mixture. To the extent not already expressly incorporated herein, all published references and patent documents referred to in this disclosure are incorporated herein by reference in their entirety for all purposes. Changes in detail or structure may be made without departing from the basic elements of the present technology as defined in the following claims.

Claims (16)

What is claimed:
1. A method of obtaining sequence information from one or more targeted portions of a genomic sample, the method comprising:
(a) generating a plurality of barcoded fragments from a plurality of individual nucleic acid molecules from the genomic sample, wherein the individual nucleic acid molecules have lengths greater than 1 kilobase (kb), and wherein fragments from the same individual nucleic molecule comprise a common barcode;
(b) enriching the plurality of barcoded fragments with fragments comprising sequences associated with a disease or disorder, thereby producing an enriched plurality of fragments; and
(c) conducting a sequencing reaction to identify sequences of the enriched plurality of fragments, thereby obtaining sequence information from the one or more targeted portions of the genomic sample.
2. The method of claim 1, wherein the disease or disorder is cancer.
3. The method of claim 1, wherein the genomic sample is from a tumor.
4. The method of claim 1, wherein the genomic sample is from a pregnant woman who is at risk of carrying a baby with a birth defect.
5. The method of claim 1, wherein the enriching step (b) comprises:
(i) hybridizing probes complementary to regions in or near the one or more targeted portions of the genomic samples to the barcoded fragments to form probe-fragment complexes;
(ii) capturing the probe-fragment complexes to a surface of a solid support;
thereby enriching the plurality of barcoded fragments with fragments comprising sequences associated with a disease or disorder.
6. The method of claim 5, wherein the probes comprise binding moieties and the surface comprises capture moieties, and wherein the probe-fragment complexes are captured on the surface through a reaction between the binding moieties and the capture moieties.
7. The method of claim 6, wherein the capture moieties are directed to a member selected from the group consisting of: whole or partial exome capture, panel capture, targeted exon capture, anchored exome capture, and tiled genomic region capture.
8. The method of claim 6, wherein the binding moieties comprise biotin and the capture moieties comprise streptavidin.
9. The method of claim 1, wherein the sequencing reaction is a short read, high accuracy sequencing reaction.
10. The method of claim 1, wherein prior to the enriching step (b), the barcoded fragments are amplified such that the resultant amplification products are capable of forming partial or complete hairpin structures.
11. The method of claim 1, wherein the individual nucleic acid molecules have lengths greater than 10 kb.
12. The method of claim 1, wherein the individual nucleic acid molecules have lengths greater than 20 kb.
13. The method of claim 1, wherein the sequencing reaction is a short read, high accuracy sequencing reaction.
14. The method of claim 1, wherein the method further comprises step (d) determining linkage of two or more targeted portions of the genomic sample based upon common barcode sequences.
15. The method of claim 1, wherein the two or more targeted portions of the genomic sample comprise about 45-95% of the genomic sample.
16. The method of claim 1, wherein the two or more targeted portions of the genomic sample comprise about 35-80% of the genomic sample.
US15/174,923 2014-10-29 2016-06-06 Methods and compositions for targeted nucleic acid sequencing Abandoned US20160281137A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US15/174,923 US20160281137A1 (en) 2014-10-29 2016-06-06 Methods and compositions for targeted nucleic acid sequencing

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US201462072164P 2014-10-29 2014-10-29
US14/927,297 US20160122817A1 (en) 2014-10-29 2015-10-29 Methods and compositions for targeted nucleic acid sequencing
US15/174,923 US20160281137A1 (en) 2014-10-29 2016-06-06 Methods and compositions for targeted nucleic acid sequencing

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US14/927,297 Continuation US20160122817A1 (en) 2014-10-29 2015-10-29 Methods and compositions for targeted nucleic acid sequencing

Publications (1)

Publication Number Publication Date
US20160281137A1 true US20160281137A1 (en) 2016-09-29

Family

ID=54540216

Family Applications (7)

Application Number Title Priority Date Filing Date
US14/927,297 Abandoned US20160122817A1 (en) 2014-10-29 2015-10-29 Methods and compositions for targeted nucleic acid sequencing
US15/174,928 Active US10287623B2 (en) 2014-10-29 2016-06-06 Methods and compositions for targeted nucleic acid sequencing
US15/174,919 Abandoned US20160281136A1 (en) 2014-10-29 2016-06-06 Methods and compositions for targeted nucleic acid sequencing
US15/174,922 Abandoned US20160281161A1 (en) 2014-10-29 2016-06-06 Methods and compositions for targeted nucleic acid sequencing
US15/174,923 Abandoned US20160281137A1 (en) 2014-10-29 2016-06-06 Methods and compositions for targeted nucleic acid sequencing
US17/064,508 Active 2035-12-09 US11739368B2 (en) 2014-10-29 2020-10-06 Methods and compositions for targeted nucleic acid sequencing
US18/346,011 Pending US20240035066A1 (en) 2014-10-29 2023-06-30 Methods and compositions for targeted nucleic acid sequencing

Family Applications Before (4)

Application Number Title Priority Date Filing Date
US14/927,297 Abandoned US20160122817A1 (en) 2014-10-29 2015-10-29 Methods and compositions for targeted nucleic acid sequencing
US15/174,928 Active US10287623B2 (en) 2014-10-29 2016-06-06 Methods and compositions for targeted nucleic acid sequencing
US15/174,919 Abandoned US20160281136A1 (en) 2014-10-29 2016-06-06 Methods and compositions for targeted nucleic acid sequencing
US15/174,922 Abandoned US20160281161A1 (en) 2014-10-29 2016-06-06 Methods and compositions for targeted nucleic acid sequencing

Family Applications After (2)

Application Number Title Priority Date Filing Date
US17/064,508 Active 2035-12-09 US11739368B2 (en) 2014-10-29 2020-10-06 Methods and compositions for targeted nucleic acid sequencing
US18/346,011 Pending US20240035066A1 (en) 2014-10-29 2023-06-30 Methods and compositions for targeted nucleic acid sequencing

Country Status (10)

Country Link
US (7) US20160122817A1 (en)
EP (1) EP3212807B1 (en)
JP (1) JP2017532042A (en)
KR (1) KR20170073667A (en)
CN (2) CN114807307A (en)
AU (1) AU2015339148B2 (en)
BR (1) BR112017008877A2 (en)
CA (1) CA2964472A1 (en)
MX (1) MX2017005267A (en)
WO (1) WO2016069939A1 (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10357771B2 (en) 2017-08-22 2019-07-23 10X Genomics, Inc. Method of producing emulsions
US10544413B2 (en) 2017-05-18 2020-01-28 10X Genomics, Inc. Methods and systems for sorting droplets and beads
US10550429B2 (en) 2016-12-22 2020-02-04 10X Genomics, Inc. Methods and systems for processing polynucleotides
US10815525B2 (en) 2016-12-22 2020-10-27 10X Genomics, Inc. Methods and systems for processing polynucleotides
US11660601B2 (en) 2017-05-18 2023-05-30 10X Genomics, Inc. Methods for sorting particles
US11833515B2 (en) 2017-10-26 2023-12-05 10X Genomics, Inc. Microfluidic channel networks for partitioning

Families Citing this family (91)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190300945A1 (en) 2010-04-05 2019-10-03 Prognosys Biosciences, Inc. Spatially Encoded Biological Assays
US11591637B2 (en) 2012-08-14 2023-02-28 10X Genomics, Inc. Compositions and methods for sample processing
US10221442B2 (en) 2012-08-14 2019-03-05 10X Genomics, Inc. Compositions and methods for sample processing
US10323279B2 (en) 2012-08-14 2019-06-18 10X Genomics, Inc. Methods and systems for processing polynucleotides
US10584381B2 (en) 2012-08-14 2020-03-10 10X Genomics, Inc. Methods and systems for processing polynucleotides
US9951386B2 (en) 2014-06-26 2018-04-24 10X Genomics, Inc. Methods and systems for processing polynucleotides
US10273541B2 (en) 2012-08-14 2019-04-30 10X Genomics, Inc. Methods and systems for processing polynucleotides
CA2881685C (en) 2012-08-14 2023-12-05 10X Genomics, Inc. Microcapsule compositions and methods
US10752949B2 (en) 2012-08-14 2020-08-25 10X Genomics, Inc. Methods and systems for processing polynucleotides
US9701998B2 (en) 2012-12-14 2017-07-11 10X Genomics, Inc. Methods and systems for processing polynucleotides
EP3567116A1 (en) 2012-12-14 2019-11-13 10X Genomics, Inc. Methods and systems for processing polynucleotides
US10533221B2 (en) 2012-12-14 2020-01-14 10X Genomics, Inc. Methods and systems for processing polynucleotides
EP2954065B1 (en) 2013-02-08 2021-07-28 10X Genomics, Inc. Partitioning and processing of analytes and other species
WO2014210225A1 (en) 2013-06-25 2014-12-31 Prognosys Biosciences, Inc. Methods and systems for determining spatial patterns of biological targets in a sample
US9824068B2 (en) 2013-12-16 2017-11-21 10X Genomics, Inc. Methods and apparatus for sorting data
AU2015243445B2 (en) 2014-04-10 2020-05-28 10X Genomics, Inc. Fluidic devices, systems, and methods for encapsulating and partitioning reagents, and applications of same
CN114214314A (en) 2014-06-24 2022-03-22 生物辐射实验室股份有限公司 Digital PCR barcoding
CN106795553B (en) 2014-06-26 2021-06-04 10X基因组学有限公司 Methods of analyzing nucleic acids from individual cells or cell populations
KR20170073667A (en) 2014-10-29 2017-06-28 10엑스 제노믹스, 인크. Methods and compositions for targeted nucleic acid sequencing
US9975122B2 (en) 2014-11-05 2018-05-22 10X Genomics, Inc. Instrument systems for integrated sample processing
CN112126675B (en) 2015-01-12 2022-09-09 10X基因组学有限公司 Method and system for preparing nucleic acid sequencing library and library prepared by using same
EP3262188B1 (en) 2015-02-24 2021-05-05 10X Genomics, Inc. Methods for targeted nucleic acid sequence coverage
WO2016137973A1 (en) 2015-02-24 2016-09-01 10X Genomics Inc Partition processing methods and systems
US10774374B2 (en) 2015-04-10 2020-09-15 Spatial Transcriptomics AB and Illumina, Inc. Spatially distinguished, multiplex nucleic acid analysis of biological specimens
US11371094B2 (en) 2015-11-19 2022-06-28 10X Genomics, Inc. Systems and methods for nucleic acid processing using degenerate nucleotides
US10774370B2 (en) 2015-12-04 2020-09-15 10X Genomics, Inc. Methods and compositions for nucleic acid analysis
SG11201806757XA (en) 2016-02-11 2018-09-27 10X Genomics Inc Systems, methods, and media for de novo assembly of whole genome sequence data
WO2017197338A1 (en) 2016-05-13 2017-11-16 10X Genomics, Inc. Microfluidic systems and methods of use
WO2017218864A1 (en) * 2016-06-17 2017-12-21 Mayo Foundation For Medical Education And Research Methods and materials for the effective use of combined targeted enrichment of genomic regions and low coverage whole genome sequencing
CA3027919C (en) * 2016-09-30 2023-02-28 Guardant Health, Inc. Methods for multi-resolution analysis of cell-free nucleic acids
IL266197B2 (en) 2016-10-24 2024-03-01 Geneinfosec Inc Concealing information present within nucleic acids
US10011872B1 (en) 2016-12-22 2018-07-03 10X Genomics, Inc. Methods and systems for processing polynucleotides
WO2018140966A1 (en) 2017-01-30 2018-08-02 10X Genomics, Inc. Methods and systems for droplet-based single cell barcoding
US10995333B2 (en) 2017-02-06 2021-05-04 10X Genomics, Inc. Systems and methods for nucleic acid preparation
US10844372B2 (en) 2017-05-26 2020-11-24 10X Genomics, Inc. Single cell analysis of transposase accessible chromatin
EP4230746A3 (en) 2017-05-26 2023-11-01 10X Genomics, Inc. Single cell analysis of transposase accessible chromatin
US10590244B2 (en) 2017-10-04 2020-03-17 10X Genomics, Inc. Compositions, methods, and systems for bead formation using improved polymers
US10837047B2 (en) 2017-10-04 2020-11-17 10X Genomics, Inc. Compositions, methods, and systems for bead formation using improved polymers
WO2019084043A1 (en) 2017-10-26 2019-05-02 10X Genomics, Inc. Methods and systems for nuclecic acid preparation and chromatin analysis
EP4241882A3 (en) 2017-10-27 2023-12-06 10X Genomics, Inc. Methods for sample preparation and analysis
EP3954782A1 (en) 2017-11-15 2022-02-16 10X Genomics, Inc. Functionalized gel beads
US10829815B2 (en) 2017-11-17 2020-11-10 10X Genomics, Inc. Methods and systems for associating physical and genetic properties of biological particles
WO2019108851A1 (en) 2017-11-30 2019-06-06 10X Genomics, Inc. Systems and methods for nucleic acid preparation and analysis
CN108165646A (en) * 2017-12-26 2018-06-15 河北省农林科学院谷子研究所 A kind of simplification genome banking process suitable for millet
WO2019157529A1 (en) 2018-02-12 2019-08-15 10X Genomics, Inc. Methods characterizing multiple analytes from individual cells or cell populations
US11639928B2 (en) 2018-02-22 2023-05-02 10X Genomics, Inc. Methods and systems for characterizing analytes from individual cells or cell populations
US20190284625A1 (en) * 2018-03-16 2019-09-19 Gencove Inc. Methods for joint low-pass and targeted sequencing
CN112262218A (en) 2018-04-06 2021-01-22 10X基因组学有限公司 System and method for quality control in single cell processing
WO2019217486A1 (en) * 2018-05-08 2019-11-14 Memorial Sloan Kettering Cancer Center Methods and compositions for detecting myeloma
US11932899B2 (en) 2018-06-07 2024-03-19 10X Genomics, Inc. Methods and systems for characterizing nucleic acid molecules
US20200149097A1 (en) * 2018-06-11 2020-05-14 Foundation Medicine, Inc. Compositions and methods for evaluating genomic alterations
US11703427B2 (en) 2018-06-25 2023-07-18 10X Genomics, Inc. Methods and systems for cell and bead processing
US20200032335A1 (en) 2018-07-27 2020-01-30 10X Genomics, Inc. Systems and methods for metabolome analysis
MX2021001763A (en) 2018-08-13 2021-04-19 Rootpath Genomics Inc High throughput cloning of paired bipartite immunoreceptor polynucleotides and applications thereof.
US11519033B2 (en) 2018-08-28 2022-12-06 10X Genomics, Inc. Method for transposase-mediated spatial tagging and analyzing genomic DNA in a biological sample
WO2020123316A2 (en) 2018-12-10 2020-06-18 10X Genomics, Inc. Methods for determining a location of a biological analyte in a biological sample
US11459607B1 (en) 2018-12-10 2022-10-04 10X Genomics, Inc. Systems and methods for processing-nucleic acid molecules from a single cell using sequential co-partitioning and composite barcodes
US11649485B2 (en) 2019-01-06 2023-05-16 10X Genomics, Inc. Generating capture probes for spatial analysis
US11926867B2 (en) 2019-01-06 2024-03-12 10X Genomics, Inc. Generating capture probes for spatial analysis
US11845983B1 (en) 2019-01-09 2023-12-19 10X Genomics, Inc. Methods and systems for multiplexing of droplet based assays
US11467153B2 (en) 2019-02-12 2022-10-11 10X Genomics, Inc. Methods for processing nucleic acid molecules
US11851683B1 (en) 2019-02-12 2023-12-26 10X Genomics, Inc. Methods and systems for selective analysis of cellular samples
SG11202108788TA (en) 2019-02-12 2021-09-29 10X Genomics Inc Methods for processing nucleic acid molecules
US11655499B1 (en) 2019-02-25 2023-05-23 10X Genomics, Inc. Detection of sequence elements in nucleic acid molecules
US11920183B2 (en) 2019-03-11 2024-03-05 10X Genomics, Inc. Systems and methods for processing optically tagged beads
MX2021012207A (en) 2019-04-05 2021-12-10 Rootpath Genomics Inc Compositions and methods for t-cell receptor gene assembly.
WO2021034974A1 (en) * 2019-08-19 2021-02-25 Universal Sequencing Technology Corporation Methods and compositions for tracking nucleic acid fragment origin for nucleic acid sequencing
JP2022552194A (en) 2019-10-10 2022-12-15 1859,インク. Methods and systems for microfluidic screening
WO2021092433A2 (en) 2019-11-08 2021-05-14 10X Genomics, Inc. Enhancing specificity of analyte binding
FI3891300T3 (en) 2019-12-23 2023-05-10 10X Genomics Inc Methods for spatial analysis using rna-templated ligation
US11702693B2 (en) 2020-01-21 2023-07-18 10X Genomics, Inc. Methods for printing cells and generating arrays of barcoded cells
US11732299B2 (en) 2020-01-21 2023-08-22 10X Genomics, Inc. Spatial assays with perturbed cells
US11898205B2 (en) 2020-02-03 2024-02-13 10X Genomics, Inc. Increasing capture efficiency of spatial assays
US11732300B2 (en) 2020-02-05 2023-08-22 10X Genomics, Inc. Increasing efficiency of spatial analysis in a biological sample
AU2021224760A1 (en) * 2020-02-21 2022-09-15 10X Genomics, Inc. Capturing genetic targets using a hybridization approach
US11891654B2 (en) 2020-02-24 2024-02-06 10X Genomics, Inc. Methods of making gene expression libraries
EP4242325A3 (en) 2020-04-22 2023-10-04 10X Genomics, Inc. Methods for spatial analysis using targeted rna depletion
US11851700B1 (en) 2020-05-13 2023-12-26 10X Genomics, Inc. Methods, kits, and compositions for processing extracellular molecules
EP4153776A1 (en) 2020-05-22 2023-03-29 10X Genomics, Inc. Spatial analysis to detect sequence variants
EP4153775A1 (en) 2020-05-22 2023-03-29 10X Genomics, Inc. Simultaneous spatio-temporal measurement of gene expression and cellular activity
WO2021242834A1 (en) 2020-05-26 2021-12-02 10X Genomics, Inc. Method for resetting an array
WO2021252499A1 (en) 2020-06-08 2021-12-16 10X Genomics, Inc. Methods of determining a surgical margin and methods of use thereof
WO2021252591A1 (en) 2020-06-10 2021-12-16 10X Genomics, Inc. Methods for determining a location of an analyte in a biological sample
AU2021294334A1 (en) 2020-06-25 2023-02-02 10X Genomics, Inc. Spatial analysis of DNA methylation
US11761038B1 (en) 2020-07-06 2023-09-19 10X Genomics, Inc. Methods for identifying a location of an RNA in a biological sample
US11926822B1 (en) 2020-09-23 2024-03-12 10X Genomics, Inc. Three-dimensional spatial analysis
US11827935B1 (en) 2020-11-19 2023-11-28 10X Genomics, Inc. Methods for spatial analysis using rolling circle amplification and detection probes
WO2022140028A1 (en) 2020-12-21 2022-06-30 10X Genomics, Inc. Methods, compositions, and systems for capturing probes and/or barcodes
WO2022182682A1 (en) 2021-02-23 2022-09-01 10X Genomics, Inc. Probe-based analysis of nucleic acids and proteins
AU2022271320A1 (en) * 2021-05-05 2023-11-02 The Board Of Trustees Of The Leland Stanford Junior University Methods and systems for analyzing nucleic acid molecules
WO2023034489A1 (en) 2021-09-01 2023-03-09 10X Genomics, Inc. Methods, compositions, and kits for blocking a capture probe on a spatial array

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130130919A1 (en) * 2011-10-18 2013-05-23 The Regents Of The University Of California Long-Range Barcode Labeling-Sequencing
US20140120529A1 (en) * 2012-10-15 2014-05-01 Life Technologies Corporation Compositions, methods, systems and kits for target nucleic acid enrichment

Family Cites Families (524)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US1124638A (en) 1914-06-11 1915-01-12 James G Coffin Method of producing planographic printing-forms.
US2797149A (en) 1953-01-08 1957-06-25 Technicon International Ltd Methods of and apparatus for analyzing liquids containing crystalloid and non-crystalloid constituents
US3047367A (en) 1959-12-01 1962-07-31 Technicon Instr Automatic analysis with fluid segmentation
US3479141A (en) 1967-05-17 1969-11-18 Technicon Corp Method and apparatus for analysis
US4124638A (en) 1977-09-12 1978-11-07 Hansen John N Solubilizable polyacrylamide gels containing disulfide cross-linkages
US4253846A (en) 1979-11-21 1981-03-03 Technicon Instruments Corporation Method and apparatus for automated analysis of fluid samples
GB2097692B (en) 1981-01-10 1985-05-22 Shaw Stewart P D Combining chemical reagents
DE3230289A1 (en) 1982-08-14 1984-02-16 Bayer Ag, 5090 Leverkusen PRODUCTION OF PHARMACEUTICAL OR COSMETIC DISPERSIONS
US4582802A (en) 1983-09-30 1986-04-15 The United States Of America As Represented By The Department Of Health And Human Services Stimulation of enzymatic ligation of DNA by high concentrations of nonspecific polymers
JPS60227826A (en) 1984-04-27 1985-11-13 Sogo Yatsukou Kk Graft capsule responding to ph
WO1988002057A1 (en) 1986-09-22 1988-03-24 Bergwerksverband Gmbh Combined rigid profiled and expanding anchor
US4916070A (en) 1986-04-14 1990-04-10 The General Hospital Corporation Fibrin-specific antibodies and method of screening for the antibodies
DE3619763A1 (en) 1986-06-12 1987-12-17 Binz Gmbh & Co HEALTH TRANSPORT OR RESCUE VEHICLE
US5618711A (en) 1986-08-22 1997-04-08 Hoffmann-La Roche Inc. Recombinant expression vectors and purification methods for Thermus thermophilus DNA polymerase
US5418149A (en) 1990-07-24 1995-05-23 Hoffmann-La Roche Inc. Reduction of non-specific amplification glycosylase using DUTP and DNA uracil
US4872895A (en) 1986-12-11 1989-10-10 American Telephone And Telegraph Company, At&T Bell Laboratories Method for fabricating articles which include high silica glass bodies
US5525464A (en) 1987-04-01 1996-06-11 Hyseq, Inc. Method of sequencing by hybridization of oligonucleotide probes
US5202231A (en) 1987-04-01 1993-04-13 Drmanac Radoje T Method of sequencing of genomes by hybridization of oligonucleotide probes
US5149625A (en) 1987-08-11 1992-09-22 President And Fellows Of Harvard College Multiplex analysis of DNA
US5137829A (en) 1987-10-05 1992-08-11 Washington University DNA transposon TN5SEQ1
US5185099A (en) 1988-04-20 1993-02-09 Institut National De Recherche Chimique Appliquee Visco-elastic, isotropic materials based on water, fluorinate sufactants and fluorinated oils, process for their preparation, and their use in various fields, such as optics, pharmacology and electrodynamics
US5237016A (en) 1989-01-05 1993-08-17 Siska Diagnostics, Inc. End-attachment of oligonucleotides to polyacrylamide solid supports for capture and detection of nucleic acids
US5756334A (en) 1990-04-26 1998-05-26 New England Biolabs, Inc. Thermostable DNA polymerase from 9°N-7 and methods for producing the same
US5489523A (en) 1990-12-03 1996-02-06 Stratagene Exonuclease-deficient thermostable Pyrococcus furiosus DNA polymerase I
US5270183A (en) 1991-02-08 1993-12-14 Beckman Research Institute Of The City Of Hope Device and method for the automated cycling of solutions between two or more temperatures
US5994056A (en) 1991-05-02 1999-11-30 Roche Molecular Systems, Inc. Homogeneous methods for nucleic acid amplification and detection
AU669489B2 (en) 1991-09-18 1996-06-13 Affymax Technologies N.V. Method of synthesizing diverse collections of oligomers
US5413924A (en) 1992-02-13 1995-05-09 Kosak; Kenneth M. Preparation of wax beads containing a reagent for release by heating
AU3816993A (en) 1992-03-19 1993-10-21 Regents Of The University Of California, The Multiple tag labeling method for DNA sequencing
JP3558294B2 (en) 1992-05-01 2004-08-25 トラスティーズ・オブ・ザ・ユニバーシティ・オブ・ペンシルベニア Polynucleotide amplification analysis using microfabrication equipment
US5587128A (en) 1992-05-01 1996-12-24 The Trustees Of The University Of Pennsylvania Mesoscale polynucleotide amplification devices
US5840865A (en) 1992-09-14 1998-11-24 Institute Of Molecular Biology And Biotechnology/Forth Eukaryotic transposable element
US5569364A (en) 1992-11-05 1996-10-29 Soane Biosciences, Inc. Separation media for electrophoresis
JPH08506664A (en) 1993-02-01 1996-07-16 セック,リミテッド Method and apparatus for DNA sequencing
WO1994019101A1 (en) 1993-02-16 1994-09-01 Alliance Pharmaceutical Corp. Method of microemulsifying fluorinated oils
JPH08511418A (en) 1993-04-19 1996-12-03 メディソーブ・テクノロジーズ・インターナショナル・リミテッド・パートナーシップ Long-acting treatment by sustained release delivery of antisense oligodeoxyribonucleotides from biodegradable ultrafine particles.
DE69429038T2 (en) 1993-07-28 2002-03-21 Pe Corp Ny Norwalk Device and method for nucleic acid amplification
WO1995004069A1 (en) 1993-07-30 1995-02-09 Affymax Technologies N.V. Biotinylation of proteins
US5512131A (en) 1993-10-04 1996-04-30 President And Fellows Of Harvard College Formation of microstamped patterns on surfaces and derivative articles
US20030044777A1 (en) 1993-10-28 2003-03-06 Kenneth L. Beattie Flowthrough devices for multiple discrete binding reactions
US5605793A (en) 1994-02-17 1997-02-25 Affymax Technologies N.V. Methods for in vitro recombination
US5558071A (en) 1994-03-07 1996-09-24 Combustion Electromagnetics, Inc. Ignition system power converter and controller
US5648211A (en) 1994-04-18 1997-07-15 Becton, Dickinson And Company Strand displacement amplification using thermophilic enzymes
ATE190727T1 (en) 1994-05-11 2000-04-15 Genera Tech Ltd METHOD FOR CAPTURE A LIGAND FROM A LIQUID AND DEVICE FOR PERFORMING SAME
US5705628A (en) 1994-09-20 1998-01-06 Whitehead Institute For Biomedical Research DNA purification and isolation using magnetic particles
US5846719A (en) 1994-10-13 1998-12-08 Lynx Therapeutics, Inc. Oligonucleotide tags for sorting and identification
US5585069A (en) 1994-11-10 1996-12-17 David Sarnoff Research Center, Inc. Partitioned microelectronic and fluidic device array for clinical diagnostics and chemical synthesis
WO1996029629A2 (en) 1995-03-01 1996-09-26 President And Fellows Of Harvard College Microcontact printing on surfaces and derivative articles
US5700642A (en) 1995-05-22 1997-12-23 Sri International Oligonucleotide sizing using immobilized cleavable primers
EP0832287B1 (en) 1995-06-07 2007-10-10 Solexa, Inc Oligonucleotide tags for sorting and identification
DE69637285T2 (en) 1995-06-07 2008-07-10 Solexa, Inc., Hayward Oligonucleotide tags for sorting and identification
US5856174A (en) 1995-06-29 1999-01-05 Affymetrix, Inc. Integrated nucleic acid diagnostic device
US6866760B2 (en) 1998-08-27 2005-03-15 E Ink Corporation Electrophoretic medium and process for the production thereof
US5872010A (en) 1995-07-21 1999-02-16 Northeastern University Microscale fluid handling system
US6057149A (en) 1995-09-15 2000-05-02 The University Of Michigan Microscale devices and reactions in microscale devices
US5851769A (en) 1995-09-27 1998-12-22 The Regents Of The University Of California Quantitative DNA fiber mapping
US5736330A (en) 1995-10-11 1998-04-07 Luminex Corporation Method and compositions for flow cytometric determination of DNA sequences
US5736332A (en) 1995-11-30 1998-04-07 Mandecki; Wlodek Method of determining the sequence of nucleic acids employing solid-phase particles carrying transponders
US6001571A (en) 1995-11-30 1999-12-14 Mandecki; Wlodek Multiplex assay for nucleic acids employing transponders
US6051377A (en) 1995-11-30 2000-04-18 Pharmaseq, Inc. Multiplex assay for nucleic acids employing transponders
US6355198B1 (en) 1996-03-15 2002-03-12 President And Fellows Of Harvard College Method of forming articles including waveguides via capillary micromolding and microtransfer molding
EP0832436A1 (en) 1996-04-15 1998-04-01 Dade Behring Inc. Apparatus and method for analysis
US5846727A (en) 1996-06-06 1998-12-08 Board Of Supervisors Of Louisiana State University And Agricultural & Mechanical College Microsystem for rapid DNA sequencing
ATE206633T1 (en) 1996-07-15 2001-10-15 Calcitech Ltd PRODUCTION OF POWDER
US5965443A (en) 1996-09-09 1999-10-12 Wisconsin Alumni Research Foundation System for in vitro transposition
US6133436A (en) 1996-11-06 2000-10-17 Sequenom, Inc. Beads bound to a solid support and to nucleic acids
US5900481A (en) 1996-11-06 1999-05-04 Sequenom, Inc. Bead linkers for immobilizing nucleic acids to solid supports
US6379929B1 (en) 1996-11-20 2002-04-30 The Regents Of The University Of Michigan Chip-based isothermal amplification devices and methods
US5958703A (en) 1996-12-03 1999-09-28 Glaxo Group Limited Use of modified tethers in screening compound libraries
US20050042625A1 (en) 1997-01-15 2005-02-24 Xzillion Gmbh & Co. Mass label linked hybridisation probes
US20020034737A1 (en) 1997-03-04 2002-03-21 Hyseq, Inc. Methods and compositions for detection or quantification of nucleic acid species
US6297006B1 (en) 1997-01-16 2001-10-02 Hyseq, Inc. Methods for sequencing repetitive sequences and for determining the order of sequence subfragments
US7622294B2 (en) 1997-03-14 2009-11-24 Trustees Of Tufts College Methods for detecting target analytes and enzymatic reactions
US6327410B1 (en) 1997-03-14 2001-12-04 The Trustees Of Tufts College Target analyte sensors utilizing Microspheres
US6391622B1 (en) 1997-04-04 2002-05-21 Caliper Technologies Corp. Closed-loop biochemical analyzers
US6143496A (en) 1997-04-17 2000-11-07 Cytonix Corporation Method of sampling, amplifying and quantifying segment of nucleic acid, polymerase chain reaction assembly having nanoliter-sized sample chambers, and method of filling assembly
WO1998052691A1 (en) 1997-05-16 1998-11-26 Alberta Research Council Microfluidic system and methods of use
US6969488B2 (en) 1998-05-22 2005-11-29 Solexa, Inc. System and apparatus for sequential processing of analytes
EP0985142A4 (en) 1997-05-23 2006-09-13 Lynx Therapeutics Inc System and apparaus for sequential processing of analytes
US20040241759A1 (en) 1997-06-16 2004-12-02 Eileen Tozer High throughput screening of libraries
CA2792122C (en) 1997-07-07 2015-09-08 Medical Research Council In vitro sorting method
GB9714716D0 (en) 1997-07-11 1997-09-17 Brax Genomics Ltd Characterising nucleic acids
FI103809B (en) 1997-07-14 1999-09-30 Finnzymes Oy An in vitro method for producing templates for DNA sequencing
US6974669B2 (en) 2000-03-28 2005-12-13 Nanosphere, Inc. Bio-barcodes based on oligonucleotide-modified nanoparticles
US6368871B1 (en) 1997-08-13 2002-04-09 Cepheid Non-planar microstructures for manipulation of fluid samples
WO1999009217A1 (en) 1997-08-15 1999-02-25 Hyseq, Inc. Methods and compositions for detection or quantification of nucleic acid species
WO1999014368A2 (en) 1997-09-15 1999-03-25 Whitehead Institute For Biomedical Research Methods and apparatus for processing a sample of biomolecular analyte using a microfabricated device
US20020092767A1 (en) 1997-09-19 2002-07-18 Aclara Biosciences, Inc. Multiple array microfluidic device units
US7214298B2 (en) 1997-09-23 2007-05-08 California Institute Of Technology Microfabricated cell sorter
EP1029244A4 (en) 1997-10-02 2003-07-23 Aclara Biosciences Inc Capillary assays involving separation of free and bound species
US5842787A (en) 1997-10-09 1998-12-01 Caliper Technologies Corporation Microfluidic systems incorporating varied channel dimensions
CA2305449A1 (en) 1997-10-10 1999-04-22 President & Fellows Of Harvard College Replica amplification of nucleic acid arrays
US6485944B1 (en) 1997-10-10 2002-11-26 President And Fellows Of Harvard College Replica amplification of nucleic acid arrays
US6511803B1 (en) 1997-10-10 2003-01-28 President And Fellows Of Harvard College Replica amplification of nucleic acid arrays
WO1999019515A1 (en) 1997-10-14 1999-04-22 Luminex Corporation Precision fluorescently dyed particles and methods of making and using same
WO1999064867A1 (en) 1997-12-04 1999-12-16 Amersham Pharmacia Biotech Uk Limited Multiple assay method
AU1726199A (en) 1997-12-31 1999-07-19 Chiron Corporation Metastatic cancer regulated gene
AU3196099A (en) 1998-03-27 1999-10-18 President And Fellows Of Harvard College Systematic identification of essential genes by (in vitro) transposon mutagenesis
US6022716A (en) 1998-04-10 2000-02-08 Genset Sa High throughput DNA sequencing vector
AU3555599A (en) 1998-04-13 1999-11-01 Luminex Corporation Liquid labeling with fluorescent microparticles
US5997636A (en) 1998-05-01 1999-12-07 Instrumentation Technology Associates, Inc. Method and apparatus for growing crystals
US6780591B2 (en) 1998-05-01 2004-08-24 Arizona Board Of Regents Method of determining the nucleotide sequence of oligonucleotides and DNA molecules
US6306590B1 (en) 1998-06-08 2001-10-23 Caliper Technologies Corp. Microfluidic matrix localization apparatus and methods
DE69931497T2 (en) 1998-08-07 2007-05-03 Cellay LLC, Cambridge GEL MICRO-DROPS FOR GENETIC ANALYSIS
US6159736A (en) 1998-09-23 2000-12-12 Wisconsin Alumni Research Foundation Method for making insertional mutations using a Tn5 synaptic complex
AR021833A1 (en) 1998-09-30 2002-08-07 Applied Research Systems METHODS OF AMPLIFICATION AND SEQUENCING OF NUCLEIC ACID
JP2002527250A (en) 1998-10-13 2002-08-27 バイオマイクロ システムズ インコーポレイテッド Fluid circuit components based on passive hydrodynamics
US6489096B1 (en) 1998-10-15 2002-12-03 Princeton University Quantitative analysis of hybridization patterns and intensities in oligonucleotide arrays
SE9803614L (en) 1998-10-19 2000-04-20 Muhammed Mamoun Method and apparatus for producing nanoparticles
WO2000026412A1 (en) 1998-11-02 2000-05-11 Kenneth Loren Beattie Nucleic acid analysis using sequence-targeted tandem hybridization
US6569631B1 (en) 1998-11-12 2003-05-27 3-Dimensional Pharmaceuticals, Inc. Microplate thermal shift assay for ligand development using 5-(4″dimethylaminophenyl)-2-(4′-phenyl)oxazole derivative fluorescent dyes
NO986133D0 (en) 1998-12-23 1998-12-23 Preben Lexow Method of DNA Sequencing
GB9900298D0 (en) 1999-01-07 1999-02-24 Medical Res Council Optical sorting method
US6416642B1 (en) 1999-01-21 2002-07-09 Caliper Technologies Corp. Method and apparatus for continuous liquid flow in microscale channels using pressure injection, wicking, and electrokinetic injection
US6635419B1 (en) 1999-02-16 2003-10-21 Applera Corporation Polynucleotide sequencing method
US20030027214A1 (en) 1999-02-17 2003-02-06 Kamb Carl Alexander Methods for substrate-ligand interaction screening
EP1163052B1 (en) 1999-02-23 2010-06-02 Caliper Life Sciences, Inc. Manipulation of microparticles in microfluidic systems
US6171850B1 (en) 1999-03-08 2001-01-09 Caliper Technologies Corp. Integrated devices and systems for performing temperature controlled reactions and analyses
US6303343B1 (en) 1999-04-06 2001-10-16 Caliper Technologies Corp. Inefficient fast PCR
US6908737B2 (en) 1999-04-15 2005-06-21 Vitra Bioscience, Inc. Systems and methods of conducting multiplexed experiments
US20060275782A1 (en) 1999-04-20 2006-12-07 Illumina, Inc. Detection of nucleic acid reactions on bead arrays
US6291243B1 (en) 1999-04-28 2001-09-18 The Board Of Trustees Of The Leland Stanford Jr. University P element derived vector and methods for its use
US6399952B1 (en) 1999-05-12 2002-06-04 Aclara Biosciences, Inc. Multiplexed fluorescent detection in microfluidic devices
WO2000070095A2 (en) 1999-05-17 2000-11-23 Dade Behring Inc. Homogeneous isothermal amplification and detection of nucleic acids using a template switch oligonucleotide
US20020051971A1 (en) 1999-05-21 2002-05-02 John R. Stuelpnagel Use of microfluidic systems in the detection of target analytes using microsphere arrays
US6846622B1 (en) 1999-05-26 2005-01-25 Oregon Health & Science University Tagged epitope protein transposable element
US20030124509A1 (en) 1999-06-03 2003-07-03 Kenis Paul J.A. Laminar flow patterning and articles made thereby
US6372813B1 (en) 1999-06-25 2002-04-16 Motorola Methods and compositions for attachment of biomolecules to solid supports, hydrogels, and hydrogel arrays
AU6068300A (en) 1999-07-06 2001-01-22 Caliper Technologies Corporation Microfluidic systems and methods for determining modulator kinetics
US6977145B2 (en) 1999-07-28 2005-12-20 Serono Genetics Institute S.A. Method for carrying out a biochemical protocol in continuous flow in a microreactor
US6524456B1 (en) 1999-08-12 2003-02-25 Ut-Battelle, Llc Microfluidic devices for the controlled manipulation of small volumes
WO2001013086A2 (en) 1999-08-13 2001-02-22 Brandeis University Detection of nucleic acids
AU6788100A (en) 1999-08-20 2001-03-19 Luminex Corporation Liquid array technology
US6492118B1 (en) 1999-08-27 2002-12-10 Matrix Technologies Corporation Methods of immobilizing ligands on solid supports
US6982146B1 (en) 1999-08-30 2006-01-03 The United States Of America As Represented By The Department Of Health And Human Services High speed parallel molecular nucleic acid sequencing
EP1224453A2 (en) 1999-10-13 2002-07-24 Signature Bioscience, Inc. System and method for detecting and identifying molecular events in a test sample
EP1174622A4 (en) 1999-10-21 2003-01-29 Kurosaki Corp Vertical pump
US6958225B2 (en) 1999-10-27 2005-10-25 Affymetrix, Inc. Complexity management of genomic DNA
AU1100201A (en) 1999-10-28 2001-05-08 Board Of Trustees Of The Leland Stanford Junior University Methods of in vivo gene transfer using a sleeping beauty transposon system
JP4721603B2 (en) 1999-11-08 2011-07-13 栄研化学株式会社 Mutation and / or polymorphism detection method
US6432290B1 (en) 1999-11-26 2002-08-13 The Governors Of The University Of Alberta Apparatus and method for trapping bead based reagents within microfluidic analysis systems
US20020051882A1 (en) 2000-02-18 2002-05-02 Lawton Ernest L. Forming size compositions, glass fibers coated with the same and fabrics woven from such coated fibers
US6720157B2 (en) 2000-02-23 2004-04-13 Zyomyx, Inc. Chips having elevated sample surfaces
IL134830A0 (en) 2000-03-01 2001-05-20 Chay 13 Medical Res Group N V Peptides and immunostimulatory and anti-bacterial pharmaceutical compositions containing them
MXPA02009031A (en) 2000-03-14 2004-08-19 Amylin Pharmaceuticals Inc Effects of glucagon like peptide 1 (7 36) on antro pyloro duodenal motility.
AU5121801A (en) 2000-03-31 2001-10-15 Micronics Inc Protein crystallization in microfluidic structures
WO2001077683A1 (en) 2000-04-06 2001-10-18 Caliper Technologies Corp. Methods and devices for achieving long incubation times in high-throughput systems
US6481453B1 (en) 2000-04-14 2002-11-19 Nanostream, Inc. Microfluidic branch metering systems and methods
US6800298B1 (en) 2000-05-11 2004-10-05 Clemson University Biological lubricant composition and method of applying lubricant composition
US20060008799A1 (en) 2000-05-22 2006-01-12 Hong Cai Rapid haplotyping by single molecule detection
WO2001089696A2 (en) 2000-05-24 2001-11-29 Micronics, Inc. Microfluidic concentration gradient loop
US6645432B1 (en) 2000-05-25 2003-11-11 President & Fellows Of Harvard College Microfluidic systems including three-dimensionally arrayed channel networks
US20060263888A1 (en) 2000-06-02 2006-11-23 Honeywell International Inc. Differential white blood count on a disposable card
US6632606B1 (en) 2000-06-12 2003-10-14 Aclara Biosciences, Inc. Methods for single nucleotide polymorphism detection
ES2259666T3 (en) 2000-06-21 2006-10-16 Bioarray Solutions Ltd MOLECULAR ANALYSIS OF MULTIPLE ANALYTICS USING SERIES OF RANDOM PARTICLES WITH APPLICATION SPECIFICITY.
AU2001281076A1 (en) 2000-08-07 2002-02-18 Nanostream, Inc. Fluidic mixer in microfluidic system
US6610499B1 (en) 2000-08-31 2003-08-26 The Regents Of The University Of California Capillary array and related methods
US6773566B2 (en) 2000-08-31 2004-08-10 Nanolytics, Inc. Electrostatic actuators for microfluidics and methods for using same
DE60140553D1 (en) 2000-09-14 2009-12-31 Caliper Life Sciences Inc MICROFLUIDIC DEVICES AND METHODS FOR CARRYING OUT TEMPERATURE-MEDIATED REACTIONS
EP1334347A1 (en) 2000-09-15 2003-08-13 California Institute Of Technology Microfabricated crossflow devices and methods
AU2001294775A1 (en) 2000-09-29 2002-04-08 Hnc Software, Inc. Score based decisioning
EP1364052A2 (en) 2000-10-10 2003-11-26 Diversa Corporation High throughput or capillary-based screening for a bioactivity or biomolecule
JP2002155305A (en) 2000-11-14 2002-05-31 Akira Kawasaki Equipment and method for manufacturing monodispersed particle, and monodispersed particle manufactured by the manufacturing method
CA2332186A1 (en) 2001-02-08 2002-08-08 Her Majesty In Right Of Canada As Represented By The Minister Of Agricul Ture And Agri-Food Canada Replicative in vivo gene targeting
US7670559B2 (en) 2001-02-15 2010-03-02 Caliper Life Sciences, Inc. Microfluidic systems with enhanced detection systems
US6620927B2 (en) 2001-02-22 2003-09-16 Anika Therapeutics, Inc. Thiol-modified hyaluronan
CA2438856C (en) 2001-02-23 2007-08-07 Japan Science And Technology Corporation Process and apparatus for producing emulsion and microcapsules
US7211654B2 (en) 2001-03-14 2007-05-01 Regents Of The University Of Michigan Linkers and co-coupling agents for optimization of oligonucleotide synthesis and purification on solid supports
US20020159920A1 (en) 2001-04-03 2002-10-31 Weigl Bernhard H. Multiple redundant microfluidic structures cross reference to related applications
US7138267B1 (en) 2001-04-04 2006-11-21 Epicentre Technologies Corporation Methods and compositions for amplifying DNA clone copy number
US20030027221A1 (en) 2001-04-06 2003-02-06 Scott Melissa E. High-throughput screening assays by encapsulation
US7572642B2 (en) 2001-04-18 2009-08-11 Ambrigen, Llc Assay based on particles, which specifically bind with targets in spatially distributed characteristic patterns
WO2004040406A2 (en) 2001-04-27 2004-05-13 The Directv Group, Inc. Estimating the operating point on a nonlinear traveling wave tube amplifier
ATE410680T1 (en) 2001-05-26 2008-10-15 One Cell Systems Inc SECRETION OF PROTEINS BY ENCAPSULATED CELLS
US6880576B2 (en) 2001-06-07 2005-04-19 Nanostream, Inc. Microfluidic devices for methods development
US7179423B2 (en) 2001-06-20 2007-02-20 Cytonome, Inc. Microfluidic system including a virtual wall fluid interface port for interfacing fluids with the microfluidic system
US6613523B2 (en) 2001-06-29 2003-09-02 Agilent Technologies, Inc. Method of DNA sequencing using cleavable tags
US7682353B2 (en) 2001-06-29 2010-03-23 Coloplast A/S Catheter device
US7077152B2 (en) 2001-07-07 2006-07-18 Nanostream, Inc. Microfluidic metering systems and methods
US6767731B2 (en) 2001-08-27 2004-07-27 Intel Corporation Electron induced fluorescent method for nucleic acid sequencing
KR100415894B1 (en) 2001-08-31 2004-01-24 주식회사 테크노모아 A wire secession preventing device of an overhead door
US7297485B2 (en) 2001-10-15 2007-11-20 Qiagen Gmbh Method for nucleic acid amplification that results in low amplification bias
US20030089605A1 (en) 2001-10-19 2003-05-15 West Virginia University Research Corporation Microfluidic system for proteome analysis
US6783647B2 (en) 2001-10-19 2004-08-31 Ut-Battelle, Llc Microfluidic systems and methods of transport and lysis of cells and analysis of cell lysate
US20030149307A1 (en) 2001-10-24 2003-08-07 Baxter International Inc. Process for the preparation of polyethylene glycol bis amine
WO2003038558A2 (en) 2001-10-30 2003-05-08 Nanomics Biosystems Pty, Ltd. Device and methods for directed synthesis of chemical libraries
GB0127564D0 (en) 2001-11-16 2002-01-09 Medical Res Council Emulsion compositions
US7335153B2 (en) 2001-12-28 2008-02-26 Bio Array Solutions Ltd. Arrays of microparticles and methods of preparation thereof
AU2003210438A1 (en) 2002-01-04 2003-07-24 Board Of Regents, The University Of Texas System Droplet-based microfluidic oligonucleotide synthesis engine
CA2473376A1 (en) 2002-01-16 2003-07-31 Dynal Biotech Asa Method for isolating nucleic acids and protein from a single sample
KR100459870B1 (en) 2002-02-22 2004-12-04 한국과학기술원 CONSTRUCTION OF NOVEL STRAINS CONTAINING MINIMIZING GENOME BY Tn5-COUPLED Cre/loxP EXCISION SYSTEM
EP1488006B1 (en) 2002-03-20 2008-05-28 InnovativeBio.Biz Microcapsules with controlable permeability encapsulating a nucleic acid amplification reaction mixture and their use as reaction compartments for parallels reactions
ATE479899T1 (en) 2002-05-09 2010-09-15 Univ Chicago EQUIPMENT AND METHODS FOR PRESSURE CONTROLLED PLUG TRANSPORT AND REACTION
US7901939B2 (en) 2002-05-09 2011-03-08 University Of Chicago Method for performing crystallization and reactions in pressure-driven fluid plugs
US7527966B2 (en) 2002-06-26 2009-05-05 Transgenrx, Inc. Gene regulation in transgenic animals using a transposon-based vector
JP2006507921A (en) 2002-06-28 2006-03-09 プレジデント・アンド・フェロウズ・オブ・ハーバード・カレッジ Method and apparatus for fluid dispersion
US7927791B2 (en) 2002-07-24 2011-04-19 Ptc Therapeutics, Inc. Methods for identifying small molecules that modulate premature translation termination and nonsense mediated mRNA decay
WO2004021151A2 (en) 2002-08-30 2004-03-11 Orasee Corp. Multi-dimensional image system for digital image input and output
IL151660A0 (en) 2002-09-09 2003-04-10 Univ Ben Gurion Method for isolating and culturing unculturable microorganisms
US20040081962A1 (en) 2002-10-23 2004-04-29 Caifu Chen Methods for synthesizing complementary DNA
WO2005023331A2 (en) 2003-09-04 2005-03-17 The United States Of America As Represented By The Department Of Veterans Affairs Hydrogel nanocompsites for ophthalmic applications
US20040248299A1 (en) 2002-12-27 2004-12-09 Sumedha Jayasena RNA interference
WO2004065617A2 (en) 2003-01-17 2004-08-05 The Trustees Of Boston University Haplotype analysis
AU2004254552B2 (en) 2003-01-29 2008-04-24 454 Life Sciences Corporation Methods of amplifying and sequencing nucleic acids
ATE536419T1 (en) 2003-02-10 2011-12-15 Max Delbrueck Centrum TRANSPOSON BASED TARGETING SYSTEM
US20070088208A1 (en) 2003-02-17 2007-04-19 Mikito Yasuzawa Linear device
US7041481B2 (en) 2003-03-14 2006-05-09 The Regents Of The University Of California Chemical amplification based on fluid partitioning
US7316903B2 (en) 2003-03-28 2008-01-08 United States Of America As Represented By The Department Of Health And Human Services Detection of nucleic acid sequence variations using phase Mu transposase
GB0307428D0 (en) 2003-03-31 2003-05-07 Medical Res Council Compartmentalised combinatorial chemistry
US20060078893A1 (en) 2004-10-12 2006-04-13 Medical Research Council Compartmentalised combinatorial chemistry by microfluidic control
GB0307403D0 (en) 2003-03-31 2003-05-07 Medical Res Council Selection by compartmentalised screening
MXPA05010697A (en) 2003-04-04 2005-12-12 Pfizer Prod Inc Microfluidized oil-in-water emulsions and vaccine compositions.
US20100035254A1 (en) 2003-04-08 2010-02-11 Pacific Biosciences Of California, Inc. Composition and method for nucleic acid sequencing
EP1610888A2 (en) 2003-04-10 2006-01-04 President And Fellows Of Harvard College Formation and control of fluidic species
EP1629286A1 (en) 2003-05-16 2006-03-01 Global Technologies (NZ) Ltd. Method and apparatus for mixing sample and reagent in a suspension fluid
WO2004103565A2 (en) 2003-05-19 2004-12-02 Hans-Knöll-Institut für Naturstoff-Forschung e.V. Device and method for structuring liquids and for dosing reaction liquids into liquid compartments immersed in a separation medium
WO2004105734A1 (en) 2003-05-28 2004-12-09 Valorisation Recherche, Societe En Commandite Method of preparing microcapsules
AU2004250131A1 (en) 2003-06-13 2004-12-29 The General Hospital Corporation Microfluidic systems for size based removal of red blood cells and platelets from blood
GB2403475B (en) 2003-07-01 2008-02-06 Oxitec Ltd Stable integrands
GB0315438D0 (en) 2003-07-02 2003-08-06 Univ Manchester Analysis of mixed cell populations
EP2918595B1 (en) 2003-07-05 2019-12-11 The Johns-Hopkins University Method and compositions for detection and enumeration of genetic variations
BRPI0414004A (en) 2003-08-27 2006-10-24 Harvard College electronic control of fluidic species
JP4353945B2 (en) 2003-09-22 2009-10-28 独立行政法人理化学研究所 Efficient DNA inverted repeat structure preparation method
WO2005069001A1 (en) 2003-09-25 2005-07-28 Toyama Prefecture Microwell array chip and its manufacturing method
EP1691792A4 (en) 2003-11-24 2008-05-28 Yeda Res & Dev Compositions and methods for in vitro sorting of molecular and cellular libraries
US8071364B2 (en) 2003-12-24 2011-12-06 Transgenrx, Inc. Gene therapy using transposon-based vectors
US20050181379A1 (en) 2004-02-18 2005-08-18 Intel Corporation Method and device for isolating and positioning single nucleic acid molecules
CA2557841A1 (en) 2004-02-27 2005-09-09 President And Fellows Of Harvard College Polony fluorescent in situ sequencing beads
KR100552706B1 (en) 2004-03-12 2006-02-20 삼성전자주식회사 Method and apparatus for nucleic acid amplification
US20050221339A1 (en) 2004-03-31 2005-10-06 Medical Research Council Harvard University Compartmentalised screening by microfluidic control
US20060020371A1 (en) 2004-04-13 2006-01-26 President And Fellows Of Harvard College Methods and apparatus for manipulation and/or detection of biological samples and other objects
US7799553B2 (en) 2004-06-01 2010-09-21 The Regents Of The University Of California Microfabricated integrated DNA analysis system
US7700281B2 (en) 2004-06-30 2010-04-20 Usb Corporation Hot start nucleic acid amplification
US7968085B2 (en) 2004-07-05 2011-06-28 Ascendis Pharma A/S Hydrogel formulations
CN100481111C (en) 2004-07-21 2009-04-22 索尼株式会社 Content reproducing device, content processing apparatus, content distribution server, content reproducing method, and content processing method
CN1648671B (en) 2005-02-06 2012-09-26 成都夸常医学工业有限公司 Detecting method for multiple reactor analytic chip and analytic chip and detector
US7608434B2 (en) 2004-08-04 2009-10-27 Wisconsin Alumni Research Foundation Mutated Tn5 transposase proteins and the use thereof
US20080268431A1 (en) 2004-09-14 2008-10-30 Jin-Ho Choy Information Code System Using Dna Sequences
US7892731B2 (en) 2004-10-01 2011-02-22 Radix Biosolutions, Ltd. System and method for inhibiting the decryption of a nucleic acid probe sequence used for the detection of a specific nucleic acid
US7968287B2 (en) 2004-10-08 2011-06-28 Medical Research Council Harvard University In vitro evolution in microfluidic systems
US9492400B2 (en) 2004-11-04 2016-11-15 Massachusetts Institute Of Technology Coated controlled release polymer particles as efficient oral delivery vehicles for biopharmaceuticals
US20080004436A1 (en) 2004-11-15 2008-01-03 Yeda Research And Development Co. Ltd. At The Weizmann Institute Of Science Directed Evolution and Selection Using in Vitro Compartmentalization
US7329493B2 (en) 2004-12-22 2008-02-12 Asiagen Corporation One-tube nested PCR for detecting Mycobacterium tuberculosis
WO2006078841A1 (en) 2005-01-21 2006-07-27 President And Fellows Of Harvard College Systems and methods for forming fluidic droplets encapsulated in particles such as colloidal particles
EP1841879A4 (en) 2005-01-25 2009-05-27 Population Genetics Technologi Isothermal dna amplification
US7393665B2 (en) 2005-02-10 2008-07-01 Population Genetics Technologies Ltd Methods and compositions for tagging and identifying polynucleotides
US7407757B2 (en) 2005-02-10 2008-08-05 Population Genetics Technologies Genetic analysis by sequence-specific sorting
ATE538213T1 (en) 2005-02-18 2012-01-15 Canon Us Life Sciences Inc DEVICE AND METHOD FOR IDENTIFYING GENOMIC DNA OF ORGANISMS
JP4649621B2 (en) 2005-02-21 2011-03-16 国立大学法人 鹿児島大学 Purification method of biodiesel fuel
CA2599683A1 (en) 2005-03-04 2006-09-14 President And Fellows Of Harvard College Method and apparatus for forming multiple emulsions
US9040237B2 (en) 2005-03-04 2015-05-26 Intel Corporation Sensor arrays and nucleic acid sequencing applications
US20070054119A1 (en) 2005-03-04 2007-03-08 Piotr Garstecki Systems and methods of forming particles
JP2006289250A (en) 2005-04-08 2006-10-26 Kao Corp Micro mixer and fluid mixing method using the same
SG162795A1 (en) 2005-06-15 2010-07-29 Callida Genomics Inc Single molecule arrays for genetic and chemical analysis
JP2006349060A (en) 2005-06-16 2006-12-28 Ntn Corp Ball screw
WO2007002490A2 (en) 2005-06-22 2007-01-04 The Research Foundation Of State University Of New York Massively parallel 2-dimensional capillary electrophoresis
US20070154903A1 (en) 2005-06-23 2007-07-05 Nanosphere, Inc. Selective isolation and concentration of nucleic acids from complex samples
EP1921140B1 (en) 2005-07-05 2011-12-14 Juridical Foundation The Chemo-Sero-Therapeutic Research Institute Mutant transposon vector and use thereof
JP5051490B2 (en) 2005-07-08 2012-10-17 独立行政法人産業技術総合研究所 Inorganic microcapsule encapsulating macro-biomaterial and method for producing the same
US20070020640A1 (en) 2005-07-21 2007-01-25 Mccloskey Megan L Molecular encoding of nucleic acid templates for PCR and other forms of sequence analysis
FR2888912B1 (en) 2005-07-25 2007-08-24 Commissariat Energie Atomique METHOD FOR CONTROLLING COMMUNICATION BETWEEN TWO ZONES BY ELECTROWRINKING, DEVICE COMPRISING ISOLABLE ZONES AND OTHERS AND METHOD FOR PRODUCING SUCH DEVICE
US20080228695A1 (en) 2005-08-01 2008-09-18 Technorati, Inc. Techniques for analyzing and presenting information in an event-based data aggregation system
DK1924704T3 (en) 2005-08-02 2011-09-05 Rubicon Genomics Inc Compositions and Methods for Processing and Multiplying DNA, including Using Multiple Enzymes in a Single Reaction
KR20050094360A (en) 2005-08-16 2005-09-27 김태완 A wall recycling Ground water, No pollution installation casing, And Electric generator Arganization, Earth gravity Use, No power Water pumping installation, And installation method.
WO2007024840A2 (en) 2005-08-22 2007-03-01 Critical Therapeutics, Inc. Method of quantitating nucleic acids by flow cytometry microparticle-based array
JP2007074967A (en) 2005-09-13 2007-03-29 Canon Inc Identifier probe and method for amplifying nucleic acid by using the same
CN101523156A (en) 2005-09-16 2009-09-02 加利福尼亚大学董事会 A colorimetric bio-barcode amplification assay for analyte detection
US7960104B2 (en) 2005-10-07 2011-06-14 Callida Genomics, Inc. Self-assembled single molecule arrays and uses thereof
WO2007120265A2 (en) 2005-11-14 2007-10-25 Applera Corporation Coded molecules for detecting target analytes
US7932037B2 (en) 2007-12-05 2011-04-26 Perkinelmer Health Sciences, Inc. DNA assays using amplicon probes on encoded particles
EP2363205A3 (en) 2006-01-11 2014-06-04 Raindance Technologies, Inc. Microfluidic Devices And Methods Of Use In The Formation And Control Of Nanoreactors
WO2007087310A2 (en) 2006-01-23 2007-08-02 Population Genetics Technologies Ltd. Nucleic acid analysis using sequence tokens
US7537897B2 (en) 2006-01-23 2009-05-26 Population Genetics Technologies, Ltd. Molecular counting
DE602007009811D1 (en) 2006-01-27 2010-11-25 Harvard College COALESCENCE FLUIDER DROPLET
HUE030215T2 (en) 2006-02-02 2017-04-28 Univ Leland Stanford Junior Non-invasive fetal genetic screening by digital analysis
WO2007092538A2 (en) 2006-02-07 2007-08-16 President And Fellows Of Harvard College Methods for making nucleotide probes for sequencing and synthesis
GB0603251D0 (en) 2006-02-17 2006-03-29 Isis Innovation DNA conformation
CN101432439B (en) 2006-02-24 2013-07-24 考利达基因组股份有限公司 High throughput genome sequencing on DNA arrays
SG10201405158QA (en) 2006-02-24 2014-10-30 Callida Genomics Inc High throughput genome sequencing on dna arrays
US20070231823A1 (en) 2006-03-23 2007-10-04 Mckernan Kevin J Directed enrichment of genomic DNA for high-throughput sequencing
JP4921829B2 (en) 2006-03-30 2012-04-25 株式会社東芝 Fine particle production apparatus, emulsifier holding part, fine particle production method, and molecular film production method
WO2007114794A1 (en) 2006-03-31 2007-10-11 Nam Trung Nguyen Active control for droplet-based microfluidics
ITTO20060259A1 (en) 2006-04-07 2007-10-08 Elbi Int Spa DEVICE FOR DISTRIBUTING A WASHING AGENT IN A WASHING MACHINE, IN PARTICULAR A DISHWASHER MACHINE.
US7656144B2 (en) 2006-04-07 2010-02-02 Qualcomm, Incorporated Bias generator with reduced current consumption
AU2007237909A1 (en) 2006-04-19 2007-10-25 Applied Biosystems, Llc. Reagents, methods, and libraries for gel-free bead-based sequencing
US7811603B2 (en) 2006-05-09 2010-10-12 The Regents Of The University Of California Microfluidic device for forming monodisperse lipoplexes
EP2530168B1 (en) 2006-05-11 2015-09-16 Raindance Technologies, Inc. Microfluidic Devices
JP5081232B2 (en) 2006-05-22 2012-11-28 ナノストリング テクノロジーズ, インコーポレイテッド System and method for analyzing nanoreporters
EP2636755A1 (en) 2006-05-26 2013-09-11 AltheaDx Incorporated Biochemical analysis of partitioned cells
FR2901717A1 (en) 2006-05-30 2007-12-07 Centre Nat Rech Scient METHOD FOR TREATING DROPS IN A MICROFLUIDIC CIRCUIT
EP4108780A1 (en) 2006-06-14 2022-12-28 Verinata Health, Inc. Rare cell analysis using sample splitting and dna tags
US8715934B2 (en) 2006-06-19 2014-05-06 The Johns Hopkins University Single-molecule PCR on microparticles in water-in-oil emulsions
WO2008005675A2 (en) 2006-06-30 2008-01-10 Applera Corporation Emulsion pcr and amplicon capture
EP1878501A1 (en) 2006-07-14 2008-01-16 Roche Diagnostics GmbH Instrument for heating and cooling
WO2008021123A1 (en) 2006-08-07 2008-02-21 President And Fellows Of Harvard College Fluorocarbon emulsion stabilizing surfactants
ITLO20060004A1 (en) 2006-08-08 2008-02-09 River Pharma Srl "LAPILLE" IS A NEW INVENTION FOR OBJECT A NEW STABLE CHEMICAL COMBINATION FOR COSMETIC AND PHARMACEUTICAL USE CONTAINING AS ACTIVE INGREDIENTS THE ALPHA-LIPOIC ACID AND DIMETHYLSULFOSID, ABLE TO IMPROVE THE ABSORPTION, THE BIO
KR101329658B1 (en) 2006-09-25 2013-11-14 아처 다니엘 미드랜드 캄파니 Superabsorbent surface-treated carboxyalkylated polysaccharides and process for producing same
US7935518B2 (en) 2006-09-27 2011-05-03 Alessandra Luchini Smart hydrogel particles for biomarker harvesting
US8841116B2 (en) 2006-10-25 2014-09-23 The Regents Of The University Of California Inline-injection microdevice and microfabricated integrated DNA analysis system using same
US7910354B2 (en) 2006-10-27 2011-03-22 Complete Genomics, Inc. Efficient arrays of amplified polynucleotides
DK2518162T3 (en) 2006-11-15 2018-06-18 Biospherex Llc Multi-tag sequencing and ecogenomic analysis
US20080242560A1 (en) 2006-11-21 2008-10-02 Gunderson Kevin L Methods for generating amplified nucleic acid arrays
US8598328B2 (en) 2006-12-13 2013-12-03 National University Corporation Nagoya University Tol1 factor transposase and DNA introduction system using the same
US7844658B2 (en) 2007-01-22 2010-11-30 Comcast Cable Holdings, Llc System and method for providing an application to a device
US20080176768A1 (en) 2007-01-23 2008-07-24 Honeywell Honeywell International Hydrogel microarray with embedded metal nanoparticles
WO2008093098A2 (en) 2007-02-02 2008-08-07 Illumina Cambridge Limited Methods for indexing samples and sequencing multiple nucleotide templates
US8003312B2 (en) 2007-02-16 2011-08-23 The Board Of Trustees Of The Leland Stanford Junior University Multiplex cellular assays using detectable cell barcodes
FI20075124A0 (en) 2007-02-21 2007-02-21 Valtion Teknillinen Method and test kit for detection of nucleotide variations
MX2009009541A (en) 2007-03-07 2009-09-16 Alantos Pharm Holding Metalloprotease inhibitors containing a heterocyclic moiety.
WO2008109176A2 (en) 2007-03-07 2008-09-12 President And Fellows Of Harvard College Assays and other reactions involving droplets
US20080228288A1 (en) 2007-03-13 2008-09-18 Ronald Harry Nelson Composite Prosthetic Foot
US20080228268A1 (en) 2007-03-15 2008-09-18 Uluru, Inc. Method of Formation of Viscous, Shape Conforming Gels and Their Uses as Medical Prosthesis
CN102014871A (en) 2007-03-28 2011-04-13 哈佛大学 Emulsions and techniques for formation
WO2008134153A1 (en) 2007-04-23 2008-11-06 Advanced Liquid Logic, Inc. Bead-based multiplexed analytical methods and instrumentation
CN101293191B (en) 2007-04-25 2011-11-09 中国科学院过程工程研究所 Agarose gelatin microsphere preparation method
JP2010528608A (en) 2007-06-01 2010-08-26 454 ライフ サイエンシーズ コーポレイション System and method for identifying individual samples from complex mixtures
US20100255556A1 (en) 2007-06-29 2010-10-07 President And Fellows Of Harvard College Methods and apparatus for manipulation of fluidic species
WO2009011808A1 (en) 2007-07-13 2009-01-22 President And Fellows Of Harvard College Droplet-based selection
WO2009015296A1 (en) 2007-07-24 2009-01-29 The Regents Of The University Of California Microfabricated dropley generator
US20130084243A1 (en) 2010-01-27 2013-04-04 Liliane Goetsch Igf-1r specific antibodies useful in the detection and diagnosis of cellular proliferative disorders
US8563527B2 (en) 2007-08-20 2013-10-22 Pharmain Corporation Oligonucleotide core carrier compositions for delivery of nucleic acid-containing therapeutic agents, methods of making and using the same
US8268564B2 (en) 2007-09-26 2012-09-18 President And Fellows Of Harvard College Methods and applications for stitched DNA barcodes
WO2009048532A2 (en) 2007-10-05 2009-04-16 President And Fellows Of Harvard College Formation of particles for ultrasound application, drug release, and other uses, and microfluidic methods of preparation
US20090099040A1 (en) 2007-10-15 2009-04-16 Sigma Aldrich Company Degenerate oligonucleotides and their uses
US20100086914A1 (en) 2008-10-03 2010-04-08 Roche Molecular Systems, Inc. High resolution, high throughput hla genotyping by clonal sequencing
EP2053132A1 (en) * 2007-10-23 2009-04-29 Roche Diagnostics GmbH Enrichment and sequence analysis of geomic regions
WO2009061372A1 (en) 2007-11-02 2009-05-14 President And Fellows Of Harvard College Systems and methods for creating multi-phase entities, including particles and/or fluids
US8592150B2 (en) 2007-12-05 2013-11-26 Complete Genomics, Inc. Methods and compositions for long fragment read sequencing
US20110008775A1 (en) 2007-12-10 2011-01-13 Xiaolian Gao Sequencing of nucleic acids
US7771944B2 (en) 2007-12-14 2010-08-10 The Board Of Trustees Of The University Of Illinois Methods for determining genetic haplotypes and DNA mapping
US9797010B2 (en) 2007-12-21 2017-10-24 President And Fellows Of Harvard College Systems and methods for nucleic acid sequencing
US9034580B2 (en) 2008-01-17 2015-05-19 Sequenom, Inc. Single molecule nucleic acid sequence analysis processes and compositions
US9262594B2 (en) 2008-01-18 2016-02-16 Microsoft Technology Licensing, Llc Tamper evidence per device protected identity
JP5468271B2 (en) 2008-02-08 2014-04-09 花王株式会社 Method for producing fine particle dispersion
US8034568B2 (en) 2008-02-12 2011-10-11 Nugen Technologies, Inc. Isothermal nucleic acid amplification methods and compositions
WO2009138408A2 (en) 2008-05-14 2009-11-19 INSERM (Institut National de la Santé et de la Recherche Médicale) Methods and kits for the diagnosis of rheumatoid arthritis
US9068181B2 (en) 2008-05-23 2015-06-30 The General Hospital Corporation Microfluidic droplet encapsulation
GB0810051D0 (en) 2008-06-02 2008-07-09 Oxford Biodynamics Ltd Method of diagnosis
KR20110042050A (en) 2008-06-05 2011-04-22 프레지던트 앤드 펠로우즈 오브 하바드 칼리지 Polymersomes, colloidosomes, liposomes, and other species associated with fluidic droplets
WO2010003132A1 (en) 2008-07-02 2010-01-07 Illumina Cambridge Ltd. Using populations of beads for the fabrication of arrays on surfaces
CA2730292C (en) 2008-07-11 2016-06-14 Eth Zurich Degradable microcapsules
EP4047367A1 (en) 2008-07-18 2022-08-24 Bio-Rad Laboratories, Inc. Method for detecting target analytes with droplet libraries
US20100062494A1 (en) 2008-08-08 2010-03-11 President And Fellows Of Harvard College Enzymatic oligonucleotide pre-adenylation
US8383345B2 (en) 2008-09-12 2013-02-26 University Of Washington Sequence tag directed subassembly of short sequencing reads into long sequencing reads
WO2010033200A2 (en) 2008-09-19 2010-03-25 President And Fellows Of Harvard College Creation of libraries of droplets and related species
US20120252015A1 (en) 2011-02-18 2012-10-04 Bio-Rad Laboratories Methods and compositions for detecting genetic material
US8663920B2 (en) 2011-07-29 2014-03-04 Bio-Rad Laboratories, Inc. Library characterization by digital assay
US8709762B2 (en) 2010-03-02 2014-04-29 Bio-Rad Laboratories, Inc. System for hot-start amplification via a multiple emulsion
US9156010B2 (en) 2008-09-23 2015-10-13 Bio-Rad Laboratories, Inc. Droplet-based assay system
MX2011002936A (en) 2008-09-25 2011-04-11 Cephalon Inc Liquid formulations of bendamustine.
US8361299B2 (en) 2008-10-08 2013-01-29 Sage Science, Inc. Multichannel preparative electrophoresis system
US9080211B2 (en) 2008-10-24 2015-07-14 Epicentre Technologies Corporation Transposon end compositions and methods for modifying nucleic acids
PL2963709T3 (en) 2008-10-24 2017-11-30 Epicentre Technologies Corporation Transposon end compositions and methods for modifying nucleic acids
US20100113296A1 (en) 2008-11-05 2010-05-06 Joel Myerson Methods And Kits For Nucleic Acid Analysis
US8748103B2 (en) 2008-11-07 2014-06-10 Sequenta, Inc. Monitoring health and disease status using clonotype profiles
US8748094B2 (en) 2008-12-19 2014-06-10 President And Fellows Of Harvard College Particle-assisted nucleic acid sequencing
KR101065807B1 (en) 2009-01-23 2011-09-19 충남대학교산학협력단 Preparation method for micro-capsule using a microfluidic chip system
JP5457222B2 (en) 2009-02-25 2014-04-02 エフ.ホフマン−ラ ロシュ アーゲー Miniaturized high-throughput nucleic acid analysis
US9347092B2 (en) 2009-02-25 2016-05-24 Roche Molecular System, Inc. Solid support for high-throughput nucleic acid analysis
EP2406003A2 (en) 2009-03-13 2012-01-18 President and Fellows of Harvard College Scale-up of flow-focusing microfluidic devices
WO2010104604A1 (en) 2009-03-13 2010-09-16 President And Fellows Of Harvard College Method for the controlled creation of emulsions, including multiple emulsions
EP3415235A1 (en) 2009-03-23 2018-12-19 Raindance Technologies Inc. Manipulation of microfluidic droplets
EP3002337B1 (en) 2009-03-30 2018-10-24 Illumina, Inc. Gene expression analysis in single cells
CN102439177B (en) 2009-04-02 2014-10-01 弗卢伊蒂格姆公司 Multi-primer amplification method for barcoding of target nucleic acids
FR2945819B1 (en) 2009-05-19 2011-06-17 Commissariat Energie Atomique DEVICE AND METHOD FOR ISOLATING BIOLOGICAL OR CHEMICAL TARGETS
US8574835B2 (en) 2009-05-29 2013-11-05 Life Technologies Corporation Scaffolded nucleic acid polymer particles and methods of making and using
US9524369B2 (en) * 2009-06-15 2016-12-20 Complete Genomics, Inc. Processing and analysis of complex nucleic acid sequence data
AU2010260088B2 (en) 2009-06-15 2016-02-11 Complete Genomics, Inc. Methods and compositions for long fragment read sequencing
US9757698B2 (en) 2009-06-26 2017-09-12 President And Fellows Of Harvard College Fluid injection
US20110028412A1 (en) 2009-08-03 2011-02-03 Cappellos, Inc. Herbal enhanced analgesic formulations
US20110033548A1 (en) 2009-08-05 2011-02-10 E.I. Du Pont De Nemours And Company Degradable crosslinked aminated dextran microspheres and methods of use
WO2011021102A2 (en) 2009-08-20 2011-02-24 Population Genetics Technologies Ltd Compositions and methods for intramolecular nucleic acid rearrangement
EP2473263B1 (en) 2009-09-02 2022-11-02 President and Fellows of Harvard College Multiple emulsions created using jetting and other techniques
CA2767056C (en) 2009-09-02 2018-12-04 Bio-Rad Laboratories, Inc. System for mixing fluids by coalescence of multiple emulsions
US9625454B2 (en) 2009-09-04 2017-04-18 The Research Foundation For The State University Of New York Rapid and continuous analyte processing in droplet microfluidic devices
GB0918564D0 (en) 2009-10-22 2009-12-09 Plasticell Ltd Nested cell encapsulation
EP2493619B1 (en) 2009-10-27 2018-12-19 President and Fellows of Harvard College Droplet creation techniques
US10207240B2 (en) 2009-11-03 2019-02-19 Gen9, Inc. Methods and microfluidic devices for the manipulation of droplets in high fidelity polynucleotide assembly
EP2496700B1 (en) 2009-11-04 2017-03-01 The University Of British Columbia Nucleic acid-containing lipid particles and related methods
JP2013511991A (en) 2009-11-25 2013-04-11 クアンタライフ, インコーポレイテッド Methods and compositions for detecting genetic material
GB2485850C (en) 2009-11-25 2019-01-23 Bio Rad Laboratories Methods and compositions for detecting copy number and chromosome aneuploidy by ligation probes and partitioning the ligated products prior to amplification
US9023769B2 (en) 2009-11-30 2015-05-05 Complete Genomics, Inc. cDNA library for nucleic acid sequencing
US8835358B2 (en) 2009-12-15 2014-09-16 Cellular Research, Inc. Digital counting of individual molecules by stochastic attachment of diverse labels
CN102656279A (en) 2009-12-17 2012-09-05 凯津公司 Restriction enzyme based whole genome sequencing
EP2517025B1 (en) 2009-12-23 2019-11-27 Bio-Rad Laboratories, Inc. Methods for reducing the exchange of molecules between droplets
US8716467B2 (en) 2010-03-03 2014-05-06 Gen9, Inc. Methods and devices for nucleic acid synthesis
US8342338B2 (en) 2010-03-22 2013-01-01 Hydro International Plc Separator for separating solids from an influent
CA2767182C (en) 2010-03-25 2020-03-24 Bio-Rad Laboratories, Inc. Droplet generation for droplet-based assays
US20120000777A1 (en) 2010-06-04 2012-01-05 The Regents Of The University Of California Devices and methods for forming double emulsion droplet compositions and polymer particles
US8703493B2 (en) 2010-06-15 2014-04-22 Src, Inc. Location analysis using fire retardant-protected nucleic acid-labeled tags
WO2012003374A2 (en) 2010-07-02 2012-01-05 The Board Of Trustees Of The Leland Stanford Junior University Targeted sequencing library preparation by genomic dna circularization
WO2012012037A1 (en) 2010-07-19 2012-01-26 New England Biolabs, Inc. Oligonucleotide adaptors: compositions and methods of use
US20130203675A1 (en) 2010-09-16 2013-08-08 Joseph M. DeSimone Asymmetric biofunctional silyl monomers and particles thereof as prodrugs and delivery vehicles for pharmaceutical, chemical and biological agents
EP2623613B8 (en) 2010-09-21 2016-09-07 Population Genetics Technologies Ltd. Increasing confidence of allele calls with molecular counting
JP6114694B2 (en) 2010-10-04 2017-04-12 ジナプシス インコーポレイテッド Systems and methods for automated reusable parallel biological reactions
US9999886B2 (en) 2010-10-07 2018-06-19 The Regents Of The University Of California Methods and systems for on demand droplet generation and impedance based detection
CA2814049C (en) 2010-10-08 2021-07-13 President And Fellows Of Harvard College High-throughput single cell barcoding
EP2635679B1 (en) 2010-11-05 2017-04-19 Illumina, Inc. Linking sequence reads using paired code tags
US8829171B2 (en) 2011-02-10 2014-09-09 Illumina, Inc. Linking sequence reads using paired code tags
US9074251B2 (en) 2011-02-10 2015-07-07 Illumina, Inc. Linking sequence reads using paired code tags
US9009487B2 (en) 2010-11-19 2015-04-14 International Business Machines Corporation Device archiving of past cluster binding information on a broadcast encryption-based network
DK2652155T3 (en) 2010-12-16 2017-02-13 Gigagen Inc Methods for Massive Parallel Analysis of Nucleic Acids in Single Cells
US8765455B2 (en) 2011-01-27 2014-07-01 Lawrence Livermore National Security, Llc Chip-based droplet sorting
ES2762866T3 (en) 2011-01-28 2020-05-26 Illumina Inc Nucleotide replacement by doubly-labeled and directional libraries
US8785380B2 (en) 2011-02-01 2014-07-22 M/S Akay Flavours & Aromatics Pvt Ltd. Formulation containing curcuminoids exhibiting enhanced bioavailability
US10457936B2 (en) 2011-02-02 2019-10-29 University Of Washington Through Its Center For Commercialization Massively parallel contiguity mapping
US9150852B2 (en) 2011-02-18 2015-10-06 Raindance Technologies, Inc. Compositions and methods for molecular labeling
GB2489714B (en) 2011-04-05 2013-11-06 Tracesa Ltd Fluid Identification Method
EP2694489B1 (en) 2011-04-07 2017-09-06 Merck Sharp & Dohme Corp. C5-c6 oxacyclic-fused thiadiazine dioxide compounds as bace inhibitors, compositions, and their use
JP2014512826A (en) 2011-04-25 2014-05-29 バイオ−ラド ラボラトリーズ インコーポレイテッド Methods and compositions for nucleic acid analysis
KR102151656B1 (en) 2011-04-28 2020-09-03 더 보드 어브 트러스티스 어브 더 리랜드 스탠포드 주니어 유니버시티 Identification of polynucleotides associated with a sample
EP3072977B1 (en) 2011-04-28 2018-09-19 Life Technologies Corporation Methods and compositions for multiplex pcr
JP6100685B2 (en) 2011-05-16 2017-03-22 地方独立行政法人 大阪府立病院機構 Method for assessing progression of malignant neoplasia by quantitative detection of blood DNA
WO2012162296A2 (en) 2011-05-23 2012-11-29 President And Fellows Of Harvard College Control of emulsions, including multiple emulsions
US9005935B2 (en) 2011-05-23 2015-04-14 Agilent Technologies, Inc. Methods and compositions for DNA fragmentation and tagging by transposases
US9617598B2 (en) 2011-05-27 2017-04-11 President And Fellows Of Harvard College Methods of amplifying whole genome of a single cell
US8841071B2 (en) 2011-06-02 2014-09-23 Raindance Technologies, Inc. Sample multiplexing
US9150916B2 (en) 2011-06-24 2015-10-06 Beat Christen Compositions and methods for identifying the essential genome of an organism
US8927218B2 (en) 2011-06-27 2015-01-06 Flir Systems, Inc. Methods and compositions for segregating target nucleic acid from mixed nucleic acid samples
US20130017978A1 (en) 2011-07-11 2013-01-17 Finnzymes Oy Methods and transposon nucleic acids for generating a dna library
US9605304B2 (en) 2011-07-20 2017-03-28 The Hong Kong Polytechnic University Ultra-stable oligonucleotide-gold and-silver nanoparticle conjugates and method of their preparation
US20130189700A1 (en) 2011-07-25 2013-07-25 Bio-Rad Laboratories, Inc. Breakage of an emulsion containing nucleic acid
US9249460B2 (en) 2011-09-09 2016-02-02 The Board Of Trustees Of The Leland Stanford Junior University Methods for obtaining a sequence
WO2013055955A1 (en) 2011-10-12 2013-04-18 Complete Genomics, Inc. Identification of dna fragments and structural variations
US20130109576A1 (en) 2011-10-28 2013-05-02 Anthony P. Shuber Methods for detecting mutations
EP2773262B1 (en) 2011-10-31 2022-11-16 University of Utah Research Foundation Evaluation of cardiac structure
CN104012011B (en) 2011-11-04 2018-11-13 英特尔公司 The signaling of configuration for downlink multipoint cooperation communication
EP2786019B1 (en) 2011-11-16 2018-07-25 International Business Machines Corporation Microfluidic device with deformable valve
US9938524B2 (en) 2011-11-22 2018-04-10 Active Motif, Inc. Multiplex isolation of protein-associated nucleic acids
US10689643B2 (en) 2011-11-22 2020-06-23 Active Motif, Inc. Targeted transposition for use in epigenetic studies
US10227639B2 (en) 2011-12-22 2019-03-12 President And Fellows Of Harvard College Compositions and methods for analyte detection
KR20130076978A (en) 2011-12-29 2013-07-09 삼성전자주식회사 Display device and color calibration method for the same
CN104245745B (en) 2012-02-09 2017-03-29 生命技术公司 hydrophilic polymer particle and preparation method thereof
EP2814984A4 (en) 2012-02-14 2015-07-29 Univ Johns Hopkins Mirna analysis methods
EP2814472A4 (en) 2012-02-15 2015-11-04 Wisconsin Alumni Res Found Dithioamine reducing agents
US10202628B2 (en) 2012-02-17 2019-02-12 President And Fellows Of Harvard College Assembly of nucleic acid sequences in emulsions
US9176031B2 (en) 2012-02-24 2015-11-03 Raindance Technologies, Inc. Labeling and sample preparation for sequencing
LT3305918T (en) 2012-03-05 2020-09-25 President And Fellows Of Harvard College Methods for epigenetic sequencing
NO2694769T3 (en) 2012-03-06 2018-03-03
EP2647426A1 (en) 2012-04-03 2013-10-09 Max-Planck-Gesellschaft zur Förderung der Wissenschaften e.V. Replication of distributed nucleic acid molecules with preservation of their relative distribution through hybridization-based binding
US20130296173A1 (en) 2012-04-23 2013-11-07 Complete Genomics, Inc. Pre-anchor wash
EP2852687A4 (en) 2012-05-21 2016-10-05 Scripps Research Inst Methods of sample preparation
CA2875695C (en) 2012-06-15 2022-11-15 The Board Of Regents Of The University Of Texas System High throughput sequencing of multiple transcripts of a single cell
US10221442B2 (en) 2012-08-14 2019-03-05 10X Genomics, Inc. Compositions and methods for sample processing
US20140378322A1 (en) 2012-08-14 2014-12-25 10X Technologies, Inc. Compositions and methods for sample processing
US20150005199A1 (en) 2012-08-14 2015-01-01 10X Technologies, Inc. Compositions and methods for sample processing
US9701998B2 (en) 2012-12-14 2017-07-11 10X Genomics, Inc. Methods and systems for processing polynucleotides
CA2881685C (en) 2012-08-14 2023-12-05 10X Genomics, Inc. Microcapsule compositions and methods
US20150005200A1 (en) 2012-08-14 2015-01-01 10X Technologies, Inc. Compositions and methods for sample processing
US9951386B2 (en) 2014-06-26 2018-04-24 10X Genomics, Inc. Methods and systems for processing polynucleotides
US20140378345A1 (en) 2012-08-14 2014-12-25 10X Technologies, Inc. Compositions and methods for sample processing
US20140378349A1 (en) 2012-08-14 2014-12-25 10X Technologies, Inc. Compositions and methods for sample processing
WO2014047561A1 (en) 2012-09-21 2014-03-27 The Broad Institute Inc. Compositions and methods for labeling of agents
JP2014067204A (en) 2012-09-26 2014-04-17 Brother Ind Ltd Panel control device, panel control method, and panel control program
US9644199B2 (en) 2012-10-01 2017-05-09 Agilent Technologies, Inc. Immobilized transposase complexes for DNA fragmentation and tagging
GB201217772D0 (en) 2012-10-04 2012-11-14 Base4 Innovation Ltd Sequencing method
CN107090491A (en) 2012-11-05 2017-08-25 鲁比康基因组学公司 Bar coding nucleic acid
CA2890441A1 (en) 2012-11-07 2014-05-15 Good Start Genetics, Inc. Methods and systems for identifying contamination in samples
CN105026576A (en) 2012-12-03 2015-11-04 以琳生物药物有限公司 Single-stranded polynucleotide amplification methods
EP3567116A1 (en) 2012-12-14 2019-11-13 10X Genomics, Inc. Methods and systems for processing polynucleotides
EP2752664A1 (en) 2013-01-07 2014-07-09 Max-Planck-Gesellschaft zur Förderung der Wissenschaften e.V. Label-free method for the detection of analytes
US9683230B2 (en) 2013-01-09 2017-06-20 Illumina Cambridge Limited Sample preparation on a solid support
EP2954065B1 (en) 2013-02-08 2021-07-28 10X Genomics, Inc. Partitioning and processing of analytes and other species
EP3418398B1 (en) 2013-03-08 2020-05-13 Bio-Rad Laboratories, Inc. Compositions for polymerase chain reaction assays
US10557133B2 (en) 2013-03-13 2020-02-11 Illumina, Inc. Methods and compositions for nucleic acid sequencing
US20140274729A1 (en) 2013-03-15 2014-09-18 Nugen Technologies, Inc. Methods, compositions and kits for generation of stranded rna or dna libraries
US9328382B2 (en) 2013-03-15 2016-05-03 Complete Genomics, Inc. Multiple tagging of individual long DNA fragments
WO2014144496A1 (en) 2013-03-15 2014-09-18 Bruker Nano, Inc. Chemical nano-identification of a sample using normalized near-field spectroscopy
CA2906076A1 (en) 2013-03-15 2014-09-18 Abvitro, Inc. Single cell bar-coding for antibody discovery
DK2971080T3 (en) 2013-03-15 2018-02-12 Expedeon S L METHODS FOR AMPLIFICATION AND SEQUENCE USING THERMOSTABLE TTHPRIMPOL
US9896717B2 (en) 2013-05-09 2018-02-20 Bio-Rad Laboratories, Inc. Magnetic immuno digital PCR assay
JP6618894B2 (en) 2013-05-23 2019-12-11 ザ ボード オブ トラスティーズ オブ ザ レランド スタンフォード ジュニア ユニバーシティー Transition to natural chromatin for individual epigenomics
US20160122753A1 (en) 2013-06-12 2016-05-05 Tarjei Mikkelsen High-throughput rna-seq
MX2015017312A (en) 2013-06-17 2017-04-10 Broad Inst Inc Delivery and use of the crispr-cas systems, vectors and compositions for hepatic targeting and therapy.
WO2014205296A1 (en) 2013-06-21 2014-12-24 The Broad Institute, Inc. Methods for shearing and tagging dna for chromatin immunoprecipitation and sequencing
MX361481B (en) 2013-06-27 2018-12-06 10X Genomics Inc Compositions and methods for sample processing.
WO2015006700A1 (en) 2013-07-12 2015-01-15 University Of South Alabama Minimal piggybac vectors for genome integration
EP3039158B1 (en) 2013-08-28 2018-11-14 Cellular Research, Inc. Massively parallel single cell analysis
CN105764490B (en) 2013-09-24 2020-10-09 加利福尼亚大学董事会 Encapsulated sensors and sensing systems for bioassays and diagnostics and methods of making and using the same
GB201317301D0 (en) 2013-09-30 2013-11-13 Linnarsson Sten Method for capturing and encoding nucleic acid from a plurality of single cells
US9937495B2 (en) 2013-10-28 2018-04-10 Massachusetts Institute Of Technology Hydrogel microstructures with immiscible fluid isolation for small reaction volumes
SG10201807112XA (en) 2013-12-30 2018-09-27 Atreca Inc Analysis of nucleic acids associated with single cells using nucleic acid barcodes
WO2015113725A1 (en) 2014-02-03 2015-08-06 Thermo Fisher Scientific Baltics Uab Method for controlled dna fragmentation
DK3110975T3 (en) 2014-02-27 2021-10-11 Jumpcode Genomics Inc METHODS FOR ANALYSIS OF SOMATIC MOBILE ELEMENTS AND USES THEREOF
AU2015243445B2 (en) 2014-04-10 2020-05-28 10X Genomics, Inc. Fluidic devices, systems, and methods for encapsulating and partitioning reagents, and applications of same
TWI682997B (en) 2014-04-15 2020-01-21 美商伊路米納有限公司 Modified transposases for improved insertion sequence bias and increased dna input tolerance
US20150298091A1 (en) 2014-04-21 2015-10-22 President And Fellows Of Harvard College Systems and methods for barcoding nucleic acids
EP3456846B1 (en) 2014-04-21 2022-06-22 President and Fellows of Harvard College Systems and methods for barcoding nucleic acid
US10975371B2 (en) 2014-04-29 2021-04-13 Illumina, Inc. Nucleic acid sequence analysis from single cells
CN107075509B (en) 2014-05-23 2021-03-09 数字基因公司 Haploid panel assay by digitizing transposons
US9534215B2 (en) 2014-06-11 2017-01-03 Life Technologies Corporation Systems and methods for substrate enrichment
US9657404B2 (en) 2014-06-27 2017-05-23 Wistron Neweb Corp. Method of forming metallic pattern on polymer substrate
CN114214314A (en) 2014-06-24 2022-03-22 生物辐射实验室股份有限公司 Digital PCR barcoding
CN106574298A (en) 2014-06-26 2017-04-19 10X基因组学有限公司 Methods and compositions for sample analysis
CN106536756A (en) 2014-06-26 2017-03-22 10X基因组学有限公司 Analysis of nucleic acid sequences
CN106795553B (en) 2014-06-26 2021-06-04 10X基因组学有限公司 Methods of analyzing nucleic acids from individual cells or cell populations
US10017759B2 (en) 2014-06-26 2018-07-10 Illumina, Inc. Library preparation of tagged nucleic acid
CN106575322B (en) 2014-06-26 2019-06-18 10X基因组学有限公司 The method and system of nucleic acid sequence assembly
US20160024558A1 (en) 2014-07-23 2016-01-28 10X Genomics, Inc. Nucleic acid binding proteins and uses thereof
WO2016040476A1 (en) 2014-09-09 2016-03-17 The Broad Institute, Inc. A droplet-based method and apparatus for composite single-cell nucleic acid analysis
EP3950944A1 (en) 2014-09-15 2022-02-09 AbVitro LLC High-throughput nucleotide library sequencing
SG11201703139VA (en) 2014-10-17 2017-07-28 Illumina Cambridge Ltd Contiguity preserving transposition
KR20170073667A (en) 2014-10-29 2017-06-28 10엑스 제노믹스, 인크. Methods and compositions for targeted nucleic acid sequencing
CN112126675B (en) 2015-01-12 2022-09-09 10X基因组学有限公司 Method and system for preparing nucleic acid sequencing library and library prepared by using same
AU2016215304B2 (en) 2015-02-04 2022-01-27 The Regents Of The University Of California Sequencing of nucleic acids via barcoding in discrete entities
EP3262188B1 (en) 2015-02-24 2021-05-05 10X Genomics, Inc. Methods for targeted nucleic acid sequence coverage
WO2016137973A1 (en) 2015-02-24 2016-09-01 10X Genomics Inc Partition processing methods and systems
KR20180008493A (en) 2015-05-18 2018-01-24 10엑스 제노믹스, 인크. Portable solid phase compositions for use in biochemical reactions and assays
CN107532218A (en) 2015-05-18 2018-01-02 10X基因组学有限公司 Stabilize reducing agent and its application method
WO2016187717A1 (en) 2015-05-26 2016-12-01 Exerkine Corporation Exosomes useful for genome editing
US20180087050A1 (en) 2015-05-27 2018-03-29 Jianbiao Zheng Methods of inserting molecular barcodes
KR102622307B1 (en) 2015-06-24 2024-01-05 옥스포드 바이오다이나믹스 피엘씨 Epigenetic chromosome interactions
US10894980B2 (en) 2015-07-17 2021-01-19 President And Fellows Of Harvard College Methods of amplifying nucleic acid sequences mediated by transposase/transposon DNA complexes
PL3334841T3 (en) 2015-08-12 2020-05-18 Cemm - Forschungszentrum Für Molekulare Medizin Gmbh Methods for studying nucleic acids
CA2999888A1 (en) 2015-09-24 2017-03-30 Abvitro Llc Affinity-oligonucleotide conjugates and uses thereof
US11092607B2 (en) 2015-10-28 2021-08-17 The Board Institute, Inc. Multiplex analysis of single cell constituents
SG11201803983UA (en) 2015-11-19 2018-06-28 10X Genomics Inc Transformable tagging compositions, methods, and processes incorporating same
US20170260584A1 (en) 2016-02-11 2017-09-14 10X Genomics, Inc. Cell population analysis using single nucleotide polymorphisms from single cell transcriptomes
WO2017156336A1 (en) 2016-03-10 2017-09-14 The Board Of Trustees Of The Leland Stanford Junior University Transposase-mediated imaging of the accessible genome
CN109923216A (en) 2016-08-31 2019-06-21 哈佛学院董事及会员团体 By the detection combination of biomolecule to the method for the single test using fluorescent in situ sequencing
DK3529357T3 (en) 2016-10-19 2022-04-25 10X Genomics Inc Methods for bar coding nucleic acid molecules from individual cells
EP3555290B1 (en) 2016-12-19 2022-11-02 Bio-Rad Laboratories, Inc. Droplet tagging contiguity preserved tagmented dna
US10011872B1 (en) 2016-12-22 2018-07-03 10X Genomics, Inc. Methods and systems for processing polynucleotides
WO2018140966A1 (en) 2017-01-30 2018-08-02 10X Genomics, Inc. Methods and systems for droplet-based single cell barcoding

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130130919A1 (en) * 2011-10-18 2013-05-23 The Regents Of The University Of California Long-Range Barcode Labeling-Sequencing
US20140120529A1 (en) * 2012-10-15 2014-05-01 Life Technologies Corporation Compositions, methods, systems and kits for target nucleic acid enrichment

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
Amini et al. (on line publication date 19 October 2014) Nature Genetics vol. 46 No 12 pp 1343-1349 doi:10.1038/ng.3119; *
Bodi et al. (July 2013) J of biomolecular Techniques. 24:73-86. *
Frampton et al. (online publication 20 oct 2013) Nature Biotechnol. 31: 1023-1031 doi:10.1038/nbt.2696. *
Illumina TrueSeq Custom Enrichment Kit (pp. 1-4, 2011-2012) *
Kaper, F. et al., PNAS, vol. 110, pp. 5552-5557 (April 2013). *

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11180805B2 (en) 2016-12-22 2021-11-23 10X Genomics, Inc Methods and systems for processing polynucleotides
US10858702B2 (en) 2016-12-22 2020-12-08 10X Genomics, Inc. Methods and systems for processing polynucleotides
US10550429B2 (en) 2016-12-22 2020-02-04 10X Genomics, Inc. Methods and systems for processing polynucleotides
US10793905B2 (en) 2016-12-22 2020-10-06 10X Genomics, Inc. Methods and systems for processing polynucleotides
US10815525B2 (en) 2016-12-22 2020-10-27 10X Genomics, Inc. Methods and systems for processing polynucleotides
US10544413B2 (en) 2017-05-18 2020-01-28 10X Genomics, Inc. Methods and systems for sorting droplets and beads
US11660601B2 (en) 2017-05-18 2023-05-30 10X Genomics, Inc. Methods for sorting particles
US10610865B2 (en) 2017-08-22 2020-04-07 10X Genomics, Inc. Droplet forming devices and system with differential surface properties
US10766032B2 (en) 2017-08-22 2020-09-08 10X Genomics, Inc. Devices having a plurality of droplet formation regions
US10583440B2 (en) 2017-08-22 2020-03-10 10X Genomics, Inc. Method of producing emulsions
US10821442B2 (en) 2017-08-22 2020-11-03 10X Genomics, Inc. Devices, systems, and kits for forming droplets
US10549279B2 (en) 2017-08-22 2020-02-04 10X Genomics, Inc. Devices having a plurality of droplet formation regions
US10898900B2 (en) 2017-08-22 2021-01-26 10X Genomics, Inc. Method of producing emulsions
US10357771B2 (en) 2017-08-22 2019-07-23 10X Genomics, Inc. Method of producing emulsions
US11565263B2 (en) 2017-08-22 2023-01-31 10X Genomics, Inc. Droplet forming devices and system with differential surface properties
US11833515B2 (en) 2017-10-26 2023-12-05 10X Genomics, Inc. Microfluidic channel networks for partitioning

Also Published As

Publication number Publication date
US20240035066A1 (en) 2024-02-01
BR112017008877A2 (en) 2018-07-03
WO2016069939A1 (en) 2016-05-06
US11739368B2 (en) 2023-08-29
AU2015339148B2 (en) 2022-03-10
MX2017005267A (en) 2017-07-26
CN114807307A (en) 2022-07-29
CA2964472A1 (en) 2016-05-06
US20160281138A1 (en) 2016-09-29
JP2017532042A (en) 2017-11-02
US20210269852A1 (en) 2021-09-02
US10287623B2 (en) 2019-05-14
AU2015339148A1 (en) 2017-05-18
CN107002128A (en) 2017-08-01
EP3212807A1 (en) 2017-09-06
US20160281161A1 (en) 2016-09-29
EP3212807B1 (en) 2020-09-02
US20160281136A1 (en) 2016-09-29
US20160122817A1 (en) 2016-05-05
KR20170073667A (en) 2017-06-28

Similar Documents

Publication Publication Date Title
US11739368B2 (en) Methods and compositions for targeted nucleic acid sequencing
US20220403464A1 (en) Methods and Compositions for Targeted Nucleic Acid Sequence Coverage
US11873528B2 (en) Methods and compositions for nucleic acid analysis

Legal Events

Date Code Title Description
AS Assignment

Owner name: 10X GENOMICS, INC., CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:JAROSZ, MIRNA;SCHNALL-LEVIN, MICHAEL;SAXONOV, SERGE;AND OTHERS;SIGNING DATES FROM 20151103 TO 20151124;REEL/FRAME:038903/0259

STPP Information on status: patent application and granting procedure in general

Free format text: FINAL REJECTION MAILED

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION