WO2016161054A1 - Massive parallel primer dimer-mediated multiplexed single cell-based amplification for concurrent evaluation of multiple target sequences in complex cell mixtures - Google Patents

Massive parallel primer dimer-mediated multiplexed single cell-based amplification for concurrent evaluation of multiple target sequences in complex cell mixtures Download PDF

Info

Publication number
WO2016161054A1
WO2016161054A1 PCT/US2016/025124 US2016025124W WO2016161054A1 WO 2016161054 A1 WO2016161054 A1 WO 2016161054A1 US 2016025124 W US2016025124 W US 2016025124W WO 2016161054 A1 WO2016161054 A1 WO 2016161054A1
Authority
WO
WIPO (PCT)
Prior art keywords
target
cells
primer
amplification
dimer
Prior art date
Application number
PCT/US2016/025124
Other languages
French (fr)
Inventor
Henricus Franciscus Petrus Maria SCHOENMAKERS
Original Assignee
Pharmacyclics Llc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Pharmacyclics Llc filed Critical Pharmacyclics Llc
Publication of WO2016161054A1 publication Critical patent/WO2016161054A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • C12N15/102Mutagenizing nucleic acids
    • C12N15/1031Mutagenizing nucleic acids mutagenesis by gene assembly, e.g. assembly by oligonucleotide extension PCR
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6844Nucleic acid amplification reactions
    • C12Q1/6853Nucleic acid amplification reactions using modified primers or templates

Definitions

  • NGS Next generation sequencing
  • RNA sequencing approaches include those commercialized by Illumina, Life Technologies, and others, which utilize short (typically in the 50-200bp size range) fragments of nucleic acid attached to DNA adaptors and to a solid support (typically a slide or bead). Each fragment is amplified using PCR and the addition of each individual nucleotide is recorded, e.g. through parallel measurement of specific fluorescence signals or ion/proton release.
  • NGS / massive parallel sequencing has become a routine laboratory approach when assessing the nature and/or complexity of a mixture of nucleic acids.
  • genomic (DNA) and transcriptomic (RNA) composition of individual cells is lost in conventional NGS sequencing studies, which typically analyze pooled DNA and/or RNA extracted from large numbers of cells or specimen containing large amounts of (often heterogeneous) cells.
  • RNA transcriptomic
  • low abundance variations in cells will be largely concealed in the bulk signal, and can no longer be assigned to a specific cell or cell tag.
  • the possibility to assess the possible co-occurrence of variants or somatic mutations within individual cells is lost when using such a pooled nucleic acids approach.
  • the present invention provides methods and compositions for massive parallel primer dimer-mediated multiplexed single cell-based amplification for concurrent evaluation of multiple target sequences in cell mixtures, including complex cell mixtures.
  • the invention provides scalable methods and compositions for assessing at least two distinct targets in a single cell by generating a multi-target amplicon fusion product via primer dimer-mediated concatenation of individual distinct amplification products originating from the same cell.
  • Primer dimer-mediated amplification serves to join or concatenate at least two targets which are ordinarily distinct and separated into a single amplification product, providing a multi-target amplicon fusion product for analysis.
  • the method is typically conducted on a single cell basis across a sample population of cells and thereby provides data on the entire population of cells both in an aggregated as well as individual (i.e. individual cell resolution) form.
  • a method for simultaneously evaluating at least two distinct target sequences from single cells across a population or mixture of cells wherein single cell-based multi-target amplicon fusion products are generated via primer dimer-mediated concatenation of at least two target sequences from each of said single cells, and the at least two target sequences are assessed on a single cell basis across the population or mixture of cells.
  • One aspect of the method comprises the steps of (a) disaggregating a population or mixture of cells into single cells; (b) manipulating the single cells such that nucleic acid therefrom is available and suitable for individual amplification; (c) amplifying at least two distinct nucleic acid target sequences using a combination of target-specific primers and concatenating nucleic acid target sequences via primer-dimer formation facilitated by unique primer dimer cassettes in target specific primers to amplify multi-target amplicon fusion product(s); and (d) analyzing the multi-target amplicon fusion product(s) from each single cell across the population or mixture of cells.
  • the invention provides methods and compositions for assessing a population or complex mixture of cells for co-occurrence in single cells of variants or mutations in at least two distinct target sequences, whereby single cell-based multi-target amplicon fusion products are generated via primer dimer-mediated concatenation of the at least two target sequences, and are evaluated for the variants or mutations in the at least two distinct target sequences across the population or mixture of cells.
  • the population or mixture of cells may comprise at least 100 cells, 1,000 cells, at least 10,000 cells, at least 100,000 cells, at least 1,000,000 cells.
  • the at least two distinct target sequences are distinct cancer target genes, or are distinct oncogenes, tumor suppressor genes, or differentiation genes.
  • the population or mixture of cells can be from a non-cancer patient, a normal individual, including a healthy individual, a patient or a non-patient, and the like.
  • the population or mixture of cells are cells derived from a patient with an immunological condition or autoimmune disease.
  • the population or mixture of cells are cells derived from a transplant patient.
  • the cells can be cells of an adult, child, infant, fetus.
  • the amplification is conducted in individual droplets or chambers for each cell in a population or mixture of cells.
  • At least one of the primers of (ii), (iii), (iv) or (v) further comprises additional intervening nucleotides between the dimerization cassette and target complementary sequences which comprises a linking domain comprising barcode sequence.
  • the invention includes a method for linking at least two distinct target sequences from single cells in an amplification product for analysis across a population or mixture of cells comprising amplifying nucleic acids from single cells in a population or mixture of cells with the oligonucleotide amplification primer composition of the invention, wherein at least two distinct target sequences are concatenated in a multi-target amplicon fusion product.
  • 5' dimerization cassette sequences lack functionally relevant complementary to cellular, genomic, or cDNA sequences.
  • the dimerization cassette in the primers of (ii) and (iii) the dimerization cassette may be directly linked to the target complementary sequences (Target A or B complementary domain) or may be separated from the 3' target complementary sequence (Target A or B complementary domain) by a linking domain.
  • the primers of the invention that contain a dimerization cassette may be directly linked to the target complementary sequences or may be separated from the 3' target complementary sequence by a linking domain.
  • the linking domain comprises at least 2 nucleotides.
  • the linking domain comprises at least 5 nucleotides.
  • the linking domain may be random nucleotides or specific nucleotides.
  • the linking domain may contain a barcode sequence or other sequence for recognition, amplification or isolation.
  • Dimerization cassettes and primer dimer sequences are designed or selected to effectively and specifically form primer dimers, with minimal or absence of any significant complementarity to target sequences or other off target sequences.
  • dimerization cassettes several design variables are considered while aiming at establishing specific dimers: an increased length allows one to incorporate additional functionality, such as, for instance, bar codes or internal (sequencing) primer binding sites, whereas the GC percentage of the cassettes (also in combination with an altered length) allows one to optimize reaction kinetics, particularly the formation of desired primer-dimers either early on, or later during amplification.
  • Dimerization cassettes should be designed in such a way, that they do not support any off-target binding and subsequent primer extension.
  • primer design software can be used during the optimization of specificity of dimerization cassette design.
  • primer design software and successful designs are known in the art and may include Primer3, DNASTAR, OligoPrimer, OligoPerfect, PHUSER and FastPCR (Olsen, LR et al (2011) Nucl Acids Res 39(Web Server Issue):W61-W67; Kalendar R et al (2014) Methods Mol Biol 1116:271-302) [00047]
  • additional targets are evaluated in accordance with the method.
  • additional targets may be incorporated in the concatenation of target amplification products.
  • target fusion amplification products are generated for up to 3, 4, 5, 6 or more targets in a single, concatenated amplification product or multi-target amplicon fusion product using overlapping PCR and primer dimers to join each target with another in series.
  • a label may include a fluorescent label, a radioactive label, a small molecule label
  • the nucleic acid base may include a label or may be attached to a label.
  • a label may be attached directly or indirectly.
  • a label is attached via a linker, which may include one or more additional component and/or may include one or more modified base.
  • isotopes such as 3 H, 14 C, 32 P, 35 S, 36 C1, 51 Cr, 57 Co, 58 Co, 59 Fe, 90 Y, 125 I, 131 I, and 186 Re may be utilized.
  • the small molecule label is biotin.
  • FIGURES 1A and IB depict an amplification scheme for two targets - a first target, denoted Target A or "set A", and a second target, denoted Target B or “set B”.
  • Set A target primers are denoted PI and P2.
  • PI and P2 primers comprise Target A complementary sequences for specific amplification of Target A.
  • P2 is an elongated primer having a first, 5' region for dimerization (the "dimerization cassette” or dimer domain) followed by a second, 3' region which is complementary to Target A (the "Target A domain").
  • Set B target primers are denoted P3 and P4.
  • P3 and P4 comprise Target B complementary sequences for specific amplification of Target B.
  • FIGURES 2A and 2B depict an amplification scheme for three targets - a first target, denoted Target A or "set A”, a second target, denoted Target B or “set B", and a third target, denoted Target C or "set C”.
  • Set A target primers are denoted PI and P2.
  • PI and P2 comprise Target A complementary sequence for specific amplification of Target A.
  • P2 is an elongated primer having a first, 5' region for dimerization (the "dimerization cassette” or dimer domain) followed by a second, 3' region which is complementary to Target A (the "Target A domain").
  • Set B target primers are denoted P3 and P4.
  • the dimer domain of set B primers i.e. the 5' tail region and dimer domain of P4
  • the dimer domain or tail of set C primers i.e. the 5' tail region and dimer domain of P5
  • End Target is linked to a first Intermediate Target via primer pairs wherein one primer comprises a specific 5' first primer dimerization cassette and 3' Intermediate Target complementary sequence, and the other primer comprises 5' sequence complementary to the End Target and 3' primer dimerization cassette complementary to the first dimerization cassette.
  • Intermediate Targets are fused via primer pairs comprising Target complementary sequence and complementary primer dimerization cassettes.
  • Target complementary sequence primers without primer dimerization cassettes are utilized at the 5' and 3' ends of the multi-target amplification product to complete the product amplification via complementarity to end target sequences.
  • target sequences particularly, means that the target sequences are not closely associated physically relative to one another, not closely associated spatially in a nucleic acid or genomic DNA context, are not present on a same RNA or mRNA (or cDNA derived therefrom), or are not closely linked spatially.
  • the distinct target sequences may be linked in a physiologically or clinically relevant sense but are not closely linked in a physical sense in terms of location on genomic DNA, RNA or chromosomal location. Sequences are distinct if they cannot be amplified on a single amplicon in a practical PCR-based manner. Thus, while they may be mapped to the same chromosome, they are thousands of bases apart and could not be amplified together, or could not be cloned together using standard means and methods.
  • primer refers to an oligonucleotide capable of acting as a point of initiation of DNA synthesis under suitable conditions. Such conditions include those in which synthesis of a primer extension product complementary to a nucleic acid strand is induced in the presence of up to four different nucleoside triphosphates and an agent for extension (e.g., a DNA polymerase or reverse transcriptase) in an appropriate buffer and at a suitable temperature.
  • agent for extension e.g., a DNA polymerase or reverse transcriptase
  • a primer may be a single-stranded nucleic acid, particularly DNA, but may include suitable alternative nucleic acids, such as protein nucleic acids (PNA).
  • PNA protein nucleic acids
  • the appropriate length of a primer depends on the intended use of the primer but typically ranges from 6 to 50 nucleotides, in certain embodiments from 10-40 nucleotides, from 10-30 nucleotides, from 15-35 nucleotides, and/or from 20-40 nucleotides. Short primer molecules or those containing a low GC percentage, generally require lower temperatures to form sufficiently stable hybrid complexes with their template.
  • a primer need not reflect the exact sequence of the template nucleic acid, but must be sufficiently complementary to specifically hybridize with the intended template.
  • a primer is "specific," for a target sequence if, when used in an amplification reaction under appropriate or sufficiently stringent conditions, the primer hybridizes primarily to the target nucleic acid, particularly the pre-selected and specific target sequence.
  • a primer is specific for a target sequence if the primer-target duplex stability is greater than the stability of a duplex formed between the primer and any other sequence found in the sample.
  • a primer may consist entirely of target-specific complementary nucleic acids and thereby hybridize to target along its entire length.
  • a primer may comprise target-specific complementary nucleic acids and additional nucleic acids.
  • the additional nucleic acids may include nucleic acids complementary to another target, may include nucleic acids complementary to another primer, may include nucleic acids that are designed to tag or identify the primer's amplification product, may include barcode sequence, etc.
  • the primers may have additional sequences added (e.g., nucleotides that may not be the same as, or complementary to, the target), such as restriction enzyme recognition sites, adaptor sequences for sequencing, barcode sequences, and the like.
  • the primers may have additional chemical groups added, including ones which do not react with nucleic acids, such as biotin. Therefore, the length of the primers may be longer, such as 55, 56, 57, 58, 59, 60, 65, 70, 75 nucleotides, or greater than 75 nucleotides, in length or more, depending on the specific use or need.
  • the primers herein are selected to be “substantially" complementary to different strands of a particular target DNA sequence.
  • the primers must be sufficiently complementary to hybridize with their respective strands in a specific and effective manner.
  • the relative amount or extent, length etc of complementary sequence should be sufficient to provide specific and stable hybridization to target or selected sequence via complementarity among non-target or non-selected sequence, for the primers to be substantially complementary.
  • the primer sequence need not reflect the exact sequence of the template.
  • a non-complementary nucleotide fragment may be attached to the 5' end of the primer, with the remainder of the primer sequence being complementary to the strand.
  • non-complementary bases or longer sequences can be interspersed into the primer, provided that the primer sequence has sufficient complementarity with the sequence of the strand to hybridize therewith and thereby form the template for the synthesis of the extension product.
  • restriction endonucleases and “restriction enzymes” refer to bacterial enzymes, each of which cut double-stranded DNA at or near a specific nucleotide sequence.
  • Two DNA sequences are "substantially homologous" when at least about 75% (preferably at least about 80%, and most preferably at least about 90 or 95%) of the nucleotides match over the defined length of the DNA sequences. Sequences that are substantially homologous can be identified by comparing the sequences using standard software available in sequence data banks, or in a Southern hybridization experiment under, for example, stringent conditions as defined for that particular system. Defining appropriate hybridization conditions is within the skill of the art.
  • pg means picogram
  • ng means nanogram
  • ug means nanogram
  • ug means nanogram
  • ug means nanogram
  • ug means nanogram
  • ug means nanogram
  • ug means nanogram
  • ug means nanogram
  • ug means nanogram
  • ug means nanogram
  • ug means nanogram
  • ug means nanogram
  • ug means nanogram
  • ug means nanogram
  • ug means nanogram
  • ug means microgram
  • mg means milligram
  • ul or “ ⁇ ” mean microliter
  • ml means milliliter
  • 1 means liter.
  • chemiluminescent group refers to a group which emits light as a result of a chemical reaction without the addition of heat.
  • luminol 5-amino-2,3-dihydro-l,4-phthalazinedione
  • oxidants like hydrogen peroxide (H 2 0 2 ) in the presence of a base and a metal catalyst to produce an excited state product (3- aminophthalate, 3-APA).
  • detectable label refers to a label which is observable using analytical techniques including, but not limited to, fluorescence, chemiluminescence, electron- spin resonance, ultraviolet/visible absorbance spectroscopy, mass spectrometry, nuclear magnetic resonance, magnetic resonance, and electrochemical methods.
  • fluorophore refers to a molecule which upon excitation emits photons and is thereby fluorescent.
  • the term "probe” refers to a compound or molecule for the detection of a target, including detection of one or more multi-target amplification product.
  • the probe may comprise complementary nucleic acids specific for an amplification product.
  • the probe may comprise an agent, a linker, a label, or any combination thereof.
  • the probe may comprise an agent and a linker, or an agent and a label, or a label.
  • the probe may comprise an agent and a label.
  • the probe may comprise a binder which recognizes a label on the target product.
  • peripheral blood mononuclear cells e.g., lymphocytes, monocytes and macrophages
  • lymphocytes e.g., B cells, T cells or K cells
  • B cells of the sample may be isolated from other cell types of the sample prior to use in the methods provided.
  • cancer cells may be isolated from normal cells of the sample prior to use in the methods provided.
  • a sample may comprise complex populations of cells, which can be assayed as a population, or separated into sub-populations.
  • methods, assays and compositions which generate target specific information for single individual cells across a population of cells, providing specific cell by cell information assessing two or more targets, including target mutations or variants.
  • the methods and compositions of the invention enable the evaluation of at least two targets simultaneously on a single cell basis, permitting evaluation of co-occurrence at the individual cell level across an entire population of cells in a given sample.
  • the present invention provides a novel approach that combines the power of single cell manipulation, such as via existing micro-fluidics and micro-droplet technology, with mutation(s)-specific PCR-based target fusion DNA hybrid synthesis.
  • a multi-target amplification product is generated and evaluated at the single cell level in a scalable fashion. Scalability enables multiplexing analysis and determination of co-occurrence an multi-target mutation assessment at the individual cell level across a population of cells in a sample, particularly wherein at least 10,000, at least 50,000, at least 100,000, at least 500,000, at least 1,000,000, at least 5,000,000, at least 10,000,000 cells are evaluated.
  • single cells are made available and separated for analysis.
  • Single cell isolates may be generated using any number of available techniques or technologies, provided that the single cells are stable and suitable for individual cell-based nucleic acid amplification.
  • Exemplary methods and approaches for single cell preparation include but are not limited to micropipetting, single cell fluorescence-activated cell sorting (FACS), laser capture microdissection (Frumkin, D et al. (2008) BMC Biotechnology 8: 17; Boone, DR et al. (2013) J Vis Exp doi: 10.3791/50308), and micro-fluidics.
  • micro droplets for example including but not limited to water-based droplets in oil are generated that on the average each contain only one cell.
  • cells do not need to be viable, as long as they are sufficiently intact to maintain their original nucleic acid content, and the nucleic acid is preserved and suitable for amplification, including of sufficient quality to allow amplification.
  • mutations identified or characterized in accordance with the analysis provided herein may then be correlated back to sorted phenotypes, markers or cell characteristics.
  • FIGURES 1A and IB depict an amplification scheme for two targets - a first target
  • P3 and P4 comprise Target B complementary sequence for specific amplification of Target B.
  • P3 is an elongated primer having a first 5' region for dimerization (the "dimerization cassette” or dimer domain) and a second 3' region which is complementary to Target B (the “Target B domain”).
  • the P3 dimerization cassette is complementary to and forms a primer dimer with the P2 dimerization cassette.
  • P4 is also an elongated primer having a first 5' region for dimerization (the "dimerization cassette” or dimer domain) and a second 3' region which is complementary to Target B (the "Target B domain”).
  • the P4 dimerization cassette is complementary to and forms a primer dimer with the P5 dimerization cassette.
  • primer dimers PD
  • PD primer dimers
  • amplification e.g. PCR
  • primers are designed in such a way that primer dimer formation during the actual PCR reaction is avoided. Short regions of complementarity in PCR reactions can result in formation of untoward primer dimers.
  • primer dimer sequences suitable for use in the present invention, including based on the teaching and exemplification provided herein.
  • primer design programs and web-based tools are known and available to the skilled artisan including, but not limited to, BLAST, BLAT, Primer3, Primer Design (bioinformatics org) and Primer-Blast.
  • the length of overlap and percent GC content will influence the annealing of the primer and the stability of the primer hybridization, including the temperature at which it breaks apart and denatures.
  • the melting temperature (Tm) can readily be calculated and estimated, including using available and known programs and web-based tools including, but not limited to, Oligo Calculator (idtdna).
  • an overlap of the primer dimer of about 10-20 or about 20-30 base pairs is recommended.
  • a GC content of about 50% is also recommended.
  • Illumina methods and adapters as well as sequencing systems are known, described, and commercially available (for example WO2006/063437, WO2008/041002, WO1998/53300; US Patents 6,355,431, 6,406,848, 6,831,994, 8,361,713, 8,486,625, each and all incorporated herein by reference).
  • the PCR products and target fusions may be used for standard deep sequencing, which will provide the specific sequence and/or mutation or mutational status of the pre-selected targets, including one or more targets, e.g. denoted target A and target B.
  • tails or adaptors on one or more primers of the invention can serve a variety of purposes in downstream processing.
  • specific modifications or additions including additional nucleic acids, chemicals, small molecules, etc. are contemplated.
  • such modifications or additions serve to facilitate or enable downstream analysis or applications, support specific instrumentation or aid in the optimization or normalization of yields of target fusion products.
  • an adapter enables specific capture of the desired amplification product, for example utilizing biotin which can be captured using avidin/streptavidin. Capture may include the use of beads, including magnetic beads. Capture may enable quantitation of the amount of product in each single cell.
  • one or more adaptors are employed for identification (such as a barcode) or for sequence characterization (such as sequencing tails).
  • adding a specific combination of fluorescent dyes can enable rapid characterization or sorting for example using flow-through counting or sorting including FACs mediated applications.
  • Adaptors can be utilized to facilitate the complexity of hybrid target fusion molecules that will be generated during amplification, for example in generating a 2-target, 3-target, 4-target, etc. fusion molecule.
  • Enzyme labels are useful, and can be detected by any of the presently utilized colorimetric, spectrophotometric, fluorospectrophotometric, amperometric or gasometric techniques.
  • the enzyme is conjugated to the selected particle by reaction with bridging molecules such as carbodiimides, diisocyanates, glutaraldehyde and the like.
  • bridging molecules such as carbodiimides, diisocyanates, glutaraldehyde and the like.
  • Many enzymes which can be used in these procedures are known and can be utilized including peroxidase, B-glucuronidase, B-D-glucosidase, B-D-galactosidase, urease, glucose oxidase plus peroxidase and alkaline phosphatase.
  • enzyme or tag is added at the conclusion of amplification.
  • labels include, but are not limited to, chemical, biochemical, biological, colorimetric, enzymatic, fluorescent, luminescent labels, chemiluminescent labels, and electrochemiluminescent labels.
  • the label may be a dye, a photocrosslinker, a cytotoxic compound, a drug, an affinity label, a photoaffmity label, a reactive compound, an antibody or antibody fragment, a biomaterial, a nanoparticle, a spin label, a fluorophore, a metal-containing moiety, a radioactive moiety, a novel functional group, a group that covalently or noncovalently interacts with other molecules, a photocaged moiety, an actinic radiation excitable moiety, a ligand, a photoisomerizable moiety, biotin, a biotin analogue, a moiety incorporating a heavy atom, a chemically cleavable group, a photocleavable group, a redox-active agent
  • fluorescent labels include fluorescein isothiocyante (FITC), DyLight Fluors, fluorescein, rhodamine (tetramethyl rhodamine isothiocyanate, TRITC), coumarin, Lucifer Yellow, and
  • the label is a fluorophore.
  • fluorophores include, but are not limited to, indocarbocyanine (C3), indodicarbocyanine (C5), Cy3, Cy3.5, Cy5, Cy5.5,
  • the fluorescent label may be a green fluorescent protein (GFP), red fluorescent protein (RFP), yellow fluorescent protein, phycobiliproteins (e.g., allophycocyanin, phycocyanin, phycoerythrin, and phycoerythrocyanin) .
  • GFP green fluorescent protein
  • RFP red fluorescent protein
  • phycobiliproteins e.g., allophycocyanin, phycocyanin, phycoerythrin, and phycoerythrocyanin
  • one or more multi-target amplification product(s) are directly captured to a solid support, for example by direct binding to a plate, filter, or bead.
  • Direct capture may be mediated by a label or tag on the amplification product, such as for selection of the multi-target amplification product.
  • the plate, filter, or bead may be coated or pre-coated with a binder or other reagent that recognizes or binds a label or tag on the amplification product.
  • the bead may be a magnetic bead.
  • the methods or assays provided herein may comprise a solid support.
  • a solid support comprises any solid platform to which a probe or binder can be attached.
  • a solid support may comprise a bead, plate, an array or a bead attached to a plate.
  • plates include, but are not limited to, MSD multi-array plates, MSD Multi-Spot® plates, microplate, ProteOn microplate, AlphaPlate, DELFIA plate, IsoPlate, and LumaPlate.
  • Next-generation sequencers have sufficient power to simultaneously analyze nucleic acids and DNAs from many different specimens, a practice known as multiplexing. Multiplexing schemes rely on the ability to associate each sequence read with the specimen from which it was derived. Molecular barcoding is an essential tool in optimally applying high throughput next generation sequencing platforms in studies involving more than one sample. Various strategies have been developed for barcode-mediated multiplexing, in which samples are uniquely tagged with short identifying sequences or barcodes, pooled, and then sequenced together. The resulting combined sequence data are subsequently sorted by barcode before bioinformatics analysis.
  • certain embodiments contemplate designing oligonucleotide/primer sequences to contain short signature sequences (for example barcodes) that permit unambiguous identification of the amplification product into which they are incorporated without having to sequence the entire amplification product.
  • short signature sequences for example barcodes
  • barcodes are placed in primers at locations where they are not found naturally, with barcodes comprising nucleotide sequences that are distinct from any naturally occurring oligonucleotide sequences that may be found in the vicinity of the sequences adjacent to which the barcodes are situated.
  • Barcodes may comprise a sequence of 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 25, 30, 35, 40, 45, 50 or more contiguous nucleotides (including all integer values in between).
  • each barcode sequence may uniquely identify the target fusion product or may identify an aspect of the target (e.g., a mutation or variant of the target sequence). Examples of the design and implementation of oligonucleotide barcode sequence identification strategies will be known and available to one skilled in the art (de Career et al. (2011) Adv Env Microbiol 77:6310; Parameswaran et al. (2007) Nucl Acids Res 35(19):330; Roh et al. (2010) Trends Biotechnol 28:291).
  • primers should be designed such that a suitably minimum size amplicon containing the target fusion region of interest results.
  • the amplicon and target fusion should be of sufficient target length in each instance to provide relevant target loci and mutational data, but short enough to be efficiently and effectively amplified and in instances where analysis of product includes sequencing, short enough to be effectively sequenced.
  • the target fusion product is not greater than 3,000 bp.
  • the target fusion product is less than 200 bp, less than 1,000 bp, preferably 200-1000 bp, 200-500 bp, less than 500 bp, less than 300 bp, less than 200 bp. In instances where sequencing of the product will not be conducted, longer amplification products can be generated and utilized.
  • the amplification reactions to generate target fusion products can be performed in a continuous system (such as using water droplets in oil), for example a flow system, including a continuous flow system.
  • a continuous system such as using water droplets in oil
  • a flow system including a continuous flow system.
  • Exemplary flow systems suitable for single cell analysis and/or amplification are known and available to one skilled in the art. For example, Raindance Technologies' systems which are commercially available and are described including in US2014/323317, US2015/027892, US patent 8,841,071, each and all incorporated herein by reference.
  • Sequencing may be performed using any of a variety of available high through-put single molecule sequencing machines and systems.
  • Illustrative sequence systems include sequence-by-synthesis systems such as the Illumina Genome Analyzer and associated instruments (Illumina, Inc., San Diego, Calif), Helicos Genetic Analysis System (Helicos Biosciences Corp., Cambridge, Mass.), Pacific Biosciences PacBio RS ( Pacific Biosciences, Menlo Park, Calif), or other systems having similar capabilities.
  • sequencing can be achieved using a set of sequencing oligonucleotides that hybridize to a defined region within the amplified DNA molecules.
  • EXAMPLE 1 In an application of the methods of the invention, cells are evaluated on a cell-by-cell basis for T cell receptor repertoire as a first target, in combination with a clinically relevant target for T cell mediated cancer(s). T cell receptors may be assessed as a first target (set A) in combination with a second target (set B). While there are sequence based approaches known and available, including commercially available, for assessment of T cell antigen receptors, current approaches evaluate the TCR repertoire and provide information regarding adaptations or changes in T cells across a multicellular sample and in a quantitatively general fashion. Specific single cell relevant information is not derived or provided with presently available systems.
  • (AB's) immunoSEQ assay uses PCR to amplify the CDR3 region of the T cell receptor, spanning the variable region formed by the junction of the V, D and J segments and their associated non-templated insertions.
  • the resulting nucleotide sequence may be used as a unique identifier or tag for a particular clone across different samples to track clonal expansions and contractions over time in the same patient.
  • the relative amount of one or more clones can only be determined. Specific information regarding other aspects of that clone, including other relevant T cell markers is not available and cannot be derived.
  • the AB approach, and other massive parallel sequencing approaches like it does not provide any means for assessment of the co-existence of a second target (or any further additional third, fourth etc target(s)) within individual cells and/or for co-occurrence of mutations with a single amplification product.
  • the availability of multi-target information, particularly TCR single cell characterization evaluating a second target (or more targets) for co-occurrence on a single cell basis enables evaluation of cancer status, immune adaptation, clonal response, and mutational drift, in a single amplification product.
  • the single cell-based multi-target amplicon fusion product of the present invention provides clinical and target relevant information for individual cells in a population of cells in a scalable, streamlined and accurate manner.
  • T cell receptor (TCR)-specific primers for example those implemented by Adaptive Biotechnologies for unbiased amplification of TCR signature sequences, may be altered to serve as one or more primer in accordance with the invention.
  • a primer dimerization cassette is further incorporated in primers incorporating T cell receptor (TCR) - specific primer sequence(s) so as to enable generation of specific, pre-designed target fusions wherein the first target, denoted Target A or set A, is a TCR and the second target, denoted Target B or set B, is linked via primer dimer-mediated concatenation, in a single multi-target amplicon fusion product for single cell analysis of at least
  • primers including TCR primers or sequences thereof, are modified for multi-target evaluation via primer dimer mediated concatenation as provided herein.
  • primer dimer mediated concatenation as provided herein.
  • the AB system oligonucleotide amplification primer composition comprises a first oligonucleotide amplification primer set comprising forward oligonucleotide sequences of a general formula U1-B1-V1, and reverse oligonucleotide sequences of a general formula U2-B2-
  • Ul and U2 comprise a first and optional second universal adaptor oligonucleotide for
  • NGS sequencing, Bl and B2 are identical or unique barcode sequences, and wherein VI is a primer for TCR variable region (V) sequence and Jl is a primer for TCR joining region (J) sequence. Alternatively or additionally Jl region sequence may be replaced with TCR constant region (C) sequence.
  • Primer sequences for implementation in the present invention and to be combined with primer dimerization cassettes may include VI, J 1 and/or C sequence for Target
  • a pre-screen to identify the clonally relevant TCRs in a tumor or sample population is conducted. It is recognized by those skilled in the art that during the acute phase of B- or T-cell disease, an overwhelming percentage of the cancer cells will often have the same B- or T-cell receptor rearrangement. Consequently, and solely based on the frequency at which the same rearrangement is detected in the NGS reads, the B- or T-cell receptor signature corresponding to the malignant clone can be identifiable. The rest of the signatures from the normal cells can therefore be avoided using clone-specific primers.
  • one or more TCR specific primer sequence such selected from V region, J region and/or C region sequence including as described above, is tailed, particularly at the 5' end, with a specifically designed primer-dimer sequence, or dimerization cassette, so as to provide a primer comprising TCR V, J, or C region target sequence and a dimerization cassette, corresponding to P2 in FIGURE 1A.
  • the primer dimer sequence is complementary to and stably hybridizes with the complementary primer dimer sequence which is a tail on, e.g. P3 of Target set B primer (refer to FIGURE 1A).
  • TCR primers comprising dimerization cassettes are introduced in the single cell system of the present invention
  • a target fusion of TCR with a secondary locus (denoted Target B, set B) is generated and TCR along with Target B or set B are evaluated for mutational and clonal status in a single concatenated amplification product.
  • a droplet may contain a single cell for simultaneous TCR analysis (as Target A) with Target B/set B evaluation.
  • Target A the exact amount and relative abundance of PCR primers that co-exist in the fused reaction droplets can be designed in such a way that they will facilitate the amplification of the T-cell receptor target (denoted Target A or "set A”) as well as the second target sequence (denoted Target B or "set B").
  • the primers of the present invention are tailed with dimer domains or dimerization cassettes as described and are dosed in such a way that the formation of pre-designed primer dimers during the PCR reaction is promoted to result in linking or concatenating the targets in a multi-target fusion amplification product.
  • the primer dimers serve to link or concatenate the T-cell target nucleic acid present in a single cell with specific target B sequences that correspond to the additional target or target mutation of interest and are present in the same cell.
  • T-ALL T cell acute lymphoblastic leukemia
  • One or more of these gene targets are suitable for co-occurrence and/or combination target analysis, including in combination with T cell receptor assessment.
  • T- ALL may be associated with mutations in one or more of IL-7R/Janus activated kinase (JAK), CNOT3, RPL5, RPL10, PTPN2 and STAT 5 genes (Schochat, C. et al (2011) J Exp Med 208(5):901-908; DeKeersmaecker, K. et al (2013) Nat Genet 45(2): 186-190; Bandapalli, O.R. et al (2014) Hematologica 99(10):el88-192).
  • JNK IL-7R/Janus activated kinase
  • T cell receptor as Target A may be combined with Target B wherein Target B is selected from IL-7R, JAK3, CNOT3, RPL5, RPL10, PTPN2, and STAT5.
  • Target B is selected from IL-7R, JAK3, CNOT3, RPL5, RPL10, PTPN2, and STAT5.
  • the mutation STAT 5 N642H can be specifically evaluated.
  • primer combinations are designed to provide target fusion amplification products of T cell receptor as Target A plus Target B and Target C in a single target fusion amplification product.
  • Target B and C may be selected from IL-7R, JAK3, CNOT3, RPL5, RPL10, PTPN2, and STAT5.
  • cells are evaluated on a cell-by- cell basis for B cell Ig repertoire as a first target, denoted Target A or set A, in combination with a clinically relevant target for B cell mediated cancer(s).
  • Target A or set A a first target
  • B cell Ig is analyzed as a first target (set A) in combination with a second target (set B). All measurable mutations, genomic variants, including SNPs that are of clinical relevance in the context of the clinical course or therapeutic decision making for any B-cell malignancy are suitable as Target B targets (or in combination as Target B, Target C, Target D etc).
  • recognized and known B cell Ig specific primers or sequences thereof may be utilized in target specific primers such as PI and in the target complementary sequence aspect of P2 as shown in Figure 1A.
  • Target B may be selected from a clinically relevant target for cancers originating from or otherwise involving the B-cell lineage.
  • Btk Bruton's tyrosine kinase
  • BCR cell surface B-cell receptor
  • Btk is expressed in all hematopoietic cells types except T lymphocytes and natural killer cells, and participates in a number of TLR and cytokine receptor signaling pathways including lipopolysaccharide (LPS) induced T F- ⁇ production in macrophages, suggesting a general role for Btk in immune regulation.
  • LPS lipopolysaccharide
  • PCI-32765 A potent irreversibly acting small molecule inhibitor of Btk, PCI-32765, has demonstrated promising clinical activity in patients with B-cell NHL.
  • PCI-32765 inhibits BCR signaling downstream of Btk, selectively blocks B-cell activation, and is efficacious in animal models of arthritis, lupus, and B-cell lymphoma (Honigsberg LA et al. (2010) PNAS 107(20): 13075-13080).
  • Irreversible inhibitor PCI-32765 is one of a series of Btk inhibitors that bind covalently to cysteine residue Cys-481 in the active site leading to irreversible inhibition of Btk enzymatic activity (Pan Z et al (2007) Chem Med Chem 2:58-61).
  • BTK such as C481 S
  • Cys481 directed inhibitors such as PCI-32765 (Imbruvica or Ibrutinib).
  • PCI-32765 Imbruvica or Ibrutinib
  • cells are evaluated on a cell- by-cell basis for B cell Ig repertoire as a first target, in combination with a clinically relevant target for autoimmune disease.
  • autoimmune diseases include but are not limited to rheumatoid arthritis, systemic lupus erythematosus, multiple sclerosis, celiac sprue disease, pernicious anemia, vitiligo, scleroderma, psoriasis, and inflammatory bowel disease.
  • Prominent hematopoietic cancers include chronic myelogenous leukemia (CML) and acute lymphocytic leukemia (ALL). These are often associated with T or B cell receptor gene rearrangements.
  • CML chronic myelogenous leukemia
  • ALL acute lymphocytic leukemia
  • the efficacy of existing therapeutic agents is hindered, however, by development of drug resistance or existing reduced drug sensitivity by virtue of alternative mutations in genes or proteins that alter the patient's response to the agent(s). Monitoring of the existence or development of one or more such mutations, particularly at the single cell level in the case of MRD could have a significant clinical impact in understanding and addressing disease.
  • These mutations are suitable for incorporation in the present invention methods and compositions as Targets.
  • Imatinib (Gleevec or ST1571) is a Tyrosine kinase inhibitor (TKI) for treatment of cancers, approved for Philadelphia-chromosome positive chronic myelogenous leukemia (CML) and acute lymphocytic leukemia (ALL), myelodysplastic/myeloproliferative diseases associated with PDGFR gene rearrangements, and gastrointestinal stromal tumors. Imatinib is much less effective in CML patients harboring a D186V mutant of c-KIT.
  • TKI Tyrosine kinase inhibitor
  • Important resistance relevant BCR-ABL gene mutations include T315I, Y253H, and F255K(Ravandi F (2011) Clin Lymphoma Myeloma Leuk 11 : 198-203; Bhamidipati PK et al (2013) Ther Adv Hematol 4(2): 103-117).
  • the T315I mutation renders resistance also to imatinib TKI alternative inhibitors including nilotinib (Tasigna, Novartis) and dasatinib (Sprycel, Bristol Myers Squibb) (Jabbour E et al (2006) Leukemia 20: 1767-1773).
  • BCR-ABL independent mechanisms of resistance to imatinib include increased efflux of the drug by increased expression of P-glycoprotein efflux pumps (Bixby D and Talpaz M (2009) Hematology Am Soc Hematol Educ Program 461-476; Jabbour E et al (2011) Emerg Cancer Ther 2:239-258Kotake M et al (2003) Cencer Letters 199:61-68), such as P-170 (Chu E and DeVita V (2010) Cancer Chemotherapy Drug Manual, Sudbury, MA: Jones and Bartlett, Publishers), decreased drug uptake due to decreased expression of the drug transporter human organic cation transporterl (hOCTl) (Thomas J et al (2004) Blood 104:3739-3745; Wang L et al (2008) Clin Pharmacol Ther 83 :258-264), sequestration of drug due to increased serum protein al acid glycoprotein (Widmer N et al (2006) Br J Clin Pharmacol 62:97-112), and alternative
  • tyrosine kinases including epidermal growth factor receptor (EGFR) contribute to the development of cancer and these mutated tyrosine kinase (TK) enzymes often provide a target or sensitivity for selective and specific cancer therapy.
  • Somatic mutations in the tyrosine kinase domains of the EGFR gene are associated with sensitivity of lung cancers to certain tyrosine kinase inhibitors (TKIs) including gefitinib (compound ZD1839, Iressa) and erlotinib (compound OSI-774, Tarceva).
  • TKIs tyrosine kinase inhibitors
  • the present approach provides a means for rapid, accurate, sensitive and specific evaluation and monitoring at the single cell or clonal level.
  • target combinations for evaluation in accordance with the methods and compositions of the invention and cancer relevant targets may be selected from targets identified in cancer mutation monitoring studies (see e.g. Cui Q (2010) PLosOne 5(10):el3180 "A Network of Co-Occurring and Anti-Co-Occurring Mutations"). Recognized clinically relevant targets include but are not limited to EGFR, EGFR and Src, EGFR and Kras, Pten and EGFR, Kit, BRAF and Ras, MAPK, Bcr/Abl.

Abstract

The present invention provides methods and compositions for parallel single cell-based multiplexed primer dimer-mediated amplification of multiple target sequences in complex cell mixtures. The invention provides primer dimer-mediated concatenation of multiple targets in multi-target fusion amplicon products for evaluation of two or more target sequences as present individual cells across a population or mixture of cells.

Description

MASSIVE PARALLEL PRIMER DIMER-MEDIATED MULTIPLEXED SINGLE CELL-BASED AMPLIFICATION FOR CONCURRENT EVALUATION OF MULTIPLE
TARGET SEQUENCES IN COMPLEX CELL MIXTURES
FIELD OF THE INVENTION
[0001] The present invention relates generally to methods and compositions for parallel single cell-based multiplexed primer dimer-mediated amplification of multiple target sequences in complex cell mixtures.
BACKGROUND OF THE INVENTION
[0002] Next generation sequencing (NGS), also referred to by some as massive parallel sequencing, enables rapid and cost-effective characterization of DNA and RNA through the production of vast numbers of short sequencing reads in a single stroke and allowing up to hundreds of gigabases of DNA to be read in a single run. NGS sequencing approaches include those commercialized by Illumina, Life Technologies, and others, which utilize short (typically in the 50-200bp size range) fragments of nucleic acid attached to DNA adaptors and to a solid support (typically a slide or bead). Each fragment is amplified using PCR and the addition of each individual nucleotide is recorded, e.g. through parallel measurement of specific fluorescence signals or ion/proton release. NGS allows rapid sequencing of whole genomes, assessment of target regions by deep sequencing, identification of RNA variants and splice sites with RNA sequencing, relative quantification of mRNAs in gene expression analysis, etc. Multiplex sequencing combined with DNA barcode tags enables simultaneous sequencing of multiple, mixed, barcoded samples, for instance while profiling microbial diversity or while interrogating a relatively low complexity (i.e. very targeted) interval in a large number of individual samples in parallel.
[0003] Rapid and effective approaches to sequence analysis, particularly mutational and/or variability (such as single nucleotide polymorphism or SNP) assessments, are critical for evaluating and monitoring diseases, clinical and patient responsiveness, susceptibility to disease, and anticipated response or resistance to therapy. Applications include cancer evaluations and monitoring, assessing immune response, evaluating inflammatory response, and understanding the impact of genetic variability. Inherited mutations and genomic heterogeneity or somatic variation lie at the root of many disorders, including neurological diseases and cancer.
[0004] Over the last decade, NGS / massive parallel sequencing has become a routine laboratory approach when assessing the nature and/or complexity of a mixture of nucleic acids. However, the genomic (DNA) and transcriptomic (RNA) composition of individual cells is lost in conventional NGS sequencing studies, which typically analyze pooled DNA and/or RNA extracted from large numbers of cells or specimen containing large amounts of (often heterogeneous) cells. In these studies, low abundance variations in cells will be largely concealed in the bulk signal, and can no longer be assigned to a specific cell or cell tag. More importantly, when assessing the status of multiple amplification targets in parallel (i.e. multiplexing), the possibility to assess the possible co-occurrence of variants or somatic mutations within individual cells is lost when using such a pooled nucleic acids approach.
[0005] Clear insights into many biological processes— from normal development to tumor evolution— will only be gained from a detailed understanding of co-occurring genomic and transcriptional variation at the single-cell level. Furthermore, some cell types are so rare that single-cell approaches become paramount to their identification and characterization, for example in early identification of minimal residual disease (MRD) or resistant clones. Current systems and approaches are limited by the fact that they only provide aggregated quantitative snapshots of the specific targets interrogated at any given point in time, often utilizing total isolated DNA from a multi-cellular sample as a starting analyte.
[0006] Therefore, in view of the aforementioned deficiencies attendant with prior art methods of sequencing and cell sample analysis, it should be apparent that there still exists a need in the art for approaches, methods and compositions for evaluation of multiple targets and co-occurrence analysis at the single cell resolution level, particularly approaches and methods which are scalable for multiplexing and cell population analysis.
[0007] The citation of references herein shall not be construed as an admission that such is prior art to the present invention.
SUMMARY OF THE INVENTION
[0008] The present invention provides methods and compositions for massive parallel primer dimer-mediated multiplexed single cell-based amplification for concurrent evaluation of multiple target sequences in cell mixtures, including complex cell mixtures. [0009] In a first aspect, the invention provides scalable methods and compositions for assessing at least two distinct targets in a single cell by generating a multi-target amplicon fusion product via primer dimer-mediated concatenation of individual distinct amplification products originating from the same cell. Primer dimer-mediated amplification serves to join or concatenate at least two targets which are ordinarily distinct and separated into a single amplification product, providing a multi-target amplicon fusion product for analysis. The method is typically conducted on a single cell basis across a sample population of cells and thereby provides data on the entire population of cells both in an aggregated as well as individual (i.e. individual cell resolution) form.
[00010] In accordance with the present invention, methods, assays and compositions are provided which generate target-specific information for single/individual cells across a population of cells, providing specific cell-by-cell information assessing two or more targets, including target mutations and/or variants, including those resulting in amino acid changes, substitutions, deletions, additions, S Ps, slice variants, etc. Thus, the methods and compositions of the invention enable the evaluation of at least two targets simultaneously on a single cell basis, permitting evaluation of their possible co-occurrence at the individual cell level across an entire population of cells in a given sample.
[00011] The present invention provides a scalable approach which combines single cell analysis and targeted amplification techniques to evaluate specific DNA and/or RNA targets, particularly at least two targets simultaneously and in tandem, in a single cell. The scalable approach provided herein permits rapid, specific and effective parallel evaluation of defined sets of targets in single cells and is applicable to pools of up to millions of individual cells without the need for aggregate whole genome analysis. The invention provides individual cell multi-target information and is distinct from alternative methods wherein aggregated, averaged, or population level information is generated for a population of cells.
[00012] The present invention particularly enables assessment of co-occurrence of select nucleic sequences of specific, disease- or therapy-related interest on a cell-by-cell basis. Thus, single nucleotide polymorphisms (SNPs) or somatic mutations, including physiological somatic mutations involving genomic rearrangements, as is the case in B- or T-cell receptor rearrangements, or non-physiological somatic mutations, as is the case in certain disease-related genomic changes, can be evaluated, thereby providing details of cell clonal alterations. In a general aspect, the present approach uses the combined power of single cell manipulation and primer dimer-mediated amplification to generate concatenated target fusion amplification products for subsequent analysis. Single cell manipulation is conducted by any of various means, including but not limited to via available micro-fluidics approaches or micro-droplet technologies. Analysis of the target-fusion amplification product may include massive parallel next-generation sequencing (NGS) to provide a comprehensive dataset at the nucleotide resolution level across a large number of individual cells.
[00013] In an aspect of the invention, a method for simultaneously evaluating at least two distinct target sequences from single cells across a population or mixture of cells, wherein single cell-based multi-target amplicon fusion products are generated via primer dimer-mediated concatenation of at least two target sequences from each of said single cells, and the at least two target sequences are assessed on a single cell basis across the population or mixture of cells.
[00014] One aspect of the method comprises the steps of (a) disaggregating a population or mixture of cells into single cells; (b) manipulating the single cells such that nucleic acid therefrom is available and suitable for individual amplification; (c) amplifying at least two distinct nucleic acid target sequences using a combination of target-specific primers and concatenating nucleic acid target sequences via primer-dimer formation facilitated by unique primer dimer cassettes in target specific primers to amplify multi-target amplicon fusion product(s); and (d) analyzing the multi-target amplicon fusion product(s) from each single cell across the population or mixture of cells.
[00015] In an aspect, the primer dimer cassettes comprise sequences of at least 5 nucleotides which lack functionally significant complementarity to any of the target sequences. In an aspect, primer dimer cassettes facilitate overlap extension PCR to concatenate amplification products representative of at least two target sequences. In an aspect, primer dimer cassettes facilitate overlap extension PCR to concatenate amplification products from at least three target sequences. In an aspect, primer dimer cassettes facilitate overlap extension PCR to concatenate amplification products from at least two target sequences to generate a multi-target amplicon fusion product.
[00016] In an embodiment of the method of the invention, at least three distinct target sequences are simultaneously evaluated, and amplicon fusion product is generated through concatenation of at least three target sequences via intermediate primer dimer formation. In an embodiment of the method of the invention, three distinct target sequences are simultaneously evaluated, and amplicon fusion product is generated through concatenation of three target sequences via intermediate primer dimer formation.
[00017] The invention provides methods and compositions for assessing a population or complex mixture of cells for co-occurrence in single cells of variants or mutations in at least two distinct target sequences, whereby single cell-based multi-target amplicon fusion products are generated via primer dimer-mediated concatenation of the at least two target sequences, and are evaluated for the variants or mutations in the at least two distinct target sequences across the population or mixture of cells.
[00018] The invention provides methods and compositions for assessing a population or complex mixture of cells for co-occurrence in single cells of variants or mutations in at least three distinct target sequences, whereby single cell-based multi-target amplicon fusion products are generated via primer dimer-mediated concatenation of the at least three target sequences, and are evaluated for the variants or mutations in the at least three distinct target sequences across the population or mixture of cells.
[00019] In accordance with the present invention, the population or mixture of cells may comprise at least 100 cells, 1,000 cells, at least 10,000 cells, at least 100,000 cells, at least 1,000,000 cells.
[00020] In an aspect of the invention, small numbers or clusters of cells, such as in a small sample of cells or in instances where multiple biopsies or samples are taken to evaluate a condition, are evaluated in accordance with the present methods. Such small numbers or clusters of cells may include at least 30 cells, at least 50 cells, fewer than 100 cells, about 100-1000 cells, etc. In an aspect, multiple samples, including samples of small numbers or clusters of cells, are evaluated in parallel. This might be applicable in an aspect, for instance, where multiple biopsies are taken from a patient, such as, for instance, in the diagnosis of prostate cancer. In one such instance, Each multi-cellular biopsy may be analyzed in parallel.
[00021] In an aspect, the at least two distinct target sequences are selected from (1) a T cell receptor or immunoglobulin region and (2) at least one T cell cancer target gene. In an aspect the cancer target gene may be an oncogene, tumor suppressor, differentiation gene.
[00022] In an aspect, the at least two distinct target sequences are selected from (1) a B cell receptor or immunoglobulin region and (2) at least one B cell cancer target gene. In an aspect the cancer target gene may be an oncogene, tumor suppressor, differentiation gene.
[00023] In an aspect, the at least two distinct target sequences are distinct cancer target genes, or are distinct oncogenes, tumor suppressor genes, or differentiation genes.
[00024] In an aspect, the target sequences include non-human sequences. In one such aspect, the target sequences may be pathogen sequences, including bacterial, viral, or fungal sequences.
[00025] In an aspect, single-cell based amplification is conducted in micro droplets, chambers, micro fluidic compartments, or any such other separated or sequestered manner whereby nucleic acid from a single cell is provided in a separated, closed, distinct or otherwise sequestered manner for specific amplification. In an aspect, a high throughput micro fluidic device or micro droplet device is utilized. [00026] In an aspect, the population or mixture of cells is composed of somatic cells. In an aspect, the population or mixture of cells is composed of neoplastic cells. In an aspect, the population or mixture of cells is derived from a biopsy sample. In an aspect, the population or mixture of cells are cells derived from a cancer patient for evaluation or assessment of minimal residual disease. The population or mixture of cells can be from a non-cancer patient, a normal individual, including a healthy individual, a patient or a non-patient, and the like. In an aspect, the population or mixture of cells are cells derived from a patient with an immunological condition or autoimmune disease. In a further aspect, the population or mixture of cells are cells derived from a transplant patient. In an aspect, the cells can be cells of an adult, child, infant, fetus.
[00027] In one aspect of the methods of the invention, the amplification is conducted in individual droplets or chambers for each cell in a population or mixture of cells.
[00028] The invention provides an oligonucleotide amplification primer composition for generating single cell-based multi-target amplicon fusion products for evaluating at least two distinct targets in single cells comprising (i) at least two target primer sets, each complementary to and specific for a distinct combination of at least two targets of interest, denoted Target A and Target B; (ii) at least one primer having a 5' dimerization cassette that forms a primer dimer with the dimer cassette of the primer of (iii) and a 3' portion complementary to a first target, denoted Target A; and (iii) at least one primer having a 5' dimerization cassette that forms a primer dimer with the dimer cassette of the primer of (ii) and a 3' portion complementary to a second target, denoted Target B.
[00029] In an aspect, the invention provides an oligonucleotide amplification primer composition for generating single cell-based products for evaluating at least three distinct targets in single cells comprising (i) at least three target primer sets each complementary to a distinct target of interest, denoted Target A, Target B and Target C; (ii) at least one primer having a 5' dimerization cassette that forms a primer dimer with the dimer cassette of the primer of (iii) and a 3' portion complementary to a first target, denoted Target A; (iii) at least one primer having a 5' dimerization cassette that forms a primer dimer with the dimer cassette of the primer of (ii) and a 3' portion complementary to a second target, denoted Target B; (iv) at least one primer having a 5' dimerization cassette that forms a primer dimer with the dimer cassette of the primer of (v) and a 3' portion complementary to a third target, denoted Target C; and (v) at least one primer having a 5' dimerization cassette that forms a primer dimer with the dimer cassette of the primer of (iv) and a 3' portion complementary to a second target, denoted Target B at a region distinct from that of primer (iii). [00030] In an aspect, one or more primer, including of one or more target primer set, further comprises a tail, particularly additional sequence at the 5' end, which comprises distinct and/or unrelated or random sequence. In an aspect, such distinct, unrelated or random sequence may serve as complementary sequence for a multi-target amplicon fusion product-directed primer for further amplification of the multi-target amplicon or of portions thereof. Such further amplification may serve to boost or otherwise increase yields of the multi-target amplicon fusion product, may serve to amplify portions of the multi-target amplicon fusion product for further analysis, etc.
[00031] In an aspect, the primer dimerization cassettes comprise sequences of at least 5 nucleotides which lack functionally relevant complementarity to any of the target sequences. In an aspect, the primer dimerization cassettes facilitate overlap extension PCR to concatenate initially amplified products from at least two target sequences.
[00032] In one aspect, the at least one of the primers of (ii), (iii), (iv) or (v) further comprises additional intervening nucleotides between the dimerization cassette and target complementary sequences which comprises a linking domain of at least 2 nucleotides.
[00033] In a further aspect, at least one of the primers of (ii), (iii), (iv) or (v) further comprises additional intervening nucleotides between the dimerization cassette and target complementary sequences which comprises a linking domain comprising barcode sequence.
[00034] The invention includes a method for linking at least two distinct target sequences from single cells in an amplification product for analysis across a population or mixture of cells comprising amplifying nucleic acids from single cells in a population or mixture of cells with the oligonucleotide amplification primer composition of the invention, wherein at least two distinct target sequences are concatenated in a multi-target amplicon fusion product.
[00035] The invention further includes a method for generating single cell-based multi-target amplicon fusion products for evaluating at least two distinct targets in single cells across a population or mixture of cells comprising amplifying at least two distinct nucleic acid target sequences using a combination of target specific primers and concatenating the nucleic acid target sequences via primer-dimer formation facilitated by unique primer dimer cassettes in target specific primers.
[00036] In an aspect of the invention, a multi-target amplicon fusion product is generated wherein Target A and Target B, or Target A, Target B and Target C sequences are fused in a single amplification product.
[00037] In accordance with the present invention, a method is provided comprising (a) separating cells from a multicellular sample to individual cells and introducing the cells on an individual basis each to a separate compartment; (b) manipulating the separated cells such that nucleic acid is available and suitable for amplification; (c) amplifying at least two distinct nucleic acid targets using multiple sets of target-specific primers designed in a way that facilitates the formation of at least one primer dimer to form at least one multi-target amplicon fusion product. In an aspect of the method the method comprises further analyzing the target fusion amplification products of the separated cells to determine the genotype of the targets.
[00038] In embodiments of the method, the step of (a) separating cells may include any suitable method or means whereby individual cells can be obtained from a multicellular sample or set of cells. Separating cells may encompass disaggregation of a population or mixture of cells. Disaggregation may include, but is not limited to, collagenase-based (or, more generally, enzyme-based) disaggregation, or physical disaggregation (such as by pushing a sample through a mesh screen, optionally followed by filtration to selectively isolate single cells). The separate compartment may include any chamber, droplet, fluid component, microwell, or other such compartment whereby each cell can remain separated and can be manipulated individually and its nucleic acid amplified and evaluated individually.
[00039] In embodiments of the invention manipulating the cells such that nucleic acid is available and suitable for amplification includes any method or means whereby nucleic acid is maintained and suitable for analysis. In one such aspect, nucleic acid, including DNA and/or RNA, is released from its normal physiological context so it can be copied to DNA (in the instance of RNA) and/or amplified by PCR or any equivalent approach. Manipulation may be specific or particular for the nucleic acid of interest, for example methods for DNA may be different from that for RNA. For example, in instances where RNA will be amplified after first being converted to cDNA, it may be desirable to first remove DNA or eliminate DNA, for example by using DNAse, followed by inactivation of the DNAse. Similarly, when assessing RNA, the cellular RNA should be suitable for RNA to cDNA conversion. Following RNA to cDNA conversion, the cDNA is then amplified and conditions suitable for cDNA amplification are relevant. Manipulation may include permeabilization, disruption, sonication or any other means of lysing of the cells.
[00040] In an aspect, a proteinase step, for example using proteinase K, may be incorporated, including to eliminate protein and maintained nucleic acid. In such an instance, the proteinase is then inactivated prior to amplification.
[00041] In accordance with the invention and amplification (c), a set of primers comprising at least the following are provided: (i) at least two target primer sets, each complementary to and specific for a distinct combination of targets of interest, denoted Target A and Target B; (ii) at least one primer having a 5' dimerization cassette that forms a primer dimer with the dimerization cassette of the primer of (iii) and a 3' portion complementary to a first target, such as Target A; and (iii) at least one primer having a 5' dimerization cassette that forms a primer dimer with the dimerization cassette of the primer of (ii) and a 3' portion complementary to a second target, such as Target B. In an aspect of the invention, the dimerization cassette containing primers, such as the primers (ii) and (iii), contain a 3' region sequence complementary to target sequence, particularly wherein the length of complementary sequence is sufficient to provide specificity to target sequence. In an aspect, the 3' region sequence complementary to target sequence hybridizes via complementarity to target under conditions wherein the 5' dimerization cassette sequences hybridize to form primer dimers. In a particular aspect, the 5' dimerization cassette sequences are not functionally complementary to any target sequence. In a particular aspect, the
5' dimerization cassette sequences lack functionally relevant complementary to cellular, genomic, or cDNA sequences.
[00042] In one aspect, the primer of (ii) may additionally have a further 5' portion complementary to Target B. In one aspect, the primer of (iii) may additionally have a further 5' portion complementary to Target A. The primers of (ii) and (iii) serve, in any aspect of design to link, or concatenate, at least two targets, denoted Target A and Target B, together in a multi- target amplicon fusion product.
[00043] In aspects of the invention, in the primers of (ii) and (iii) the dimerization cassette may be directly linked to the target complementary sequences (Target A or B complementary domain) or may be separated from the 3' target complementary sequence (Target A or B complementary domain) by a linking domain. Thus, in an aspect, one or more of the primers of the invention that contain a dimerization cassette may be directly linked to the target complementary sequences or may be separated from the 3' target complementary sequence by a linking domain. In an aspect, the linking domain comprises at least 2 nucleotides. In an aspect, the linking domain comprises at least 5 nucleotides. The linking domain may be random nucleotides or specific nucleotides. In one embodiment, the linking domain may contain a barcode sequence or other sequence for recognition, amplification or isolation.
[00044] In accordance with the present invention, at least two targets are fused via primer dimer-mediated concatenation to generate a multi-target amplicon fusion product. In embodiments of the invention, including as depicted in FIGURE 3, targets comprising End Target(s) and Intermediate Target(s) are concatenated via sets of distinct intervening primer dimer pairs. Primer dimerization proceeds via primer dimerization cassettes, utilizing a unique and specific primer dimerization cassette for each target concatenation. In an aspect of the invention, amplification to generate a multi-target amplicon fusion product comprising End
Target(s) and Intermediate Target(s) is conducted utilizing: (a) End Target complementary primers and (b) End Target/Intermediate Target primers each comprising (i) End Target or
Intermediate Target complementary sequence and (ii) complementary primer dimerization cassettes. In an aspect of the invention, additional Intermediate Targets are concatenated wherein amplification is conducted further comprising (c) Intermediate Target primers comprising
Intermediate Target complementary sequence and complementary primer dimerization cassettes.
In each concatenation step, primers comprising primer dimerization cassette pairs and specific
Target complementary sequence are utilized such that linkage of designated and specific Targets is promoted and mediated. The concatenated amplicon fusion product forms by amplification from the 5' end across the dimer-mediated overlaps.
[00045] In an aspect of the invention, single cell-based amplification of at least two targets is conducted to generate multi-target amplicon fusion products wherein sequence of End Targets and at least one Intermediate Target are concatenated. Amplification is conducted with primers comprising: (a) End Target complementary primers and (b) End Target/Intermediate Target primers each comprising (i) End Target or Intermediate Target complementary sequence, and (ii) complementary primer dimerization cassettes. In an aspect, amplification is conducted further comprising (c) Intermediate Target primers comprising (i) Intermediate Target complementary sequence and (ii) complementary primer dimerization cassettes.
[00046] Dimerization cassettes and primer dimer sequences are designed or selected to effectively and specifically form primer dimers, with minimal or absence of any significant complementarity to target sequences or other off target sequences. When designing dimerization cassettes, several design variables are considered while aiming at establishing specific dimers: an increased length allows one to incorporate additional functionality, such as, for instance, bar codes or internal (sequencing) primer binding sites, whereas the GC percentage of the cassettes (also in combination with an altered length) allows one to optimize reaction kinetics, particularly the formation of desired primer-dimers either early on, or later during amplification. Dimerization cassettes should be designed in such a way, that they do not support any off-target binding and subsequent primer extension. Various established algorithms (including, but not being limited to BLAST or BLAT searches) can be used during the optimization of specificity of dimerization cassette design. Examples of primer design software and successful designs are known in the art and may include Primer3, DNASTAR, OligoPrimer, OligoPerfect, PHUSER and FastPCR (Olsen, LR et al (2011) Nucl Acids Res 39(Web Server Issue):W61-W67; Kalendar R et al (2014) Methods Mol Biol 1116:271-302) [00047] In another aspect, additional targets are evaluated in accordance with the method. In one such aspect, additional targets may be incorporated in the concatenation of target amplification products. In accordance with this aspect, and amplification (c), a set of primers comprising at least the following are provided: (i) at least three target primer sets each complementary to a distinct target of interest, denoted Target A, Target B and Target C; (ii) at least one primer having a 5' dimerization cassette that forms a primer dimer with the dimer portion of the primer of (iii) and a 3' portion complementary to a first target, denoted Target A; (iii) at least one primer having a 5' dimerization cassette that forms a primer dimer with the dimerization cassette of the primer of (ii) and a 3' portion complementary to a second target, denoted Target B; (iv) at least one primer having a 5' dimerization cassette that forms a primer dimer with the dimerization cassette of the primer of (v) and a 3' portion complementary to a third target, denoted Target C; and (v) at least one primer having a 5' dimerization cassette that forms a primer dimer with the dimerization cassette of the primer of (iv) and a 3' portion complementary to a second target, denoted Target B at a region distinct from that of primer (iii). In this aspect a multi-target amplicon fusion product is generated wherein nucleic acids of interest from three targets, denoted Target A, Target B and Target C, are fused or concatenated in a single amplification product, a multi-target amplicon fusion.
[00048] Aspects and methods to fuse additional targets are contemplated, whereby target fusion amplification products are generated for up to 3, 4, 5, 6 or more targets in a single, concatenated amplification product or multi-target amplicon fusion product using overlapping PCR and primer dimers to join each target with another in series.
[00049] In an important aspect of the methods of the invention, the method is scalable to evaluate and simultaneously assess target genotypes in all or a majority or a significant fraction of individual cells in a multicellular sample. Thus, the method can evaluate multiple targets on an individual cell basis in samples comprising thousands of cells, or comprising millions of cells. In an aspect, the method is applied for analysis of samples of at least 2 cells, at least 30 cells, at least 50 cells, at least 100 cells, at least 1,000 cells, at least 5,000 cells, at least 10,000, at least 50,000, at least 100,000, at least 500,000, at least 1,000,000, at least 5,000,000 cells. In an aspect, the method is applied for analysis of samples of at least 1,000 cells, at least 5,000 cells, at least 10,000, at least 50,000, at least 100,000, at least 500,000, at least 1,000,000, at least 5,000,000 cells. In an aspect the method is applied for analysis of samples of at least 50 cells of interest, at least 100 cells, at least 1,000 cells of interest in a complex mixture of cells.
[00050] The invention provides a method of evaluating co-occurrence of at least two mutations of interest in at least two targets in a sample of cells on an individual cell basis. In accordance with the method co-occurrence of two or more target mutations are evaluated in a sample of at least 2 cells, at least 30 cells, at least 50 cells, at least 100 cells, at least 1,000 cells, at least 5,000 cells, at least 10,000 cells, at least 50,000, at least 100,000, at least 500,000, at least 1,000,000, at least 5,000,000 cells. In accordance with the method co-occurrence of two or more target mutations are evaluated in a sample of at least 1,000 cells, at least 5,000 cells, at least 10,000 cells, at least 50,000, at least 100,000, at least 500,000, at least 1,000,000 cells.
[00051] In accordance with the invention, a method is provided for evaluating co-occurrence of at least two gene or splice site mutations of interest in at least two targets in a sample of cells from a patient on an individual cell basis. In an aspect, combinations of one or more somatic mutations, variants, S P, and/or splice variants are evaluated. In accordance with the invention, a method is provided for evaluating co-occurrence of at least two cancer gene mutations of interest in at least two targets in a sample of cells from a cancer patient on an individual cell basis. The sample of cells may include a tumor biopsy, blood sample, bone marrow sample, or any suitable sample taken from a patient for evaluation of the status of cancer, evaluation of sensitivity to a cancer drug, evaluation of drug resistance, and/or evaluation of minimally residual disease. In an aspect of the invention, a method is provided for evaluating co-occurrence of at least two gene or splice site mutations of interest on an individual cell basis in at least two targets in a sample of cells from a patient with an autoimmune disease or condition, an inflammatory condition, an immune-mediated disorder.
[00052] In an embodiment, the presence or absence of a target amplification product, particularly the multi-target amplicon fusion product, may be diagnostic of the presence or cooccurrence of a mutation in a single cell. In an aspect of the invention, one or more target primers are specific for a target mutation, such that a target fusion amplification product is not produced in the absence of the mutation and therein, the presence of the target fusion amplification product is indicative of the mutation.
[00053] In an aspect, the multi-target amplicon fusion product is separated or isolated from any partial or incomplete amplification products or amplicons prior to and/or for analysis.
[00054] The invention includes an assay system for evaluation of drugs potentially effective to modulate target activity by evaluating the genotype of one or more target in single cells of a patient sample. In an aspect, the method provides an assay for determining sensitivity to one or more drugs, or a combination of drugs, directed to one or more targets being evaluated. Analysis of the target fusion amplification product may thus provide information regarding co-occurrence of certain mutations in two or more targets of interest and thereby sensitivity of the cells to target directed drugs based on the presence or absence of such mutation. [00055] In an aspect, one or more primers used in the method is labeled. A label may include a fluorescent label, a radioactive label, a small molecule label, the nucleic acid base may include a label or may be attached to a label. A label may be attached directly or indirectly. In an aspect, a label is attached via a linker, which may include one or more additional component and/or may include one or more modified base. In the instance where a radioactive label is used, isotopes such as 3H, 14C, 32P, 35S, 36C1, 51Cr, 57Co, 58Co, 59Fe, 90Y, 125I, 131I, and 186Re may be utilized. In an aspect, the small molecule label is biotin.
[00056] Other objects and advantages will become apparent to those skilled in the art from a review of the following description which proceeds with reference to the following illustrative drawings.
BRIEF DESCRIPTION OF THE DRAWINGS
[00057] FIGURES 1A and IB depict an amplification scheme for two targets - a first target, denoted Target A or "set A", and a second target, denoted Target B or "set B". Set A target primers are denoted PI and P2. PI and P2 primers comprise Target A complementary sequences for specific amplification of Target A. P2 is an elongated primer having a first, 5' region for dimerization (the "dimerization cassette" or dimer domain) followed by a second, 3' region which is complementary to Target A (the "Target A domain"). Set B target primers are denoted P3 and P4. P3 and P4 comprise Target B complementary sequences for specific amplification of Target B. P3 is an elongated primer having a first, 5' region for dimerization (the "dimerization cassette" or dimer domain) and a second, 3' region which is complementary to target B (the "target B domain"). PI and P4 lack dimerization cassettes. Initial amplification (Figure 1A) proceeds to separately amplify Target A or set A, and Target B or set B. During subsequent rounds of amplification, the PCR products start acting as primers themselves and primer dimer formation via the (complement of the initial) dimerization cassettes occurs (Figure IB). The dimer domain of set A primers (i.e. the 5' tail region and dimer domain of P2) is complementary to the dimer domain or tail of set B primers (i.e. the 5' tail region and dimer domain of P3), so that the complementary strands of P2 and P3 hybridize to form desired dimers which serve to link or concatenate Target A and Target B in a single, concatenated amplification product. Target A and B are linked via amplification from 5' to 3' across the dimerization cassettes. The concatenated product forms by amplification from the 5' end across the dimer- mediated overlaps. Following dimer formation, amplification proceeds to generate a target fusion amplification product wherein Target A and Target B are linked and form a concatenated multi-target amplicon fusion product as indicated. It is notable that, by adding primers PI and P4 in a higher concentration, or relatively larger amount, to the reaction mixture, generation of a concatenated Set A - Set B fusion fragments is favored. Thus, as primers P2 and P3 may particularly be present in limiting or lower relative amounts, and PI and P4 may be present in more abundant amounts, PCR products (black and blue in figure IB) start acting as primers themselves via primer dimer formation through the complements of the dimerization cassettes. This serves to concatenate, or link, Target A and Target B in a single amplification product. The multi-target amplicon fusion product enables subsequent simultaneous analysis of two distinct selected targets in a single molecule.
[00058] FIGURES 2A and 2B depict an amplification scheme for three targets - a first target, denoted Target A or "set A", a second target, denoted Target B or "set B", and a third target, denoted Target C or "set C". Set A target primers are denoted PI and P2. PI and P2 comprise Target A complementary sequence for specific amplification of Target A. P2 is an elongated primer having a first, 5' region for dimerization (the "dimerization cassette" or dimer domain) followed by a second, 3' region which is complementary to Target A (the "Target A domain"). Set B target primers are denoted P3 and P4. P3 and P4 comprise Target B complementary sequence for specific amplification of Target B. P3 is an elongated primer having a first, 5' region for dimerization (the "dimerization cassette" or dimer domain) followed by a second, 3' region which is complementary to Target B (the "Target B domain"). The P3 dimerization cassette is complementary to and is capable of forming a primer dimer with the P2 dimerization cassette. P4 is also an elongated primer having a first, 5' region for dimerization (the "dimerization cassette" or dimer domain) and a second, 3' region which is complementary to target B (the "target B domain"). The P4 dimerization cassette is complementary to and forms a primer dimer with the P5 dimerization cassette. P5 and P6 comprise Target C complementary sequence for specific amplification of Target C. P5 is an elongated primer having a first, 5' region for dimerization (the "dimerization cassette" or dimer domain) followed by a second, 3' region which is complementary to target C (the "target C domain"). By virtue of intermediary primer dimer formation via (complements of) P2 and P3 sequence and P4 and P5 sequence, concatenation of Target A with Target B with Target C occurs. Initial amplification (Figure 2 A) proceeds to separately amplify Target A or set A, Target B or set B, and Target C or set C. After multiple rounds of amplification, the PCR products will act as primers themselves and primer dimer formation via the dimerization cassettes occurs (Figure 2B). The dimer domain of set A primers (i.e. the 5' tail region and dimer domain of P2) is complementary to the dimer domain or tail of set B primers (i.e. the 5' tail region and dimer domain of P3), so that the complementary strands of P2 and P3 hybridize to form desired dimers which serve to link or concatenate Target
A and Target B in a single amplification product. The dimer domain of set B primers (i.e. the 5' tail region and dimer domain of P4) is complementary to the dimer domain or tail of set C primers (i.e. the 5' tail region and dimer domain of P5), so that the complementary strands of P4 and P5 hybridize to form desired primer dimers which serve to link or concatenate Target B and
Target C. Following primer dimer formation, amplification proceeds to generate a target fusion amplification product wherein Target A, Target B, and Target C are linked and form a concatenated multi-target amplicon fusion product as indicated. It is notable that, by adding primers PI and P6 (primers with Target complementary sequence and lacking dimerization cassettes) in a higher concentration, or relatively larger amount to the reaction mixture, generation of a concatenated Set A - Set B - Set C fusion fragment is favored. Thus, as primers
P2, P3, P4 and P5 are present in limiting or lower relative amounts, and PI and P6 are in more abundant relative amounts, PCR products start acting as primers themselves via dimer formation through the (complements of their) dimerization cassettes. This serves to concatenate or link
Target A and Target B and also Target C into a single amplification product. The target fusion amplification product enables subsequent simultaneous analysis of three distinct selected targets in a single molecule. Additional Targets, such as Target D, Target E, etc. can be concatenated via intermediary and overlapping primer dimer dimerization cassettes in a manner similar to that shown in the figures for three targets. With increasing length of the fully concatenated product,
PCR conditions may need to be adjusted (with extended elongation times allowing for the synthesis of longer molecules, just to give one example) Dimerization cassettes facilitate linking or concatenation of initial amplification products to form a multi-target amplicon fusion product wherein multiple target sequences are concatenated in a single nucleic acid for analysis and evaluation of multiple targets simultaneously from a single separate cell sample nucleic acid. In a virtual example where one would concatenate many targets (greater than 6 for instance, up to 9 for instance, more than 9 for instance), one could envision adding longer tails (containing a sequencing primer binding site) after an arbitrary number of targets (for instance: after every three), so NGS (with its limited read lengths) can be used to specifically query specific targets or target subsets).
[00059] FIGURE 3 depicts the overall scheme for primer dimer-mediated concatenation to generate a multi-target amplicon fusion product in single cells. The figure depicts concatenation of four targets, however, the scheme is applicable to more targets or fewer targets, similar as to what is shown in Figures 1 and 2. A multi-target amplification product comprises End Targets and Intermediate Target(s). Intermediate Targets are concatenated via primers comprising target complementary sequence (depicted as straight lines in the primers) and primer dimerization cassette sequence (depicted as dashed lines in the primers). The primer dimerization cassette in each instance is specific for a primer dimer pair, thus specifically concatenating Target sequences in an ordered amplification product. Therefore, as depicted in the figure, End Target is linked to a first Intermediate Target via primer pairs wherein one primer comprises a specific 5' first primer dimerization cassette and 3' Intermediate Target complementary sequence, and the other primer comprises 5' sequence complementary to the End Target and 3' primer dimerization cassette complementary to the first dimerization cassette. Intermediate Targets are fused via primer pairs comprising Target complementary sequence and complementary primer dimerization cassettes. Target complementary sequence primers without primer dimerization cassettes are utilized at the 5' and 3' ends of the multi-target amplification product to complete the product amplification via complementarity to end target sequences.
DETAILED DESCRIPTION
[00060] In accordance with the present invention there may be employed conventional molecular biology, microbiology, and recombinant DNA techniques within the skill of the art. Such techniques are explained fully in the literature. See, e.g., Sambrook et al, "Molecular Cloning: A Laboratory Manual" (1989); "Current Protocols in Molecular Biology" Volumes I-III [Ausubel, R. M., ed. (1994)]; "Cell Biology: A Laboratory Handbook" Volumes I-III [J. E. Celis, ed. (1994))]; "Current Protocols in Immunology" Volumes I-III [Coligan, J. E., ed. (1994)]; "Oligonucleotide Synthesis" (M.J. Gait ed. 1984); "Nucleic Acid Hybridization" [B.D. Hames & S.J. Higgins eds. (1985)]; "Transcription And Translation" [B.D. Hames & S.J. Higgins, eds. (1984)]; "Animal Cell Culture" [R.I. Freshney, ed. (1986)]; "Immobilized Cells And Enzymes" [IRL Press, (1986)]; B. Perbal, "A Practical Guide To Molecular Cloning" (1984); PCR Technology: Principles and Applications for DNA Amplification (Breakthroughs in Molecular Biology) (Henry A. Erlich, ed. (1993)).
[00061] Therefore, if appearing herein, the following terms shall have the definitions set out below.
[00062] Nucleic acids may comprise and include any recognized, known or acceptable nucleotide bases or nucleotides, modified nucleotides, etc. [00063] An "RNA molecule" refers to the polymeric form of ribonucleotides (adenine, guanine, uracil, and/or cytosine), particularly including its single stranded form, and including mRNA.
[00064] A "DNA molecule" refers to the polymeric form of deoxyribonucleotides (adenine, guanine, thymine, and/or cytosine) either in its single stranded form, or a double-stranded helix. This term refers only to the primary and secondary structure of the molecule, and does not limit it to any particular tertiary forms. Thus, this term includes double- stranded DNA found, inter alia, in linear DNA molecules (e.g., restriction fragments), viruses, plasmids, and chromosomes. DNA may include cDNA.
[00065] A DNA "coding sequence" is a double-stranded DNA sequence which is transcribed and translated into a polypeptide in vivo when placed under the control of appropriate regulatory sequences. The boundaries of the coding sequence may determined by a start codon at the 5' (amino) terminus and a translation stop codon at the 3' (carboxyl) terminus. A coding sequence can include, but is not limited to, prokaryotic sequences, cDNA from eukaryotic mRNA, genomic DNA sequences from eukaryotic (e.g., mammalian) DNA, and even synthetic DNA sequences. In eukaryotes, a polyadenylation signal and transcription termination sequence will usually be located 3' to the coding sequence.
[00066] An "amplicon" refers to a piece of DNA or RNA that is the source and/or product of natural or artificial amplification or replication events. An amplicon can be formed using various methods including polymerase chain reactions (PCR), ligase chain reactions (LCR), or natural gene duplication.
[00067] The term "multi-target amplicon fusion", "multi-target amplicon", "multi-target amplicon fusion product", "multi-target amplification fusion product(s)", may be used interchangeably, and refer to a product in accordance with the invention wherein sequences from at least two targets, particularly defined or pre-selected targets, are linked, joined or otherwise concatenated in a single nucleic acid or stretch of nucleic acids. In particular, concatenation to form a multi-target amplification fusion product is mediated via dimerization cassettes and primer dimer formation to link, via amplification across the dimers, initial single target specific amplification products in a multi-target product. In an aspect, the target sequences are not ordinarily spatially associated or otherwise near or linked with one another. In an aspect, the target sequences are distinct.
[00068] The term "distinct" as it refers to target sequences particularly, means that the target sequences are not closely associated physically relative to one another, not closely associated spatially in a nucleic acid or genomic DNA context, are not present on a same RNA or mRNA (or cDNA derived therefrom), or are not closely linked spatially. The distinct target sequences may be linked in a physiologically or clinically relevant sense but are not closely linked in a physical sense in terms of location on genomic DNA, RNA or chromosomal location. Sequences are distinct if they cannot be amplified on a single amplicon in a practical PCR-based manner. Thus, while they may be mapped to the same chromosome, they are thousands of bases apart and could not be amplified together, or could not be cloned together using standard means and methods.
[00069] The term "oligonucleotide," refers to a molecule comprised of two or more nucleotides, preferably at least 5, particularly more than five, particularly at least 10, particularly more than ten. Its exact or optimal size will depend upon many factors which, in turn, depend upon the ultimate function and use of the oligonucleotide. Oligonucleotides (e.g., primers) can be prepared by any suitable method, including direct chemical synthesis by a method such as the phosphotriester method of Narang et al, 1979, Meth. Enzymol. 68:90-99; the phosphodiester method of Brown et al, 1979, Meth. Enzymol. 68: 109-151; the diethylphosphoramidite method of Beaucage et al, 1981, Tetrahedron Lett. 22: 1859-1862; and the solid support method of U.S. Pat. No. 4,458,066, each incorporated herein by reference. A review of synthesis methods of conjugates of oligonucleotides and modified nucleotides is provided in Goodchild, 1990, Bioconjugate Chemistry 1(3): 165-187, incorporated herein by reference.
[00070] The term "primer," as used herein, refers to an oligonucleotide capable of acting as a point of initiation of DNA synthesis under suitable conditions. Such conditions include those in which synthesis of a primer extension product complementary to a nucleic acid strand is induced in the presence of up to four different nucleoside triphosphates and an agent for extension (e.g., a DNA polymerase or reverse transcriptase) in an appropriate buffer and at a suitable temperature.
[00071] A primer may be a single-stranded nucleic acid, particularly DNA, but may include suitable alternative nucleic acids, such as protein nucleic acids (PNA). The appropriate length of a primer depends on the intended use of the primer but typically ranges from 6 to 50 nucleotides, in certain embodiments from 10-40 nucleotides, from 10-30 nucleotides, from 15-35 nucleotides, and/or from 20-40 nucleotides. Short primer molecules or those containing a low GC percentage, generally require lower temperatures to form sufficiently stable hybrid complexes with their template. A primer need not reflect the exact sequence of the template nucleic acid, but must be sufficiently complementary to specifically hybridize with the intended template. The design of suitable primers for the amplification of a given target sequence is known in the art and described in the literature including as cited herein. [00072] As described herein, primers can incorporate additional features which allow for the detection or immobilization of the primer but do not alter the basic property of the primer, that of acting as a point of initiation of DNA synthesis. For example, primers may contain an additional nucleic acid sequence at the 5' end which does not hybridize to the initial target nucleic acid, but which facilitates isolation, cloning, detection, or sequencing of the amplified product. The region of the primer which is sufficiently complementary to template or target, particularly selected target, to hybridize may be referred to herein as the hybridizing region.
[00073] As used herein, a primer is "specific," for a target sequence if, when used in an amplification reaction under appropriate or sufficiently stringent conditions, the primer hybridizes primarily to the target nucleic acid, particularly the pre-selected and specific target sequence. Typically, a primer is specific for a target sequence if the primer-target duplex stability is greater than the stability of a duplex formed between the primer and any other sequence found in the sample. One of skill in the art will recognize that various factors, such as salt conditions as well as uniqueness of the base composition of the primer and the location of the mismatches, will affect the specificity of the primer, and that routine experimental confirmation of the primer specificity can be conducted. Thus, it is known in the art that perfect homologies are more critical towards the 3' end of a primer, with the outmost 3' base being critical. Hybridization and amplification conditions can be chosen under which the primer forms stable duplexes only with a target sequence. Thus, the use of target-specific primers under suitably stringent amplification conditions enables the selective amplification of those target sequences which contain the target primer binding sites. It is notable that a primer may consist entirely of target-specific complementary nucleic acids and thereby hybridize to target along its entire length. Alternatively, a primer may comprise target-specific complementary nucleic acids and additional nucleic acids. The additional nucleic acids may include nucleic acids complementary to another target, may include nucleic acids complementary to another primer, may include nucleic acids that are designed to tag or identify the primer's amplification product, may include barcode sequence, etc.
[00074] In certain particular embodiments, target primers for use in the methods described herein comprise or consist of a nucleic acid of at least about 15 nucleotides long that has the same sequence as, or is complementary to, a 15 nucleotide long contiguous sequence of the target. Longer primers, e.g., those of about 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 45, or 50, nucleotides long that have the same sequence as, or sequence complementary to, a contiguous sequence of the target, will also be of use in certain embodiments. All intermediate lengths of the aforementioned primers are contemplated for use herein. As would be recognized by one skilled in the art, the primers may have additional sequences added (e.g., nucleotides that may not be the same as, or complementary to, the target), such as restriction enzyme recognition sites, adaptor sequences for sequencing, barcode sequences, and the like. The primers may have additional chemical groups added, including ones which do not react with nucleic acids, such as biotin. Therefore, the length of the primers may be longer, such as 55, 56, 57, 58, 59, 60, 65, 70, 75 nucleotides, or greater than 75 nucleotides, in length or more, depending on the specific use or need.
[00075] The primers herein are selected to be "substantially" complementary to different strands of a particular target DNA sequence. In the instance of substantially complementary versus complementary, the primers must be sufficiently complementary to hybridize with their respective strands in a specific and effective manner. Thus, the relative amount or extent, length etc of complementary sequence should be sufficient to provide specific and stable hybridization to target or selected sequence via complementarity among non-target or non-selected sequence, for the primers to be substantially complementary. In an aspect, the primer sequence need not reflect the exact sequence of the template. For example, a non-complementary nucleotide fragment may be attached to the 5' end of the primer, with the remainder of the primer sequence being complementary to the strand. Alternatively, non-complementary bases or longer sequences can be interspersed into the primer, provided that the primer sequence has sufficient complementarity with the sequence of the strand to hybridize therewith and thereby form the template for the synthesis of the extension product.
[00076] As used herein, the terms "restriction endonucleases" and "restriction enzymes" refer to bacterial enzymes, each of which cut double-stranded DNA at or near a specific nucleotide sequence.
[00077] Two DNA sequences are "substantially homologous" when at least about 75% (preferably at least about 80%, and most preferably at least about 90 or 95%) of the nucleotides match over the defined length of the DNA sequences. Sequences that are substantially homologous can be identified by comparing the sequences using standard software available in sequence data banks, or in a Southern hybridization experiment under, for example, stringent conditions as defined for that particular system. Defining appropriate hybridization conditions is within the skill of the art.
[00078] A "heterologous" region of the DNA construct is an identifiable segment of DNA within a larger DNA molecule that is not found in association with the larger molecule in nature. Thus, when the heterologous region encodes a mammalian gene, the gene will usually be flanked by DNA that does not flank the mammalian genomic DNA in the genome of the source organism. Another example of a heterologous coding sequence is a construct where the coding sequence itself is not found in nature (e.g., a cDNA where the genomic coding sequence contains introns, or synthetic sequences having codons different than the native gene). Allelic variations or naturally-occurring mutational events do not give rise to a heterologous region of DNA as defined herein.
[00079] As used herein, "pg" means picogram, "ng" means nanogram, "ug" or '^g" mean microgram, "mg" means milligram, "ul" or "μΐ" mean microliter, "ml" means milliliter, "1" means liter.
[00080] The term "chemiluminescent group," as used herein, refers to a group which emits light as a result of a chemical reaction without the addition of heat. By way of example only, luminol (5-amino-2,3-dihydro-l,4-phthalazinedione) reacts with oxidants like hydrogen peroxide (H202) in the presence of a base and a metal catalyst to produce an excited state product (3- aminophthalate, 3-APA).
[00081] The term "chromophore," as used herein, refers to a molecule which absorbs light of visible wavelengths, UV wavelengths or IR wavelengths.
[00082] The term "detectable label," as used herein, refers to a label which is observable using analytical techniques including, but not limited to, fluorescence, chemiluminescence, electron- spin resonance, ultraviolet/visible absorbance spectroscopy, mass spectrometry, nuclear magnetic resonance, magnetic resonance, and electrochemical methods.
[00083] The term "dye," as used herein, refers to a soluble, coloring substance which contains a chromophore.
[00084] The term "fluorophore," as used herein, refers to a molecule which upon excitation emits photons and is thereby fluorescent.
[00085] In some embodiments, the term "label," as used herein, refers to a substance which is incorporated into a compound and is readily detected, whereby its physical distribution is detected and/or monitored.
[00086] The term "probe" refers to a compound or molecule for the detection of a target, including detection of one or more multi-target amplification product. The probe may comprise complementary nucleic acids specific for an amplification product. The probe may comprise an agent, a linker, a label, or any combination thereof. The probe may comprise an agent and a linker, or an agent and a label, or a label. In another instance, the probe may comprise an agent and a label. The probe may comprise a binder which recognizes a label on the target product.
[00087] Suitable samples for any of the methods and assays provided herein contain cells for analysis and may comprise, but are not limited to, a whole blood sample, peripheral blood sample, lymph sample, tissue sample, tumor biopsy sample, bone marrow sample, or other cellular sample. The sample may contain one or more cell types, derived from a whole blood sample, peripheral blood sample, peripheral blood mononuclear cell (PBMC) sample, lymph sample, tissue sample, tumor biopsy sample, bone marrow sample, or other cellular sample. Cells of the sample may be isolated from other components of the sample prior to use in the methods provided. Particular cell types of the sample may be isolated from other cell types of the sample prior to use in the methods provided. In some embodiments, peripheral blood mononuclear cells (PBMCs, e.g., lymphocytes, monocytes and macrophages) of a blood sample are isolated from other cell types of the blood sample prior to use in the methods provided. For example, in some embodiments, lymphocytes (e.g., B cells, T cells or K cells) of the sample are isolated from other cell types of the sample prior to use in the methods provided. B cells of the sample may be isolated from other cell types of the sample prior to use in the methods provided. In some embodiments, cancer cells may be isolated from normal cells of the sample prior to use in the methods provided. A sample may comprise complex populations of cells, which can be assayed as a population, or separated into sub-populations. Such cellular samples can be separated by centrifugation, elutriation, density gradient separation, apheresis, affinity selection, panning, FACS, filtration, centrifugation with Hypaque, using antibodies specific for markers identified with particular cell types. Alternatively, a heterogeneous cell population can be used. Once a sample is obtained, it can be used directly, frozen, or maintained in appropriate culture medium for short periods of time. Methods to isolate one or more cells for use according to the methods of this invention are performed according to standard techniques and protocols well-established in the art.
[00088] In an aspect, the invention provides scalable methods and compositions for assessing at least two distinct targets in a single cell by generating a multi-target fusion product via primer dimer-mediated amplification. Primer dimer-mediated amplification serves two join at least two targets which are ordinarily distinct and separate in a single amplification product for analysis. The method is conducted on a single cell basis across a sample population of cells and thereby provides data on the entire population of cells individually.
[00089] In accordance with the present invention, methods, assays and compositions are provided which generate target specific information for single individual cells across a population of cells, providing specific cell by cell information assessing two or more targets, including target mutations or variants. Thus, the methods and compositions of the invention enable the evaluation of at least two targets simultaneously on a single cell basis, permitting evaluation of co-occurrence at the individual cell level across an entire population of cells in a given sample.
[00090] Thus, the present invention provides a novel approach that combines the power of single cell manipulation, such as via existing micro-fluidics and micro-droplet technology, with mutation(s)-specific PCR-based target fusion DNA hybrid synthesis. In accordance with the methods herein, and the associated compositions, a multi-target amplification product is generated and evaluated at the single cell level in a scalable fashion. Scalability enables multiplexing analysis and determination of co-occurrence an multi-target mutation assessment at the individual cell level across a population of cells in a sample, particularly wherein at least 10,000, at least 50,000, at least 100,000, at least 500,000, at least 1,000,000, at least 5,000,000, at least 10,000,000 cells are evaluated.
[00091] Existing approaches to target analysis in cells evaluate a whole population in an aggregated fashion. While an emerging clone representing 10% or thereabouts of the cell population might be recognized, current approaches largely fail to readily identify or recognize rare clones (<10%, particularly <1%) and cannot assess multiple targets simultaneously on a cell- by-cell basis. For example, current approaches cannot evaluate target 1 co-occurrence with an oncogenically or drug resistance significant target 2 mutation.
[00092] In accordance with the present invention, single cell analysis and amplification techniques are combined with primer compositions which utilize primer-dimer formation to link two or more targets of interest in a single amplification product. Once an amplification product is generated in a single cell, each cell product can be evaluated, for example utilizing sequencing or other such techniques to provide a target-relevant readout or mutation status at each target for every individual cell in the sample. In an embodiment of the invention, select cells in a sample are evaluated and are first selected by phenotype or other characteristic, including the presence of a surface antigen, such as by FACS analysis.
[00093] The presently available systems, including NGS applications, are typically unable to establish the possible co-occurrence (or lack thereof) of specific, clinically significant somatic mutation(s) within a single cell, for example clinically relevant somatic mutation(s) in a given B- or T-cell clone, when applied to a large sample size (millions of cells in parallel). Scaling up to analysis of millions of cells to provide cell-by-cell information in parallel still remains a technological hurdle. Evaluating all cells in total in a homogenized tumor biopsy sample fails to identify low prevalence genetic alterations and/or emerging clones. A therapeutic challenge in effectively battling cancer, including achieving sustained remission, lies in tumor heterogeneity, and in instances of metastasis or recurrence, in emergence of mutations and clones with reduced sensitivity or resistance to cancer agents. Thus, being able to see which mutations co-occur in the same tumor cell (clone), would be a tremendous step forward in the process of NGS-driven assessments and clinical decision making and in the design and adaptation of novel therapies and therapeutic strategies.
[00094] Advances in techniques for the isolation of single cells (including micro-pipetting, fluorescence-activated cell sorting (FACS) and microfluidics), whole genome or transcriptome amplification, and genome-wide analysis platforms and NGS devices have paved the way for high-resolution analysis of the genome or transcriptome from single cells. A diploid human cell contains approximately 7 pg genomic DNA. This necessitates amplification prior to microarray or NGS based analysis. Current whole-genome amplification (WGA) principles are based on Multiple Displacement Amplification (MDA), Polymerase Chain Reaction (PCR), or a combination of both. Unfortunately, various imperfections in whole genome amplification can considerably affect the interpretation of the readout, including the breadth of genomic coverage, amplification bias due to local differences in % GC-bias, the prevalence of allelic drop outs, preferential allelic amplifications, chimeric DNA-molecules. Nucleotide copy errors can vary significantly between different WGA approaches, making some methods better suited than others for detecting specific classes of genetic variation (Kumar P, et al (2013) How to analyze a single blastomere? Application of whole-genome technologies: micro-arrays and next generation sequencing. In: Sermon K, Viville S, editors. Textbook of Human Reproductive Genetics. Cambridge: Cambridge University Press; Treff NR et al (2011) Mol Hum Reprod 17: 335-343; Zong C et al (2012) Science 338: 1622-1626; Voet T et al (2013) Nucleic Acids Res 41 : 6119- 6138). Thus, a major issue with WGA and amplifying the full genetic material from a single cell is bias and error. This can be mitigated but not eliminated, even with separation of the amplification reaction for instance to individual nanoliter-scale (such as with microwell displacement amplification systems (MIDAS) (Gole, J et al (2013) Nature Biotechnology 31 : 1126-1132) or with individual micro droplets.
[00095] In seeking to evaluate individual and relevant cellular changes, whole genome amplification and sequencing of a single cell presents challenges and provides off target information which may not be necessary or valuable. On the other hand, targeted or target specific information, when provided for a population of cells, and not on an individual cell basis, fails to identify specific cell-by-cell information and eliminates the possibility of assessing mutations, including co-occurrence, at the individual cell level. Therefore, we sought to develop a scalable approach which comprises single cell analysis and targeted PCR amplification techniques to evaluate specific DNA or RNA targets in tandem in a single cell. [00096] The instant invention provides a scalable approach that permits rapid, specific and effective parallel evaluation of defined sets of targets or loci in single cells and is applied to pools of millions of individual cells without the need for whole genome analysis. The present technique particularly enables cataloging and assessment for co-occurrence of select mutations or alterations. Thus, single nucleotide polymorphisms (S Ps) or somatic mutations can be evaluated, thereby providing details of cell clonal alterations. The new approach uses the combined power of single cell manipulation, such as via existing micro-fluidics and micro- droplet technology, with mutation(s)-specific PCR-based target fusion DNA hybrid synthesis. Massive parallel sequencing, also commonly referred to as Next Generation Sequencing (NGS), can be implemented in the final analysis to provide a comprehensive dataset at the nucleotide resolution level.
[00097] In accordance with the present approach, a method or system is provided comprising single cell sampling and primer dimer-mediated amplification of multiple targets, wherein the multiple targets are linked in a single amplification product via primer-dimer formation. Thus, the invention provides a method wherein (a) single cells are provided for analysis, particularly wherein the single cells are suitable for nucleic acid, including DNA or RNA, amplification; (b) amplification proceeds utilizing primers directed to at least two targets of interest, wherein the targets are linked via primer dimer formation to provide one or more fusion amplification product which is a multi-target amplification product, a multi-target amplicon fusion product; and (c) analysis of the amplification product or products is conducted to evaluate the sequence and/or mutational status at each target.
[00098] Single Cell Sampling
[00099] In a first step, single cells are made available and separated for analysis. Single cell isolates may be generated using any number of available techniques or technologies, provided that the single cells are stable and suitable for individual cell-based nucleic acid amplification. Exemplary methods and approaches for single cell preparation include but are not limited to micropipetting, single cell fluorescence-activated cell sorting (FACS), laser capture microdissection (Frumkin, D et al. (2008) BMC Biotechnology 8: 17; Boone, DR et al. (2013) J Vis Exp doi: 10.3791/50308), and micro-fluidics. In one aspect, micro droplets (for example including but not limited to water-based droplets in oil) are generated that on the average each contain only one cell. Available methods and approaches for generating and evaluating micro droplets include those described by Link and colleagues and Raindance Technologies (Kiss, MM et al. (2008) Anal Chem 80(23):8975-81; US2010/0137163; US2013/0217583; WO2014/165559; and U.S. Patents 8,528,589 and 8,857,462; incorporated herein by reference). Raindance Technologies droplets are applicable for use with single cells, single proteins, single molecules of nucleic acids, and wherein a single-plex PCR reaction is conducted inside of each droplet. The speed of droplet preparation is up to 10,000 per second, bringing 10-15 million droplets (i.e. 106 cells) in 1000-1500 seconds within a feasible range (i.e. less than half an hour to produce). An alternative droplet-to-digital approach and micro-fluidic platform has recently been described (Shih, SCC et al. (2015) Lab Chip 15:225-236).
[000100] It is notable that, in the present approach, the single cells can be either unsorted (i.e. when using, for instance, whole blood, or PBMCs, or a tumor or tissue biopsy sample), or the single cells can be obtained after specific isolation (sorting) of a subpopulation. For example, cells can be obtained through collagenase-mediated disaggregation of solid tumor samples. Cells may be pre-separated or pre-selected exploiting for example cell surface markers combined with magnetic bead capture technologies or conventional fluorescence-activated cell sorting (FACS).
[000101] In an aspect of the invention, cells do not need to be viable, as long as they are sufficiently intact to maintain their original nucleic acid content, and the nucleic acid is preserved and suitable for amplification, including of sufficient quality to allow amplification. In the case of pre-separation or pre-selection, mutations identified or characterized in accordance with the analysis provided herein may then be correlated back to sorted phenotypes, markers or cell characteristics.
[000102] Combination with amplification reagents
[000103] In a next step, suitable amplification reagents are introduced into the single cell samples or droplets. In one approach, single cell sample droplets are subsequently fused 1 : 1 with pre-formulated reaction droplets containing specific reagents needed for next steps of the procedure. This can be achieved using one of several established micro-fluidics approaches. In one such approach, the cell is permeabilized, sonicated, lysed or any such suitable other method can be used wherein nucleic acid is made available for subsequent amplification and assessment of specific target(s).
[000104] More specifically, the added reagents (for example reagents contained in "the second droplet" (or reaction droplet) or otherwise combined with the single cell sample or droplet) include a complex mixture of PCR amplification reagents. Reagents include suitable PCR buffer, nucleotides (which may or may not be fluorescently or otherwise labeled), polymerase, and a target region-specific, custom-designed set or mixture of primers. The primers include primers which specifically hybridize to each target region for amplification of at least two targets that can but do not necessarily need to be in close proximity on the human genome. [000105] In one aspect of the invention DNA is evaluated and is amplified, particularly genomic DNA is evaluated and is amplified. In such an aspect, amplification reagents suitable and applicable for DNA amplification are utilized and added or otherwise combined with the single cell sample or droplet. In one such aspect amplification proceeds with only DNA amplification reagents.
[000106] In another aspect, RNA is evaluated. In the instance wherein RNA is evaluated, the RNA is first reverse transcribed to DNA. Methods and approaches suitable for transcribing RNA to DNA, particularly RNA to cDNA, including in RT-PCR applications, are known and available to one skilled in the art. Wherein RNA is evaluated, additional steps and/or reagents may be utilized, for instance, proteinase digestion, followed optionally by proteinase inactivation. In another aspect, RNA is inactivated or otherwise destroyed after cDNA synthesis, for instance utilizing RNAase treatment of the sample. These steps and/or reagents may be utilized to enable more accurate and/or quantitative amplification or results, including for example to improve the yield and/or accuracy of cDNA generation.
[000107] Target-specific amplification
[000108] Specific amplification of at least two target sites in a single cell is then conducted. In an aspect of the invention, single cell amplification is conducted in individual chambers, micro- channels or droplets for each single cell. In one such aspect the chamber(s) or micro-channel(s) are heated and/or cooled to facilitate and/or control amplification. In one aspect of the invention, amplification may proceed via use of a continuous micro-fluidics device, including wherein amplification occurs while droplets pass through a temperature gradient wherein the temperature is suitable for amplification. Cycling of annealing, extension, and denaturation and then repeating same is a standard aspect of known and available amplification techniques and technologies and can be readily implemented and applied by one of skill in the art. In one aspect, amplification proceeds through initial denaturation at high temperature (about 94°C, range 85- 95°C), annealing at lower range temperature (about 54°C, range 50-65°C), followed by extension at a mid-range temperature (about 72°C, range 65-75°C).
[000109] Cycling and amplification conditions may follow methods of those skilled in the art or known in the art. In an aspect, for example, conditions used in a PCR Express™ thermal cycler may be utilized (Hybaid, Ashford, UK), and the following cycling conditions may be used: 1 cycle at 95°C for 15 minutes, 25 to 40 cycles at 94°C for 30 seconds, 59°C for 30 seconds and 72°C for 1 minute, followed by one cycle at 72°C for 10 minutes. As recognized by the skilled person, thermal cycling conditions may be optimized, for example, by modifying annealing temperatures, annealing times, number of cycles and extension times. Also, the amount of primer and other PCR reagents used, as well as PCR parameters (e.g., annealing temperature, extension times and cycle numbers), may be optimized to achieve desired PCR amplification efficiency.
[000110] In an aspect of the invention, amplification may be performed utilizing any suitable method of amplification, including polymerase chain reaction (PCR), ligase chain reaction (Barany, F. (1991) Proc. Natl. Acad. Sci. 88: 189-193), rolling circle amplification (Lizardi, P.M. et al. (1998) Nature Genetics 19:225-232), strand displacement amplification (Walker, G.T. et al. (1992) Proc. Natl. Acad. Sci. 89:392-396) or alternatively any means or method whereby sufficient amounts of multi-target amplification product may be obtained, including for analysis or sequencing.
[000111] In certain aspects contemplated herein, "digital PCR" methods may be used, for example to quantitate the number of target-fusion amplification products or to evaluate the cooccurrence of targets across a population of single cells in a sample. In an application of digital PCR, the PCR reaction is performed in a multitude of more than 100 microcells or droplets, such that each droplet either amplifies (e.g., generation of an amplification product provides evidence of the presence of target-fusion amplification product in the microcell or droplet) or fails to amplify (evidence that the product was not present in a given microcell or droplet). By counting the number of positive microcells, it is possible to directly count the number of cells with target- fusion amplification product that are present in an input sample cell population. Digital PCR methods typically use an endpoint readout, rather than a conventional quantitative PCR signal that is measured after each cycle in the thermal cycling reaction (Pekin et al. (2011) Lab Chip 11(13):2156; Zhong et al. (2011) Lab Chip 11(13):2167; Tewhey et al. (2009) Nature Biotechnol 27: 1025; Tewhey et al. (2010) Nature Biotechnol 28: 178). Accordingly, any of the compositions and methods provided herein may be adapted for use in digital PCR methodology, for example, the ABI QuantStudio™12K Flex System (Life Technologies, Carlsbad, Calif), the QuantaLife™ digital PCR system (BioRad, Hercules, Calif.) or the RainDance™ microdroplet digital PCR system (RainDance Technologies, Lexington, Mass.).
[000112] Each target is amplified using target specific primers. The amplification is conducted in a stepwise manner so that ultimately at least two target sites are linked in a single pre-designed amplification fusion product having target relevant sequence information for at least two targets joined in an amplification product. The target-concatenated amplification product (a target fusion) can be subjected to further analysis including sequencing using next generation sequencing or other suitable approaches. The target fusion enables characterization and evaluation of at least two targets for singular analysis, thus providing co-occurrence data on a single cell basis in one amplification product. Exemplary target amplification schemes for two or three targets are depicted in FIGURES 1 A and IB and FIGURES 2A and 2B, respectively.
[000113] FIGURES 1A and IB depict an amplification scheme for two targets - a first target
"set A" and a second target "set B". The exact amount and relative abundance of PCR primers that co-exist in the reaction droplets can be designed in such a way that they will facilitate the amplification of a first target [denoted "set A"] as well as a second target sequence [denoted "set
B"]. Set A target primers are denoted PI and P2. P2 is an elongated primer having a first 5' region for dimerization (the dimerization cassette or "dimer domain") and a second 3' region which is complementary to target A (the "target A domain"). The dimer domain or dimerization cassette may be directly linked to the target A domain or be separated from the target A domain by a linking domain (e.g. a linking domain of at least 5 nucleotides). The linking domain may be random nucleotides or specific nucleotides (e.g. it may but does not necessarily need to contain a barcode sequence). Set B target primers are denoted P3 and P4. P3 is an elongated primer having a first 5' region for dimerization (the "dimer domain") and a second 3' region which is complementary to target B (the "target B domain"). The dimer domain may be directly linked to the target B domain or be separated from the target B domain by a linking domain (e.g. a linking domain of at least 5 nucleotides). The linking domain may be random nucleotides or specific nucleotides (e.g. it may but does not necessarily need to contain a barcode sequence). In an important aspect, the dimer domain or tail of set A primers (i.e. the tail and dimer domain of P2) is complementary to the dimer domain or tail of set B primers (i.e. the tail and dimer domain of
P3), so that P2 and P3 form desired primer dimers.
[000114] After generating set A and set B single target PCR products in the initial round(s) of amplification (as shown in FIGURE 1 A), the set A and set B specific amplification products will fuse by virtue of primer dimer formation between P2 and P3 (and P2 complementary and P3 complementary - referred to as "P2c" and P3c" in FIGURE IB) ends on the set A and set B products (FIGURE IB). P2/P3 (or P2c and P3c) primer dimer formation and amplification of the primer dimer pairs results in generation of a target fusion amplification product, thereby linking target A and target B in a multi-target PCR amplification product fusion that can subsequently be used for downstream analysis by, for instance, a NGS, digital PCR, or other approach.
[000115] An exemplary scheme for amplification of three targets is depicted in Figure 2 A and 2B - a first target, denoted Target A or "set A", a second target, denoted Target B or "set B", and a third target, denoted Target C or "set C". Set A target primers are denoted PI and P2. Again, PI and P2 comprise Target A complementary sequence for specific amplification of Target A. P2 is an elongated primer having a first 5' region for dimerization (the "dimerization cassette" or dimer domain) and a second 3' region which is complementary to Target A (the "Target A domain"). Set B target primers are denoted P3 and P4. P3 and P4 comprise Target B complementary sequence for specific amplification of Target B. P3 is an elongated primer having a first 5' region for dimerization (the "dimerization cassette" or dimer domain) and a second 3' region which is complementary to Target B (the "Target B domain"). The P3 dimerization cassette is complementary to and forms a primer dimer with the P2 dimerization cassette. In this instance, P4 is also an elongated primer having a first 5' region for dimerization (the "dimerization cassette" or dimer domain) and a second 3' region which is complementary to Target B (the "Target B domain"). The P4 dimerization cassette is complementary to and forms a primer dimer with the P5 dimerization cassette. P5 and P6 comprise Target C complementary sequence for specific amplification of Target C. P5 is an elongated primer having a first 5' region for dimerization (the "dimerization cassette" or dimer domain) and a second 3' region which is complementary to Target C (the "Target C domain"). By virtue of intermediary primer dimer formation via P2 and P3 sequence and P4 and P5 sequence, concatenation of Target A with Target B with Target C occurs. Initial amplification (Figure 2A) proceeds to separately amplify Target A or set A, Target B or set B, and Target C or set C. After multiple rounds of amplification, the PCR products act as primers themselves and primer dimer formation via the dimerization cassettes is occurring (Figure 2B). Following primer dimer formation, amplification proceeds to generate a target fusion amplification product wherein Target A, Target B, and Target C are linked and form a concatenated multi-target amplicon fusion product as indicated.
[000116] In addition to the specific primers being "tailed" with dimer domains or dimerization cassettes, in an aspect of the invention, the primers are dosed in such a way that the formation of pre-designed primer dimers (or PCR fragment "fusions") during the PCR reaction is promoted. Thus, the relative concentrations of primers are such that primers including a dimerization cassette, P2 and P3 in Figure 1A, and all of P2, P3, P4 and P5 in Figure 2A, are present in limiting amounts and primers lacking dimerization cassettes, PI and P4 in Figure 1 A and PI and P6 in Figure 2 A, are present in abundant amounts. In accordance with the invention, the primer dimerization cassette-containing primers serve to link the first target "set A" in a single cell (for example in the droplet) with specific target "set B" sequences (for example a second target mutation of interest) present in the same cell, and may also link additional targets (Target C, set C, etc.) to provide a target fusion or target concatenated amplification product, combining first and second target information, first and second and third target information, etc. into the same artificial concatenated DNA molecule. Also, in an aspect, the primer target complementary sequence and the target primers may be designed and the amplified Target regions selected (such as Target A, Target B, etc.) so as to avoid, and preferably eliminate, any functionally relevant and/or alternative primer dimer formation, thus ensuring preferential and specific primer dimer formation via the dimerization cassette sequences.
[000117] As described herein, including as demonstrated in FIGURES 1 and 2, the novel approach provided herein makes use of primer dimers (PD) to generate an artificially linked or concatenated amplification product which provides a target fusion (two or more target nucleic acids are fused in a multi-target amplicon fusion product). It is notable that in a typical amplification (e.g. PCR) design, primers are designed in such a way that primer dimer formation during the actual PCR reaction is avoided. Short regions of complementarity in PCR reactions can result in formation of untoward primer dimers. DNA polymerase amplifies the PD, leading to competition for PCR reagents (including the primers itself), thereby often limiting the specific amplification of the DNA sequence targeted for PCR amplification. In quantitative PCR, PDs may interfere with accurate quantification. Therefore amplification primers are specifically designed to minimize the formation of primer dimers and various methods and approaches seek to mitigate the generation of PDs are known and available. However, in the present approach, primer dimers are desired, and a key aspect is to provide a means to specifically link two distinct targets in a target fusion amplification product.
[000118] In an aspect of the invention, the primer dimer may be a unique and unrelated sequence. In an aspect of the invention, the primer dimer may comprise a unique unrelated length of sequence and may further comprise overlap sequences from one or more targets at one or more ends, thereby providing seamless and directional connection of PCR products from the one or more targets. For example, in an aspect of the invention, the primer dimer containing oligonucleotide may comprise Target A complementary sequence at its 5' end that allows for fusion of Target A initial amplification product with Target B amplification product, for instance utilizing an overlap extension PCR approach. In one aspect, the target complementary sequence is at least 15 bp in length, preferably at least 20 bp, preferably at least 30 bp, or preferably 30-40 bp. Overlap extension PCR to splice DNA fragments, including multiple DNA fragments, using PCR has been described and implemented using various strategies (Horton, RN et al. (1989) Gene 77(l):61-68; Bruskin AV and Matsumura I (2010) BioTechniques 48:463-465; Spiliotis M (2012) PLosONE 7(4):e35407; Luo W-G et al. (2013) Biological Proc Onlinel5:9; You, C et al. (2012) Appl Environ Microbiology 78(5): 1593-1595).
[000119] One skilled in the art can design and utilize primer dimer sequences suitable for use in the present invention, including based on the teaching and exemplification provided herein. For example, primer design programs and web-based tools are known and available to the skilled artisan including, but not limited to, BLAST, BLAT, Primer3, Primer Design (bioinformatics org) and Primer-Blast. The length of overlap and percent GC content will influence the annealing of the primer and the stability of the primer hybridization, including the temperature at which it breaks apart and denatures. The melting temperature (Tm) can readily be calculated and estimated, including using available and known programs and web-based tools including, but not limited to, Oligo Calculator (idtdna). The higher the % GC content, the more stable the primer dimer will be. A primer dimer overlap stretch of at least 5 base pairs should be utilized, with a suitable GC content. In addition, minimal homology to other genome sequences is an important aspect of primer dimer sequence design, and can be assessed by sequence searches, BLAST, etc.
In certain aspects, an overlap of the primer dimer of about 10-20 or about 20-30 base pairs is recommended. A GC content of about 50% is also recommended.
[000120] Thus, in an aspect of the invention, primer design serves to further enhance or stabilize the formation and stability of primer dimers via the dimerization cassettes (for example P2/P3 and P2c/P3c as shown in FIGURE IB). Examples include but are not limited to enhanced GC content or GC tails, locked nucleic acids, and/or use of modified nucleotides. In addition, the amplification reagents and/or droplet solutions can also contain Tm lowering chemicals, such as, for instance, DMSO, to facilitate annealing and primer dimer stability.
[000121] In the present methods, separate single cell PCR reactions take place in each single cell reaction compartment (for example in each droplet or a single cell micro well or chamber) thereby preventing the mixing of DNA originating from different cells (such as for example different T cells, B cells, tumor cells). The target fusion product in each reaction is thus a readout of the genotype of each of the two or more targets (such as target A and target B genotype), including any pre-selected mutation or variant sequence thereof, in a single cell. The target A status can be evaluated in specific combination with the target B status on a cell-by-cell basis. In another embodiment, the target A status can be evaluated in specific combination with the target B status and target C status on a cell-by-cell basis. This permits assessing co-occurrence, for example of two or more cancer gene targets, or of a TCR with a cancer gene target, or of a B cell Ig specificity with an immune marker, etc. as indicative and non-limiting examples.
[000122] In an aspect of the invention, the amplification is self-limiting so the amount of target fusion hybrid PCR product formed in each droplet or chamber is in some range in terms of quantity. The quantity may reflect the time or duration of the amplification step(s). The quantity may be set to be suitable or sufficient for direct sequencing or other analysis. The amplification may be designed to be quantitative in some instances for example if RNA is being amplified and thereby the expression of a target protein is being evaluated in single cells. [000123] It an aspect of the invention, the multi-target amplicon fusion products may be selectively characterized, for example by modifying target primers, such as PI or P4 as depicted in Figure 1 A, or PI and P6 as depicted in Figure 2 A, with a label, including a capture label, such as biotin which can be captured with (strept-)avidin. In one such aspect, a specific barcode or sequencing primer may be alternatively or additionally incorporated in a modified target primer, such as PI or P4 or PI and P6, such that only the pre-designed multi-target amplicon fusion products are sequenced. In one aspect, one target primer PI or P4 is modified with a capture label and the other target primer P4 or PI is modified with a sequencing primer for combined specific capture and sequencing.
[000124] In an alternative aspect of the method, the set A and set B specific primers, primers PI and P4 as depicted in FIGURE 1 A, or the set A and set C specific primers PI and P6 depicted in Figure 2A, can be tailed or tagged as well, so that a final self-limiting amplification with an additional set of primers binding to these PI and/or P4 and/or P6 tails can be additionally performed. In one such alternative approach, PI and P4 primers, or PI and P6 primers, are tailed with universal adapters to facilitate automated high throughput sequencing. One such example is the sequencing platform specific adapters of Illumina (San Diego, CA). Illumina methods and adapters as well as sequencing systems (HiSeq, MiSeq) are known, described, and commercially available (for example WO2006/063437, WO2008/041002, WO1998/53300; US Patents 6,355,431, 6,406,848, 6,831,994, 8,361,713, 8,486,625, each and all incorporated herein by reference). In accordance with this aspect, after amplification is concluded, the PCR products and target fusions may be used for standard deep sequencing, which will provide the specific sequence and/or mutation or mutational status of the pre-selected targets, including one or more targets, e.g. denoted target A and target B.
[000125] In an aspect of the invention, specific or custom designed tails or adaptors on one or more primers of the invention can serve a variety of purposes in downstream processing. Thus, specific modifications or additions, including additional nucleic acids, chemicals, small molecules, etc. are contemplated. In an aspect, such modifications or additions serve to facilitate or enable downstream analysis or applications, support specific instrumentation or aid in the optimization or normalization of yields of target fusion products. In one aspect, an adapter enables specific capture of the desired amplification product, for example utilizing biotin which can be captured using avidin/streptavidin. Capture may include the use of beads, including magnetic beads. Capture may enable quantitation of the amount of product in each single cell. In another aspect, one or more adaptors are employed for identification (such as a barcode) or for sequence characterization (such as sequencing tails). In another aspect, adding a specific combination of fluorescent dyes can enable rapid characterization or sorting for example using flow-through counting or sorting including FACs mediated applications. Adaptors can be utilized to facilitate the complexity of hybrid target fusion molecules that will be generated during amplification, for example in generating a 2-target, 3-target, 4-target, etc. fusion molecule.
[000126] Labels employed may include radioactive elements, enzymes, chemicals which fluoresce when exposed to ultraviolet light, and others. A number of fluorescent materials are known and can be utilized as labels. These include, for example, fluorescein, rhodamine, auramine, Texas Red, AMCA blue and Lucifer Yellow. In an aspect, one or more primers can be labeled with a radioactive element or with an enzyme. The radioactive isotope may be selected from 3H, 14C, 32P, 35S, 36C1, 51Cr, 57Co, 58Co, 59Fe, 90Y, 125I, 131I, and 186Re. Enzyme labels are useful, and can be detected by any of the presently utilized colorimetric, spectrophotometric, fluorospectrophotometric, amperometric or gasometric techniques. The enzyme is conjugated to the selected particle by reaction with bridging molecules such as carbodiimides, diisocyanates, glutaraldehyde and the like. Many enzymes which can be used in these procedures are known and can be utilized including peroxidase, B-glucuronidase, B-D-glucosidase, B-D-galactosidase, urease, glucose oxidase plus peroxidase and alkaline phosphatase. In an aspect, enzyme or tag is added at the conclusion of amplification.
[000127] Examples of labels include, but are not limited to, chemical, biochemical, biological, colorimetric, enzymatic, fluorescent, luminescent labels, chemiluminescent labels, and electrochemiluminescent labels. The label may be a dye, a photocrosslinker, a cytotoxic compound, a drug, an affinity label, a photoaffmity label, a reactive compound, an antibody or antibody fragment, a biomaterial, a nanoparticle, a spin label, a fluorophore, a metal-containing moiety, a radioactive moiety, a novel functional group, a group that covalently or noncovalently interacts with other molecules, a photocaged moiety, an actinic radiation excitable moiety, a ligand, a photoisomerizable moiety, biotin, a biotin analogue, a moiety incorporating a heavy atom, a chemically cleavable group, a photocleavable group, a redox-active agent, an isotopically labeled moiety, a biophysical probe, a phosphorescent group, a chemiluminescent group, an electron dense group, a magnetic group, an intercalating group, a chromophore, an energy transfer agent, a biologically active agent, a detectable label, or a combination thereof. The label may be a chemical label. Examples of chemical labels can include, but are not limited to, biotin and radioisotopes (e.g., iodine, carbon, phosphate, hydrogen). In some embodiments, the methods, assays and kits disclosed herein comprise a biological label. Biological labels comprise metabolic labels, including, but not limited to, bioorthogonal azide-modified amino acids, sugars, and other compounds. Enzymatic labels can include, but are not limited to horseradish peroxidase (HRP), alkaline phosphatase (AP), glucose oxidase, and b-galactosidase. In some embodiments, the enzymatic label is luciferase. A fluorescent label may be an organic dye (e.g.,
FITC), biological fluorophore (e.g., green fluorescent protein), or quantum dot. A non-limiting list of fluorescent labels includes fluorescein isothiocyante (FITC), DyLight Fluors, fluorescein, rhodamine (tetramethyl rhodamine isothiocyanate, TRITC), coumarin, Lucifer Yellow, and
BODIPY. In some embodiments, the label is a fluorophore. Exemplary fluorophores include, but are not limited to, indocarbocyanine (C3), indodicarbocyanine (C5), Cy3, Cy3.5, Cy5, Cy5.5,
Cy7, Texas Red, Pacific Blue, Oregon Green 488, Alexa Fluor®-355, Alexa Fluor 488, Alexa
Fluor 532, Alexa Fluor 546, Alexa Fluor-555, Alexa Fluor 568, Alexa Fluor 594, Alexa Fluor
647, Alexa Fluor 660, Alexa Fluor 680, JOE, Lissamine, Rhodamine Green, BODIPY, fluorescein isothiocyanate (FITC), carboxy-fluorescein (FAM), phycoerythrin, rhodamine, dichlororhodamine (dRhodamine), carboxy tetramethylrhodamine (TAMRA), carboxy-
Xrhodamine (ROX™), LIZ™, VIC™, NED™, PET™, SYBR, PicoGreen, RiboGreen. The fluorescent label may be a green fluorescent protein (GFP), red fluorescent protein (RFP), yellow fluorescent protein, phycobiliproteins (e.g., allophycocyanin, phycocyanin, phycoerythrin, and phycoerythrocyanin) .
[000128] In one aspect, a smart ("labeled" or tagged) primer design can be used to ensure that only target fusion hybrid PCR products are being sequenced. For example primers PI and/or P4 may be tagged or labeled with a selectable marker. In one approach, the tag or label may be biotin, which will permit specific selection or capture of the target fusion products with streptavidin or avidin. In an alternative approach the PI and P4 primers may be tagged with a poly-dA sequence, which can be captured with poly-dT. In an aspect of the invention, primers and/or amplification products may be labeled or tagged with a barcode - a specific and unique sequence that serves as a marker or identifier.
[000129] In an aspect of the invention method and assays, one or more multi-target amplification product(s) are directly captured to a solid support, for example by direct binding to a plate, filter, or bead. Direct capture may be mediated by a label or tag on the amplification product, such as for selection of the multi-target amplification product. The plate, filter, or bead may be coated or pre-coated with a binder or other reagent that recognizes or binds a label or tag on the amplification product. The bead may be a magnetic bead. In some embodiments, the methods or assays provided herein may comprise a solid support. A solid support comprises any solid platform to which a probe or binder can be attached. A solid support may comprise a bead, plate, an array or a bead attached to a plate. Examples of plates include, but are not limited to, MSD multi-array plates, MSD Multi-Spot® plates, microplate, ProteOn microplate, AlphaPlate, DELFIA plate, IsoPlate, and LumaPlate. Examples of beads include streptavidin beads, agarose beads, magnetic beads, Dynabeads®, MACS® microbeads, antibody conjugated beads, protein A conjugated beads, protein G conjugated beads, protein A or G conjugated beads, protein L conjugated beads, oligo-dT conjugated beads, silica beads, silica-like beads, anti-biotin microbead, anti-fluorochrome microbead, and BcMag™ CarboxyTerminated Magnetic Beads.
[000130] Next-generation sequencers have sufficient power to simultaneously analyze nucleic acids and DNAs from many different specimens, a practice known as multiplexing. Multiplexing schemes rely on the ability to associate each sequence read with the specimen from which it was derived. Molecular barcoding is an essential tool in optimally applying high throughput next generation sequencing platforms in studies involving more than one sample. Various strategies have been developed for barcode-mediated multiplexing, in which samples are uniquely tagged with short identifying sequences or barcodes, pooled, and then sequenced together. The resulting combined sequence data are subsequently sorted by barcode before bioinformatics analysis.
[000131] Various barcoding strategies allow for the incorporation of short recognition sequences into samples and sequencing libraries, either by ligation or amplification, particularly amplification via PCR. Several approaches for incorporating a known, sample-specific nucleotide sequence, or barcode, in DNA fragments have been reported and implemented (Craig, D.W. et al. (2008) Nat Methods 5: 887-893; Parameswaran, P. et al. (2007) Nucleic Acids Res 35: el30; Rigola, D. et al. (2009) Plos One 4: e4761 ; Vigneault, F. et al. (2008) Nat Methods 5: 777-779; Buermans, H.P. et al. (2010) BMC Genomics 11 : 716). PCR amplification of a pool of DNA molecules with different nucleotide compositions, especially near priming sites, however, can result in quantitative bias because some DNA species are amplified more efficiently than others (Schutze, T. et al (2011) Anal Biochem 410: 155-157; Meyer, S.U. et al (2010) Biotechnol Lett 32: 1777-1788; Lopez-Barragan, M.J. et al (2011) Mol Biochem Parasitol 176: 64-67; Linsen, S.E.V. et al (2009) Nat Methods 6: 474-476).
[000132] Nonetheless, many currently published methods and kits introduce a barcode in the nucleic acid sample or library before or during PCR-based library amplification (Vigneault, F. et al (2008) Nat Methods 5: 777-779; Buermans, H.P. et al. (2010) BMC Genomics 11 : 716). The Illumina TruSeq small RNA protocol introduces the barcode during the PCR step using differentially barcoded primers, while the TruSeq DNA (or messenger RNA converted to double stranded DNA) protocol introduces the barcode before the PCR step by ligation of differentially barcoded double stranded adaptors. Methods place the barcodes within the adapters, downstream or within the PCR primer binding site or introduce the barcode during PCR. Ligation of the barcode after PCR amplification has also been described (Nieuwerburgh, F.V. (2011) PLoSONE 6(10) e26969).
[000133] The current practice of appending molecular barcodes prior to pooling, however, is only practical for parallel analysis of up to several or many dozen samples. Barcode-mediated strategies permitting simultaneous analysis of tens of thousands of specimens have been reported or contemplated, however they rely on the use of combinatorial pooling strategies in which pools rather than individual specimens are assigned barcodes. Thus, the identity of each specimen is encoded within the pooling pattern rather than by its association with a particular sequence tag (Erlich, Y et al. (2009) Genome Research 19: 1243-1253). Such pooling prevents any single cell or individual cell relevant data or information.
[000134] In an aspect of the invention, certain embodiments contemplate designing oligonucleotide/primer sequences to contain short signature sequences (for example barcodes) that permit unambiguous identification of the amplification product into which they are incorporated without having to sequence the entire amplification product. Typically, barcodes are placed in primers at locations where they are not found naturally, with barcodes comprising nucleotide sequences that are distinct from any naturally occurring oligonucleotide sequences that may be found in the vicinity of the sequences adjacent to which the barcodes are situated. Barcodes may comprise a sequence of 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 25, 30, 35, 40, 45, 50 or more contiguous nucleotides (including all integer values in between). In an aspect of the invention, each barcode sequence may uniquely identify the target fusion product or may identify an aspect of the target (e.g., a mutation or variant of the target sequence). Examples of the design and implementation of oligonucleotide barcode sequence identification strategies will be known and available to one skilled in the art (de Career et al. (2011) Adv Env Microbiol 77:6310; Parameswaran et al. (2007) Nucl Acids Res 35(19):330; Roh et al. (2010) Trends Biotechnol 28:291).
[000135] In an aspect, a barcode may comprise a first barcode oligonucleotide, for example of length in range of 5-15 nucleotides, that uniquely identifies an oligonucleotide primer, and optionally in certain embodiments a second barcode oligonucleotide, for example of length in range of 5-15 nucleotides, that uniquely identifies mutated or variant target sequence, to provide barcodes of, without limitation, 5-30 nucleotides in length. Barcode oligonucleotides may comprise oligonucleotide sequences of any length, so long as a minimum barcode length is obtained that precludes occurrence of a given barcode sequence in two or more product polynucleotides having otherwise distinct sequences. [000136] In an aspect, barcoding allows one to pool samples from different origins, for example samples from different patients or different sites within a patient, for combinatorial analysis.
[000137] To facilitate subsequent data acquisition by NGS, and/or to facilitate higher complexity assays (for example wherein 3 or 4 or more target loci are evaluated), primers should be designed such that a suitably minimum size amplicon containing the target fusion region of interest results. In other words, the amplicon and target fusion should be of sufficient target length in each instance to provide relevant target loci and mutational data, but short enough to be efficiently and effectively amplified and in instances where analysis of product includes sequencing, short enough to be effectively sequenced. In an aspect of the invention the target fusion product is not greater than 3,000 bp. In an aspect, the target fusion product is less than 200 bp, less than 1,000 bp, preferably 200-1000 bp, 200-500 bp, less than 500 bp, less than 300 bp, less than 200 bp. In instances where sequencing of the product will not be conducted, longer amplification products can be generated and utilized.
[000138] In an optional and additional aspect, the amplification reactions to generate target fusion products can be performed in a continuous system (such as using water droplets in oil), for example a flow system, including a continuous flow system. Exemplary flow systems suitable for single cell analysis and/or amplification are known and available to one skilled in the art. For example, Raindance Technologies' systems which are commercially available and are described including in US2014/323317, US2015/027892, US patent 8,841,071, each and all incorporated herein by reference.
[000139] Sequencing may be performed using any of a variety of available high through-put single molecule sequencing machines and systems. Illustrative sequence systems include sequence-by-synthesis systems such as the Illumina Genome Analyzer and associated instruments (Illumina, Inc., San Diego, Calif), Helicos Genetic Analysis System (Helicos Biosciences Corp., Cambridge, Mass.), Pacific Biosciences PacBio RS (Pacific Biosciences, Menlo Park, Calif), or other systems having similar capabilities. In one aspect, sequencing can be achieved using a set of sequencing oligonucleotides that hybridize to a defined region within the amplified DNA molecules.
[000140] The invention may be better understood by reference to the following non-limiting Examples, which are provided as exemplary of the invention. The following examples are presented in order to more fully illustrate the preferred embodiments of the invention and should in no way be construed, however, as limiting the broad scope of the invention.
EXAMPLE 1 [000141] In an application of the methods of the invention, cells are evaluated on a cell-by-cell basis for T cell receptor repertoire as a first target, in combination with a clinically relevant target for T cell mediated cancer(s). T cell receptors may be assessed as a first target (set A) in combination with a second target (set B). While there are sequence based approaches known and available, including commercially available, for assessment of T cell antigen receptors, current approaches evaluate the TCR repertoire and provide information regarding adaptations or changes in T cells across a multicellular sample and in a quantitatively general fashion. Specific single cell relevant information is not derived or provided with presently available systems.
[000142] The deficiency of single cell-based T cell receptor target information available from current commercial systems is evident in systems such as massive parallel sequencing-based interrogation platforms, including that of Adaptive Biotechnologies (AB) (described for example in one or more of Robins, H et al. (2009) Blood 114:4099; Robins, H et al. (2010) Sci Translat Med. 2:47ra64; Robins, H et al. (2012) J Immunol Meth375(l-2): 14-19; Sherwood et al. (2011) Sci Translat Med 3(90):90ra61; US Pub. No. 2012/0058902, US Pub. No. 2010/0330571, U.S. Pub. Nos. 2014/0322716, WO2010/151416, WO2011/106738, WO2012/027503, WO2013/059725, WO2013/188831, WO2014/145992, each incorporated herein by reference). Using the AB system, sequence information is not provided on a single cell basis but in an aggregated form for the entire population of cells interrogated. At best, data is provided for the T cell repertoire on a relative basis across a population of cells.
[000143] To evaluate the T cell compartment, (AB's) immunoSEQ assay uses PCR to amplify the CDR3 region of the T cell receptor, spanning the variable region formed by the junction of the V, D and J segments and their associated non-templated insertions. The resulting nucleotide sequence may be used as a unique identifier or tag for a particular clone across different samples to track clonal expansions and contractions over time in the same patient. However, in each instance the relative amount of one or more clones can only be determined. Specific information regarding other aspects of that clone, including other relevant T cell markers is not available and cannot be derived.
[000144] Thus, in addition to failing to provide single cell analysis information, the AB approach, and other massive parallel sequencing approaches like it, does not provide any means for assessment of the co-existence of a second target (or any further additional third, fourth etc target(s)) within individual cells and/or for co-occurrence of mutations with a single amplification product. The availability of multi-target information, particularly TCR single cell characterization evaluating a second target (or more targets) for co-occurrence on a single cell basis, enables evaluation of cancer status, immune adaptation, clonal response, and mutational drift, in a single amplification product. The single cell-based multi-target amplicon fusion product of the present invention provides clinical and target relevant information for individual cells in a population of cells in a scalable, streamlined and accurate manner.
[000145] In an aspect of the present invention, T cell receptor (TCR)-specific primers, for example those implemented by Adaptive Biotechnologies for unbiased amplification of TCR signature sequences, may be altered to serve as one or more primer in accordance with the invention. In an aspect a primer dimerization cassette is further incorporated in primers incorporating T cell receptor (TCR) - specific primer sequence(s) so as to enable generation of specific, pre-designed target fusions wherein the first target, denoted Target A or set A, is a TCR and the second target, denoted Target B or set B, is linked via primer dimer-mediated concatenation, in a single multi-target amplicon fusion product for single cell analysis of at least
2 target loci simultaneously across a population of cells, including a mixture of cells.
[000146] Thus, in accordance with an aspect of the present invention, primers, including TCR primers or sequences thereof, are modified for multi-target evaluation via primer dimer mediated concatenation as provided herein. In the AB system, for example as described in
US2014/322716, the AB system oligonucleotide amplification primer composition comprises a first oligonucleotide amplification primer set comprising forward oligonucleotide sequences of a general formula U1-B1-V1, and reverse oligonucleotide sequences of a general formula U2-B2-
Jl, wherein Ul and U2 comprise a first and optional second universal adaptor oligonucleotide for
NGS sequencing, Bl and B2 are identical or unique barcode sequences, and wherein VI is a primer for TCR variable region (V) sequence and Jl is a primer for TCR joining region (J) sequence. Alternatively or additionally Jl region sequence may be replaced with TCR constant region (C) sequence. Primer sequences for implementation in the present invention and to be combined with primer dimerization cassettes may include VI, J 1 and/or C sequence for Target
TCR. Exemplary VI and Jl sequences may be selected from published or described sequences, including as set out in one or more of Robins, H et al. (2009) Blood 114:4099; Robins, H et al.
(2010) Sci Translat Med. 2:47ra64; Robins, H et al. (2011) J Immunol Meth doi: 10.1016/j.jim.2011.09.001; Sherwood et al (2011) Sci Translat Med 3 :90ra61; US Pub. No.
2012/0058902, US Pub. No. 2010/0330571, US Pub 2014/0322716, WO2010/151416,
WO2011/106738, WO2012/027503, WO2013/059725, WO2013/188831, WO2014/145992, each incorporated herein by reference.
[000147] In one aspect of the present invention, a pre-screen to identify the clonally relevant TCRs in a tumor or sample population is conducted. It is recognized by those skilled in the art that during the acute phase of B- or T-cell disease, an overwhelming percentage of the cancer cells will often have the same B- or T-cell receptor rearrangement. Consequently, and solely based on the frequency at which the same rearrangement is detected in the NGS reads, the B- or T-cell receptor signature corresponding to the malignant clone can be identifiable. The rest of the signatures from the normal cells can therefore be avoided using clone-specific primers. In addition, after the initial identification of the B- or T-cell receptor present in the malignant clone, specific ("personalized") primer-sets for the specific amplification of the B- or T-cell receptor of the malignant clone in a given patient can be designed and utilized for the T- or B- cell receptor target. This enables specific evaluation of cancer cells in a sample known or possibly suspected to be mixed with normal cells.
[000148] In accordance with an aspect of the present methods and compositions, one or more TCR specific primer sequence, such selected from V region, J region and/or C region sequence including as described above, is tailed, particularly at the 5' end, with a specifically designed primer-dimer sequence, or dimerization cassette, so as to provide a primer comprising TCR V, J, or C region target sequence and a dimerization cassette, corresponding to P2 in FIGURE 1A. The primer dimer sequence is complementary to and stably hybridizes with the complementary primer dimer sequence which is a tail on, e.g. P3 of Target set B primer (refer to FIGURE 1A). When the TCR primers comprising dimerization cassettes are introduced in the single cell system of the present invention, a target fusion of TCR with a secondary locus (denoted Target B, set B) is generated and TCR along with Target B or set B are evaluated for mutational and clonal status in a single concatenated amplification product.
[000149] Thus, in accordance with the present method, a droplet may contain a single cell for simultaneous TCR analysis (as Target A) with Target B/set B evaluation. As described, the exact amount and relative abundance of PCR primers that co-exist in the fused reaction droplets can be designed in such a way that they will facilitate the amplification of the T-cell receptor target (denoted Target A or "set A") as well as the second target sequence (denoted Target B or "set B").
[000150] In addition, in an aspect of the invention the primers of the present invention are tailed with dimer domains or dimerization cassettes as described and are dosed in such a way that the formation of pre-designed primer dimers during the PCR reaction is promoted to result in linking or concatenating the targets in a multi-target fusion amplification product. The primer dimers serve to link or concatenate the T-cell target nucleic acid present in a single cell with specific target B sequences that correspond to the additional target or target mutation of interest and are present in the same cell. [000151] After amplification is concluded, the PCR products, particularly the multi-target amplification products may be used for sequencing, including standard deep sequencing or NGS, which will characterize and/or reveal the specific T-cell receptor Target A sequence genotype, together with the precise mutational status of one or more pre-selected genes/targets Target B,
Target C etc. As provided above and in FIGURE 1 A and B, a smart or labeled design can be used to ensure that only hybrid PCR products are being sequenced.
[000152] Several gene alterations or mutations have been identified and described that are relevant to assessment and monitoring of T cell acute lymphoblastic leukemia (T-ALL). One or more of these gene targets are suitable for co-occurrence and/or combination target analysis, including in combination with T cell receptor assessment. Thus, it has been described that T- ALL may be associated with mutations in one or more of IL-7R/Janus activated kinase (JAK), CNOT3, RPL5, RPL10, PTPN2 and STAT 5 genes (Schochat, C. et al (2011) J Exp Med 208(5):901-908; DeKeersmaecker, K. et al (2013) Nat Genet 45(2): 186-190; Bandapalli, O.R. et al (2014) Hematologica 99(10):el88-192).
[000153] In embodiments for T cell cancer analysis, particularly T-ALL, T cell receptor as Target A may be combined with Target B wherein Target B is selected from IL-7R, JAK3, CNOT3, RPL5, RPL10, PTPN2, and STAT5. In particular the mutation STAT 5 N642H can be specifically evaluated. Also, primer combinations are designed to provide target fusion amplification products of T cell receptor as Target A plus Target B and Target C in a single target fusion amplification product. Target B and C may be selected from IL-7R, JAK3, CNOT3, RPL5, RPL10, PTPN2, and STAT5.
EXAMPLE 2
[000154] In one application of the methods of the invention, cells are evaluated on a cell-by- cell basis for B cell Ig repertoire as a first target, denoted Target A or set A, in combination with a clinically relevant target for B cell mediated cancer(s). In the present application and example, B cell Ig is analyzed as a first target (set A) in combination with a second target (set B). All measurable mutations, genomic variants, including SNPs that are of clinical relevance in the context of the clinical course or therapeutic decision making for any B-cell malignancy are suitable as Target B targets (or in combination as Target B, Target C, Target D etc).
[000155] In accordance with the invention, recognized and known B cell Ig specific primers or sequences thereof may be utilized in target specific primers such as PI and in the target complementary sequence aspect of P2 as shown in Figure 1A. In addition to Target A B cell Ig status, Target B may be selected from a clinically relevant target for cancers originating from or otherwise involving the B-cell lineage.
[000156] Bruton's tyrosine kinase (Btk), a member of the Tec family of non-receptor tyrosine kinases, plays an essential role in the B-cell signaling pathway linking cell surface B-cell receptor (BCR) stimulation to downstream intracellular responses. Btk is expressed in all hematopoietic cells types except T lymphocytes and natural killer cells, and participates in a number of TLR and cytokine receptor signaling pathways including lipopolysaccharide (LPS) induced T F-α production in macrophages, suggesting a general role for Btk in immune regulation. Small molecule Btk inhibitors have been developed with anticipated therapeutic benefit in the treatment of lymphoma and autoimmune diseases. A potent irreversibly acting small molecule inhibitor of Btk, PCI-32765, has demonstrated promising clinical activity in patients with B-cell NHL. PCI-32765 inhibits BCR signaling downstream of Btk, selectively blocks B-cell activation, and is efficacious in animal models of arthritis, lupus, and B-cell lymphoma (Honigsberg LA et al. (2010) PNAS 107(20): 13075-13080). Irreversible inhibitor PCI-32765 is one of a series of Btk inhibitors that bind covalently to cysteine residue Cys-481 in the active site leading to irreversible inhibition of Btk enzymatic activity (Pan Z et al (2007) Chem Med Chem 2:58-61).
[000157] Mutations in BTK such as C481 S are known to render tumor cells resistant to Cys481 directed inhibitors such as PCI-32765 (Imbruvica or Ibrutinib). Thus, the status of BTK and particularly Cys-481 residue in single cells, including in patients with B-cell lymphoma, is relevant to assess a patient's or their disease's susceptibility to Cys-481 directed therapies.
[000158] In another application of the methods of the invention, cells are evaluated on a cell- by-cell basis for B cell Ig repertoire as a first target, in combination with a clinically relevant target for autoimmune disease. Relevant autoimmune diseases include but are not limited to rheumatoid arthritis, systemic lupus erythematosus, multiple sclerosis, celiac sprue disease, pernicious anemia, vitiligo, scleroderma, psoriasis, and inflammatory bowel disease.
EXAMPLE 3
[000159] Prominent hematopoietic cancers include chronic myelogenous leukemia (CML) and acute lymphocytic leukemia (ALL). These are often associated with T or B cell receptor gene rearrangements. The efficacy of existing therapeutic agents is hindered, however, by development of drug resistance or existing reduced drug sensitivity by virtue of alternative mutations in genes or proteins that alter the patient's response to the agent(s). Monitoring of the existence or development of one or more such mutations, particularly at the single cell level in the case of MRD could have a significant clinical impact in understanding and addressing disease. These mutations are suitable for incorporation in the present invention methods and compositions as Targets.
[000160] Imatinib (Gleevec or ST1571) is a Tyrosine kinase inhibitor (TKI) for treatment of cancers, approved for Philadelphia-chromosome positive chronic myelogenous leukemia (CML) and acute lymphocytic leukemia (ALL), myelodysplastic/myeloproliferative diseases associated with PDGFR gene rearrangements, and gastrointestinal stromal tumors. Imatinib is much less effective in CML patients harboring a D186V mutant of c-KIT. Further, resistance to Imatinib in CML occurs through selection for tumor cells having BCR-Abl kinase point mutations, many clustered in the ATP binding region and interfering with drug binding (Gorre, ME et al (2001) Science 293 :876). Numerous BCR-ABL mutations have been identified (von Bubnoff, N et al (2002) Lancet 359:487; Branford S et al (2002) Blood 99:3472; Roche-Lestienne, C et al (2002) Blood 100: 1014; Shah, NP et al (2002) Cancer Cell 2: 117; Hochhaus, C et al (2002) Leukemia 16:2190; Al-Ali, HK et al (2004) Hematol. J. 5:55). Important resistance relevant BCR-ABL gene mutations include T315I, Y253H, and F255K(Ravandi F (2011) Clin Lymphoma Myeloma Leuk 11 : 198-203; Bhamidipati PK et al (2013) Ther Adv Hematol 4(2): 103-117). The T315I mutation renders resistance also to imatinib TKI alternative inhibitors including nilotinib (Tasigna, Novartis) and dasatinib (Sprycel, Bristol Myers Squibb) (Jabbour E et al (2006) Leukemia 20: 1767-1773).
[000161] BCR-ABL independent mechanisms of resistance to imatinib include increased efflux of the drug by increased expression of P-glycoprotein efflux pumps (Bixby D and Talpaz M (2009) Hematology Am Soc Hematol Educ Program 461-476; Jabbour E et al (2011) Emerg Cancer Ther 2:239-258Kotake M et al (2003) Cencer Letters 199:61-68), such as P-170 (Chu E and DeVita V (2010) Cancer Chemotherapy Drug Manual, Sudbury, MA: Jones and Bartlett, Publishers), decreased drug uptake due to decreased expression of the drug transporter human organic cation transporterl (hOCTl) (Thomas J et al (2004) Blood 104:3739-3745; Wang L et al (2008) Clin Pharmacol Ther 83 :258-264), sequestration of drug due to increased serum protein al acid glycoprotein (Widmer N et al (2006) Br J Clin Pharmacol 62:97-112), and alternative signaling pathway activation through Ras/Raf/MEK kinase, STAT, Erk2, or SFK phosphorylation of BCR-ABL (Bixby and Talpaz M (2009) Hematology Am Soc Hematol Educ Program 461-476). Elevated transcript levels of cyclooxygenase 1, which metabolizes imatinib has been associated with primary imatinib resistance (Zhang W et al (2009) J Clin Oncol 27:3642-3649). [000162] The above targets and mutations are suitable for incorporation in the present invention methods and compositions as Targets.
EXAMPLE 4
[000163] When over expressed or activated by mutations, tyrosine kinases including epidermal growth factor receptor (EGFR) contribute to the development of cancer and these mutated tyrosine kinase (TK) enzymes often provide a target or sensitivity for selective and specific cancer therapy. Somatic mutations in the tyrosine kinase domains of the EGFR gene are associated with sensitivity of lung cancers to certain tyrosine kinase inhibitors (TKIs) including gefitinib (compound ZD1839, Iressa) and erlotinib (compound OSI-774, Tarceva). In frame EGFR deletions in exon 19 (del L747-S752) and frequent point mutations in codon 858 (exon 21) (L858R) have been identified in non- small cell lung cancers and adenocarcinomas and associated with sensitivity to the TKIs gefitinib and erlotinib (Lynch TJ et al (2004) N Engl J Med 350:2129-2139; Paez JG et al (2004) Science 304: 1497-1500; Pao W et al (2004) PNAS 101(36): 13306-13311).
[000164] Acquired resistance to chemotherapy or targeted cancer therapy, mediated by secondary resistance or compensatory mutations is an ongoing challenge. Tumors that are sensitive to TKIs, including either gefitinib or erlotinib, eventually progress despite continued treatment with the TKIs. A secondary mutation at position 790 of EGFR (T790M) has been identified in tumor biopsy of relapsed and resistant patients (Kobayashi S et al (2005) N Engl J Med 352(8):786-792). This mutation is predicted to lead to steric hindrance of inhibitor binding in the ATP-kinase-binding pocket.
[000165] Thus, EGFR mutation evaluation and EGFR as Target, including one or more specific kinase domain variant, are an aspect of the present invention.
[000166] In as much as in certain instances of cancer more than mutation in a single gene may be relevant for evaluation, such as for co-occurrence in cells, the multi-target amplicon fusion product include concatenated sequences from distinct regions or mutations for a single Target gene. In one aspect of the invention, at least 2 unique EGFR mutations, or at least two BCR/ABL mutations may be evaluated simultaneously. Thus, one target specific primer set may be specific or directed to a first relevant EGFR mutation of interest, and a second target specific primer set may be directed to a second relevant EGFR mutation of interest. Similarly, one target specific primer set may be specific or directed to a first relevant BCR/ABL mutation of interest, and a second target specific primer set may be directed to a second relevant BCR/ABL mutation of interest.
EXAMPLE 5
Cancer Gene Co-Occurrence Analysis
[000167] The methods are applicable for assessing and evaluating cancer patients, including characterizing tumors, monitoring for minimally residual disease, and profiling clinical tumor samples for prediction and evaluation of therapeutic agents and drug response and sensitivity. The present methods and approach may be applied to replace or in supplement to more complex and cumbersome oncogene and tumor suppressor analyses, including those such as OncoMap (MacConaill LE et al (2009) PLoS ONE 4(l l):e7887) wherein multiple gene mutations need to be analyzed. In one aspect and in the instance where a secondary oncogene mutation results in reduced sensitivity or resistance to a cancer agent, or in clonal development of drug-resistant secondary disease, the present approach provides a means for rapid, accurate, sensitive and specific evaluation and monitoring at the single cell or clonal level.
[000168] Thus, target combinations for evaluation in accordance with the methods and compositions of the invention and cancer relevant targets may be selected from targets identified in cancer mutation monitoring studies (see e.g. Cui Q (2010) PLosOne 5(10):el3180 "A Network of Co-Occurring and Anti-Co-Occurring Mutations"). Recognized clinically relevant targets include but are not limited to EGFR, EGFR and Src, EGFR and Kras, Pten and EGFR, Kit, BRAF and Ras, MAPK, Bcr/Abl.
[000169] This invention may be embodied in other forms or carried out in other ways without departing from the spirit or essential characteristics thereof. The present disclosure is therefore to be considered as in all aspects illustrative and not restrictive, the scope of the invention being indicated by the appended Claims, and all changes which come within the meaning and range of equivalency are intended to be embraced therein.
[000170] Various references are cited throughout this Specification, each of which is incorporated herein by reference in its entirety.

Claims

WHAT IS CLAIMED IS:
1. A method for simultaneously evaluating at least two distinct target sequences from single cells across a population or mixture of cells, wherein single cell-based multi-target amplification fusion products are generated via primer dimer-mediated concatenation of at least two target sequences from each of said single cells, and the at least two target sequences are assessed on a single cell basis across the population or mixture of cells.
2. The method of claim 1 comprising (a) disaggregating a population or mixture of cells to single cells; (b) manipulating the single cells in order that nucleic acid therefrom is available and suitable for individual amplification; (c) amplifying at least two distinct nucleic acid target sequences using a combination of target specific primers and concatenating the nucleic acid target sequences via primer-dimer formation facilitated by unique primer dimer cassettes in target specific primers to amplify multi-target amplification fusion product(s); and (d) analyzing the multi-target amplification fusion product(s) from each single cell across the population or mixture of cells.
3. The method of claim 2 wherein the primer dimer cassettes comprise sequences of at least 5 nucleotides which lack complementarity to any of the target sequences.
4. The method of claim 2 wherein the primer dimer cassettes facilitate overlap extension PCR to concatenate initially amplified products from at least two target sequences.
5. The method of claim 1 wherein at least three distinct target sequences are simultaneously evaluated and amplification fusion product is generated concatenating three target sequences via intermediate primer dimer formation.
6. The method of claim 1 or 2 for assessing a population or complex mixture of cells for cooccurrence in single cells of variants or mutations in at least two distinct target sequences, whereby single cell-based multi-target amplification fusion products are generated via primer dimer-mediated concatenation of the at least two target sequences, and are evaluated for the variants or mutations in the at least two distinct target sequences across the population or mixture of cells.
7. The method of any of claims 1-6 wherein the population or mixture of cells comprises at least 1,000 cells.
8. The method of any of claims 1-6 wherein the population or mixture of cells comprises at least 10,000 cells.
9. The method of any of claims 1-6 wherein the population or mixture of cells comprises at least 100,000 cells.
10. The method of any of claims 1-6 wherein the population or mixture of cells comprises at least 1,000,000 cells.
11. The method of any of claims 1-6 wherein the at least two distinct target sequences are selected from a T cell receptor or immunoglobulin region and at least one T cell cancer target gene or oncogene.
12. The method of any of claims 1-6 wherein the at least two distinct target sequences are selected from a B cell receptor or immunoglobulin region and at least one B cell cancer target gene or oncogene.
13. The method of any of claims 1-6 wherein the at least two distinct target sequences are distinct cancer target genes or oncogenes.
14. The method of any of claims 1-6 wherein the population or mixture of cells are somatic cells.
15. The method of any of claims 1-6 wherein the population or mixture of cells are neoplastic cells.
16. The method of claim 1 wherein the population or mixture of cells are from a biopsy sample.
17. The method of claim 1 wherein the population or mixture of cells are cells derived from a cancer patient for evaluation or assessment of minimal residual disease.
18. The method of claim 1 wherein the population or mixture of cells are cells derived from a patient with an immunological condition or autoimmune disease.
19. The method of claim 1 wherein the population or mixture of cells are cells derived from a transplant patient.
20. The method of any of claims 1-19 wherein the amplification is conducted in individual droplets or chambers for each cell in a population or mixture of cells.
21. An oligonucleotide amplification primer composition for generating single cell-based multi-target amplification fusion products for evaluating at least two distinct targets in single cells comprising (i) at least two target primer sets, each complementary to and specific for a distinct combination of at least two targets of interest, denoted Target A and Target B; (ii) at least one primer having a 5' dimerization cassette that forms a primer dimer with the dimer cassette of the primer of (iii) and a 3' portion complementary to a first target, denoted Target A; and (iii) at least one primer having a 5' dimerization cassette that forms a primer dimer with the dimer cassette of the primer of (ii) and a 3' portion complementary to a second target, denoted Target B.
22. The oligonucleotide amplification primer composition of claim 21 for generating single cell-based multi-target amplification fusion products for evaluating at least three distinct targets in single cells comprising (i) at least three target primers, each complementary to a distinct target of interest, denoted Target A, Target B and Target C; (ii) at least one primer having a 5' dimerization cassette that forms a primer dimer with the dimer cassette of the primer of (iii) and a 3' portion complementary to a first target, denoted Target A; (iii) at least one primer having a 5' dimerization cassette that forms a primer dimer with the dimer cassette of the primer of (ii) and a 3' portion complementary to a second target, denoted Target B; (iv) at least one primer having a 5' dimerization cassette that forms a primer dimer with the dimer cassette of the primer of (v) and a 3' portion complementary to a third target, denoted Target C; and (v) at least one primer having a 5' dimerization cassette that forms a primer dimer with the dimer cassette of the primer of (iv) and a 3' portion complementary to a second target, denoted Target B at a region distinct from that of primer (iii).
23. The composition of claim 21 or 22 wherein the primer dimerization cassettes comprise sequences of at least 5 nucleotides which lack complementarity to any of the target sequences.
24. The composition of claim 21 or 22 wherein the primer dimerization cassettes facilitate overlap extension PCR to concatenate initially amplified products from at least two target sequences.
25. The composition of claim 21 or 22 wherein at least one of the primers of (ii), (iii), (iv) or (v) further comprises additional intervening nucleotides between the dimerization cassette and target complementary sequences which comprises a linking domain of at least 2 nucleotides.
26. The composition of claim 21 or 22 wherein at least one of the primers of (ii), (iii), (iv) or (v) further comprises additional intervening nucleotides between the dimerization cassette and target complementary sequences which comprises a linking domain comprising barcode sequence.
27. A method for linking at least two distinct target sequences from single cells in an amplification product for analysis across a population or mixture of cells comprising amplifying nucleic acids from single cells in a population or mixture of cells with the oligonucleotide amplification primer composition of claim 21 or 22, wherein at least two distinct target sequences are concatenated in a multi-target amplification fusion product.
28. A method for generating single cell-based multi-target amplification fusion products for evaluating at least two distinct targets in single cells across a population or mixture of cells comprising amplifying at least two distinct nucleic acid target sequences using a combination of target specific primers and concatenating the nucleic acid target sequences via primer-dimer formation facilitated by unique primer dimer cassettes in target specific primers.
29. The method of claim 28 wherein the amplifying is conducted with the oligonucleotide amplification primer composition of claim 21 or 22.
30. The method of claim 29 wherein a multi-target amplification fusion product is generated wherein Target A and Target B, or Target A, Target B and Target C sequences are fused in a single amplification product.
PCT/US2016/025124 2015-04-01 2016-03-31 Massive parallel primer dimer-mediated multiplexed single cell-based amplification for concurrent evaluation of multiple target sequences in complex cell mixtures WO2016161054A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201562141433P 2015-04-01 2015-04-01
US62/141,433 2015-04-01

Publications (1)

Publication Number Publication Date
WO2016161054A1 true WO2016161054A1 (en) 2016-10-06

Family

ID=57004619

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2016/025124 WO2016161054A1 (en) 2015-04-01 2016-03-31 Massive parallel primer dimer-mediated multiplexed single cell-based amplification for concurrent evaluation of multiple target sequences in complex cell mixtures

Country Status (1)

Country Link
WO (1) WO2016161054A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110612355A (en) * 2017-06-20 2019-12-24 深圳华大智造科技有限公司 Composition for quantitative PCR amplification and application thereof
WO2022182649A1 (en) * 2021-02-23 2022-09-01 The Broad Institute, Inc. High-throughput assessment of exogenous polynucleotide- or polypeptide-mediated transcriptome perturbations

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011106738A2 (en) * 2010-02-25 2011-09-01 Fred Hutchinson Cancer Research Center Use of tcr clonotypes as biomarkers for disease
US20140322716A1 (en) * 2012-06-15 2014-10-30 Adaptive Biotechnologies Corporation Uniquely Tagged Rearranged Adaptive Immune Receptor Genes in a Complex Gene Set

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011106738A2 (en) * 2010-02-25 2011-09-01 Fred Hutchinson Cancer Research Center Use of tcr clonotypes as biomarkers for disease
US20140322716A1 (en) * 2012-06-15 2014-10-30 Adaptive Biotechnologies Corporation Uniquely Tagged Rearranged Adaptive Immune Receptor Genes in a Complex Gene Set

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
QUIN CHOU ET AL.: "Prevention of pre-PCR mis-priming and primer dimerization improves low-copy-number amplifications", NUCLEIC ACIDS RESEARCH, vol. 20, no. 7, 1992, pages 1717 - 1723, XP002190821 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110612355A (en) * 2017-06-20 2019-12-24 深圳华大智造科技有限公司 Composition for quantitative PCR amplification and application thereof
CN110612355B (en) * 2017-06-20 2024-01-12 深圳华大智造科技股份有限公司 Composition for quantitative PCR amplification and application thereof
WO2022182649A1 (en) * 2021-02-23 2022-09-01 The Broad Institute, Inc. High-throughput assessment of exogenous polynucleotide- or polypeptide-mediated transcriptome perturbations

Similar Documents

Publication Publication Date Title
US20210254148A1 (en) Measurement of nucleic acid variants using highly-multiplexed error-suppressed deep sequencing
Song et al. Limitations and opportunities of technologies for the analysis of cell-free DNA in cancer diagnostics
Perkins et al. Droplet-based digital PCR: application in cancer research
JP6871160B2 (en) Methods for Identifying and Quantifying Nucleic Acid Expression, Splice Variants, Translocations, Copy Counts, or Methylation Changes
Taly et al. Detecting biomarkers with microdroplet technology
Diehl et al. Digital quantification of mutant DNA in cancer patients
US20200017902A1 (en) Quantification of Mutant Alleles and Copy Number Variation Using Digital PCR with Nonspecific DNA-Binding Dyes
JP7189401B2 (en) Methods for preparing cell-free nucleic acid molecules by in situ amplification
EP2971138A1 (en) Digital assays with associated targets
CN105358709A (en) Systems and methods for detection of genomic copy number changes
JP2019162102A (en) System and method of detecting rnas altered by cancer in peripheral blood
CN114555827A (en) Methods, systems and devices for simultaneous multiomic detection of protein expression, single nucleotide variation and copy number variation in the same single cell
CN103122374A (en) Probe, polymorphism detection method, method of evaluating drug efficacy or tolerance, and reagent kit
CN114761111A (en) Methods, systems, and devices for simultaneous detection of copy number variation and single nucleotide variation in single cells
US20160376664A1 (en) Experimentally Validated Sets of Gene Specific Primers for Use in Multiplex Applications
CN105745335A (en) Compositions and methods for multimodal analysis of cMET nucleic acids
Jancuskova et al. A method to identify new molecular markers for assessing minimal residual disease in acute leukemia patients
WO2016161054A1 (en) Massive parallel primer dimer-mediated multiplexed single cell-based amplification for concurrent evaluation of multiple target sequences in complex cell mixtures
US10870879B2 (en) Method for the preparation of bar-coded primer sets
WO2016057852A1 (en) Markers for hematological cancers
Ruiz et al. Single‐molecule detection of cancer mutations using a novel PCR‐LDR‐qPCR assay
WO2018212247A1 (en) Method for predicting therapeutic efficacy of egfr tyrosine kinase inhibitor for egfr-mutant non-small cell lung cancer
JP6205216B2 (en) Mutation detection probe, mutation detection method, efficacy determination method, and mutation detection kit
Huang et al. RLP system: A single-tube two-step approach with dual amplification cascades for rapid identification of EGFR T790M
WO2024015992A1 (en) Systems, methods, and compositions for digital drop-off assays

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16774155

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16774155

Country of ref document: EP

Kind code of ref document: A1