WO2019204661A1 - Methods for assessing specificity of cell engineering tools - Google Patents

Methods for assessing specificity of cell engineering tools Download PDF

Info

Publication number
WO2019204661A1
WO2019204661A1 PCT/US2019/028200 US2019028200W WO2019204661A1 WO 2019204661 A1 WO2019204661 A1 WO 2019204661A1 US 2019028200 W US2019028200 W US 2019028200W WO 2019204661 A1 WO2019204661 A1 WO 2019204661A1
Authority
WO
WIPO (PCT)
Prior art keywords
cell
gene
nucleic acid
probe
protein
Prior art date
Application number
PCT/US2019/028200
Other languages
French (fr)
Inventor
Fyodor Urnov
John A. Stamatoyannopoulos
Vivek Nandakumar
Pavel Zrazhevskiy
Shreeram Akilesh
Original Assignee
Altius Institute For Biomedical Sciences
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Altius Institute For Biomedical Sciences filed Critical Altius Institute For Biomedical Sciences
Priority to EP19788289.7A priority Critical patent/EP3781704A4/en
Priority to CA3098427A priority patent/CA3098427A1/en
Priority to US17/047,456 priority patent/US20210147922A1/en
Publication of WO2019204661A1 publication Critical patent/WO2019204661A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • C12N15/102Mutagenizing nucleic acids
    • C12N15/1024In vivo mutagenesis using high mutation rate "mutator" host strains by inserting genetic material, e.g. encoding an error prone polymerase, disrupting a gene for mismatch repair
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/01Preparation of mutants without inserting foreign genetic material therein; Screening processes therefor
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases RNAses, DNAses
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6813Hybridisation assays
    • C12Q1/6841In situ hybridisation
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/10Type of nucleic acid
    • C12N2310/20Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPRs]
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2800/00Nucleic acids vectors
    • C12N2800/80Vectors containing sites for inducing double-stranded breaks, e.g. meganuclease restriction sites

Definitions

  • Methods to assess the specificity of cell engineering tools disclosed herein measure the differential response of a cell to a cellular perturbation by a cell engineering tool by quantifying the change in the load of protein relevant to such a response, relative to the background load of the same protein in untreated reference cells, and, in some cases, normalized by the predicted magnitude of response to perturbation by a target- specific cell engineering tool. Degree of deviation of the change in protein load beyond that expected for a target- specific cell engineering tool is used as an indicator of additional off- target activity by cell engineering tool, which might be undesirable.
  • the cell engineering tool might be optimized to achieve an increased target- specific response using the analytical workflow' disclosed herein
  • the present disclosure provides a method of quantifying a protein load, the method comprising quantifying a protein that accumulates in a primary cell in response to a cellular perturbation on a per allele per cell basis.
  • the present disclosure provides a method of quantifying a protein load, the method comprising quantifying a protein that accumulates in a plurality of cells in response to a cellular perturbation in less than 24 hours on a per allele per cell basis.
  • the present disclosure provides a method of screening a plurality of cell engineering tools for specificity, the method comprising quantifying a protein load in an intact cell in less than 24 hours and determining the specificity of the cell engineering tool for a target genomic locus based on the protein load.
  • the present disclosure provides a method of producing a potent and specific cell engineering tool, the method comprising: a) administering a cell engineering tool to a cell; b) determining specificity, activity, or a combination thereof of the cell engineering tool for a target genomic locus by quantifying a protein load; c) quantifying potency of the cell engineering tool by measuring gene editing efficiency, activation of gene expression, or repression of gene expression; and d) adjusting a parameter of the cell engineering tool to increase specificity for the target genomic locus.
  • the protein accumulates in response to a cellular perturbation.
  • the method fijrther comprises quantifying the protein load on a per allele per cell basis.
  • the intact cell comprises an intact primary cell.
  • the cell comprises an intact primary cell.
  • the cellular perturbation comprises administering a cell engineering tool.
  • the method further comprises determining specificity of the cell engineering tool for a target genomic locus. In some aspects, the method further comprises quantifying gene editing efficiency, activation of gene expression, or repression or gene expression. In some aspects, the plurality of cells comprises at least 5 cells, at least 10 cells, at least 20 cells, at least 50 cells, at least 100 cells, at least 200 cells, at least 500 cells, or at least 1000 cells.
  • the protein indicates a cellular response.
  • the cellular response comprises a double strand break, activation of transcription, repression of transcription, or chromosome translocation.
  • the cell or intact cell comprises an immortalized cell.
  • the cell engineering tool comprises a genome editing complex or a gene regulator.
  • the gene regulator comprises a gene activator or a gene repressor.
  • the protein comprises phosphorylated r53BR1 (r53BR1), gH2AC, 53BP1,
  • the method further comprises staining the cell for the protein.
  • the staining the cell for the protein comprises labeling with a primary antibody against the protein and a secondary antibody conjugated to a first fluorophore.
  • the staining the cell for the protein comprises direct labeling with a primary antibody conjugated to a first fluorophore.
  • the method further comprises imaging the cell for one or more protein foci comprising the first fluorophore.
  • the method further comprises image analysis of the cell for the one or more protein foci comprising the first fluorophore.
  • the method further comprises quantifying the protein load from the one or more protein foci comprising the first fluorophore.
  • the protein load comprises a number of protein foci, total protein content within the nucleus, spatial localization pattern, or any combination thereof
  • the cell engineering tool further comprises a polypeptide tag.
  • the polypeptide tag is a FLAG tag.
  • the method further comprises staining the cell for the cell engineering tool.
  • the staining the cell for the cell engineering tool comprises staining with a primary antibody against the polypeptide tag and a secondary antibody conjugated to a second fluorophore.
  • the staining the cell for the cell engineering tool comprises direct labeling with a primary antibody conjugated to a second fluorophore.
  • the staining of the cell for the cell engineering tool comprises staining with a primary antibody against the nuclease and a secondary antibody conjugated to a second fluorophore.
  • the staining the cell for the cell engineering tool comprises direct labeling with a primary antibody conjugated to a second fluorophore.
  • the method further comprises imaging the cell for one or more cell engineering tool foci comprising the second fluorophore. In some aspects, the method further comprises image analysis of the cell for the one or more cell engineering tool foci comprising the second fluorophore. In some aspects, the method further comprises quantifying cell engineering tool load from the one or more cell engineering tool foci comprising the second fluorophore. In some aspects, the cell engineering tool load comprises a number of cell engineering tool foci, total content of the cell engineering tool within the nucleus, spatial localization pattern, or any combination thereof
  • the method further comprises hybridizing a probe set comprising a plurality of probes to the cell, wherein the probe set targets and binds to a target genomic locus.
  • each probe of the plurality of probes comprises a third fluorophore.
  • the probe set comprises an oligonucleotide probe set.
  • the method further comprises imaging the cell for one or more Nano-FISH foci comprising the third fluorophore.
  • the method further comprises image analysis of the cell for the one or more Nano-FISH foci comprising the third fluorophore.
  • co- localization of signal from the first fluorophore and the third fluorophore indicates that the cellular perturbation occurs at the target genomic locus.
  • the method further comprises hybridizing a second probe set comprising a second plurality of probes to the cell, wherein the second probe set targets and binds to an off-target genomic locus.
  • each probe of the second plurality of probes comprises a fourth fluorophore.
  • the second probe set comprises a second oligonucleotide probe set.
  • the method further comprises imaging the cell for one or more Nano-FISH foci comprising the fourth fluorophore.
  • the method further comprises image analysis of the cell for the one or more Nano-FISH foci comprising the fourth fluorophore.
  • co-localization of signal from the first fluorophore, the third fluorophore, and the fourth fluorophore indicates a chromosome translocation.
  • imaging the cell comprises acquiring images of the cell by a microscopy mode selected from the group consisting of: epifluorescence, widefield, confocal, selective plane illumination, tomography, holography, super-resolution, and synthetic aperture optics (SAO).
  • the method further comprises processing the acquired images to identify regions of interest (ROIs) comprising cell nuclei, protein marker foci, sites of cell engineering tool localization, or a combination thereof
  • ROIs regions of interest
  • the method further comprises processing the ROIs to extract a plurality of features selected from the group consisting of: count, spatial location, size (area/volume), shape (circularity/sphericity, eccentricity, irregularity (concavity/convexity), diameter, perimeter/surface area, quantitative measures of image texture that are pixel-based or region-based over a tunable length scale, nuclear diameter, nuclear area, nuclear volume, perimeter, surface area, DNA content, DNA texture measures, number of protein marker foci, size of protein marker foci, shape of protein marker foci, amount of protein marker per cell, spatial location and localization pattern of protein marker foci, number of nuclease per cell, amount of nuclease per cell, nuclease localization or texture, number of cell engineering tool foci, size of cell engineering tool foci, shape of cell engineering tool foci, amount of cell engineering tool foci per cell, spatial location and localization pattern of cell engineering tool foci, number of Nano-FISH foci, size of Nano-FISH
  • the method further comprises processing the extracted plurality of features to measure a degree of co-localization between the one or more Nano-FISH foci and the one or more protein marker foci, thereby determining specificity of the genome editing complex or the gene regulator.
  • the method further comprises applying a machine learning predictor to the extracted plurality of features to evaluate performance of cell engineering tools by predicting a distinction capability of nucleases.
  • the method further comprises the genome editing complex comprises a DNA binding domain and a nuclease.
  • the genome editing complex further comprises a linker.
  • the gene activator comprises a DNA binding domain and an activation domain.
  • the gene activator further comprises a linker.
  • the gene repressor comprises a DNA binding domain and a repressor domain.
  • the gene repressor further comprises a linker.
  • the DNA binding domain comprises a transcription activator- like effector (TALE) protein, a zinc finger protein (ZFP), or a single guide RNA (sgRNA).
  • TALE transcription activator- like effector
  • ZFP zinc finger protein
  • sgRNA single guide RNA
  • the genome editing complex is a TALEN, a ZFN, a CRISPR/Cas9, a megaTAL, or a meganuclease.
  • the nuclease comprises Fokl.
  • Fokl has at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, at least 97%, or at least 99% sequence identity to SEQ ID NO: 1062.
  • the linker comprises the naturally occurring C-terminus of a TALE protein or any truncation thereof In some aspects, the linker comprises 0- 15 residues of glycine, methionine, aspartic acid, alanine, lysine, serine, leucine, threonine, tryptophan, or any combination thereof
  • the activation domain comprises VP 16, VP64, p65, p300 catalytic domain, TET1 catalytic domain, TDG, Ldbl self-associated domain, SAM activator (VP64, p65, HSF1), VPR (VP64, p65, Rta).
  • the repressor domain comprises KRAB, Sin3a, LSD1, SUV39H1, G9A (EHMT2), DNMT1, DNMT3A-DNMT3L, DNMT3B, KOX, TGF-beta- inducible early gene (IIEG), v-erbA, SID, MBD2, MBD3, Rb, or MeCP2.
  • a parameter of the genome editing complex or the gene regulator is adjusted improve specificity.
  • the parameter is a sequence of the DNA binding domain or length of the DNA binding domain.
  • the protein load is quantified in at least 50 to 100,000 cells. In some aspects, the protein load is quantified in no more than 1000, no more than 500, no more than 100, or no more than 50 cells.
  • the cell comprises a hematopoietic stem cells (HSC), a T cell, a chimeric antigen receptor T cell (CAR T cell). In other aspects, the cell is from a normal solid tissue or a tumorigenic solid tissue.
  • the target genomic locus is within a PDCD1 gene, a CTLA4 gene, a LAG3 gene, a TET2 gene, a BTLA gene, a HAVCR2 gene, a CCR5 gene, a CXCR4 gene, a TRA gene, a TRB gene, a B2M gene, an albumin gene, a HBB gene, a HBA1 gene, a TTR gene, a NR3C l gene, a CD52 gene, an erythroid specific enhancer of the BCLl lA gene, a CBLB gene, a TGFBRl gene, a SERPINA1 gene, a HBV genomic DNA in infected cells, a CEP290 gene, a DMD gene, a CFTR gene, an IL2RG gene, or a combination thereof
  • a chimeric antigen receptor (CAR) alpha-L iduronidase (IDUA), iduronate-2- sulf
  • a live cell comprises contacting a live cell with a cell engineering tool comprising a DNA binding domain and a nuclease domain, a gene repressor, or a gene activator, wherein the live cell comprises genomic DNA comprising a target genomic locus for the DNA binding domain of the cell engineering tool; fixing the cell and contacting the fixed cell with a plurality of nucleic acid probes complementary to the target genomic locus and assaying for presence of a protein indicative of cellular response to the contacting; and assaying for colocalization of the probes and the protein, wherein detection of the colocalization indicates activity of the cell engineering tool at the target genomic locus and absence of the
  • colocalization indicates activity of the cell engineering tool at an off-target site.
  • assaying for colocalization comprises imaging the cell at 40X or higher magnification.
  • the fixing of the cell is performed within 24 hours or less of the contacting.
  • the cell engineering tool may include a DNA binding domain and a nuclease domain.
  • the nuclease domain induces a double strand break in the genomic DNA and where the protein indicative of cellular response to the contacting comprises a DNA repair protein.
  • the DNA repair protein may be r53BR1, gH2AC, MRE- l l, BRCA1, RAD-51 , phospho- ATM or MDC 1.
  • the cell engineering tool may include a DNA binding domain and a gene repressor.
  • the gene repressor may be KRAB, Sin3a, LSD1, SUV39H1, G9A (EHMT2), DNMT1, DNMT3A-DNMT3L, DNMT3B, KOX, TGF-beta-inducible early gene (TIEG), v- erbA, SID, MBD2, MBD3, Rb, or MeCP2.
  • the cell engineering tool may include a DNA binding domain and a gene activator.
  • the gene activator may be VP 16, VP64, p65, p300 catalytic domain, TET1 catalytic domain, TDG, Ldb l self- associated domain, SAM activator (VP64, p65, HSF 1), VPR (VP64, p65, Rta).
  • the DNA binding domain may be a transcription activator-like effector (TALE) protein, a zinc finger protein (ZFP), or a single guide RNA (sgRNA).
  • TALE transcription activator-like effector
  • ZFP zinc finger protein
  • sgRNA single guide RNA
  • the cell may be any cell of interest, including the cells as provided herein, e.g., primary cells.
  • the cell may be hematopoietic stem cell (HSC), a T cell, or a chimeric antigen receptor T cell (CAR T cell).
  • HSC hematopoietic stem cell
  • T cell T cell
  • CAR T cell chimeric antigen receptor T cell
  • the cell may be from a normal solid tissue or a tumorigenic solid tissue.
  • the cell may be an immortalized cell.
  • the target genomic locus may be within a PDCD1 gene, a CTLA4 gene, a LAG3 gene, a TET2 gene, a BTLA gene, a HAVCR2 gene, a CCR5 gene, a CXCR4 gene, a TRA gene, a TRB gene, a B2M gene, an albumin gene, a HBB gene, a HBA1 gene, a TTR gene, a NR3C l gene, a CD52 gene, an erythroid specific enhancer of the BCL1 1A gene, a CBLB gene, a TGFBR1 gene, a SERPINA1 gene, a HBV genomic DNA in infected cells, a CEP290 gene, a DMD gene, a CFTR gene, or an IL2RG gene, e.g., in the open reading frame, intron, promoter, regulatory elements, and the like of the gene.
  • the assaying for the colocalization comprises imaging the cell by a microscopy mode selected epifluorescence, widefield, confocal, selective plane illumination, tomography, holography, super-resolution, and synthetic aperture optics (SAO).
  • a microscopy mode selected epifluorescence, widefield, confocal, selective plane illumination, tomography, holography, super-resolution, and synthetic aperture optics (SAO).
  • the plurality of nucleic acid probes may be 30-60 bases in length and may include 20-200 probes having distinct sequences.
  • the plurality of nucleic acid probes may bind to a 1 kilobase (kb) to 5 kb region comprising the target genomic locus.
  • the method when the absence of colocalization is detected, the method further comprises adjusting a parameter of the genome editing tool to improve specificity.
  • the parameter may be a sequence of the DNA binding domain or length of the DNA binding domain.
  • the parameter may be an amount of the genome editing tool introduced into the cell.
  • the method may include contacting a live cell with a cell engineering tool comprising a DNA binding domain and a nuclease domain, a gene repressor, or a gene activator, wherein the live cell comprises genomic DNA comprising a target genomic locus for the DNA binding domain of the cell engineering tool; fixing the cell and assaying for presence of a measurable change in nuclear protein load of a protein indicative of cellular response to the contacting, wherein the measurement reflects the total activity of the cell engineering tool.
  • the method may further include contacting the fixed cell with a plurality of nucleic acid probes complementary to the target genomic locus; and assaying for colocalization of the probes and the protein indicative of cellular response, wherein detection of the colocalization indicates activity of the cell engineering tool at the target genomic locus and absence of the colocalization indicates activity of the cell engineering tool at an off-target site.
  • Assaying for the change in nuclear protein load comprises imaging the cell by a microscopy mode selected from the group consisting of: epifluorescence, widefield, confocal, selective plane illumination, tomography, holography, super-resolution, and synthetic aperture optics (SAO) and comparing to nuclear protein load in a reference cell not contacted with the cell engineering tool.
  • a microscopy mode selected from the group consisting of: epifluorescence, widefield, confocal, selective plane illumination, tomography, holography, super-resolution, and synthetic aperture optics (SAO) and comparing to nuclear protein load in a reference cell not contacted with the cell engineering tool.
  • the method when the measured change in protein load above an application-specific baseline level is detected, the method further comprises adjusting a parameter of the genome editing tool to improve specificity.
  • FIG. 1 shows a brief summary of the assay workflow including the steps of nuclease transfection in cells, immuno labeling, imaging, processing raw images by deconvolution, optional enhancement, deconvolution or reconstruction and segmentation, feature
  • FIG. 2 shows further details on image analysis including the steps of obtaining a microscopy image, deconvolution, delineation/segmentation of nuclei, p53BPl foci, and nuclease protein, morphological data estimation, and informatics/analysis as described in
  • FIG. 1 A first figure.
  • FIGS. 3A and 3B illustrate dose response assessments of GA7 TALENs (XXX) in primary CD34+ hematopoietic stem cells.
  • FIG. 3A shows the number of p53BPl foci per cell for CD34+ primary cells treated with a blank transfection control, 0.5 pg GA7 per TALEN monomer, 1 pg GA7 per TALEN monomer, 2 pg GA7 per TALEN monomer, and 4 pg GA7 per TALEN monomer.
  • FIG. 3B shows the total p53BPl content (fluorescence intensity) per nucleus normalized by the nuclear size versus total FLAG tag content per nucleus normalized by the nuclear size indicative of a nuclease for CD34+ primary cells treated with a blank
  • transfection control 0.5 pg GA7 per TALEN monomer, 1 pg GA7 per TALEN monomer, 2 pg GA7 per TALEN monomer, and 4 pg GA7 per TALEN monomer.
  • FIGS. 4A and 4B illustrate dose response assessments of GA6 TALENs in immortalized K562 cells.
  • FIG. 4A shows the number of p53BPl foci per cell for immortalized K562 cells treated with a blank transfection control, 0.5 pg GA6 per TALEN monomer, 1 pg GA6 per TALEN monomer, 2 pg GA6 per TALEN monomer, and 4 pg GA6 per TALEN monomer.
  • FIG. 4B shows the total p53BPl content (fluorescence intensity) per nucleus normalized by the nuclear size versus total FLAG tag content per nucleus normalized by the nuclear size indicative of a nuclease for immortalized K562 cells treated with a blank transfection control, 0.5 pg GA6 per TALEN monomer, 1 pg GA6 per TALEN monomer, 2 pg GA6 per TALEN monomer, and 4 pg GA6 per TALEN monomer.
  • FIGS. 5A and 5B illustrate dose response assessments of AAVS1 TALENs in immortalized K562 cells.
  • FIG. 5A shows the number of p53BPl foci per cell for immortalized K562 cells treated with a blank transfection control, 0.5 pg AASVl per TALEN monomer, 1 pg AASV1 per TALEN monomer, 2 pg AASV1 per TALEN monomer, and 4 pg AASVl per TALEN monomer.
  • FIG. 5B shows the total p53BPl content (fluorescence intensity) per nucleus normalized by the nuclear size versus total FLAG tag content per nucleus normalized by the nuclear size indicative of a nuclease for immortalized K562 cells treated with a blank transfection control, 0.5 pg AASV1 per TALEN monomer, 1 pg GA6, 2 pg AASV1 per TALEN monomer, and 4 pg AASV1 per TALEN monomer.
  • FIG. 6 shows a graph of the number of p53BPl foci per K562 cells at 6 hours, 12 hours, 24 hours, 48 hours, and 72 hours post transfection of AASV1 as compared to a control at each time point.
  • FIGS. 7A-7E show the results of control transfection and AAS VI -targeting TALEN transfection in various cell types.
  • FIG. 7 A shows the number of p53BPl foci in adherent immortalized A549 cells transfected with a control and with an AAS VI -targeting TALEN 24 hours post- transfection.
  • FIG. 7B shows the number of p53BPl foci in suspension immortalized K 562 cells transfected with a control and with an AAS VI -targeting TALEN 24 hours post- transfection.
  • FIG. 7C shows the number of p53BPl foci in primary CD34+ progenitor cells transfected with a control and with an AAS VI -targeting TALEN 24 hours post- transfection.
  • FIG. 7D shows the number of p53BPl foci in primary CD4+ T cells transfected with a control and with an AAS VI -targeting TALEN 24 hours post- transfection.
  • FIG. 7E shows representative images of cells treated with AAVS1 TALENs versus untreated controls.
  • Cells were stained for p53BPl with an antibody and are visualized in green.
  • TALENs were stained with a FLAG tag and are visualized in red.
  • Nuclei were stained with DAPI and are visualized in grey.
  • the scale bar indicates a size of 5 pm.
  • FIGS. 8A-8B illustrate assessment of nuclease specificity in K562 cells for TALENs and Cas9 nucleases targeting the AAVS 1 genomic locus.
  • FIG. 8A illustrates the number of p53BPl foci per cell for K 562 cells transfected with Cas9 protein along with AAVS 1 guide RNAs as compared to a blank transfection control.
  • FIG. 8B illustrates the number of p53BPl foci per cell for K562 cells transfected with AAVS 1 -targeting TALENs as compared to a blank transfection control.
  • FIGS. 9A-9B show the DNA damage response, as measured by p53BPl foci quantification, in CD34+ cells and T cells with TALENs targeting various genomic loci.
  • FIG. 9A shows the number of p53BPl foci per cell in primary CD34+ progenitor cells after transfection with GA6-targeting TALENs, AAVS 1 -targeting TALENs, GA7- targeting TALENs, GA6-EK-targeting TALENs, and GA7-targeting TALENs. Controls include blank transfection controls.
  • FIG. 9B shows the number of p53BPl foci per cell in primary stimulated CD4+ T cells after transfection with TPl50-targeting TALENs, AAVS 1 -targeting TALENs, and TPl7l-targeting TALENs.
  • Controls include non- electroporated naive T cells, non- electroporated stimulated T cells, and untreated blank transfection control stimulated T cells.
  • FIG. 10 shows the number of p53BPl foci per cell in K 562 cells transfected with GA6 L14, GA6 L17, and GA6 L19.
  • FIG. 11 shows the number of p53BPl foci per cell in K 562 cells transfected with GA6 L, GA6 R, GA6 LR versus untreated control cells.
  • FIG. 12 shows the number of p53BPlfoci per cell in K 562 cells transfected with GA6 or GA6 EK TALENs.
  • FIG. 13 shows fluorescence microscopy images of control cells and AAVS 1- targeting TALEN treated cells.
  • a D API stain (gray) was used to visualize nuclei, p53BPl is shown in green and the AAVS1 oligonucleotide Nano-FISH probe was visualized in red. Imaging showed that in cells transfected with AAVS1 -targeting TALEN, spots indicative of double stranded breaks (indicated by p53BPl foci) co-localized with AAVS1 oligonucleotide Nano-FISH probe spots.
  • FIGS. 14A-14C show histograms of the proportion of pairwise distances between AAVS 1 Nano-FISH spots and p53BPl foci.
  • FIG. 14A shows histograms of control and AAVS1 TALEN treated cells at pairwise distances of 0. l to 0.5.
  • FIG. 14B shows histograms of control and AAVS1 TALEN treated cells at pairwise distances of 0 to 0.025.
  • FIG. 14C shows histograms of control and AAVS1 TALEN treated cells at pairwise distances of 0 - -0.08.
  • FIGS. 15A-15C show evaluation of nuclease specificity by counting p53BPl foci in cells transfected with AAVS 1 -targeting TALENs.
  • FIG. 15A illustrates the number of p53BPl foci on the x axis versus the proportion of cells with p53BPl foci on the y-axis in cells transfected with AAVS1 -targeting TALENs and, in 3D, imaged on a Nikon widefield fluorescence microscope with a 60x magnification lens using oil immersion contact techniques.
  • “Ref’ samples indicate control cells that were not transfected with TALENs.
  • Biological replicates are shown for control and transfected cells (indicated by set x). The number of cells analyzed in each sample is indicated by“n”
  • FIG. 15B illustrates the number of p53BPl foci on the x axis versus the proportion of cells with p53BPl foci on the y-axis in cells transfected with AAVS1 -targeting TALENs and imaged, in 3D, on a Nikon widefield fluorescence microscope with a 40x magnification lens using non-contact techniques.
  • ‘Ref’ samples indicate control cells that were not transfected with TALENs. Biological replicates are shown for control and transfected cells. The number of cells analyzed in each sample is indicated by“n.”
  • FIG. 15C illustrates the number of p53BPl foci on the x axis versus the proportion of cells with p53BPl foci on the y-axis in cells transfected with AAVS1 -targeting TALENs and imaged on a Stellar- Vis ion (SV) fluorescence microscope using non-contact techniques.
  • SV Stellar- Vis ion
  • Ref samples indicate control cells that were not transfected with TALENs. Biological replicates are shown for control and transfected cells. The number of cells analyzed in each sample is indicated by“n”
  • FIG. 16 shows a graph of the number of p53BPl foci per CD4+ T cell at 24 hours and 48 hours post- transfection with AAS VI -targeting TALENs as compared to blank transfection controls at each time point.
  • FIG. 17 shows an assay workflow for microscopy on a Stellar- Vision microscope. Images are captured on the Stellar- Vision microscope, images were reconstructed, images were segmented for regions of interest such as cell nucleic, p53BPl foci, and nuclease localization, features were computed (such as count, size, diameter, area, volume, perimeter length, circularity, irregularity, eccentricity, etc.). The measured per-cell feature information was statistically analyzed to produce quantitative specificity metrics for the tested nuclease(s).
  • FIG. 18 depicts a method for estimating nuclease specificity based on p53BPl foci characteristics.
  • FIG. 19 depicts a method for estimating nuclease specificity based on p53BPl foci counts.
  • FIG. 20 shows a comparison of off- target activity estimated using Guide- Seq vs. p53BPl imaging assay.
  • FIG. 21 illustrates use of the number of p53BPl foci as a read out for improved nuclease specificity.
  • FIG. 22 illustrates use of the number of p53BPl foci as a read out for improved nuclease specificity.
  • FIG. 23A illustrates the use of immunoNanoFISH and p53BPl staining for per-allele per-cell on/off-target activity estimation in K562 cells.
  • FIG. 23B illustrates the use of immunoNanoFISH and p53BPl staining for per-allele per-cell on/off-target activity estimation in CD34+ cells.
  • FIG. 24A illustrates the use of p53BPl imaging for identifying nucleases suitable for targeting TCR-alpha locus.
  • FIG. 24B illustrates the use of p53BPl imaging for identifying nucleases suitable for targeting PDCD- l .
  • FIG. 25 illustrates the use of p53BPl imaging for dose titration of a lead TALEN.
  • FIG. 26 illustrates the use of p53BPl imaging for screening nucleases for specificity and potency.
  • FIG. 27 shows that double strand break (DSB) repair protein serve as markers for evaluating nuclease specificity.
  • DSB double strand break
  • compositions and methods for image-based analysis of cells eliciting a cellular response comprising accumulation of a moiety, such as a domain or a protein, in response to a cellular perturbation can allow for quantification of a protein load in a cell, wherein the protein can accumulate in response to a cellular response to a cellular perturbation.
  • the cellular response can be accumulation of a protein at the site of a double strand break.
  • the cellular response can be active or passive accumulation of a protein, which participates in activating or repressing translational machinery.
  • the cellular perturbation comprises administration of a cell engineering tool.
  • Examples of cell engineering tools include genome editing complex or gene regulator (an epigenetic repressor or activator).
  • the genome editing complex or gene regulator can be designed to edit or regulate a target genomic locus. Modification of the target genomic locus can have therapeutic value.
  • modification of the target genomic locus can include introduction of a gene encoding a functional protein, knocking out a gene encoding a protein, or repressing expression of a protein for, e.g., treatment of indications that would benefit from the modification of the target genomic locus, such as, an indication that results from aberrant protein expression.
  • the methods and compositions disclosed herein include an image-based assay for quantitation of foci within the nucleus of the cell.
  • the image-based assay can allow for visualization of fluorescent foci within the cell nucleus.
  • the fluorescent foci may indicate accumulation of a protein.
  • the protein can be labeled with any detectable agent disclosed herein. Upon accumulation within the nucleus, said detectable agent-labeled protein can be visualized as agglomerations or spots, also referred to as“foci.”
  • the present disclosure also describes foci representing other detectable agents.
  • disclosed herein are foci of fluorescently labeled cell engineering tools (e.g., genome editing complex or gene regulator such as an epigenetic repressor or activator).
  • Cell engineering tools e.g., genome editing complex or gene regulator such as an epigenetic repressor or activator
  • a second fluorophore different from the fluorophore conjugated to the protein. This can allow for simultaneous imaging and image analysis of the cell engineering tool (e.g., genome editing complex or gene regulator such as an epigenetic repressor or activator) and a protein, which accumulates during a cellular response.
  • foci of a fluorescently labeled genomic locus wherein the genomic locus is visualized by labeled oligonucleotide Nano-FISH probe sets, which have a third
  • the genomic locus can be a target or off-target genomic locus.
  • the genomic locus can be a target or off-target genomic locus.
  • two separate Nano-FISH probe sets can be used, each with a different detectable agent.
  • the methods and compositions disclosed herein include an image-based assay for quantifying a protein that accumulates during a cellular response to a cellular perturbation caused by a cell engineering tool (e.g., genome editing complex or gene regulator such as an epigenetic repressor or activator), thereby serving as a marker of specificity and/or activity of the cell engineering tool.
  • a cell engineering tool e.g., genome editing complex or gene regulator such as an epigenetic repressor or activator
  • the image-based methods can quantify a protein load, wherein the protein load is number of protein foci or total protein content per nucleus.
  • the image-based methods described herein can also quantify a cell engineering tool load, wherein the cell engineering tool load can be a number of cell engineering tool foci or total cell engineering tool content per nucleus.
  • a cellular perturbation comprising accumulation of a protein can be induced by a genome editing complex, which includes a DNA binding domain, a nuclease, and an optional linker.
  • Genome editing complexes can also be referred to simply as “nucleases.”
  • Specific genome editing complexes, whose cellular activity can be monitored, can include TALENs, megaTAL, a meganuclease, CAS nuclease (e.g., CRISPR/Cas9 systems), and zinc finger nucleases (ZFNs).
  • the cellular perturbation can be induced by a gene regulator, such as a gene repressor, which can include a DNA binding domain, a repressor domain, and, optionally, a linker.
  • a gene regulator such as a gene repressor
  • the image based analysis of this disclosure allows for quantification of spots in a cell or a subcellular compartment, such as the nucleus, which are indicative of protein accumulation in response to a cellular perturbation.
  • the image-based assay allows for quantification of spots representing protein accumulation within the nucleus on a per allele per cell basis. For example, when cells are edited with a genome editing complex (e.g., a TALEN,
  • nucleases e.g., FokI or Cas9
  • a protein such as a DNA repair protein, e.g., phosphorylated (serl778) 53BP1 (p53BPl) or gH2AC can accumulate at the site of the double strand break and is indicative of a DNA damage response.
  • p53BPl serves as a surrogate marker of a double strand break.
  • the present disclosure provides methods for staking cells tor p53BPl with a detectable agent.
  • the detectable agent can comprise a primary antibody and a secondary antibody conjugated to a fluorophore.
  • the detectable agent can comprise a direct primary antibody conjugated to a fluorophore.
  • the number of p53BP l foci can indicate the number of double stand breaks induced in a cell and image analysis can, thus, serve to quantitatively resolve the DNA damage process spatially and temporally in each cell induced by a gene editing complex (e.g., a TALEN, CRISPR/Cas9, ZFN, megaTALs, or meganucleases).
  • a gene editing complex e.g., a TALEN, CRISPR/Cas9, ZFN, megaTALs, or meganucleases.
  • Staining and visualizing p53BP l foci within the nucleus of a cell using the staining and image analysis techniques disclosed herein, can serve as a powerful tool to probe the specificity of a genome editing complex (e.g., a TALEN,
  • CRISPR/Cas9, ZFN, megaTALs, or meganucleases on a per allele per cell basis.
  • compositions and methods of the present disclosure can be a powerful tool for assesskig the specificity and activity of cell engineering tools (e.g., genome editing complex or gene regulator such as an epigenetic repressor or activator). These methods can be used to screen at least 5, at least 10, at least 50, at least 100, at least 150, at least 200, at least 250, at least 300, at least 350, at least 400, at least 500, or at least 1000 cell engineering tools (e.g., genome editing complex or gene regulator such as an epigenetic repressor or activator).
  • cell engineering tools e.g., genome editing complex or gene regulator such as an epigenetic repressor or activator.
  • These methods can be used to screen at 5- 10, 10-50, 50- 100, 150-200, 200-250, 250-300, 300-350, 350-400, 400-450, 450-500, or 500- 1000 (e.g., genome editing complex or gene regulator such as an epigenetic repressor or activator) for lead candidates that exhibit potency (e.g., high gene editing efficiency or heightened or dampened gene expression) and specificity (low off-target (not at the genomic locus) cellular responses).
  • the methods of the present disclosure can also be used to produce a potent and specific cell engineering tool, by iteratively tuning a parameter of a cell engineering tool and testing for improved specificity.
  • compositions and methods of the present disclosure can be used to evaluate cell engineering tools for activity and/or specificity in primary cells.
  • immortalized cells can also be used with the compositions and methods of the present disclosure.
  • the primary cells and immortalized cell lines can be intact.
  • the present disclosure provides compositions and methods for probing the specificity of a genome editing complex (e.g., a TALEN, CRISPR/Cas9, ZFN, megaTALs, or meganucleases) by imaging and analyzing p53BPl foci.
  • a genome editing complex e.g., a TALEN, CRISPR/Cas9, ZFN, megaTALs, or meganucleases
  • Genome editing complexes are a type of a cell engineering tool and can be referred to herein as a“nuclease.”
  • imaging and analyzing p53BPl foci after administration of a genome editing complex e.g., a TALEN, CRISPR/Cas9, ZFN, megaTALs, or meganucleases
  • Genome editing complexes e.g., a TALEN, CRISPR/Cas9, and/or ZFN
  • Genome editing complexes can be administered to a cell by electroporation, lipofection, viral transduction, or another suitable delivery method.
  • the types of outcomes or readouts can be analyzed using image-based analysis of p53BPl or yFKAX foci.
  • the methods can be used to quantify a protein (p53BPl) load, which can comprise the number of p53BP! foci and/or total p53BPl content within the nucleus.
  • a nuclease may comprise a Transcription Activator-Like Effector (TALE) sequence.
  • TALE Transcription Activator-Like Effector
  • a TALE may comprise a DNA-binding module which includes a variable number of repeat units or repeat modules having about 33-35 amino acid residues. Each acid repeat unit recognizes one nucleotide through two adjacent amino acids (such as at amino acids at positions 12 and 13 of the repeat). In general, the amino acid sequences of each repeat unit does not vary significantly outside of positions 12 and 13. The amino acids at positions 12 and 13 of a repeat may also be referred to as repeat- variable diresidue (RVD).
  • RVD repeat- variable diresidue
  • a TALE probe described herein may comprise between about 1 to about 50 TALE repeat modules.
  • a TALE probe described herein may comprise between about 5 and about 45, between about 8 and about 45, between about 10 and about 40, between about 12 and about 35, between about 15 and about 30, between about 20 and about 30, between about 8 and about 40, between about 8 and about 35, between about 8 and about 30, between about 10 and about 35, between about 10 and about 30, between about 10 and about 25, between about 10 and about 20, or between about 15 and about 25 TAL effector repeat modules.
  • a TALE probe described herein may comprise about 1, about 2, about 3, about 4, about 5, about 6, about 7, about 8, about 9, about 10, about 11, about 12, about 13, about 14, about 15, about 16, about 17, about 18, about 19, about 20, about 21, about 22, about 23, about 24, about 25, about 26, about 27, about 28, about 29, about 30, about 31, about 32, about 33, about 34, about 35, about 36, about 37, about 38, about 39, about 40, about 45, or about 50 TALE repeat modules.
  • a TALE probe described herein may comprise about 5
  • a TALE probe described herein may comprise about 10 TALE repeat modules.
  • a TALE probe described herein may comprise about 11 TALE repeat modules.
  • a TALE probe described herein may comprise about 12 TALE repeat modules.
  • a TALE probe described herein may comprise about 13 TALE repeat modules.
  • a TALE probe described herein may comprise about 14 TALE repeat modules.
  • a TALE probe described herein may comprise about 15 TALE repeat modules.
  • a TALE probe described herein may comprise about 16 TALE repeat modules.
  • a TALE probe described herein may comprise about 17 TALE repeat modules.
  • a TALE probe described herein may comprise about 18 TALE repeat modules.
  • a TALE probe described herein may comprise about 19 TALE repeat modules.
  • a TALE probe described herein may comprise about 20 TALE repeat modules.
  • a TALE probe described herein may comprise about 21 TALE repeat modules.
  • a TALE probe described herein may comprise about 22 TALE repeat modules.
  • a TALE probe described herein may comprise about 23 TALE repeat modules.
  • a TALE probe described herein may comprise about 24 TALE repeat modules.
  • a TALE probe described herein may comprise about 25 TALE repeat modules.
  • a TALE probe described herein may comprise about 26 TALE repeat modules.
  • a TALE probe described herein may comprise about 27 TALE repeat modules.
  • a TALE probe described herein may comprise about 28 TALE repeat modules.
  • a TALE probe described herein may comprise about 29 TALE repeat modules.
  • a TALE probe described herein may comprise about 30 TALE repeat modules.
  • a TALE probe described herein may comprise about 35 TALE repeat modules.
  • a TALE probe described herein may comprise about 40 TALE repeat modules.
  • a TALE probe described herein may comprise about 45 TALE repeat modules.
  • a TALE probe described herein may comprise about 50 TALE repeat modules.
  • a TAL effector repeat module may be a wild-type TALE DNA-binding module or a modified TALE DNA-binding repeat module enhanced for specific recognition of a nucleotide.
  • a TALE probe described herein may comprise one or more wild-type TALE DNA-binding module.
  • a TATE probe described herein may comprise one or more modified TAL effector DNA-binding repeat module enhanced for specific recognition of a nucleotide.
  • a modified TALE DNA-binding repeat module may comprise 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more mutations that may enhance the repeat module for specific recognition of a nucleic acid sequence (e.g., a target sequence). In some cases, a modified TALE DNA- binding repeat module is modified at amino acid position 2, 3, 4, 11, 12, 13, 21, 23, 24, 25,
  • a modified TALE DNA-binding repeat module is modified at amino acid positions 12 or 13.
  • a TALE repeat module may be a repeat module-like domain or RVD-like domain.
  • a RVD-like domain has a sequence different from naturally occurring polynucleotidic repeat module comprising RVD (RVD domain) but have a similar function and/or global structure.
  • Non- limiting examples of RVD-like domains include protein domains selected from Puf RNA binding protein or Ankyrin super- family.
  • a TATE repeat module may comprise a RVD of TABLE 1.
  • a TATE probe described herein may comprise one or more RVDs selected from TABLE 1.
  • a TALE probe described herein may comprise up to 1, up to 2, up to 3, up to 4, up to 5, up to 6, up to 7, up to 8, up to 9, up to 10, up to 11, up to 12, up to 13, up to 14, up to 15, up to 16, up to 17, up to
  • RVDs selected from TABLE 1.
  • a RVD may recognize or interact with one type of nucleotide (e.g., the RVD HD binds only to C).
  • a RVD may recognize or interact with more than one type of nucleotide
  • the efficiency of a RVD domain at recognizing a nucleotide is ranked as“strong”, “intermediate” or“weak”.
  • the ranking may be according to a ranking described in Streubel el a/. ,“TAL effector RVD specificities and efficiencies,” Nature Biotechnology 30(7): 593-595 (2012).
  • the ranking of RVD may be as illustrated in TABLE 2, based on the ranking provided in Streubel et al. Nature Biotechnology 30(7): 593- 595 (2012).
  • a TALE DNA-binding domain may Huffier comprise a C-terminal truncated TALE DNA-binding repeat module, such as, a shortened, e.g., a half- repeat unit.
  • a C-terminal truncated TALE DNA-binding repeat module may be between about 15 and about 34 residues in length.
  • a C-terminal truncated TALE DNA-binding repeat module may be between about 15 and about 32, between about 18 and about 34, between about 18 and about 32, between about 24 and about 35, between about 28 and about 32, between about 25 and about 34, between about 25 and about 32, between about 25 and about 30, between about 28 and about 32, or between about 28 and about 30 residues in length.
  • a C-terminal truncated TALE DNA-binding repeat module may be at least 18, at least 19, at least 20, at least 21, at least 22, at least 23, at least 24, at least 25, at least 26, at least 27, at least 28, at least 29, at least 30, at least 31, at least 32, at least 33, up to 34 residues in length.
  • a C-terminal truncated TALE DNA-binding repeat module may be up to 15 residues, up to 18 residues, up to 19 residues, up to 20 residues, up to 21 residues, up to 22 residues, up to 23 residues, up to 24 residues, up to 25 residues, up to 26 residues, up to 27 residues, up to 28 residues, up to 29 residues, up to 30 residues, up to 31 residues, up to 32 residues, up to 33 residues, or up to 34 residues in length.
  • a C-terminal truncated TALE DNA-binding repeat module may include a
  • a TALE DNA-binding domain may further comprise an N-terminal cap.
  • An N- terminal cap may be a polypeptide sequence flanking the DNA-binding repeat module.
  • An N- terminal cap may be any length and may comprise from about 0 to about 136 amino acid residues in length.
  • An N-terminal cap may be about 5, about 10, about 15, about 20, about 25, about 30, about 35, about 40, about 45, about 50, about 60, about 70, about 80, about 90, about 100, about 110, about 120, or about 130 amino acid residues in length.
  • An N-terminal cap may modulate structural stability of the DNA-binding repeat modules.
  • An N-terminal cap may modulate nonspecific interactions.
  • An N-terminal cap may decrease nonspecific interaction.
  • An N-terminal cap may reduce off- target effect.
  • off- target effect refers to the binding of a DNA binding protein (e.g., a TALE protein) to a sequence that is not the target sequence of interest.
  • An N-terminal cap may further comprise a wild-type N- terminal cap sequence of a TALE protein or may comprise a modified N-terminal cap sequence a TALE protein, such as a TALE protein from Xanthomonas.
  • a TALE DNA-binding domain may further comprise a C-terminal cap sequence.
  • a C-terminal cap sequence may be a polypeptide portion flanking the C-terminal truncated TALE DNA-binding repeat module.
  • a C-terminal cap may be any length and may comprise from about 0 to about 278 amino acid residues in length.
  • a C-terminal cap may be about 5, about 10, about 15, about 20, about 25, about 30, about 35, about 40, about 45, about 50, about 60, about 80, about 100, about 150, about 200, or about 250 amino acid residues in length.
  • a C-terminal cap may further comprise a wild-type C-terminal cap sequence of a TALE protein or may comprise a modified C-terminal cap sequence a TALE protein, such as a TATE protein from Xanthomonas.
  • a nuclease domain may be linked to a TALE DNA-binding domain either directly or through a linker.
  • a linker may be between about 1 and about 50 amino acid residues in length.
  • a linker may be from about 5 to about 45, from about 5 to about 40, from about 5 to about 35, from about 5 to about 30, from about 5 to about 25, from about 5 to about 20, from about 5 to about 15, from about 10 to about 40, from about 10 to about 35, from about 10 to about 30, from about 10 to about 25, from about 10 to about 20, from about 12 to about 40, from about 12 to about 35, from about 12 to about 30, from about 12 to about 25, from about 12 to about 20, from about 14 to about 40, from about 14 to about 35, from about 14 to about 30, from about 14 to about 25, from about 14 to about 20, from about 14 to about 16, from about 15 to about 40, from about 15 to about 35, from about 15 to about 30, from about 15 to about 25, from about 15 to about 20, from about 15 to about 18, from about 18 to about 40, from about 18 to about 35
  • a nuclease domain fused to a TALE can be an endonuclease or an exonuclease.
  • An endonuclease can include restriction endonucleases and homing endonucleases.
  • An endonuclease can also include Sl Nuclease, mung bean nuclease, pancreatic DNase I, micrococcal nuclease, or yeast HO endonuclease.
  • An exonuclease can include a 3’- 5’ exonuclease or a 5’ -3’ exonuclease.
  • exonuclease can also include a DN A exonuclease or an RNA exonuclease.
  • exonuclease includes exonucleases I, II, III, IV, V, and Vni; DNA polymerase I, RNA exonuclease 2, and the like.
  • a nuclease domain fused to a TALE can be a restriction endonuclease (or restriction enzyme).
  • a restriction enzyme cleaves DNA at a site removed from the recognition site and has a separate binding and cleavage domains.
  • such restriction enzyme is a Type IIS restriction enzyme.
  • a nuclease domain fused to a TALE can be a Type IIS nuclease.
  • a Type IIS nuclease can be Fokl or Bfil.
  • a nuclease domain fused to a TALE is Fokl.
  • a nuclease domain fused to a TALE is Bfil.
  • Fokl can be a wild-type Fokl or can comprise one or more mutations. In some cases, Fokl can comprise 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more mutations. A mutation can enhance cleavage efficiency. A mutation can abolish cleavage activity. In some cases, a mutation can modulate homodimerization. For example, Fokl can have a mutation at one or more amino acid residue positions 446, 447, 479, 483, 484, 486, 487, 490, 491, 496, 498, 499, 500, 531, 534, 537, and 538 to modulate homodimerization.
  • a Fokl cleavage domain is, for example, as described in Kim et al. ‘Hybrid restriction enzymes: Zinc finger fusions to Fok I cleavage domain,” PNAS 93 : 1156- 1160 (1996), which is incorporated herein by reference in its entirety.
  • a Fokl cleavage domain described herein is a Fokl of
  • a Fokl cleavage domain described herein is a Fokl, for example, as described in ET.S. Patent No. 8,586,526, which is incorporated herein by reference in its entirety.
  • a TALE probe can be designed to recognize each strand of a double- stranded segment of DN A by engineering the TALE to include a sequence of repeat- variable diresidue subunits that may comprise about 22, about 23, about 24, about 25, about 26, about 27, about 28, about 29, about 30, about 31, about 32, about 33, about 34, about 35, about 36, about 37, about 38, about 39, or about 40 amino acid repeats capable of associating with specific DNA sequences, such that the detectable label of the TALE probe is located at the target nucleic acid sequence.
  • megaTALs in which a TALE DNA binding domain is fused to a monomeric meganuclease, also referred to as a“homing endonuclease” capable of binding and cleaving a target genomic locus of interest.
  • Image-based analysis methods and compositions described herein can be used to evaluate the specificity and/or activity of a megaTAL.
  • Meganucleases can include intron endonucleases and intein endonucleases. Meganucleases can be a LAGLIDADG
  • endonuclease and can include I-Crel or I-Scel.
  • CRISPR-Cas9 clustered regularly interspaced palindromic repeats- associated- Cas9 systems can also be engineered to target and edit a specific nucleic acid sequence.
  • a CRISPR-dCas9 can comprise multiple components in a
  • ribonucleoprotein complex which can include the Cas9 protein that can interact with a single- guide RNA (sgRNA), an optional linker, and a repressor domain.
  • the sgRNA can be made of a CRISPR RNA (crRNA) and a trans- activating crRNA (tracrRNA).
  • the CRISPR- Cas9s described herein can be used to modulate transcription of a target gene to which the sgRNA binds.
  • the CRISPR-Cas9s of the present disclosure can be used to repress expression of a target gene.
  • the sgRNA can comprise at least 18, at least 19, at least 20, at least 21, at least 22, at least 23, at least 24, or at least 25 nucleotides that are complementary to a target sequences of interest.
  • this portion of the sgRNA is analogous to the DNA binding domain described herein with respect to TALENs and ZFNs.
  • the portion of the sgRNA e.g., the about 20 nucleotides within the sgRNA that bind to a target
  • bind adjacent to a protospacer adjacent motif (PAM) which can comprise 2-6 nucleotides in the target sequence that is bound by Cas9.
  • ZFN zinc-finger nuclease
  • a ZFN can comprise a zinc- finger DNA binding domain linked either directly or indirectly to a nuclease domain.
  • a zinc-finger DNA binding domain of a ZFN can comprise from about 1 to about 10 zinc finger motifs.
  • a zinc-finger DNA binding domain can comprise from about 1 to about 9, from about 2 to about 8, from about 2 to about 6 or from about 2 to about 4 zinc finger motifs.
  • a zinc-finger DNA binding domain can comprise at least 1, 2, 3, 4, 5,
  • a zinc- finger DNA binding domain can comprise at least 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 zinc finger motifs.
  • a zinc-finger DNA binding domain can comprise about 1 zinc finger motif A zinc-finger DNA binding domain can comprise about 2 zinc finger motif A zinc-finger DNA binding domain can comprise about 3 zinc finger motif A zinc-finger DNA binding domain can comprise about 4 zinc finger motif A zinc-finger DNA binding domain can comprise about 5 zinc finger motif A zinc-finger DNA binding domain can comprise about 6 zinc finger motif A zinc-finger DNA binding domain can comprise about 7 zinc finger motif A zinc-finger DNA binding domain can comprise about 8 zinc finger motif A zinc-finger DNA binding domain can comprise about 9 zinc finger motif A zinc- finger DNA binding domain can comprise about 10 zinc finger motif
  • a zinc finger motif can be a wild-type zinc finger motif or a modified zinc finger motif enhanced for specific recognition of a set of nucleotides.
  • a ZFN described herein can comprise one or more wild-type zinc finger motif
  • a ZFN described herein can comprise one or more modified zinc finger motif enhanced for specific recognition of a set of nucleotides.
  • a modified zinc finger motif can comprise 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, or more mutations that can enhance the motif for specific recognition of a set of nucleotides.
  • one or more amino acid residues within the a-helix of a zinc finger motif are modified.
  • one or more amino acid residues at positions - 1, +1, +2, +3, +4, +5, and/or +6 relative to the N-terminus of the a-helix of a zinc finger motif can be modified.
  • a nuclease domain linked to a zinc-finger DNA-binding domain can be an
  • An endonuclease can include restriction endonucleases and homing endonucleases.
  • An endonuclease can also include Sl Nuclease, mung bean nuclease, pancreatic DNase I, micrococcal nuclease, or yeast HO endonuclease.
  • An exonuclease can include a 3’ -5’ exonuclease or a 5’ -3’ exonuclease.
  • An exonuclease can also include a DNA exonuclease or an RN A exonuclease. Examples of exonuclease includes exonucleases I, II,
  • IP IP, IV, V and VIII; DNA polymerase I, RNA exonuclease 2, and the like.
  • a nuclease domain fiised to a zinc-finger DNA-binding domain can be a restriction endonuclease (or restriction enzyme).
  • a restriction enzyme cleaves DNA at a site removed from the recognition site and has a separate binding and cleavage domains.
  • such restriction enzyme is a Type IIS restriction enzyme.
  • a nuclease domain fused to a zinc-finger DNA-binding domain can be a Type IIS nuclease.
  • a Type IIS nuclease can be Fokl or Bill.
  • a nuclease domain fused to a zinc-finger DNA-binding domain is Fokl.
  • a nuclease domain fused to a zinc- finger DNA-binding domain is Bfil.
  • a nuclease domain can be linked to a zinc-finger DNA-binding domain either directly or through a linker.
  • a linker can be between about 1 to about 50 amino acid residues in length.
  • a linker can be from about 5 to about 45, from about 5 to about 40, from about 5 to about 35, from about 5 to about 30, from about 5 to about 25, from about 5 to about 20, from about 5 to about 15, from about 10 to about 40, from about 10 to about 35, from about 10 to about 30, from about 10 to about 25, from about 10 to about 20, from about 12 to about 40, from about 12 to about 35, from about 12 to about 30, from about 12 to about 25, from about 12 to about 20, from about 14 to about 40, from about 14 to about 35, from about 14 to about 30, from about 14 to about 25, from about 14 to about 20, from about 14 to about 16, from about 15 to about 40, from about 15 to about 35, from about 15 to about 30, from about 15 to about 25, from about 15 to about 20, from about 15 to about 18, from about 18 to about 40, from about 18 to
  • a linker for linking a nuclease domain to a zinc-finger DNA-binding domain can be about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26,
  • the present disclosure provides an image-based assay for quantification of protein (e.g., p53BPl or gH2 AX ) load on a per cell basis after administration of any of the gene editing complexes disclosed herein (e.g., a TALEN, CRISPR/Cas9, ZFN, megaTALs, or meganucleases).
  • Protein load can be determined, for example, by quantification of number of p53BPl foci or total p53BPl content per nucleus.
  • Types of analyses that can be performed include identification of DNA damage response proteins as surrogates for nuclease activity, development of a reliable quantitative imaging assay to visualize the protein (e.g., p53BPl or gH2 AX ), quantification of nuclease activity in each cell at its target genomic locus and elsewhere (for example, by measurement of indels), quantification of cell transfection efficiency and levels of nuclease expression, quantification of cytotoxicity resulting from nuclease activity, screening of nucleases in a high-throughput (96-well) format, and screening of gene editing complexes with high precision using as low as 50 cells to as high as 1000 cells or more.
  • a reliable quantitative imaging assay to visualize the protein e.g., p53BPl or gH2 AX
  • quantification of nuclease activity in each cell at its target genomic locus and elsewhere for example, by measurement of indels
  • quantification of cell transfection efficiency and levels of nuclease expression quantification of
  • Image-based analysis of p53BPl for evaluating nuclease specificity can be performed across all nucleases (e.g., a TALEN, CRISPR/Cas9, ZFN, megaTALs, or meganucleases) and across all cell types including immortalized cells and primary cells.
  • nucleases e.g., a TALEN, CRISPR/Cas9, ZFN, megaTALs, or meganucleases
  • the genome editing complex can be tagged, for example with a FLAG tag.
  • the image analysis methods of the present disclosure allows for co-quantification of genome editing complex amount by staining for the FLAG tag (e.g., antibody-based methods) and p53BPl load (e.g., number of p53BPl foci, total p53BPl amount per nucleus), which serves as a measure of genome editing complex specificity.
  • genome editing complex-induced cytotoxicity can be measured by quantifying the fraction of apoptotic nuclei in transfected cells.
  • Genome editing complex specificity can be measured by evaluating dose response in cells using the image-based assay of the present disclosure and analyzing for p53BPl load.
  • genome editing complex with high specificity can induce a similar level of double strand breaks, as visualized by a similar p53BPl load, regardless of the genome editing complex dose.
  • genome editing complex specificity can be measured over time, for example up to 3 hrs post-transfection, up to 6 hours post transfection, up to 12 hours post transfection, up to 24 hours post-transfection, up to 48 hours post transfection, up to 60 hrs post-transfection, 0 to 6 hours post-transfection, 3 to 60 hours post transfection, 6 to 12 hours post transfection, 24 to 48 hours post transfection, 6 to 24 hours 48 hours to 5 days after transfection. 5 to 10 days after transfection, 10- 15 days post transfection. 15 to 20 days post transfection, 20 to 25 days post transfection, 25 to 30 days post transfection, or 6 hours to 30 days post transfection.
  • imaging p53BPl foci for quantification of double strand breaks can be used to determine which component of a genome editing complex drives specificity versus off target activity.
  • TALENs can be comprised of a left DNA binding domain coupled to Fokl targeting a top DNA strand and a right DNA binding domain coupled to Fokl targeting a bottom DNA strand. These can be referred to as a left TALEN monomer and a right TALEN monomer. Quantification of p53BPl foci after administration of just one TALEN monomer can reveal which monomer leads to off- target enzymatic activity.
  • genome editing complexes can be iteratively improved upon by changing a parameter of the genome editing complex, testing for specificity by image analysis of p53BPl load after administration in cells, and, optionally, further tuning the parameter of the genome editing complex and re-testing specificity.
  • a TALEN can include a DNA binding domain comprising a number of repeat units. As length of the DNA binding domain is increased, specificity for the target genomic locus can be increased.
  • TALENs can be iteratively designed to increase the number of repeats within the DNA binding domain, administering said TALEN to a cell, evaluating specificity by imaging for p53BPl foci and quantifying p53BPl load, and if needed further increasing the number of repeats within the DNA binding domain.
  • visualization of DNA double strand breaks, induced by a genome editing complex, via staining for p53BPl can be further combined with imaging of the target genomic locus of interest using oligonucleotide Nano-FISH probe sets and methods described further below.
  • cells can be transfected with a genome editing complex targeting a genomic locus of interest.
  • the nuclease enzyme (e.g., Fokl) of the genome editing complex can be tagged (e.g., via a FLAG tag) and cells can be denatured and labeled with oligonucleotide Nano-FISH probes for the same genomic locus of interest.
  • DNA double strand breaks can be further imaged via staining for p53BPl foci.
  • Co-localization of signal from p53BPl foci with signal from oligonucleotide Nano-FISH probe foci indicates nuclease activity at the target genomic locus of interest, thus indicating specificity.
  • Signal from p53BPl foci that are spatially separated from signal from oligonucleotide Nano-FISH probe foci can indicate off-target nuclease activity that may not be at the genomic locus of interest.
  • High throughput analysis can involve analysis of greater than 1000, greater than 10,000, or greater than 100,000 cells in less than 24 hours or less than 48 hours. In some embodiments, high throughput analysis can involve analysis of more than 1 unique sample, more than 5 unique samples, more than 10 unique samples, or more than 100 unique samples within 24 hours. In other embodiments, cell populations less than 1000, less than 500, less than 100, or 50 or less can be analyzed.
  • image-based analysis of p53BPl content in a cell after administration of a gene editing complex can be combined with measurements of gene editing efficiency (e.g., measuring indels at the target site).
  • gene editing efficiency e.g., measuring indels at the target site.
  • the present disclosure allows assessment of genome editing complexes for potency and specificity, wherein potency is determined by measuring gene editing efficiency and specificity is measured via quantification of p53BPl foci either alone or in combination with oligonucleotide Nano- FISH for the genomic locus of interest.
  • the present disclosure provides compositions and methods for probing the specificity of a gene regulator (e.g., a TALE-TF, CRISPR/dCas9, and/or ZFP- TF) by imaging and analyzing for protein accumulation at a target genomic locus.
  • a gene regulator e.g., a TALE-TF, CRISPR/dCas9, and/or ZFP-TF
  • Described below are several gene regulators (e.g., a TALE-TF, CRISPR/dCas9, and/or ZFP-TF), which can be used to activate expression of a target gene or repress expression of a target gene.
  • additional proteins are recruited to the target genomic locus and can serve as a marker for gene activation (e.g., H3K4mel, H3K4me2, H3K27ac) or gene repression (e.g., KAP1, H3K9me3, H3K27me3 or HP1). Further described below are the types of outcomes or readouts that can be analyzed using image-based analysis of gene repression.
  • TALE-TF Transcription Activator-like Effector - Transcription Factor
  • the present disclosure provides for a gene regulator or an engineered transcription factor, wherein the engineered transcription factor can be a transcription activator- like effector-transcription factor (TALE-TF).
  • TALE-TF can include multiple components including the transcription activator- like effector (TALE) protein, an optional linker, and a repressor domain.
  • TALE-TFs described herein can be used to modulate transcription of a target gene to which the TALE protein binds.
  • tire TALE-TFs of the present disclosure can be used to repress expression of a target gene.
  • the TAL effector can be any TAL effector described above.
  • a TALE-TF of the present disclosure can further include a transcription repressor domain.
  • the repressor domain can be a Kriippel- associated box (KRAB) protein, which induces transcriptional repression of polymerases (RNA pol I, II, and/or III) by binding to other corepressors.
  • KRAB Kriippel- associated box
  • the repressor domain can be any one of KQX, TGF-beta- inducible early gene (TIEG), v-erbA, SID, MBD2, MBD3, DNMT1, DNMT3A-L, or DNMT3B, Rb, and MeCP2.
  • a TALE-TF of the present disclosure can further include a transcription activation domain.
  • the activation domain can comprises VP 16, VP64, p65, p300 catalytic domain, TET1 catalytic domain, TDG, Ldbl self- associated domain, SAM activator (VP64, p65, HSF 1), or VPR (VP64, p65, Rta)
  • any one of the TALEs described herein can bind to a region of interest of any gene.
  • the TALEs described herein can bind upstream of the promoter region, upstream of the gene transcription start site, or downstream of the transcription start site.
  • the TALE protein binding region is no farther than 50 base pairs downstream of the transcription start site.
  • the TALE protein is designed to bind in proximity to the transcription start site (TSS). In other embodiments, the TALE can be designed to bind in the 5’ UTR region.
  • ZFP-TF Zinc Finger Protein - Transcription Factor
  • the present disclosure provides for a engineered transcription factor, wherein the engineered transcription factor can be a zinc-finger protein-transcription factor (ZFP-TF).
  • ZFP-TF can include multiple components including the zinc finger protein (ZFP), an optional linker, and a repressor domain.
  • the ZFP-TFs described herein can be used to modulate transcription of a target gene to which the ZFP binds.
  • the ZFP-TFs of the present disclosure can be used to repress expression of a target gene.
  • the repressor domain can be a Kriippel- associated box (KRAB) protein, which induces transcriptional repression of polymerases (RNA pol I, II, and/or III) by binding to other corepressors.
  • KRAB Kriippel- associated box
  • the repressor domain can be any one of Sin3a, LSD1, SUV39H1, G9A (EHMT2), DNMT1, DNMT3A-DNMT3L, DNMT3B, KOX, TGF-beta-inducible early gene (TIEG), v-erbA, SID, MBD2, MBD3, Rb, or MeCP2.
  • a ZFP-TF of the present disclosure can further include a transcription activation domain.
  • the activation domain can comprises VP 16, VP64, p65, p300 catalytic domain, TET1 catalytic domain, TDG, Ldbl self- associated domain, SAM activator (VP64, p65, HSF 1), or VPR (VP64, p65, Rta)
  • the ZFP can also be referred to as a zinc finger DNA binding domain.
  • the zinc- finger DNA binding domain can comprise a set of zinc finger motifs.
  • Each zinc finger motif can be about 30 amino acids in length and can fold into a bba structure in which the a- helix can be inserted into the major groove of the DNA double helix and can engage in sequence- specific interaction with the DNA site.
  • the sequence- specific recognition can span over 3 base pairs.
  • a single zinc finger motif can interact specifically with 1, 2 or 3 nucleotides.
  • CRISPR-dCas9 - Transcription Factor CRISPR-dCas9-TF
  • the present disclosure provides for a engineered transcription factor, wherein the engineered transcription factor can be a clustered regularly interspaced palindromic repeats- associated- deactivated Cas9 (CRISPR-dCas9).
  • CRISPR-dCas9 can comprise multiple components in a ribonucleoprotein complex, which can include the dCas9 protein that can interact with a single-guide RNA (sgRNA), an optional linker, and a repressor domain.
  • the sgRNA can be made of a CRISPR RNA (crRNA) and a trans- activating crRNA (tracrRNA).
  • the CRISPR-dCas9s described herein can be used to modulate transcription of a target gene to which the sgRNA binds.
  • the CRISPR-dCas9s of the present disclosure can be used to repress expression of a target gene.
  • the sgRNA can comprise at least 18, at least 19, at least 20, at least 21, at least 22, at least 23, at least 24, or at least 25 nucleotides that are complementary to a target sequences of interest.
  • this portion of the sgRNA is analogous to the DNA binding domain described above with respect to ZFPs and TALEs.
  • the portion of the sgRNA e.g., the about 20 nucleotides within the sgRNA that bind to a target
  • bind adjacent to a protospacer adjacent motif (PAM) which can comprise 2-6 nucleotides in the target sequence that is bound by dCas9.
  • PAM protospacer adjacent motif
  • the dCas9 can be generated from a wild-type Cas9 protein by mutating 2 residues.
  • the CRISPR-dCas9 ribonucleoprotein complex can repress a target gene by steric hindrance.
  • the CRISPR-dCas9 ribonucleoprotein complex can be further coupled to any repressor domain described herein (e.g., KRAB, Sin3a, LSD1, SUV39H1, G9A (EHMT2), DNMT1, DNMT3A-DNMT3L, DNMT3B, KOX, TGF-beta-inducible early gene (TIEG), v-erbA, SID, MBD2, MBD3, Rb, or MeCP2) to provide repression of a target gene.
  • any repressor domain described herein e.g., KRAB, Sin3a, LSD1, SUV39H1, G9A (EHMT2), DNMT1, DNMT3A-DNMT3L, DNMT3B, KOX, TGF-beta-inducible early gene (TIEG
  • a CRISPR-dCas9 ribonucleoprotein complex can be further coupled to a transcription activation domain.
  • the activation domain can comprises VP 16, VP64, p65, p300 catalytic domain, TET1 catalytic domain, TDG, Ldbl self- associated domain, SAM activator (VP64, p65, HSF1), or VPR (VP64, p65, Rta) D.
  • the present disclosure provides for imaging protein
  • a gene regulator e.g., TALE-TF, CRISPR-dCas9, or ZFP-TF.
  • Types of analyses that can be performed include identification of protein for repression of translation machinery, development of a reliable quantitative imaging assay to visualize the chosen surrogate protein, quantification of gene repression activity in each cell at its target genomic locus and elsewhere, quantification of cell transfection efficiency and levels of gene regulator expression, and screening of gene regulators in a high-throughput (96- well) format.
  • a TALE-TF comprising a DNA binding domain, a KRAB repressor domain and, optionally, a linker can be transfected into a cell of interest.
  • the cell can be an immortalized cell or a primary cell.
  • the KRAB repressor domain Upon binding to the target genomic locus, the KRAB repressor domain is capable of recruiting other co-repressors (e.g., KAP1). Staining can be performed against recruited co-repressors (e.g., KAP1) for evaluating repressor activity.
  • the staining can include a primary and secondary antibody-fluorophore conjugate or a primary antibody-fluorophore conjugate.
  • the TALE-TF can comprise a DNMT3 a repressor domain.
  • the TALE-TF can comprise any repressor domain or activation domain described herein. Staining can then be performed for proteins accumulating at the site gene activation (e.g., H3K4mel, H3K4me2, H3K27ac) or gene repression (e.g., KAP1, H3K9me3, H3K27me3 or HP 1) to evaluate specificity of the gene regulator.
  • site gene activation e.g., H3K4mel, H3K4me2, H3K27ac
  • gene repression e.g., KAP1, H3K9me3, H3K27me3 or HP 1
  • image-based analyses of proteins indicative of gene regulator activity can be performed across all gene regulators (e.g., TALE-TF, CRISPR/dCas9, ZFP-TFs) and across all cell types, including immortalized cells and primary cells.
  • the activation or repression domain can be tagged with a detectable agent, such as a fluorescent moiety.
  • a detectable agent such as a fluorescent moiety.
  • the image analysis methods of the present disclosure allows for co- quantification of gene regulator amount and a protein (e.g., H3K4mel, H3K4me2, H3K27ac proteins for activation or K API, H3K9me3, H3K27me3 or HP1 proteins for repression) load, which serves as a measure of gene regulator activity.
  • protein load can include number of protein foci or total protein content per nucleus.
  • cytotoxicity induced by administration of gene regulators can be measured by quantifying the fraction of apoptotic nuclei in transfected cells.
  • Gene regulator specificity can be measured by evaluating dose response in cells using the image-based assay of the present disclosure and analyzing for foci comprising markers of gene activation (e.g., H3K4mel, H3K4me2, H3K27ac) or gene repression (e.g., KAP1, H3K9me3, H3K27me3 or HPl).
  • gene regulator specificity can be measured over time, for example 6 hours post- transfection, 12 hours post transfection, 24 hours post-transfection, 48 hours post transfection, 0-6 hours post- transfection. 6- 12 hours post transfection, 24-48 hours post transfection, 48 hours to 5 days after transfection. 5- 10 days after transfection, 10- 15 days post transfection. 15-20 days post transfection, 20-25 days post transfection, 25-30 days post transfection, or 6 hours - 30 days post transfection.
  • visualization of gene regulator activity, via staining for a protein that accumulates in response to gene activation e.g., H3K4mel, H3K4me2,
  • H3K27ac or gene repression (e g., KAPI, H3K9me3, H3K27me3 or HPl), can be further combined with imaging of the target genomic locus of interest using oligonucleotide Nano- FISH probe sets and methods described further below.
  • cells can be transfected with a gene regulator (e.g., TALE-TF, ZFP-TF, CRISPR/dCas9) targeting a genomic locus of interest.
  • Cells can be denatured and labeled with oligonucleotide Nano-FISH probes for the same genomic locus of interest.
  • Recruited protein that accumulates in response to gene activation e.g., H3K4mel, H3K4me2, H3K27ac
  • gene repression e.g., KAP1, H3K9me3, H3K27me3 or HPl
  • Co-localization of protein foci e.g., H3K4mel, H3K4me2, H3K27ac for activators or KAP1, H3K9me3, H3K27me3 or HP1 for repressors
  • signal from oligonucleotide Nano-FISH probes indicates activity of the gene regulator at the target genomic locus of interest.
  • Signal from protein foci that are spatially separated from signal from oligonucleotide Nano-FISH probes indicates off- target gene regulator activity that may not be at the genomic locus of interest.
  • the present disclosure involves imaging of a translocation event, such as chromosome translocation.
  • chromosome translocation can involve the generation of double strand breaks in two non-homologous regions of DNA, which can result in joining of the two non-homologous regions (translocation).
  • a genome editing complex (e.g., TALEN, ZFN, CRISPR/Cas9, megaTAL, meganuclease) can be administered to an immortalized or primary cell.
  • Cells can be stained for p53BPl with a first detectable agent, subsequently or concurrently contacted with a oligonucleotide Nano-FISH probe set with a second detectable agent to hybridize to a target genomic locus, and contacted with a different oligonucleotide Nano-FISH probe set with a third detectable agent to hybridize to an off-target genomic locus.
  • Samples are imaged and analyzed using the techniques disclosed herein.
  • Foci of p53BPl can be visualized by signal from the first detectable agent, indicating a double strand break and gene editing with the genome editing complex.
  • Foci of the oligonucleotide Nano-FISH probe set hybridized to a target genomic locus can be visualized by signal from the second detectable agent, indicating the target genomic locus.
  • Foci of the oligonucleotide Nano-FISH probe set hybridized to an off-target genomic locus can be visualized by signal from the third detectable agent, indicating the off- target genomic locus.
  • hybridization or“hybridizes” refers to a process in which a region of nucleic acid strand anneals to and forms a stable duplex, either a homoduplex or a heteroduplex, under normal hybridization conditions with a complementary nucleic acid strand and does not form a stable duplex with unrelated (non- complementary) nucleic acid molecules under the same normal hybridization conditions.
  • the formation of a duplex is accomplished by annealing two complementary nucleic acids under hybridization conditions.
  • the hybridization condition can be made to be highly specific by adjustment of the conditions under which the hybridization reaction takes place, such that two nucleic acid strands will not form a stable duplex, e.g., a duplex that retains a region of double- strandedness under normal stringency conditions, unless the two nucleic acid strands contain a certain number of nucleotides in specific sequences which are substantially or completely complementary.
  • “Normal hybridization or normal stringency conditions” are readily determined for any given hybridization reaction. See, for example, Ausubel et al., Current Protocols in Molecular Biology, John Wiley & Sons, Inc., New York, or Sambrook et al., Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory Press.
  • the term“hybridizing” or“hybridization” refers to any process by which a strand of nucleic acid binds with a complementary strand through base pairing..
  • the image-based analysis of protein (e.g., p53BPl) of cellular perturbation (e.g., genome editing with a TALEN, CRISPR/Cas9, or ZFN) and/or Nano-FISH image analysis can be used to identify a lead genome editing complex for the purposes of genetic modification of a cell.
  • genome editing can be performed by fusing a nuclease of the present disclosure with a DNA binding domain for a particular genomic locus of interest.
  • Genetic modification can involve introducing a functional gene for therapeutic purposes, knocking out a gene for therapeutic gene, or engineering a cell ex vivo (e.g., HSCs or CAR T cells) to be administered back into a subject in need thereof
  • the genome editing complex can have a target site within a gene such as PDCDl, CTLA4, LAG3, TET2, BTLA, HAVCR2, CCR5, CXCR4, TRA, TRB, B2M, albumin, HBB, HBA1, TTR, NR3C1, CD52, eiythroid specific enhancer of the BCL11A gene, CBLB, TGFBR1, SERPINA1, HBV genomic DNA in infected cells, CEP290, DMD, CFTR,
  • a gene such as PDCDl, CTLA4, LAG3, TET2, BTLA, HAVCR2, CCR5, CXCR4, TRA, TRB, B2M, albumin, HBB, HBA1, TTR, NR3C1, CD52, e
  • A“gene,” for the purposes of the present disclosure, includes a DNA region encoding a gene product, as well as all DNA regions which regulate the production of the gene product, whether or not such regulatory sequences are adjacent to coding and/or transcribed sequences. Accordingly, a gene includes, but is not necessarily limited to, promoter sequences, terminators, translational regulatory sequences such as ribosome binding sites and internal ribosome entry sites, enhancers, silencers, insulators, boundary elements, replication origins, matrix attachment sites and locus control region.
  • a genome editing complex can cleave double stranded DNA at a target site in order to insert a chimeric antigen receptor (CAR), alpha-L iduronidase
  • CAR chimeric antigen receptor
  • alpha-L iduronidase alpha-L iduronidase
  • IDUA iduronate-2-sulfatase
  • F9 Factor 9
  • Cells such as hematopoietic stem cells (HSCs) and T cells, can be engineered ex vivo with the genome editing complex.
  • HSCs hematopoietic stem cells
  • T cells can be engineered ex vivo with the genome editing complex.
  • genome editing complexes can be directly administered to a subject in need thereof
  • Image-based analysis of protein (e.g., p53BPl) of said genome editing complexes can enable the development of highly specific genome editing complexes with less than 10 off-target double strand breaks, less than 5 off-target double strand breaks, less than 4 off- target double strand breaks, less than 3 off-target double strand breaks, less than 2 off-target double strand breaks, less than 1 off- target double strand breaks, or no off- target double strand breaks.
  • the subject receiving treatment can be suffering from a disease such as transthyretin amyloidosis (ATTR), HIV, glioblastoma multiforme, cancer, acute lymphoblastic leukemia, acute myeloid leukemia, beta-thalassemia, sickle cell disease, MPSI, MPSII, Hemophilia B, multiple myeloma, melanoma, sarcoma, Leber congenital amaurosis (LCA10), CD19 malignancies, BCMA-related malignancies, duchenne muscular dystrophy (DMD), cystic fibrosis, alpha- 1 antitrypsin deficiency, X-linked severe combined immunodeficiency (X- SCID), or Hepatitis B.
  • a disease such as transthyretin amyloidosis (ATTR), HIV, glioblastoma multiforme, cancer, acute lymphoblastic leukemia, acute myeloid leukemia, beta-thalassemia, sick
  • a Nano-FISH probe set can be designed for any genomic locus of interest described herein (e g., PDCD1, CTLA4, LAG3, TET2, BTLA, HAVCR2, CCR5, CXCR4, TRA, TRB, B2M, albumin, HBB, HBA1, TTR, NR3C1, CD52, erythroid specific enhancer of the BCLl lA gene, CBLB, TGFBR1, SERPINA1, HBV genomic DNA in infected cells, CEP290, DMD, CFTR, IL2RG, CS- l, or any combination thereof) to be used in combination with image-based analysis of protein (e.g., p53BPl) of cellular perturbation.
  • image-based analysis of protein e.g., p53BPl
  • compositions and methods for image-based analysis of a surrogate marker for a cellular response induced by a cellular perturbation can be further combined with Nano-FISH.
  • Oligonucleotide Nano-FISH probe sets can be used to visualize a target genomic locus of interest.
  • a genome editing complex e.g., a TALEN, CRISPR/Cas9, ZFN
  • a gene regulator e.g., a TALE-TF, ZFP-TF, CRISPR/dCas9
  • a translocation event can be visualized by combination imaging with Nano-FISH.
  • Compositions and methods for Nano-FISH are described in further detail below.
  • Described herein are methods of detecting a cellular regulatory element in situ utilizing a super-resolution microscopy technique to determine the presence, absence, and/or activity of a regulatory element. Also described herein are methods of detecting different types of regulatory elements simultaneously utilizing a heterogeneous set of detection agents, and translating the molecular information from the different types of regulatory elements to determine the activity state of a cell.
  • the activity state of a cell may correlate to a
  • Described herein are methods of determining the localization of a regulatory element and measuring the activity of a regulatory element. The methods provided herein may avoid the introduction of artifacts such as biological stressors and perturbations or destroys cellular architecture.
  • One or more methods described herein may detect different types of regulatory elements, distinguish between different types of regulatory elements, and/or generate a map of a regulatory element (e.g., chromatin).
  • a regulatory element may be labeled by one or more different types of detection agents.
  • the one or more different types of detection agents may include DNA detection agents, RNA detection agents, protein detection agents, or combinations thereof
  • the detection agent may comprise a probe portion, which may interact (e.g., hybridize) to a target site within the regulatory element, and optionally comprise a detectable moiety.
  • the detectable moiety may include a fluorophore, such as a fluorescent dye or a quantum dot.
  • the detection agent may be an unlabeled probe which can be further conjugated to an additional labeled probe.
  • the regulatory element may be detected by stochastic or deterministic super-resolution microscopy method.
  • the stochastic super-resolution microscopy method may be a synthetic aperture optics (SAO) method.
  • the SAO method may generate a detection profile, which can encompass fluorescent signal intensity, size, shape, or localization of the detection agent. Based on the detection profile, the activity state, the localization, expression level, and/or interaction state of the regulatory element may be determined.
  • a map based on the detection profile of the regulatory element may also be generated, and may be correlated to cell type identification (e.g., cancerous cell identification).
  • the regulatory element may be further analyzed in the presence of an exogenous agent or condition, such as a small molecule fragment or a drug, or under an environment such as a change in temperature, pH, nutrient, or a combination thereof
  • an exogenous agent or condition such as a small molecule fragment or a drug
  • the perturbation of the activity state of the regulatory element in the presence of the exogenous agent or condition may be measured.
  • a report may further be generated and provided to a user, such as a laboratory clinician or health care provider.
  • Nano-FISH fluorescence in situ hybridization methodology
  • Nano-FISH can utilize defined pools or sets of synthetic fluorescent dye-labeled oligonucleotides (probe pools or probe sets) to reliably detect small genomic regions in large numbers of adherent or suspension cells in situ.
  • Nano-FISH can be conducted utilizing conventional wide-field microscopic imaging. In other embodiments, Nano-FISH can be conducted using super- resolution imaging techniques.
  • Nano-FISH can be coupled with an automated image informatics pipeline to enable high-throughput detection and 2D and/or 3D spatial localization of small genomic DNA elements in situ in hundreds of, thousands of or more individual cells per experiment.
  • an automated image informatics pipeline to enable high-throughput detection and 2D and/or 3D spatial localization of small genomic DNA elements in situ in hundreds of, thousands of or more individual cells per experiment.
  • a scalable image analysis software suite can reliably identify and
  • Nano-FISH can allow detection of the precise localization of specific regulatory genomic elements in 2D or 3D nuclear space, the identification of small-scale structural genomic variations (such as sequence gains or losses), the quantitation of spatial interactions between regulatory elements and their putative target gene(s), or the detection of genomic conformational changes that induce stimulus-dependent gene expression.
  • Nano-FISH can allow the visualization of the precise localization of a target nucleic acid sequence.
  • the target nucleic acid sequence can be an endogenous nucleic acid sequence, a nucleic acid sequence derived from an exogenous source, or a combination thereof
  • An exogenous nucleic acid sequence can be introduced into a first cell and can be further detected in progeny of the first cell.
  • An exogenous target nucleic acid sequence can be introduced to a cell through electroporation, lipofection, transfection, microinjection, viral transduction, or a gene gun.
  • vector systems that can be used to introduce a target nucleic acid sequence into a cell may include viral vector, episomal vector, naked RNA (recombinant or natural), naked DNA (recombinant or natural), bacterial artificial chromosome (BAC), and RNA/DNA hybrid systems used separately or in combination.
  • Vector systems can be used without additional reagents meant to aid in the incorporation and/or expression of desired mutations.
  • a non-limiting list of reagents meant to aid in the incorporation and/or expression of desired mutations can include Lipofectamine, FuGENE, FuGENE HD, calcium phosphate, HeLaMONSTER, Xtreme Gene.
  • Nano-FISH can allow the detection of the precise localization of exogenous nucleic acids inserted or integrated into a genome.
  • Nano-FISH can allow the detection of the precise localization of exogenous DN A inserted into a genome, as may be inserted by a genetic engineering technique or by viral infection or transduction.
  • Nano-FISH can allow the detection of an episomal nucleic acid sequence.
  • the systems and methods described herein can be useful in detecting or determining the presence, absence, identity, or quantity of a target nucleic acid sequence in a sample.
  • the methods, compositions, and systems described herein can be used to efficiently detect, to identify, and to quantify a target nucleic acid sequence that is a short nucleic acid sequences.
  • a short nucleic acid sequence that can be detected or quantified using the disclosures of the present application may be from 15 nucleotides in length to about 12 kb in length.
  • a short nucleic acid sequence can be less than 1 kb.
  • Methods for the detection, identification, and/or quantification of a short nucleic acid sequence of a sample can comprise contacting the short nucleic acid sequence with a probe comprising a detectable label and determining the presence, absence, or quantity of probes bound to the target nucleic acid sequence. Determination of the sequence position of the short nucleic acid sequence relative to other nucleotides or another short nucleic acid sequence (for instance, using a second probe capable of binding to a second target sequence of the nucleic acid) can be a step in the methods described herein. The methods described herein can also comprise determining the spatial position of the short nucleic acid sequence.
  • Nano-FISH can be used to measure the normalized inter-spot distance between a first short nucleic acid sequence encoding an enhancer or portion thereof and a second nucleic acid encoding a promoter of a gene or portion thereof which can be used to study changes in genome conformation that may be associated with gene function.
  • the methods described herein can comprise comparing the presence, absence, spatial position, sequence position, or quantity of a short nucleic acid sequence of a sample to a reference value.
  • a non-limiting example of quantifying detection of a short nucleic acid sequence in a cell can comprise quantifying the number of copies of a nucleic acid sequence that has been incorporated into a modified cell (for example, a cell modified by the introduction of a nucleic acid sequence into the cell by genetic editing), which can be used as quality control for modified cells produced by cell engineering strategies.
  • identification, and quantification made possible by the methods, compositions, and systems of the present disclosure can enable the detection of viral nucleic acid sequences, which commonly range from about 1 kb in length to about 10 kb in length.
  • Also described herein are methods, compositions, and systems useful in characterizing and/or quantifying the presence, absence, position, or identity of a target nucleic acid sequence in a cell or sample derived therefrom relative to a reference nucleic acid sequence in the same cell or sample or relative to a control cell or sample.
  • improvements to the efficiency of detection and to a detection threshold can allow for the detection and characterization of short nucleic acid sequences (for instance, non-repeating nucleic acid sequence insertions) during analysis or validation of cell samples or cell lines.
  • a target nucleic acid sequence can be associated with the expression of a target protein.
  • a detectable label may be used to detect a target protein expression, which therefore can allow for the correlation between the presence, absence, or quantity of the target nucleic acid sequence and the expression of the target protein.
  • Nano-FISH methods as described herein can be used as a diagnostic for the detection, identification, and/or quantification of a short nucleic acid sequence of a sample.
  • Nano-FISH can be used as a diagnostic for HIV by detecting HIV nucleic acid sequences in a sample.
  • the Nano-FISH methods as described herein can be used with therapeutics by detecting, identifying, and/or quantifying a short nucleic acid sequence of a sample.
  • Nano-FISH can be used with therapeutics in which a short nucleic acid sequence is integrated into a cell’s DNA (e.g., chimeric antigen receptor T cell therapeutics) to determine detect, identify, and/or quantify the short nucleic acid sequence integration.
  • flow cell sorting or being laborious (single-cell cloning).
  • Nano-FISH is a significantly improved and distinct tool from
  • Nano-FISH probe sets of the present disclosure can be comprised of one or more short oligonucleotide Nano-FISH probes designed against a target, allowing for complete control over probe size.
  • one or more oligonucleotide Nano-FISH probes of exact size can be designed against a transfer plasmid backbone.
  • the oligonucleotide Nano-FISH probes of the present disclosure can be from 30 to 60 nucleotides in length.
  • the oligonucleotide Nano-FISH probes of the present disclosure can be 40 nucleotides in length.
  • conventional FISH techniques require the use of fosmids (varying in size from 40-50 kilobases), BACs (varying in size from varying in size from 100-250 kilobases), or plasmids (varying in size from 5- 10 kilobases), which are conventionally nick translated to incorporate hapten or lluorescently labeled-dUTP (or other nucleotide).
  • the result of nick translating fosmids, BACs, and/or plasmids to obtain conventional FISH probes is the generation of a highly heterogeneous pool of probes of varying sizes.
  • Conventional FISH probes average around 500 nucleotides in length but exhibit a size distribution from 100 bases to anywhere around 1.5 kilobases, which is up to 50 times larger than an
  • oligonucleotide Nano-FISH probe can be generated by means ofPCR with the incorporation of labeled nucleotides during the reaction.
  • oligonucleotide Nano-FISH probes of this disclosure there is poor control over the resulting probe size of nick translated conventional FISH probes made from fosmids, BACs, or plasmids.
  • the Nano-FISH probes of the present disclosure are precisely controlled to introduce an exact number of fluorescent dye molecules per probe.
  • each oligonucleotide Nano-FISH probe of the present disclosure can have exactly a detectable agent at the 3’ end.
  • the detectable agent can be any dye molecule, such as a Quasar Dye (e.g., Q570 and Q670).
  • Oligonucleotide Nano-FISH probes of the present disclosure may be synthesized from the 3’ to 5’ end, and the fluorophore may be included on the first nucleotide at the 3’ end.
  • an oligonucleotide Nano- FISH probe of the present disclosure can have 2 fluorescent dye molecules.
  • a Nano-FISH oligonucleotide probe of the present disclosure with a size of 55 to 60 nucleotides can have 2 fluorescence dye molecules.
  • the second dye molecule may be placed on an internal nucleotide or at the 5’ end.
  • the oligonucleotide Nano-FISH probes of the present disclosure directly incorporate a fluorophore at the 3’ end of each probe, the present disclosure provides a probe set that can be directly labeled and, thus, offers direct labeling and detection of a target nucleotide sequence without any need for signal
  • the Nano-FISH probes of the present disclosure are designed to precisely target a desired strand of a target (e.g., the Watson strand, the Crick strand, or both strands).
  • the oligonucleotide Nano-FISH probes of the present disclosure can be designed to overlap by at least 5 base pairs.
  • a first oligonucleotide Nano- FISH probe can be designed to target the Watson strand of a target sequence and a second oligonucleotide Nano-FISH probe can be designed to target an adjacent region on the Crick strand of a target sequence.
  • the first and second probe can overlap by at least 5 nucleotides, can be directly adjacent to each other, or can be spaced apart by at least several nucleotides.
  • the first and second probe can overlap by 5-20 nucleotides.
  • Overlapping probes on the plus and minus strands can allow for the design and hybridization of larger probe sets to target smaller nucleic acid sequences.
  • the oligonucleotide Nano-FISH probes of the present disclosure are designed and selected according to certain criteria in order to precisely target and detect an exogenous sequence (e.g., a viral nucleic acid sequence), while minimizing off-target binding that would increase the background noise during imaging.
  • an exogenous sequence e.g., a viral nucleic acid sequence
  • a target can be selected and the hg38 coordinates can be determined.
  • a tiling density can be selected from all on one strand, a fixed 2 base pair spacing between adjacent oligonucleotide Nano-FISH probes, or a spacing of 30 base pairs on each DNA strands with a 5 base pair overlap between the top and bottom strands at each end.
  • oligonucleotide Nano-FISH probes of the present disclosure are tiled across a target to avoid steric hindrance between molecules.
  • oligonucleotide Nano-FISH probe sequences are tiled across regions of interest, such as the human genome or the human genome with an artificial extra chromosome representing the target (e.g., the CAR transfer plasmid).
  • a program can be used to tile oligonucleotide Nano-FISH probes across the region of interest.
  • a 40 base pair probe pool can be generated by tiling 40 base pair oligonucleotide probes at a predetermined spacing between oligonucleotides across a target sequence.
  • the tiled 40 base pair probe pool can be designed to provide a minimum spacing of 2 base pairs between each consecutive oligonucleotide Nano-FISH probe.
  • Each oligonucleotide Nano-FISH probe in the resulting probe pool can be compared to a l6-mer database of genomic sequences to identify partial matches of probes to genomic sequences that can result in off-target background staining, which would negatively affect the signal- to-noise ratio.
  • An oligonucleotide Nano- FISH probe that comprises a total of 24 matches or less to the l6-mer database may be considered to be unique in the human genome and, thus, can be selected to move forward.
  • a probe with more than 300 matches to the l6-mer database of genomic sequences can be discarded from consideration as it generates too many non-target hits.
  • the number of matches of an oligonucleotide Nano-FISH probe can have to the l6-mer database of genomic sequences may depend on the size of the probe. For example, a 30 base pair long
  • oligonucleotide Nano-FISH probe that exbibits a total of 14 matches or less to the l6-mer database may be considered to be unique in the human genome and, thus, may be selected to move forward.
  • a 50 base pair long oligonucleotide Nano-FISH probe that exhibits a total of 34 matches or less to the l6-mer database may be considered to be unique in the human genome and, thus, may be selected to move forward.
  • a 60 base pair long oligonucleotide Nano-FISH probe that exhibits a total of 44 matches or less to the l6-mer database may be considered to be unique in the human genome and, thus, may be selected to move forward.
  • an oligonucleotide Nano-FISH probe of the present disclosure between 30 to 60 base pairs in length may exhibit 14 to 44 matches or less to the l6-mer database and be considered unique in the human genome.
  • Oligonucleotide Nano-FISH probes of the present disclosure have less than 300 matches to the l6-mer database of genomic sequences. Pools of at least 30 oligonucleotide Nano-FISH probes that satisfied all design criteria can be selected to carry forward. Additional selection criteria that can be applied when selecting the oligonucleotide Nano-FISH probes of the present disclosure include percent GC content. For example, oligonucleotide Nano-FISH probes can have a percent GC content above at least 25%.
  • oligonucleotide Nano-FISH probes of the present disclosure are selected for use if they have less than 5 hits, less than 4 hits, less than 3 hits, less than 2 hits, or less than 1 hit of at least a 50% contiguous homology elsewhere in the human genome (e.g., by a BLAT search of each oligo against the genome).
  • a BLAT search of each oligo against the genome may result in larger stretches of homology.
  • a probe that exhibits less than 50% ( ⁇ 20 bases) homology may be considered to be unique and, thus, may be selected to move forward.
  • the probe set can be designed to have a limited number of oligonucleotide Nano-FISH probes, such as 25-35 probes, that can be closely spaced.
  • the probe set can be designed include from 100- 150 probes.
  • oligonucleotide Nano-FISH probes of the present disclosure may be selected to not include a repetitive element.
  • a repetitive element may be short interspersed nuclear elements (SINE) including ALUs, long interspersed nuclear elements (LINE), long terminal repeat elements (LTR) including retroposons, DNA repeat elements, simple repeats (micro- satellites), low complexity repeats, satellite repeats, RNA repeats such as RNA, tRNA, rRNA, snRNA, scRNA, or srpRNA, or other repeats such as the class rolling circle (RC).
  • SINE short interspersed nuclear elements
  • LINE long interspersed nuclear elements
  • LTR long terminal repeat elements
  • RNA repeats such as RNA, tRNA, rRNA, snRNA, scRNA, or srpRNA, or other repeats such as the class rolling circle (RC).
  • RC class rolling circle
  • a probe set is referred to herein as a“probe pool” or a “plurality of probes.”
  • an oligonucleotide Nano-FISH probe set can comprise from 20-200 oligonucleotide probes.
  • the probe set can comprise 20- 200 oligonucleotide Nano-FISH probes.
  • the above described properties of the Nano-FISH probes of the present disclosure can lead to increased precision in detecting a target sequence, especially detection of small target sequences that are less than 5 kilobases, and lower background signals stemming from off target probe-DNA interactions, as compared to conventional FISH probes.
  • the Nano-FISH probes of the present disclosure can yield a better or higher signal-to-noise ratio than conventional FISH probes.
  • 9 oligonucleotide-Nano-FISH probes of the present disclosure may be used visualize insertions of an exogenous nucleic acid sequence in the nucleus at a signal to noise ratio of about 1.2- 1.5 to 1.
  • 15 oligonucleotide-Nano- FISH probes of the present disclosure may be used visualize insertions of an exogenous nucleic acid sequence in the nucleus at a signal to noise ratio of about 1.5 :1.
  • 30 oligonucleotide-Nano-FISH probes of the present disclosure may be used visualize insertions of an exogenous nucleic acid sequence in the nucleus at a signal to noise ratio of about 4-8 to 1.
  • 60 oligonucleotide-Nano-FISH probes of the present disclosure may be used visualize insertions of an exogenous nucleic acid sequence in the nucleus at a signal to noise ratio of about 5- 10:1. In some embodiments, 90
  • oligonucleotide Nano-FISH probes of the present disclosure may result in at least one detected allele (in a triploid cell background) in about 98% of cells.
  • 60 oligonucleotide Nano-FISH probes of the present disclosure may result in at least one detected allele (in a triploid cell background) in about 92% of cells.
  • 30 oligonucleotide Nano-FISH probes of the present disclosure may result in at least one detected allele (in a triploid cell background) in about 89% of cells.
  • 15 oligonucleotide Nano-FISH probes of the present disclosure may result in at least one detected allele (in a triploid cell background) in about 34% of cells.
  • the target exogenous nucleic acid sequence does not need to be amplified prior to detection.
  • the exogenous nucleic acid sequences of the present disclosure are non-amplified exogenous nucleic acid sequences.
  • the signal from the oligonucleotide Nano-FISH probes of the present disclosure does not need to be amplified prior to detection.
  • the Nano-FISH methods of the present disclosure provide methods of non- signal amplified detection.
  • the Nano-FISH methods of the present disclosure provide methods of direct, non- amplified signal detection.
  • compositions and methods provided herein can also comprise a plurality of probe sets, wherein each probe set can contain any number of oligonucleotide Nano-FISH probes described above.
  • oligonucleotide Nano-FISH probes may all labeled with the same fluorophore.
  • Each probe set in the plurality of probe sets may be labeled with different fluorophores.
  • Each probe set in the plurality of probe sets may further comprise oligonucleotide Nano-FISH probes for the detection of unique target sequences (e.g., exogenous or viral nucleic acid sequences).
  • a plurality of probe sets can be used to detect multiple target sequences simultaneously, with each target sequence being labeled with a unique fluorophore.
  • a regulatory element may be DNA, RNA, a polypeptide, or a combination thereof
  • a regulatory element may be DNA.
  • a regulatory element may be RNA.
  • a regulatory element may be a polypeptide.
  • a regulatory element may be any combination of DNA, RNA, and/or polypeptide (e.g., protein- protein complexes, protein-DNA/RNA complexes, and the like).
  • a regulatory element may be DNA.
  • a regulatory element may be a single- stranded DNA regulatory element, a double- stranded DNA regulatory element, or a combination thereof
  • the DNA regulatory element may be single- stranded.
  • the DNA regulatory element may be double-stranded.
  • the DNA regulatory element may encompass a DNA fragment.
  • the DNA regulatory element may encompass a gene.
  • the DNA regulatory element may encompass a chromosome.
  • the DNA regulatory element may include endogenous DNA regulatory elements (e.g., endogenous genes).
  • the DNA regulatory element may include artificial DNA regulatory elements (e.g., foreign genes introduced into a cell).
  • a regulatory element may be RNA.
  • a regulatory element may be a single- stranded RNA regulatory element, a double- stranded RNA regulatory element, or a combination thereof
  • the RNA regulatory element may be single- stranded.
  • the RNA regulatory element may be double- stranded.
  • the RNA regulatory element may include endogenous RNA regulatory elements.
  • the RNA regulatory element may include artificial RNA regulatory elements.
  • the RNA regulatory element may include microRNA (miRNA), transfer RNA (tRNA), ribosomal RNA (rRNA), messenger RNA (mRNA), pre-mRNA, transfer-messenger RNA (tmRNA), heterogeneous nuclear RNA (hnRNA), short interfering RNA (siRNA), or short hairpin RNA (shRNA).
  • miRNA microRNA
  • tRNA transfer RNA
  • rRNA ribosomal RNA
  • messenger RNA mRNA
  • pre-mRNA pre-mRNA
  • tmRNA transfer-messenger RNA
  • hnRNA transfer-messenger RNA
  • shnRNA heterogeneous nuclear RNA
  • siRNA short interfering RNA
  • shRNA short hairpin RNA
  • An RNA regulatory element may be an enhancer RNA (eRNA).
  • eRNA enhancer RNA
  • An enhancer RNA may be a non- coding RNA molecule transcribed from an enhancer region of a DNA molecule, and may be from about 50 base-pairs (bp) in length to about 3 kilo base pairs in length.
  • An enhancer RNA may be a 1D eRNA or an eRNA that may be unidirectionally transcribed.
  • An enhancer RNA may also be a 2D eRNA or an eRNA that may be
  • a regulatory element may be a DNasel hypersensitive site (DHS).
  • DHS may be a region of chromatin unoccupied by transcription factors and which is sensitive to cleavage by the DNase I enzyme.
  • the presence of DHS regions within a chromatin may demarcate transcription factory occupancy at a nucleotide resolution.
  • the presence of DHS regions may further correlate with activation of cis-regulatory elements, such as an enhancer, promoter, silencer, insulator, or locus control region.
  • DHS variation may be correlated to variation in gene expression in healthy or diseased cells (e.g., cancerous cells) and/or correlated to phenotypic traits.
  • a DHS pattern may encode memory of prior cell fate decisions and exposures. For example, upon differentiation, a DHS pattern of a progeny may encode transcription factor occupancy of its parent. Further, a DHS pattern of a cell may encode an environmentally- induced transcription factor occupancy from an earlier time point.
  • a DHS pattern may encode cellular maturity.
  • An embryonic stem cell may encode a set of DHS s that may be transmitted combinatorially to a differentiated progeny, and this set of DHS s may be decreased with each cycle of differentiation. As such, the set of DHS s may be correlated with time, thereby allowing a DHS pattern to be correlated with cellular maturity.
  • a DHS pattern may also encode splicing patterns. Protein coding exons may be occupied by transcription factors, which may fijrther be correlated with codon usage patterns and amino acid choice on evolutionary time scales and human fitness. A transcription factory occupancy may further modulate alternative splicing patterns, for example, by imposing sequence constraints at a splice junction. As such, a DHS pattern may encode transcription factor occupancy of one or more exons of interest and may provide additional information on alternative splicing patterns.
  • a DHS pattern may encode a cell type. For example, within each cell type, about 100,000 to about 250,000 DHSs may be detected. About 5% of the detected DHSs may be located within a transcription start site and the remaining DHSs may be detected at a distal site from the transcription start site. Each cell type may contain a distinct DHS pattern at the distal site and mapping the DHS pattern at the distal site may allow identification of a cell type. An overlap may further be present within two DHS patterns from two different cell types, for example, an overlap of a set of detected DHSs within the two DHS patterns. An overlap may be less than about 70 of the detected DHSs. The presence of an overlap may not affect the identification of a cell type.
  • a regulatory element may be a polypeptide.
  • the polypeptide may be a protein or a polypeptide fragment.
  • a regulatory element may be a transcription factor, DNA- binding protein or functional fragment, RNA-binding protein or functional fragment, protein involved in chemical modification (e.g., involved in histone modification), or gene product.
  • a regulatory element may be a transcription factor.
  • a regulatory element may be a DNA or RNA-binding protein or functional fragment.
  • a regulatory element may be a product of a gene transcript.
  • a regulatory element may be a chromatin.
  • Described herein is a method of detecting a regulatory element.
  • the detection may encompass identification of the regulatory element, determining the presence or absence of the regulatory element, and/or determining the activity of the regulatory element.
  • a method of detecting a regulatory element may include contacting a cell sample with a detection agent, binding the detection agent to the regulatory element, and analyzing a detection profile from the detection agent to determine the presence, absence, or activity of the regulatory element.
  • the method may involve utilizing one or more intrinsic properties associated with a detection agent to aid in detection of the regulatory element.
  • the intrinsic properties may encompass the size of the detection agent, the intensity of the signal, and the location of the detection agent.
  • the size of the detection agent may include the length of the probe and/or the size of the detectable moiety (e.g., the size of a fluorescent dye molecule) may modulate the specificity of interaction with a regulatory element.
  • the intensity of the signal from the detection agent may correlate to the sensitivity of detection.
  • a detection agent with a molar extinction coefficient of about 0.5-5 x l0 6 M 1 cm 1 may have a higher intensity signal relative to a detection agent with a molar extinction coefficient outside of the 0.5-5 x l0 6 M 1 cm 1 range and may have lower attenuation due to scattering and absorption.
  • a detection agent with a longer excited state lifetime and a large Stoke shift may further improve the sensitivity of detection.
  • the location of the detection agent may, for example, provide the activity state of a regulatory element.
  • a combination of intrinsic properties of the detection agent may be used to detect a regulatory element of interest.
  • a detection agent may comprise a detectable moiety that is capable of generating a light, and a probe portion that is capable of hybridizing to a target site on a regulatory element.
  • a detection agent may include a DNA probe portion, an RNA probe portion, a polypeptide probe portion, or a combination thereof
  • a DNA or RNA probe portion may be between 10 and about 100 nucleotides in length.
  • a DNA or RNA probe portion may be 10 to 100, or more nucleotides in length.
  • a DNA or RNA probe portion may be a TALEN probe, ZFN probe, or a CRISPR probe.
  • a DNA or RNA probe portion may be a padlock probe.
  • a polypeptide probe may comprise a DNA- binding protein, a RNA-binding protein, a protein involved in the transcription/translation process, a protein that detects the transcription/translation process, a protein that may detect an open or relaxed portion of a chromatin, or a protein interacting partner of a product of a regulatory element (e.g., an antibody or binding fragment thereof).
  • a detection agent may comprise a DNA or RNA probe portion which may be between about 10 and about 100 nucleotides in length.
  • a detection agent may comprise a DNA or RNA probe portion which may be about 10 to 100, or more nucleotides in length.
  • a set of detection agents may be used to detect a regulatory element.
  • the set of detection agents may comprise 2 to 20, or more detection agents may be used for detection of a regulatory element.
  • a detection agent may comprise a polypeptide probe selected from a DNA-binding protein, a RNA-binding protein, a protein involved in the
  • transcription/translation process a protein that detects the transcription/translation process, a protein that may detect an open or relaxed portion of a chromatin, or a protein interacting partner of a product of a regulatory element (e.g., an antibody or binding fragment thereof).
  • a regulatory element e.g., an antibody or binding fragment thereof
  • a detectable moiety that is capable of generating a light may be directly conjugated or bound to a probe portion.
  • a detectable moiety may be indirectly conjugated or bound to a probe portion by a conjugating moiety.
  • a detectable moiety may be a small molecule (e.g., a dye) which may be directly conjugated or bound to a probe portion.
  • a detectable moiety may be a fluorescently labeled protein or molecule which may be attached to a conjugating moiety (e.g., a hapten group, an azido group, an alkyne group) of a probe.
  • a profile or a detection profile or signature may include the signal intensity, signal location, or size of the signal of the detection agent.
  • the profile or the detection profile may comprise about 100 image frames to 50,000 frames, or more frames.
  • Analysis of the profile or the detection profile may determine the activity of the regulatory element. The degree of activation may also be determined from the analysis of the profile or detection profile.
  • Analysis of the profile or the detection profile may further determine the optical isolation and localization of the detection agents, which may correlate to the localization of the regulatory element.
  • a detection agent may comprise a polypeptide probe selected from a DNA-binding protein, a RNA-binding protein, a protein involved in the transcription/translation process or detects the transcription/translation process, a protein that may detect an open or relaxed portion of a chromatin, or a protein interacting partner of a product of a regulatory element (e.g., an antibody or binding fragment thereof).
  • a polypeptide probe selected from a DNA-binding protein, a RNA-binding protein, a protein involved in the transcription/translation process or detects the transcription/translation process, a protein that may detect an open or relaxed portion of a chromatin, or a protein interacting partner of a product of a regulatory element (e.g., an antibody or binding fragment thereof).
  • a detectable moiety that is capable of generating a light is directly conjugated or bound to a probe portion.
  • a detectable moiety is indirectly conjugated or bound to a probe portion by a conjugating moiety.
  • a detectable moiety may be a small molecule (e.g., a dye) which may be directly conjugated or bound to a probe portion.
  • a detectable moiety may be a fluorescently labeled protein or molecule which may be attached to a conjugating moiety (e.g., a hapten group, an azido group, an alkyne group) of a probe.
  • a profile or a detection profile or signature may include the signal intensity, signal location, or size of the signal of the detection agent.
  • the profile or the detection profile may comprise about 100 frames to 50,000 frames or more images.
  • Analysis of the profile or the detection profile may determine the activity of the regulatory element. In some cases, the degree of activation may also be determined from the analysis of the profile or detection profile. In additional cases, analysis of the profile or the detection profile may fijrther determine the optical isolation and localization of the detection agents, which may correlate to the localization of the regulatory element.
  • a regulatory element may be DNA. Described herein is a method of detecting a DNA regulatory element, which may include contacting a cell sample with a detection agent, binding the detection agent to the DNA regulatory element, and analyzing a profile from the detection agent to determine the presence, absence, or activity of the DNA regulatory element.
  • a regulatory element may be RNA. Described herein is a method of detecting a RNA regulatory element, which may include contacting a cell sample with a detection agent, binding the detection agent to the RNA regulatory element, and analyzing a profile from the detection agent to determine the presence, absence, or activity of the RNA regulatory element.
  • a regulatory element may be an enhancer RNA (eRNA).
  • eRNA enhancer RNA
  • the presence of an eRNA may correlate to an activated regulatory element.
  • the production of an eRNA may correlate to the transcription of a target gene.
  • the detection of an eRNA element may indicate that a target gene downstream of the eRNA element may be activated.
  • a method of detecting an eRNA regulatory element which may include contacting a cell sample with a detection agent, binding the detection agent to the eRNA regulatory element, and analyzing a profile from the detection agent to determine the presence, absence, or activity of the eRNA regulatory element.
  • Described herein is an in situ method of detecting an activated regulatory DNA site, which may include incubating a sample with a set of detection agents (e.g., fluorescently-labeled probes), hybridizing the set of detection agents to at least one enhancer RNA (eRNA), and analyzing a profile (e.g., a fluorescent profile) from the set of detection agents to determine the presence of an eRNA, in which the presence of eRNA correlates to an activated regulatory DNA site.
  • a set of detection agents e.g., fluorescently-labeled probes
  • eRNA enhancer RNA
  • a regulatory element may be a DNasel hypersensitive site (DHS).
  • DHS DNasel hypersensitive site
  • a DNasel hypersensitive site may be an inactivated DNasel hypersensitive site.
  • hypersensitive site may be an activated DNasel hypersensitive site.
  • a method of detecting a DHS may include contacting a cell sample with a detection agent, binding the detection agent to the DHS, and analyzing a profile from the detection agent to determine the presence, absence, or activity of the DHS.
  • the DHS may be an active DHS and may further contain a single stranded DNA region.
  • the single stranded DNA region may be detected by Sl nuclease.
  • a method of detecting a DHS may further be extended to detect the presence of a single stranded DNA region within a DHS. Such a method, for example, may comprise contacting a cell sample with a detection agent, binding the detection agent to a single stranded region of a DHS, and analyzing a profile from the detection agent to determine the presence or absence of the single stranded region within a DHS.
  • Also described herein is a method of determining the activity level of a regulatory element, which may include incubating a cell sample with a set of detection agents (e.g., fluorescently labeled probes), in which each detection agent hybridizes to a DHS, measuring a signature (e.g., a fluorescent signature) from the set of detection agents, and based on the signature, determining a DHS profile, and comparing the DHS profile with a control, in which a correlation with the control indicates the activity level of the regulatory element in the cell sample.
  • the signature e.g., the fluorescent signature
  • the signature may further correlate to a signal intensity (or a peak height).
  • a set of signal intensities may be compiled into a DHS profile and compared with a control to generate a second DHS profile which comprises a set of relative signal intensities (or relative peak heights).
  • the set of relative signal intensities may correlate to the activity level of a regulatory element.
  • Also described herein is a method of generating a DHS map, which may provide information on cell-to-cell variation in gene expression, memory of early developmental fate decisions which establish lineage hierarchies, quantitation of embryonic stem cell DHS sites which decreases with cell passage, and presence of oncogenic elements.
  • the location of a set of DHS sites may be correlated to a cell type. For example, the location of about 1 to 60, or more DHS may be used to determine a cell type.
  • the cell may be a normal cell or a cancerous cell. DHS variation may be used to determine the presence of cancerous cells in a sample.
  • a method of determining a cell type may include incubating a cell sample with a set of detection agents (e.g., fluorescently labeled probes), in which each detection agent hybridizes to a DHS, measuring a signature (e.g., a fluorescent signature) from the set of detection agents, and based on the signature, determining a DHS profile, and comparing the DHS profile with a control, in which a correlation with the control indicates the cell type of the sample.
  • a set of detection agents e.g., fluorescently labeled probes
  • a signature e.g., a fluorescent signature
  • a DHS site may be visualized through a terminal deoxynucleotidyl transferase (TdT) dUTP Nick- End labeling (TUNEL) assay.
  • TdT terminal deoxynucleotidyl transferase
  • TUNEL dUTP Nick- End labeling
  • TdT deoxynucleotidyl transferase
  • a fluorescent moiety may further be conjugated to dEITP.
  • a TUNEL assay may be utilized for visualization of a plurality of DHSs present in a cell.
  • the sequence of a DHS site may be detected in situ, by utilizing an in situ sequencing methodology.
  • the two ends of a padlock probe may be hybridized to a target regulatory element sequence and the two ends may be further ligated together by a ligase (e.g., T4 ligase) when bound to the target sequence.
  • a ligase e.g., T4 ligase
  • An amplification e.g., a rolling circle amplification or RCA
  • a polymerase e.g., f29 polymerase
  • the amplified product at least about be sequenced by ligation in situ using partition sequencing compatible primers and labeled probes (e.g., fluorescently labeled probes).
  • labeled probes e.g., fluorescently labeled probes
  • each target sequence within the amplified product may bind to a primer and probe set resulting in a bright spot detectable by, e.g., an immunofluorescence microscopy.
  • the labeled probe e.g., the fluorescent label on the probe
  • at least 1 to at least 20, or more rounds of ligation and detection may occur for detection of a DHS site.
  • a control as used herein may refer to a DHS profile generated from a regulatory element whose activity level is known.
  • a control may also refer to a DHS profile generated from an inactivated regulatory element.
  • a control may further refer to a DHS profile generated from an activated or inactivated regulatory element from a specific cell type.
  • the cell type may be an epithelial cell, connective tissue cell, muscle cell, or nerve cell type.
  • the cell may be a cell derived from heart, lung, kidney, stomach, intestines, liver, pancreas, brain, esophagus, and the like.
  • the cell type may be a hormone- secreting cell, such as a pituitary cell, a gut and respiratory tract cell, thyroid gland cell, adrenal gland cell,
  • the cell may be a blood cell or a blood progenitor cell.
  • the cell may be an immune system cell, e.g., monocytes, dendritic cell, neutrophile granulocyte, eosinophil granulocyte, basophil granulocyte, hybridoma cell, mast cell, helper T cell, suppressor T cell, cytotoxic T cell, Natural Killer T cell, B cell, or natural killer cell.
  • a regulatory element may also be a chromatin.
  • a method of detecting a chromatin which may include contacting a cell sample with a detection agent, binding the detection agent to the chromatin, and analyzing a profile from the detection agent to determine the activity state of the chromatin.
  • the activity level of a chromatin may be determined based on the presence or activity level of a nucleic acid of interest or the presence or absence of a chromatin associated protein.
  • the activity level of a chromatin may be determined based on DHS locations.
  • the one or more DHS locations on a chromatin may be used to map chromatin activity state.
  • one or more DHSs may be localized in a region and the surrounding chromatin may be decompacted and readily visualized relative to an inactive chromatin state when a DHS is not present.
  • the one or more DHSs within a localized region may fijrther form a localized DHS set and a plurality of localized DHS sets may fijrther provide a global map or pattern of chromatin activity (e.g., an activity pattern).
  • RNA regulatory elements e.g., eRNA
  • chromatin associated proteins or gene products or a combination thereof
  • the method of generating a chromatin map may be based on the pattern of DNasel hypersensitive sites.
  • the method may comprise generating a 3-dimensional map from a detection profile (or a 2-dimensional detection profile).
  • a chromatin map may provide information on the compaction of chromatin, the spatial structure, spacing of regulatory elements, and localization of the regulatory elements to globally map chromatin structure and accessibility.
  • a chromatin map for a cell type may also be generated, in which each cell type comprises a different chromatin pattern.
  • Each cell type may be associated with at least one unique marker.
  • the at least one unique marker (or fiduciary marker) may be a genomic sequence.
  • the at least one unique marker (or fiduciary marker) may be DHS.
  • a cell type may comprise about 5, about 10, about 15, about 20, about 25, about 30, about 35, about 40, about 45, about 50, about 60, or more unique markers (or fiduciary markers).
  • the cell type may be an epithelia cell, a connective tissue cell, a muscle cell, a nerve cell, a hormone- secreting cell, a blood cell, an immune system cell, or a stem cell type.
  • the cell type may be a cancerous cell type.
  • a chromatin profile (e.g., based on DHSs) in the presence of an exogenous agent or condition may also be generated.
  • the method may comprise incubating a cell sample with a set of fluorescently labeled probes specific to target sites (e.g., target DHSs) on a chromatin in the presence of an exogenous agent or condition; measuring a fluorescent signature of the set of fluorescently labeled probes; based on the fluorescent signature, generating a fluorescent profile of the chromatin; and comparing the fluorescent profile with a second fluorescent profile of a chromatin obtained from an equivalent sample incubated with an equivalent set of fluorescently labeled probes in the absence of the exogenous agent or condition, wherein a difference between the two sets of fluorescent profiles indicates a change in the chromatin density (e.g., changes in the presences or activation of DHSs) induced by the exogenous agent or condition.
  • target sites e.g., target DHSs
  • the exogenous agent or condition may comprise a small molecule or a drug.
  • the exogenous agent may be a small molecule, such as a steroid.
  • the exogenous agent or condition may comprise an environmental factor, such as a change in pH, temperature, nutrient, or a combination thereof
  • the localization of a regulatory element may provide an activity state of the regulatory element.
  • the localization of a regulatory element may also provide an interaction state with at least one additional regulatory element.
  • the localization of a first regulatory element with respect to a second regulatory element may provide spatial coordinate and distance information between the two regulatory elements, and v further provide information regarding whether the two regulatory elements may interact with each other.
  • the activity state of a regulatory element may include, for example, a transcription or translation initiation event, a translocation event, or an interaction event with one or more additional regulatory elements.
  • the regulatory element may comprise DNA, RNA, polypeptides, or a combination thereof
  • the regulatory element may be DNA.
  • the regulatory element may be RNA.
  • the regulatory element may be an enhancer RNA (eRNA).
  • the regulatory element may be a DNasel hypersensitive site (DHS).
  • the DHS may be an inactive DHS or an active DHS.
  • the regulatory element may be a polypeptide.
  • the regulatory element may be chromatin.
  • the localization of a regulatory element may include contacting a regulatory element with a first set of detection agents, photobleaching the first set of detection agents for a first time point at a first wavelength to generate a second set of detection agents capable of generating a light at a second wavelength, detecting at least one burst generated by the second set of detection agents to generate a detection profile of the second set of detection agents, and analyzing the detection profile to determine the localization of the regulatory element.
  • a detection agent may comprise a detectable moiety that is capable of generating a light, and a probe portion that is capable of hybridizing to a target site on a regulatory element.
  • Each detection agent within the first set of detection agents may have the same or a different detectable moiety.
  • Each detection agent within the first set of detection agents may have the same detectable moiety.
  • a detectable moiety may comprise a small molecule (e.g., a fluorescent dye).
  • a detectable moiety may comprise a fluorescently labeled polypeptide, a fluorescently labeled nucleic acid probe, and/or a fluorescently labeled polypeptide complex.
  • a second set of detection agents may be generated from the first set of detection agents, in which the second set may include detection agents that are capable of generating a burst of light detectable at a second wavelength.
  • bleaching of the set of detection agents may lead to about 50%, about 60%, about 70%, about 80%, about 90%, or more detection agents within the set to enter into an“OFF-state.”
  • An“OFF-state” may be a dark state in which the detectable moiety crosses from the singlet excited or ON state to the triplet state or OFF-state in which detection of light (e.g., fluorescence) may be low (e.g., less than 10%, less than 5%, less than 1%, or less than 0.5% of the light may be detected).
  • the remainder of the detection agents that have not entered into the OFF-state may generate bursts of lights, or to cycle between a singlet excited state (or ON- state) and a singlet ground state.
  • bleaching of the set of detection agents may generate about 40%, about 30%, about 20%, about 10%, about 5%, or less detection agents within the set that may generate bursts of lights.
  • the bursts of lights may be detected stochastically, at a single burst level in which each burst of light correlates to a single detection agent.
  • a single wavelength may be used for photobleaching a set of detection agents. At least two wavelengths may be used for photobleaching a set of detection agents.
  • wavelength at 491 nm may be used.
  • a wavelength at 405 nm may be used in combination with the wavelength at 491 nm.
  • the two wavelengths may be applied simultaneously to photobleach a set of detection agents. Alternatively, the two wavelengths may be applied sequentially to photobleach a set of detection agents.
  • the time for photobleaching a set of detection agents may be from about 10 seconds to about 4 hours, or more.
  • the concentration of the detection agents may be from about 5 nM to about 1 mM.
  • the burst of lights from the set of detection agents may generate a detection profile.
  • the detection profile may comprise about 100 image frames to about 50,000 frames, or more.
  • the detection profile may also include the signal intensity, signal location, or size of the signal. Analysis of the detection profile may determine the optical isolation and localization of the detection agents, which may correlate to the localization of the regulatory element.
  • the detection profile may comprise a chromatic aberration correction.
  • the detection profile may comprise less than 5%, chromatic aberration.
  • the detection profile may comprise 0% chromatic aberration.
  • More than one regulatory element may be detected at the same time. At least 2 to 20, or more regulatory elements may be detected at the same time. Each of the regulatory elements may be detected by a set of detection agents. The detectable moiety between the different set of detection agents may be the same. For example, two different sets of detection agents may be used to detect two different regulatory elements and the detectable moieties from the two sets of detection agents may be the same. As such, at least 2 to at least 20, or more regulatory elements may be detected at the same time at the same wavelength.
  • the detectable moiety between the different set of detection agents may also be different.
  • two different sets of detection agents may be used to detect two different regulatory elements and the detectable moiety from one set of detection agents may be detected at a different wavelength from the detectable moiety of the second set of detection agents.
  • at least 2 to 20, or more regulatory elements may be detected at the same time in which each of the regulatory elements may be detected at a different wavelength.
  • the regulatory element may comprise DNA, RNA, polypeptides, or a combination thereof
  • the method may include detection of a regulatory element and one or more products of the regulatory element.
  • One or more products of the regulatory element may also include intermediate products or elements.
  • the method may comprise contacting a cell sample with a first set and a second set of detection agents, in which the first set of detection agents interact with a target regulatory element within the cell and the second set of detection agents interact with at least one product of the target regulatory element, and analyzing a detection profile from the first set and the second set of detection agents, in which the presence or the absence of the at least one product indicates the activity of the target regulatory element.
  • a detection agent may comprise a detectable moiety that is capable of generating a light, and a probe portion that is capable of hybridizing to a target site on a regulatory element.
  • Each detection agent within the first set of detection agents may have the same or a different detectable moiety.
  • Each detection agent within the first set of detection agents may have the same detectable moiety.
  • a detectable moiety may comprise a small molecule (e.g., a fluorescent dye).
  • a detectable moiety may comprise a fluorescently labeled polypeptide, a fluorescently labeled nucleic acid probe, and/or a fluorescently labeled polypeptide complex.
  • the method may also allow photobleaching of the first set and the second set of detection agents, thereby generating a subset of detection agents capable of generating a burst of light.
  • a detection profile may be generated from the detection of a set of light bursts, in which the presence or the absence of the at least one product may indicate the activity of the target regulatory element.
  • the regulatory element may comprise DNA, RNA, polypeptides, or a combination thereof
  • the regulatory element may be DNA.
  • the regulatory element may be RNA.
  • the regulatory element may be an enhancer RNA (eRNA).
  • the presence of an eRNA may correlate with target gene transcription that is downstream of eRNA.
  • the regulatory element may be a DNasel hypersensitive site (DHS).
  • the DHS may be an activated DHS.
  • the pattern of the DHS on a chromatin may correlate to the activity of the chromatin.
  • the regulatory element may be a polypeptide, e.g., a transcription factor, a DNA or RNA-binding protein or binding fragment thereof, or a polypeptide that is involved in chemical modification.
  • the regulatory element may be chromatin.
  • a target nucleic acid sequence may be a nucleic acid sequence of interest or may encode a DNA, RNA, or protein of interest or a portion thereof
  • a DNA, RNA, or protein of interest may be a DNA, RNA, or protein produced by a cell or contained within a cell.
  • a target nucleic acid sequence may be incorporated into a structure of a cell.
  • a target nucleic acid sequence may also be associated with a cell.
  • a target nucleic acid sequence may be in contact with the exterior of a cell.
  • a target nucleic acid sequence may be unassociated with a structure of a cell.
  • a target nucleic acid sequence may be a circulating nucleic acid sequence.
  • a target nucleic acid sequence or a portion thereof may be artificially constructed or modified.
  • a target nucleic acid sequence may be a natural biological product.
  • a target nucleic acid sequence may be a short nucleic acid sequence.
  • a target nucleic acid sequence may be a nucleic acid sequence that is from a source that is exogenous to a cell.
  • a target nucleic acid sequence may be an endogenous nucleic acid sequence.
  • a target nucleic acid sequence may be a nucleic acid sequence that comprises a combination of an endogenous nucleic acid sequence and a nucleic acid sequence from a source that is exogenous to a cell.
  • a target nucleic acid sequence may be a chromosomal nucleic acid sequence or fragment thereof
  • a target nucleic acid sequence may be an episomal nucleic sequence or fragment thereof
  • a target nucleic acid sequence may be a sequence resulting from somatic rearrangement or somatic hypermutation, such as a nucleic acid sequence from a T cell receptor, B cell receptor, or fragment thereof
  • a nucleic acid of a cell or sample which may comprise the target nucleic acid sequence, may comprise a deoxyribonucleic acid (DNA) or a ribonucleic acid (RNA), or a combination thereof
  • a nucleic acid may be a chromosome, an oligonucleotide, a plasmid, an artificial chromosome, or a fragment or portion thereof
  • a nucleic acid may comprise genomic DNA, episomal DNA, complementary DNA, mitochondrial DNA, recombinant DNA, cell-free DNA (cfDNA), messenger RNA (mRNA), pre-mRNA, microRNA (miRNA), transfer RNA (tRNA), transfer messenger RNA (tmRNA), ribosomal RNA (rRNA), heterogeneous nuclear RNA (hnRNA), short interfering RNA (siRNA), anti- sense RNA, or short hairpin RNA (shRNA).
  • a nucleic acid may be single-stranded, double- stranded, or a combination thereof
  • a target nucleic acid sequence may comprise a naturally occurring nucleic acid sequence, an artificially constructed nucleic acid sequence (such as an artificially synthesized nucleic acid sequence), or a modified nucleic acid sequence (such as a naturally occurring nucleic acid sequence that has been altered or modified through a natural or artificial process).
  • a naturally occurring nucleic acid sequence may comprise a nucleic acid sequence present in a cellular sample.
  • a naturally occurring nucleic acid sequence may comprise a nucleic acid sequence present in an unfixed cell.
  • a naturally occurring nucleic acid sequence may comprise a nucleic acid sequence derived from a cellular sample.
  • a nucleic acid sequence may also be derived from a vims (such as a viral nucleic acid sequence from a lentivirus or adenovirus).
  • a naturally occurring nucleic acid sequence may comprise a nucleic acid sequence present in an acellular sample.
  • a naturally occurring nucleic acid sequence may comprise a nucleic acid sequence derived from an acellular sample.
  • a nucleic acid sequence may be a cell-free DNA sequence present in a bodily fluid (such as a sample of cerebrospinal fluid).
  • a nucleic acid may comprise a target nucleic acid sequence that is not endogenous to the source (exogenous) from which it was taken or in which it is analyzed.
  • a nucleic acid may be an artificially synthesized oligonucleotide.
  • a nucleic acid sequence may comprise one or more modifications.
  • a modification may be a post-translational modification of a nucleic acid sequence or an epigenetic modification of nucleic acid sequence (e.g., modification to the methylation of a nucleic acid sequence).
  • a modification may be a genetic modification.
  • a genetic modification to a nucleic acid sequence may be an insertion, a deletion, or a substitution of a nucleic acid sequence.
  • a nucleic acid sequence modification may comprise an insertion may comprise transformation, transduction, or transfection of a sample.
  • a nucleic acid sequence modification comprising an insertion may result from infection or transduction of a cell with a virus and subsequent incorporation of a viral nucleic acid sequence into a nucleic acid sequence of the cells, such as the cell’s genomic DNA.
  • the integrated viral nucleic acid sequence (viral integrant) or fragment thereof may be the target nucleic acid sequence.
  • Modification of a nucleic acid sequence may be an artificial modification, resulting from, for instance, genetic engineering or intentional nucleic acid sequence modification during nucleic acid fabrication.
  • a nucleic acid sequence may be the result of somatic rearrangement.
  • a modification to a nucleic acid sequence comprising an insertion, deletion or substitution may comprise a difference between the nucleic acid sequence and a reference sequence.
  • a reference sequence may be a nucleic acid sequence in a database, an artificial nucleic acid, a viral nucleic acid sequence, a nucleic acid sequence of the same cell, a nucleic acid sequence of a cell from the tissue, a nucleic acid sequence from a different tissue of the same subject, or a nucleic acid sequence from a subject of a different species.
  • a modification to a nucleic acid sequence may comprise a difference in 1 nucleotide (a single nucleotide polymorphism, SNP), from 1 to 1,000 nucleotides.
  • Modification to a nucleic acid sequence comprising a difference in a plurality of nucleotides may comprise differences in two or more adjacent nucleotides or nucleotide sequences relative to a reference nucleic acid sequence.
  • Modifications to a nucleic acid sequence comprising a difference in a plurality of nucleotides may also comprise differences in two or more non- adjacent nucleotides or nucleotide sequences (such as two or more modifications to the nucleic acid sequence that are separated by at least one nucleotide) relative to a reference nucleic acid sequence.
  • a target sequence may be assayed in situ or it may be isolated and/or purified from a cellular or acellular sample.
  • a target sequence comprising a nucleic acid may comprise a portion (a region) of genomic DNA located in situ in the nucleus of a fixed (intact) cell.
  • a target sequence may comprise a nucleic acid sequence that is isolated from a sample (such as an aliquot of cerebrospinal fluid).
  • Detection agents may be utilized to detect nucleic acid sequence of interest.
  • a detection agent may comprise a probe portion.
  • the probe portion may include a probe, or a combination of probes.
  • the probe portion may comprise a nucleic acid molecule, a polypeptide, or a combination thereof
  • the detection agents may further comprise a detectable moiety.
  • the detectable moiety may comprise a fluorophore.
  • a fluorophore may be a molecule that may absorb light at a first wavelength and transmit or emit light at a second wavelength.
  • the fluorophore may be a small molecule (such as a dye) or a fluorescent polypeptide.
  • the detectable moiety may be a fluorescent small molecule (such as a dye).
  • the detectable moiety may not contain a fluorescent polypeptide.
  • the detection agent may further comprise a conjugating moiety.
  • the conjugating moiety may allow attachment of the detection agent to a nucleic acid sequence of interest.
  • the detection agent may comprise a probe that is synthesized with direct dye incorporation at the 3’ end or 5’ end.
  • a detection agent may comprise a probe portion.
  • a probe portion may comprise a probe or a combination of probes.
  • a probe may be a nucleic acid probe, a polypeptide probe, or a combination thereof
  • a probe portion may be an unconjugated probe that does not contain a detectable moiety.
  • a probe portion may be a conjugated probe which comprises a single probe with a detectable moiety, or two or more probes in which at least one probe may be an unconjugated probe bound to at least a second probe which comprises a detectable moiety.
  • a probe may be a nucleic acid probe.
  • the nucleic acid probe may be a DNA probe, a RNA probe, or a combination thereof
  • the nucleic acid probe may be a DNA probe.
  • the nucleic acid probe may be a RNA probe.
  • the nucleic acid probe may be a double stranded nucleic acid probe, a single stranded nucleic acid probe, or may contain single- stranded and/or double stranded portions.
  • the nucleic acid probe may further comprise overhangs on one or both termini, may further comprises blunt ends on one or both termini, or may further form a hairpin.
  • the nucleic acid probe may be at least 10 to about 100 nucleotides in length.
  • TABLE 3 lists exemplary nucleotide sequences according to the present disclosure.
  • a nucleic acid probe may be a non-labeled probe, or a probe that does not contain a detectable moiety.
  • a non-labeled probe may further interact with a labeled probe (e.g., a labeled nucleic acid probe).
  • a non-labeled probe may hybridize with a labeled nucleic acid probe.
  • a non-labeled probe may also interact with a labeled polypeptide probe.
  • the labeled polypeptide probe may be a protein that recognizes a sequence within the non-labeled probe.
  • a labeled probe may include a nucleic acid portion and a polypeptide tag portion and the polypeptide tag portion may further interact with a molecule comprising a detectable moiety.
  • a non-labeled probe may be a nucleic acid probe comprising a streptavidin which may interact with a biotinylated molecule comprising a detectable moiety.
  • a nucleic acid probe may comprise about 95%, about 96%, about 97%, about 98%, about 99%, or about 100% sequence specificity or sequence complementarity to a target site of a regulatory element.
  • a nucleic acid probe may comprise about 95%, about 96%, about 97%, about 98%, about 99%, or about 100% sequence specificity or sequence
  • a nucleic acid probe may comprise about 95%, about 96%, about 97%, about 98%, about 99%, or about 100% sequence specificity or sequence complementarity to a target viral nucleic acid sequence
  • the hybridization may be a high stringent hybridization condition.
  • a nucleic acid probe may hybridize with a genomic sequence that is present in low or single copy numbers (e.g., genomic sequences that are not repetitive elements).
  • repetitive element refers to a DNA sequence that is present in many identical or similar copies in the genome.
  • Repetitive elements are not intended to refer to a DNA sequence that is present on each copy of the same chromosome (e.g., a DNA sequence that is present only once, but is found on both copies of chromosome 1 1, would not be considered a repetitive element, and would be considered a sequence that is present in the genome as one copy).
  • the genome may consist of three broad sequence components : single copy or at least very low copy number DNA (approximately 60% of the human genome); moderately repetitive elements (approximately 30% of the human genome); and highly repetitive elements (approximately 10% of the human genome).
  • very low copy number DNA approximately 60% of the human genome
  • moderately repetitive elements approximately 30% of the human genome
  • highly repetitive elements approximately 10% of the human genome.
  • a nucleic acid probe may have reduced off-target interaction.
  • off- target or“off-target interaction” may refer to an instance in which a nucleic acid probe against a given target hybridizes or interact with another target site (e.g., a different DNA sequence, RNA sequence, or a cellular protein or other moiety).
  • a nucleic acid probe may further be cross-linked to a target site of a regulatory element.
  • the nucleic acid probe may be cross-linked by a photo- crosslinking means such as UV or by a chemical cross-linking means such as by formaldehyde, or through a reactive group within the nucleic acid probe.
  • Reactive group may include sulfhydryl- reactive linkers such as bismaleimidohexane (BMH), and the like.
  • a nucleic acid probe may include natural or unnatural nucleotide analogues or bases or a combination thereof
  • the unnatural nucleotide analogues or bases may comprise modifications at one or more of ribose moiety, phosphate moiety, nucleoside moiety, or a combination thereof
  • the unnatural nucleotide analogues or bases may comprise 2’-0- methyl, 2’-0-methoxyethyl (2’-0-MOE), 2’-0-aminopropyl, 2'-deoxy, T-deoxy-2'-lluoro, 2'- O-aminopropyl (2'-0-AP), 2'-0-dimethylaminoethyl (2'-0-DMAOE), 2'-0- dimethylaminopropyl (2'-0-DMAP), T-O- dimethylaminoethyloxyethyl (2'-0-DMAEOE), or 2'-0-N-methylacetamido (2'-0-NMA)
  • a nucleic acid probe may be a locked nucleic acid probe (such as a labeled locked nucleic acid probe), a labeled or unlabeled peptide nucleic acid (PNA) probe, a labeled or unlabeled oligonucleotide, an oligopaint, an ECHO probe, a molecular beacon probe, a padlock (or molecular inversion probe), a labeled or unlabeled toe-hold probe, a labeled TALE probe, a labeled ZFN probe, or a labeled CRISPR probe.
  • a locked nucleic acid probe such as a labeled locked nucleic acid probe
  • PNA labeled or unlabeled peptide nucleic acid
  • a nucleic acid probe may be a labeled or unlabeled locked nucleic acid probe or a labeled or unlabeled peptide nucleic acid probe.
  • Locked nucleic acid probes and peptide nucleic acid probes are known to those of skill in the art and are described in Briones et al., Anal Bioanal Chem (2012) 402:3071-3089.
  • a nucleic acid probe may be a padlock (or molecular inversion probe).
  • a padlock probe may be hybridized to a target regulatory element sequence in which the two ends may correspond to the target sequence.
  • a padlock probe may be ligated together by a ligase (such as T4 ligase) when bound to the target sequence.
  • An amplification (such as a rolling circle amplification or RC A) may be performed utilizing for example f29 polymerase, which may result in a single stranded DNA comprising multiple tandem copies of the target sequence.
  • a nucleic acid probe may be an oligopaint as described in U.S. Publication No. 2010/0304994; and in Beliveau, et al,“Versatile design and synthesis platform for visualizing genomes with oligopaint FISH probes,” PNAS 109(52): 21301-21306 (2012). Oligopaint may refer to detectably labeled polynucleotides that have sequences
  • Oligopaints may be generated from synthetic probes and arrays that are, optionally, computationally patterned (rather than using natural DNA sequences and/or chromosomes as a template).
  • a nucleic acid probe can be a labeled or unlabeled toe-hold probe.
  • Toe-hold probes are known to those of skill in the art as described in Zhang et al., Optimizing the Specificity of Nucleic Acid Hybridization, Nature Chemistry 4: 208-214 (2012).
  • a nucleic acid probe may be a molecular beacon.
  • Molecular beacons may be hairpin shaped molecules with an internally quenched fluorophore whose fluorescence is restored when they bind to a target nucleic acid sequence.
  • Molecular beacons are known to those of skill in the ari as described in Guo et al., Anal. Bioanal. Chem. (2012) 402:3115-3125.
  • a nucleic acid probe may be an ECHO probe.
  • ECHO probes may be sequence- specific, hybridization- sensitive, quencher-free fluorescent probes for RNA detection, which may be designed using the concept of fluorescence quenching caused by intramolecular excitonic interaction of fluorescent dyes.
  • ECHO probes are known to those of skill in the art as described in Kubota et al., PLoS ONE, Vol. 5, Issue 9, el3003 (2010); or Okamoto, Chem. Soc. Rev., 2011, 40, 5815-5828, Wang et al., ENA (2012), 18:166- 175.
  • a probe may be a clustered regularly interspaced palindromic repeat (CRISPR) probe.
  • CRISPR clustered regularly interspaced palindromic repeat
  • the CRISPR system may use a Cas9 protein to recognize DNA sequences, in which the target specificity may be solely determined by a small guide (sg) RNA and a protospacer adjacent motif (PAM).
  • sg small guide
  • PAM protospacer adjacent motif
  • the Cas9-sgRNA complex may generate a DNA double- stranded break.
  • a Cas9 protein may be replaced with an endonuclease- deactivated Cas9 (dCas9) protein.
  • dCas9 endonuclease- deactivated Cas9
  • imaging a cell such as by fluorescence in situ hybridization (FISH) may be achieved by synthesizing a dCas9 within the cell, synthesizing RNA within the cell to bind genomic DNA and to complex with the dCas9 forming a dCas9/RNA complex, labeling the dCas9/RNA complex, and imaging the labeled dCas9/RNA complex within the live cell bound to genomic DNA.
  • FISH fluorescence in situ hybridization
  • endonuclease- deactivated Cas9 may be synthesized in vivo by using an integrated construct, a transiently transfected construct, by injection into the cell of a syncitia of nuclei or via electroporation into cells and/or nuclei.
  • a probe may comprise an endonuclease- deactivated Cas9 (dCas9) protein as described in Chen et al.,“Dynamic imaging of genomic loci in living human cells by an optimized CRISPR/Cas system,” Cell 155(7): 1479- 1491 (2013); or Ma et al.,“Multicolor CRISPR labeling of chromosomal loci in human cells,” PNAS 112(10): 3002-3007 (2015).
  • the dCas9 protein may be lurther labeled with a detectable moiety.
  • the RNA of the Cas9/RNA complex may be synthesized in vivo by using an integrated construct, a transiently transfected construct, by injection into the cell of a syncitia of nuclei or via electroporation into cells and/or nuclei.
  • the Cas9/RNA complex may be labeled by making a fusion protein that includes Cas9 and a reporter, by injection of RNA that has been attached to a reporter into the cell or by a syncitia of nuclei including RNA that has been attached to a reporter, by electroporation into cells or nuclei or by indirect labeling of the RNA by hybridization with a labeled secondary oligonucleotide.
  • the label may be a conditional reporter, based on the binding of Cas9/RNA to the target nucleic acid. The label may be quenched and may then be activated upon the Cas9/RNA complex binding to the target nucleic acid.
  • a probe may be a transcription activator- like effector nuclease (TALEN) probe or a zinc- finger nuclease (ZFN) probe.
  • TALEN transcription activator- like effector nuclease
  • ZFN zinc- finger nuclease
  • a probe disclosed herein may be a polypeptide probe.
  • a polypeptide probe may include a protein or a binding fragment thereof that interacts with a target site (such as a nucleic acid target site or a protein target) of interest.
  • a polypeptide probe may comprise a DNA-binding protein, a RNA-binding protein, a protein involved in the
  • transcription/translation process or detects the transcription/translation process, a protein that may detect an open or relaxed portion of a chromatin, or a protein interacting partner of a product of a regulatory element.
  • a polypeptide probe may be a DNA-binding protein.
  • the DNA-binding protein may be a transcription factor that modulates the transcription process, polymerases, or histones.
  • a DNA-binding protein may comprise a zinc linger domain, a helix-turn-helix domain, a leucine zipper domain (such as a basic leucine zipper domain), a high mobility group box (HMG-box) domain, and the like.
  • the DNA-binding protein may interact with a nucleic acid region in a sequence specific manner.
  • the DNA-binding protein may interact with a nucleic acid region in a sequence non-specific manner.
  • the DNA-binding protein may interact with single- stranded DNA.
  • the DNA-binding protein may interact with double- stranded DNA.
  • the DNA-binding protein probe may fijrther comprise a detectable moiety.
  • a polypeptide probe may be a RNA-binding protein.
  • the RNA-binding protein may participate in forming ribonucleoprotein complexes.
  • the RNA-binding protein may modulate post-transcription such as in splicing, polyadenylation, mRNA stabilization, mRNA localization, or in translation.
  • a RNA-binding protein may comprise a RNA recognition motif (RRM), dsRNA binding domain, zinc finger domain, K-Homology domain (KH domain), and the like.
  • the RNA-binding protein may interact with single- stranded RNA.
  • the RNA-binding protein may interact with double- stranded RNA.
  • the RNA-binding protein probe may fijrther comprise a detectable moiety.
  • a polypeptide probe may be a protein that may detect an open or relaxed portion of a chromatin.
  • the polypeptide probe may be a modified enzyme that lacks cleavage activity.
  • the modified enzyme may be an enzyme that recognizes DNA or RNA (double-stranded or single- stranded). Examples of modified enzymes may be obtained from oxidoreductases, transferases, hydrolases, lyases, isomerases, or ligases.
  • a modified enzyme may be an endonuclease (such as a deactivated restriction endonuclease such as the TALEN or CRISPR probes described herein).
  • a polypeptide probe may be an antibody or binding fragment thereof
  • the antibody or binding fragment thereof may be a protein interacting partner of a product of a regulatory element.
  • the antibody or binding fragment thereof may comprise a humanized antibody or binding fragment thereof, murine antibody or binding fragment thereof chimeric antibody or binding fragment thereof, monoclonal antibody or binding fragment thereof monovalent Fab’, divalent Fab2, F(ab)'3 fragments, single-chain variable fragment (scFv), bis-scFv, (scFv)2, diabody, minibody, nanobody, triabody, tetrabody, disulfide stabilized Fv protein (dsFv), single-domain antibody (sdAb), Ig NAR, camelid antibody or binding fragment thereof or a chemically modified derivative thereof
  • the antibody or binding fragment thereof may further comprise a detectable moiety.
  • probes may be used together in a probe set to detect a nucleic acid sequence using Nano-FISH.
  • a probe set can also be referred to herein as a“probe pool.”
  • the probe set may be designed for the detection of the target nucleic acid sequence.
  • the probe set may be optimized for probes based on GC content, l6mer base matches (for determining binding specificity of the probe), and their predicted melting temperature when hybridized.
  • the l6mer base matches may have a total of 24 matches to the l6mer database. In some embodiments, probe sets with greater than 100 l6-mer database matches may be discarded.
  • Exemplary probe nucleotide sequences are shown in TABLE 3 for probe sets for different target sequences.
  • Some exemplary probe sequences may be target sequences located in the GREB1 promoter of chromosome 2, ER iDHSl of chromosome 2, ER iDHS2 of chromosome 2, HBGlup of chromosome 11, HBG2 up of chromosome l l, HSl of chromosome 11, HS2 of chromosome 11, HS3 of chromosome 11, HS4 of chromosome 11, HS5 of chromosome 11, HS 1 Lflank of chromosome 11, HS1 2flank of chromosome 11, HS2 3 flank of chromosome l l, HS3 4flank of chromosome 11, F1S4 5 flank of chromosome 11, HS5 Rflank of chromosome 11, CCND1 SNP of chromosome 11, CCND1 CTL of chromosome 11, the
  • GREB1 is gene that may be induced by estrogen stimulation of MCF-7 breast cancer cells.
  • ER iDHS l and ER iDHS2 are DHS that may be induced by estrogen stimulation of MCF-7 breast cancer cells.
  • HBGlup and HBG2up are hemoglobin genes expressed in K562 erthyroleukemia cells.
  • HS1, HS2, HS3, HS4, and HS5 are hypersensitive sits in the beta-globin locus control region, and HSl Lflank, HS2 3flank, HS3 4flank, HS4 5flank, HS5 Rflank are sequences in the intervening regions between HS1-HS5.
  • CCND SNP is an enhancer for the CCND1 gene
  • CCND1 CTL is a control region adjacent to the CCND1 SNP
  • the CCND1 promoter is the promoter region of the CCND1 gene.
  • Chromosome 18 deadl, Chromosome 18 dead 2, and Chromosome 18 dead3 are non hypersensitive regions of chromosome 18.
  • the CNOT promoter is the promoter (active region) of CNOT.
  • the TSEN promoter is the promoter (active region) of TSEN.
  • the KLK2 promoter is the promoter KLK2.
  • the KLK3 promoter is the promoter of KLK3.
  • KLK eRNA is an enhancer for the KLK2 gene and/or the KLK3 gene, and which may also enhance RNA.
  • a probe set comprising at least nine different Q570 labeled probes selected from the group consisting of SEQ ID NO: 1 - SEQ ID NO: 39 may be used to detect the GREB1 promoter in chromosome 2.
  • a Q570 labeled probe set comprising probes with SEQ ID NO: 7 - SEQ ID NO: 35 may be used to detect the GREB1 promoter in chromosome 2.
  • a probe set comprising at least nine different Q670 labeled probes selected from the group consisting of SEQ ID NO: 40 - SEQ ID NO: 72 may be used to detect the ER iDHS 1 in chromosome 2.
  • a probe set comprising at least nine different Q670 labeled probes selected from the group consisting of SEQ ID NO: 73 - SEQ ID NO: 104 may be used to detect the ER iDHS 2 in chromosome 2.
  • a probe set comprising at least nine different Q570 labeled probes selected from the group consisting of SEQ ID NO: 105 - SEQ ID NO: 134 may be used to detect the HBGlup in chromosome 11.
  • a probe set comprising at least nine different Q570 labeled probes selected from the group consisting of SEQ ID NO: 135 - SEQ ID NO: 164 may be used to detect the HBG2up in chromosome 11.
  • a probe set comprising at least nine different Q570/670 labeled probes selected from the group consisting of SEQ ID NO: 165 - SEQ ID NO: 194 may be used to detect HS1 in chromosome 11.
  • a probe set comprising at least nine different Q570/670 labeled probes selected from the group consisting of SEQ ID NO: 195 - SEQ ID NO: 224 may be used to detect HS2 in chromosome 11.
  • a probe set comprising at least nine different Q570/670 labeled probes selected from the group consisting of SEQ ID NO: 225 - SEQ ID NO: 254 may be used to detect HS3 in chromosome 11.
  • a probe set comprising at least nine different Q670 labeled probes selected from the group consisting of SEQ ID NO: 255 - SEQ ID NO: 298 may be used to detect HS4 in chromosome 11.
  • a probe set comprising at least nine different Q570/670 labeled probes selected from the group consisting of SEQ ID NO: 299 - SEQ ID NO: 340 may be used to detect HS5 in chromosome 11.
  • a probe set comprising at least nine different Q670 labeled probes selected from the group consisting of SEQ ID NO: 341 - SEQ ID NO: 370 may be used to detect HS1 Lflank in chromosome 11.
  • a probe set comprising at least nine different Q570 labeled probes selected from the group consisting of SEQ ID NO: 371 - SEQ ID NO: 400 may be used to detect HS1 2flank in chromosome 11.
  • a probe set comprising at least nine different Q670 lab eled probes selected from the group consisting of SEQ ID NO: 401 - SEQ ID NO: 430 may be used to detect HS2 3llank in chromosome 11.
  • a probe set comprising at least nine different Q570 labeled probes selected from the group consisting of SEQ ID NO: 431 - SEQ ID NO: 460 may be used to detect HS3 4flank in chromosome 11.
  • a probe set comprising at least nine different Q670 labeled probes selected from the group consisting of SEQ ID NO: 461 - SEQ ID NO: 484 may be used to detect HS4 5flank in chromosome 11.
  • a probe set comprising at least nine different Q570 labeled probes selected from the group consisting of SEQ ID NO: 485 - SEQ ID NO: 514 may be used to detect HS5 Rflank in chromosome 11.
  • a probe set comprising at least nine different Q570 labeled probes selected from the group consisting of SEQ ID NO: 515 - SEQ ID NO: 544 may be used to detect CCND1 SNP in chromosome 11.
  • a probe set comprising at least nine different Q670 labeled probes selected from the group consisting of SEQ ID NO: 545, SEQ ID NO: 539 - SEQ ID NO: 544, or SEQ ID NO: 546 - SEQ ID NO: 564 may be used to detect CCND1 CTL in chromosome 11.
  • a probe set comprising at least nine different Q670 labeled probes selected from the group consisting of SEQ ID NO: 559 - SEQ ID NO: 592 may be used to detect the CCND1 promoter in chromosome 11.
  • a probe set comprising at least nine different Q670 labeled probes selected from the group consisting of SEQ ID NO: 593 - SEQ ID NO: 622 may be used to detect Chromosome 18 deadl in chromosome 18.
  • a probe set comprising at least nine different Q670 labeled probes selected from the group consisting of SEQ ID NO: 623 - SEQ ID NO: 652 may be used to detect Chromosome 18 dead2 in chromosome 18.
  • a probe set comprising at least nine different Q670 labeled probes selected from the group consisting of SEQ ID NO: 653 - SEQ ID NO: 682 may be used to detect Chromosome 18 dead3 in chromosome 18.
  • a probe set comprising at least nine different Q670 labeled probes selected from the group consisting of SEQ ID NO: 683 - SEQ ID NO: 712 may be used to detect the CNOT3 promoter in chromosome 19.
  • a probe set comprising at least nine different Q670 labeled probes selected from the group consisting of SEQ ID NO: 713 - SEQ ID NO: 742 may be used to detect the TSEN34 promoter in chromosome 19.
  • a probe set comprising at least nine different Q670 labeled probes selected from the group consisting of SEQ ID NO: 743 - SEQ ID NO: 772 may be used to detect CNOT3 interl in chromosome 19.
  • a probe set comprising at least nine different Q670 labeled probes selected from the group consisting of SEQ ID NO: 773 - SEQ ID NO: 802 may be used to detect CNOT3 inter2 in chromosome 19.
  • a probe set comprising at least nine different Q670 labeled probes selected from the group consisting of SEQ ID NO: 803 - SEQ ID NO: 832 may be used to detect CNOT3 inter3 in chromosome 19.
  • a probe set comprising at least nine different Q570 labeled probes selected from the group consisting of SEQ ID NO: 833 - SEQ ID NO: 862 may be used to detect the KLK2 promoter in chromosome 19.
  • a probe set comprising at least nine different Q670 labeled probes selected from the group consisting of SEQ ID NO:
  • SEQ ID NO: 929 may be used to detect KLK eRNA in chromosome 19.
  • a probe set comprising at least at least nine different probes labeled with a detection agent selected from the group consisting of SEQ ID NO: 930 - SEQ ID NO: 1061 may be used to detect an HIV nucleic acid sequence.
  • a detecting agent may comprise a detectable moiety.
  • a detectable moiety may be a small molecule (such as a dye) or a macromolecule.
  • a macromolecule may include polypeptides (such as proteins and/or protein fragments), nucleic acids, carbohydrates, lipids, macrocycles, polyphenols, and/or endogenous macromolecule complexes.
  • a detectable moiety may be a small molecule.
  • a detectable moiety may be a macromolecule.
  • a detectable moiety may include a moiety that is detectable by a colorimetric method or a fluorescent method.
  • a colorimetric method may be an assay which utilizes reagents that undergo a measurable color change in the presence of an analyte (such as an enzyme, an antibody, a compound, a hormone).
  • Exemplary colorimetric method may include enzyme- mediated detection method such as tyramide signal amplification (TSA) which utilizes horseradish peroxidase (HRP) to generate a signal when digested by tyramide substrate and 3,3’,5,5’-Tetramethylbenzidine (TMB) which generates a blue color upon oxidation to 3,3’5,5’-tetramethylbenzidine diamine in the presence of a peroxidase enzyme such as HRP.
  • TSA tyramide signal amplification
  • HRP horseradish peroxidase
  • TMB 3,3’,5,5’-Tetramethylbenzidine
  • HRP peroxidase enzyme
  • a detectable moiety described herein may include a moiety that is detectable by a colorimetric method.
  • a detectable moiety may also include a moiety that is detectable by a fluorescent method. Sometimes, the detectable moiety may be a fluorescent moiety.
  • a fluorescent moiety may be a small molecule (such as a dye) or a fluorescently labeled macromolecule.
  • a fluorescently labeled macromolecule may include a fluorescently labeled polypeptide (such as a labeled protein and/or a protein fragment), a fluorescently labeled nucleic acid molecule, a fluorescently labeled carbohydrate, a fluorescently labeled lipid, a fluorescently labeled macrocycle, a fluorescently labeled polyphenol, and/or a fluorescently labeled endogenous macromolecule complex (such as a primary antibody- secondary antibody complex).
  • a fluorescent small molecule may comprise rhodamine, rhodol, fluorescein, thiofluorescein, aminofluorescein, carboxyfluorescein, chlorofluorescein, methylfluorescein, sulfofluorescein, aminorhodol, carboxyrhodol, chlororhodol, methylrhodol, sulforhodol; aminorhodamine, carboxyrhodamine, chlororhodamine, methylrhodamine, sulforhodamine, thiorhodamine, cyanine, indocarbocyanine, oxacarbocyanine, thiacarbocyanine, merocyanine, cyanine 2, cyanine 3, cyanine 3.5, cyanine 5, cyanine 5.5, cyanine 7, oxadiazole derivatives, pyridyloxazole, nitrobenzox
  • a fluorescent moiety may comprise Cy3, Cy5, Cy5.5, Cy7, Q570, Alexa488, Alexa555, Alexa594, Alexa647, Alexa680, Alexa 750, Alexa 790, TexasRed, CF610, Propidium iodide, Quasar 570 (Q570), Quasar 670 (Q670), IRDye700, IRDye800,
  • a fluorescent moiety may comprise a quantum dot (QD).
  • Quantum dots may be a nanoscale semiconducting photolumine scent material, for example, as described in Alivisatos A.P.,“Semiconductor clusters, nanocrystals, and quantum dots,” Science 271(5251): 933-937 (1996).
  • Exemplary QDs may include, but are not limited to, CdS quantum dots, CdSe quantum dots, CdSe/CdS core/shell quantum dots, CdSe/ZnS core/shell quantum dots, CdTe quantum dots, PbS quantum dots, and/or PbSe quantum dots.
  • CdSe/ZnS may mean that a ZnS shell is coated on a CdSe core surface (a“core-shell” quantum dot).
  • the shell materials of core-shell QDs may have a higher bandgap and passivate the core QDs surfaces, resulting in higher quantum yield and higher stability and wider applications than core QDs.
  • QDs may absorb a wide spectrum of light, and may be physically tuned with emission bandwidths in various wavelengths. See, e.g., Badolato, et al., Science 208:1158-61 (2005).
  • the emission bandwidth may be in the visible spectrum (from about 350 to about 750 nm), the ultraviolet- visible spectrum (from about 100 nm to about 750 nm), or in the near-infrared spectrum (from about 750 nm to about 2500 nm).
  • QDs that emit energy in the visible range may include, but are not limited to, CdS, CdSe, CdTe, ZnSe, ZnTe, GaP, and GaAs.
  • QDs that emit energy in the blue to near-ultraviolet range include, but are not limited to, ZnS and GaN.
  • QDs that emit energy in the near-infrared range include, but are not limited to, InP, InAs, InSb, PbS, and PbSe.
  • the radius of a QD may be modulated to manipulate the emission bandwidth. For example, a radius of between about 5 and about 6 nm QD may emit wavelengths resulting in emission colors such as orange or red. A radius of between about 2 and about 3 nm may emit wavelengths resulting in emission colors such as blue or green.
  • a QD may further form a QD microstructure, which encompasses one or more layers of QD.
  • each quantum dot containing layer may comprise a single type of quantum dot of a specific emission color.
  • each layer may be made of any material suitable for use that (a) allows excitation light to reach the quantum dot and allows fluorescence generated from the quantum dot to pass through the layer(s) for detection and (b) may be combined with a quantum dot to form a layer.
  • materials that may be used to form layers containing quantum dots include, but are not limited to, inorganic, organic, or polymeric material, each with or without biodegradable properties, and combinations thereof
  • the layers may comprise silica-based compounds or polymers.
  • Exemplary silica-based layers may include, but are not limited to, those comprising tetramethoxy silane or tetraethylortho silicate.
  • Exemplary polymer layers may include, but are not limited to, those comprising polystyrene, poly (methyl methacrylate),
  • the quantum dot further may comprise a spacer layer which serves as a barrier to prevent interactions between different QD layers, and may be made of any material suitable for use that (a) allows excitation light to reach the quantum dots in the quantum dot containing layer(s) below it and allows fluorescence generated from those quantum dots to pass through it and (b) may segregate the quantum dots in one layer from those in other layers. Examples of materials that may be used to form spacer layers are the same as for the quantum dot containing layers.
  • the materials used for the quantum dot containing and spacer layers may be the same or different. The same material may be used in the quantum dot containing layers and the spacer layers.
  • the quantum dot containing layers and the spacer layers within a given QD molecule may be any thickness and may be varied. For example, thicker QD-containing layers may allow for the loading of increased QDs in the shell, resulting in greater fluorescence intensity for that layer than for a thinner layer containing the same concentration of QDs. Thus, varying layer thickness may facilitate preparing QD-containing layer of various intensities, thereby generating spectrally distinct QD bar codes. In various instances, the QD-containing layers may be between 5 nm and 500 nm. Those of skill in the art will understand that other methods for varying intensity also exist, for example, modifying concentrations of the same QD in one microstructure with a first unique barcode compared to a second QD
  • the spacer layers may be greater than 10 nm, up to approximately 5 pm thick; the spacer layers may be greater than 10 nm, up to approximately 500 nm thick; the space layers may be greater than 10 nm, up to approximately 100 nm thick.
  • the quantum dot- containing and spacer layers may be arranged in any order.
  • a“spacer layer” may comprise a single layer, or may comprise two or more such spacer layers.
  • the QD microstructure may comprise any number of quantum dot containing layers suitable for use with the microstructure.
  • a microstructure described herein may comprise 2 or more quantum dot- containing layers and an appropriate number of spacer layers based on the number of quantum dot- containing layers.
  • the number of quantum dot containing layers in a given microstructure may range from 1 to“m,” where“m” is the number of quantum dots that may be used.
  • a defined intensity level may refer to a known amount of quantum dots in each quantum dot containing layer, resulting in a known amount of fluorescent intensity generated from the QD containing layer upon appropriate stimulation. Since each QD containing layer has a defined intensity level, each microstructure may possess a defined ratio of fluorescence intensities generated from the various QD-containing layers upon stimulation. This defined ratio is referred to herein as a barcode. Thus, each type of microstructure with the same QD layers possesses a similar barcode that may be distinguished from microstructures with different QD layers.
  • each quantum dot containing layer may comprise a single type of quantum dot of a specific emission color and the layer is produced to possess a defined intensity level, based on the concentration of the QD in the layer.
  • intensity levels of QDs (“n”) in different microstructures and using a variety of different quantum dots (“m”) the number of different unique barcodes (and thus the number of different unique microstructure populations that may be produced) is approximated by the equation, (n m -l) unique codes.
  • a set of QD-labeled probes may further generate a spectrally distinct barcode.
  • each probe with the set of QD-labeled probes may comprise a QD with a distinct excitation wavelength and the combination of the set may generate a distinct barcode.
  • a set of spectrally distinct QD-labeled probes may be utilized to detect a regulatory element. As such, when detecting two or more regulatory elements, each regulatory element may be spectrally barcoded.
  • a quantum dot provided herein may include QDot525, QDot 545, QDot 565, QDot 585, QDot 605, or QDot 655.
  • a probe described herein may comprise a quantum dot.
  • a quantum dot may comprise a quantum dot as described in Han et a/.,“Quantum- dot- tagged microbeads for multiplexed optical coding of biomolecules,” Nat. Biotechnol.
  • a QD may further comprise a functional group or attachment moiety.
  • a QD that has a functional group or attachment moiety is a QD with a carboxylic acid terminated surface, such as those commercially available though, for example, Quantum Dot, Inc., Flayward, CA.
  • the probe may include a conjugating moiety.
  • the conjugation moiety may be attached at the 5’ terminus, the 3’ terminus, or at an internal site.
  • the conjugating moiety may be a nucleotide analog (such as bromodeoxyuridine).
  • the conjugating moiety may be a conjugating functional group.
  • the conjugating functional group may be an azido group or an alkyne group.
  • the probe may further be derivatized through a chemical reaction such as click chemistry.
  • the click chemistry may be a copper(I)- catalyzed [3+2]-Huisgen 1,3 -dipolar cyclo- addition of alkynes and azides leading to 1,2, 3-triazoles.
  • the click chemistry may be a copper free variant of the above reaction.
  • the conjugating moiety may comprise a hapten group.
  • a hapten group may include digoxigenin, 2,4-dinitrophenyl, biotin, avidin, or are selected from azoles, nitroaryl compounds, benzolurazans, triterpenes, ureas, thioureas, rotenones, oxazoles, thiazoles, coumarins, cyclolignans, heterobiaryl compounds, azoaryl compounds or benzodiazepines.
  • a hapten group may include biotin.
  • the probe comprising the conjugating moiety may further be linked to a second probe (such as a nucleic acid probe or a polypeptide probe), a fluorescent moiety (such as a dye such as a quantum dot), a target nucleic acid, or a conjugating partner such as a polymer (such as PEG), a macromolecule (such as a carbohydrate, a lipid, a polypeptide), and the like.
  • a second probe such as a nucleic acid probe or a polypeptide probe
  • a fluorescent moiety such as a dye such as a quantum dot
  • a target nucleic acid or a conjugating partner
  • a conjugating partner such as a polymer (such as PEG), a macromolecule (such as a carbohydrate, a lipid, a polypeptide), and the like.
  • the method may comprise an operation of providing one or more probes capable of binding to a target nucleic acid sequence, as described herein.
  • the method may comprise an operation of binding the one or more probes to the target nucleic acid sequence, as described herein.
  • the method may comprise an operation of detecting a signal associated with binding of the one or more probes to the target nucleic acid sequence, as described herein.
  • the target nucleic acid sequence may be detected in an intact cell.
  • the target nucleic acid sequence may be detected in a fixed cell.
  • the target nucleic sequence may be detected in a lysate or chromatin spread.
  • a probe may be used to detect a nucleic acid sequence in a sample.
  • a probe comprising a probe sequence capable of binding a nucleic acid sequence (such as a target nucleic acid sequence) and a detectable label (such as a detectable agent) may be used to detect the nucleic acid sequence.
  • a method for detecting a nucleic acid sequence may comprise contacting a nucleic acid sequence with a probe comprising a probe sequence configured to bind at least a portion of the nucleic acid sequence and detecting the probe (such as detecting the detectable label of the probe).
  • the detection of a nucleic acid sequence may comprise binding the probe to the nucleic acid sequence.
  • the detection of a nucleic acid sequence may comprise binding the probe sequence, such as the sequence of an oligonucleotide probe, to a target nucleic acid sequence.
  • the detection of a nucleic acid sequence may comprise hybridizing the probe sequence (such as the nucleic acid binding region) of a nucleic acid probe to a target nucleic acid sequence.
  • the nucleic acid sequence may be a virus nucleic acid sequence.
  • the nucleic acid sequence may be an agricultural viral nucleic acid sequence.
  • the nucleic acid sequence may be a lentivirus nucleic acid sequence, an adenovirus nucleic acid sequence, an adeno-associated virus nucleic acid sequence, or a retrovirus nucleic acid sequence.
  • a nucleic acid sequence may be contacted with a plurality of probes.
  • a nucleic acid sequence may be contacted with a number of probes ranging from about 1 to about 108 probes, from about 2 to about to about 50 million probes.
  • the probes of the plurality of probes may be the same.
  • a plurality of probes may have sequences such that the probes are tiled across the nucleic acid sequence. Each probe can bind to a target nucleic acid sequence along the nucleic acid sequence.
  • the probes of a plurality may be different.
  • a first probe of the plurality of probes may be different than a second probe of the plurality of probes.
  • the plurality of probes may bind to the nucleic acid sequence with from 0 to 10 nucleotides separating each probe.
  • a nucleic acid sequence may be washed after it has been contacted with a probe. Washing a nucleic acid sequence after it has been contacted with a probe may reduce background signal for detection of the detectable label of the probe.
  • a nucleic acid sequence (such as a target nucleic acid sequence) can be contacted by a plurality of probes.
  • a nucleic acid sequence can be contacted with a plurality of types of probes. That is, a method of detection of a nucleic acid sequence (such as a target nucleic acid sequence) may comprise contacting the target nucleic acid sequence with a plurality of sets of probes (such as a plurality of types of probes).
  • a first probe set (such as a first type of probe) may be different from a second probe set (such a second type of probe) in that the first probe type comprises a first probe sequence which is different than the probe sequence of the second probe type.
  • the probe sequence of a first type of probe may be the same as the probe sequence of a second type of probe.
  • a first probe set may comprise a first detectable label and a first probe sequence and a second probe set may comprise a second detectable label and a second probe sequence, wherein the first and second probe sequences are the same and the first and second detectable labels are different.
  • the first and second probe sequences may be different and the first and second detectable labels of a first and second probe set may be the same.
  • the first and second probe sequences of a first and second probe set may be different and the first and second detectable labels of a first and second probe set may be different.
  • a method of detecting a nucleic acid sequence may comprise contacting a nucleic acid sequence with 1 to 20 types of probes.
  • a first probe sequence may be configured to specifically recognize (such as to bind to or to hybridize with) a first nucleic acid sequence (such as a first target nucleic acid sequence).
  • a second probe sequence may be configured to specifically recognize (such as to bind to or to hybridize with) a second nucleic acid sequence (such as a second target nucleic acid sequence).
  • a detectable label may be detected with a detector.
  • a detector may detect the signal intensity of the detectable label.
  • a detector may spatially distinguish between two detectable labels.
  • a detector may also distinguish between a first and second detectable label based on the spectral pattern produced by the first and second detectable labels, wherein the first and second detectable label do not produce an identical spectral intensity pattern. For example, a detector may distinguish between a first and second detectable signal, wherein the
  • a detector may resolve (such as by spatially distinguishing or spectrally distinguishing) a first and second detectable label that are less than 1 kb apart to less than 100 kb apart on a chromosome.
  • the detectable label of the probe may be detected optically.
  • a detectable label of a probe may be detected by light microscopy, fluorescence microscopy, or chromatography.
  • Detection of the detectable label of a probe may comprise stimulating the probe or a portion thereof (such as the detectable label) with a source of radiation (such as a light source, such as a laser).
  • Detection of the detectable label of a probe may also comprise an enzymatic reaction.
  • Detection of the target nucleic acid sequence may be within a period of not more than 12 hours to not more than 48 hours.
  • a method for assessing a phenotype of an intact genetically modified cell may comprise : a) providing the intact genetically modified cell comprising a target nucleic acid sequence less than 2.5 kilobases in length; b) contacting the intact genetically modified cell with a first plurality of probes, wherein each probe comprises a first detectable label and a probe sequence that binds to a portion of the target nucleic acid sequence; c) detecting a presence of the first detectable label in the intact cell, wherein the presence of the first detectable label indicates the presence of the target nucleic acid sequence; d) determining a phenotype of the intact genetically modified cell; and e) correlating the phenotype of the intact genetically modified cell with the presence of the target nucleic acid sequence.
  • the method may fiirther comprise determining a number or location of genetic modifications in the intact genetically modified cell.
  • the method may further comprise f) selecting a first intact genetically modified cell comprising a phenotype of interest; g) determining a set of conditions used for a genetic modification of the first intact genetically modified cell; and h) preparing a second genetically modified cell using the set of conditions for genetic modification.
  • the intact genetically modified cell may be a eukaryotic cell that was genetically modified.
  • the intact genetically modified cell may be a bacteria cell that was genetically modified.
  • the intact genetically modified cell may be a mammalian cell that was genetically modified.
  • the intact genetically modified cell may be any cell as described herein that was genetically modified.
  • the phenotype may be a product expressed as a result of the genetic modification of the cell.
  • the phenotype may be an increased level or decreased level of the product expressed as a result of the genetic modification of the cell.
  • the phenotype may be an increased quality of the product expressed as a result of the genetic modification of the cell.
  • the expressed product may be protein, such as an enzyme.
  • the expressed product may be a transgene protein, RNA, or a secondary product of the genetic modification. For example, if an enzyme is produced as a result of the genetic modification of the cell, a secondary product of the genetic modification is a product of the enzyme.
  • Determining the number of target nucleic acid sequences in a cell may be useful in determining the phenotype of the cell.
  • Cells with a specific number of target nucleic acid sequences may be tested for increased cellular activity, decreased cellular activity, or toxicity.
  • Increased cellular activity may be increased expression of a protein or a cellular product.
  • Decreased cellular activity may be decreased expression of a protein or a cellular product.
  • Toxicity may be a result of cellular activity that may be too high or too low, resulting in cell death.
  • the contacting a sample ofvirally transduced cells with a probe configured to bind to a particular target viral nucleic acid sequence and then determining the number of viral integrants may be an expedient means of determining whether vims has successfully integrated in the cells of the sample in way in which a desired therapeutic effect may result if given to a patient as a therapy.
  • Determining the presence, absence, identity, spatial position or sequence position of a target nucleic acid sequence in a sample may be useful in determining a condition of a patient.
  • the contacting a sample of cells with a probe configured to bind to a particular target nucleic acid sequence and then determining the number of target nucleic acid sequences in the cell may be an expedient means of determining the number of target nucleic acid sequences may be affecting the cell phenotype or function.
  • contacting a patient sample with a probe configured to bind to a particular nucleic acid sequence may be an expedient means of determining whether the patient has the nucleic acid sequence.
  • contacting a sample of virally transduced cells with a probe configured to bind to a particular target viral nucleic acid sequence may be an expedient means of determining whether virus has successfully integrated in the cells of the sample.
  • contacting a patient sample with a plurality of types of probes, each configured to bind to a different nucleic acid sequence may be an expedient means of screening patients for various genetic or acquired conditions, such as inherited mutations.
  • a method of detecting or determining the presence of a nucleic acid sequence may comprise determining the number of probes associated with the nucleic acid sequence.
  • a method of detecting or determining the presence of a nucleic acid sequence may comprise determining the number of probes hybridized to the nucleic acid sequence.
  • a viral nucleic acid sequence comprises the target nucleic acid sequence
  • the number of viral nucleic acid sequences may be quantified using the methods described herein. Quantification of the number of viral nucleic acid sequences in a sample (such as a cell comprising viral integrations) may be useful in determining the multiplicity of infection. This quantification may also be useful for methods of enriching heterogeneous populations of transduced cells to a more homogenous cell population or to a cell population comprising a greater percentage of cells comprising a specific number or a specific range of viral integrations. Quantification of target nucleic acid sequences in a sample using the methods, compositions, and systems described herein may be useful in determining the number of repeated sequences in a nucleic acid of a sample.
  • this method can be used for quantifying populations of cells transduced to express chimeric antigen receptors (CARs) in order to determine the average number of viral insertions per cell or the distribution of viral insertions per cell within the cell populations.
  • CARs chimeric antigen receptors
  • a Nano-FISH probe or a Nano-FISH probe set of this disclosure can be used to verify the number of viral insertions in T cells that have been engineered to express CARs, such as BCMA, CD19, CD22, WT1, L1CAM, MUC16, ROR1, or LeY.
  • CARs such as BCMA, CD19, CD22, WT1, L1CAM, MUC16, ROR1, or LeY.
  • the Nano-FISH probe or Nano-FISH probe sets of the present disclosure can be used as a quality control step to verify that engineered CAR T cells have truly been transduced with a vector encoding for a given CAR, prior to administering the CAR T cells to a subject in need thereof
  • this method can be used for quantifying populations of CD34+ hematopoietic stem cells (HSCs) transduced to express a gene of interest for the purpose of gene therapy, in order to determine the average number of viral insertions per cell or the distribution of viral insertions per cell within the cell populations.
  • HSCs hematopoietic stem cells
  • a Nano-FISH probe or a Nano-FISH probe set of this disclosure can be used to verify the number of viral insertions in CD34+ cells that have been engineered with any vector, such as a lentivirus vector or an adeno- associated vims vector to express any gene of interest.
  • the Nano-FISH probe or Nano-FISH probe sets of the present disclosure can be used as a quality control step to verify that engineered CD34+ cells have truly been transduced with a vector encoding for a given gene, prior to administering the engineered CD34+ cells to a subject in need thereof
  • a CD34+ cell from a human donor is transduced with the lentivirus vector encoding for any gene.
  • a subset of the engineered CD34+ cells can be subject to viral Nano-FISH validation wherein, the CD34+ cells are hybridized to a Nano-FISH probe or Nano-FISH probe set of the present disclosure and imaged to detect and quantify spots in the cell nuclei corresponding to viral insertions.
  • the engineered CD34+ cells can, thus, be verified for successful transduction of any gene.
  • the engineered CD34+ cells can, thus, be characterized for the average number of insertions per cell and/or the distribution of viral insertions per cell.
  • Viral Nano-FISH can provide these valuable metrics characterizing the heterogeneity and quality of the engineered CD34+ cells prior to administration to a subject in need thereof
  • the above described methods can be used to validate CD34+ cells engineered to in any of the following gene therapies: thalassemia, sickle cell disease, muscular dystrophy, or an immune disorder.
  • the quantification of a target nucleic acid sequence may allow for the precise tuning of per-cell viral integrant number among a pool of cells transduced with a vims, such as a retrovirus.
  • Viral transduction of cells may be heterogeneous, producing cells with no viral integrant, a single copy of a viral integrant, or two or more copies of a viral integrant.
  • Nano-FISH a pool of cells with a consistent number of viral integrants may be produced, wherein cells comprising an undesirable number of viral integrants (e.g., too many or no viral integrants) may be reduced or eliminated.
  • Viral integrants may be detected using the methods as described herein for Nano-FISH, also referred to herein as“viral Nano-FISH.” This may use microscopic imaging of fixed cells, and thus the imaged cells may not themselves be collected for subsequent use.
  • pairing the Nano-FISH with a statistical approach may allow for (i) inferring the distribution of viral integrants in subpools of cells expanding in culture, and (ii) combining subpools to create a refined pool of cells with uniform viral integrants number.
  • the pool of cells with the uniform number of viral integrants may be a therapeutic used to treat a disease.
  • this method may be used for enriching populations of cells transduced to express chimeric antigen receptors (CARs) in order to deliver a cell population with a uniform number of CAR integrations to a patient as a cancer therapy.
  • CARs chimeric antigen receptors
  • the enrichment process may comprise the following steps: a) quantify the number of viral integrants in a sample from a source pool of cells; b) subdivide the remaining cells of the source pool into K subpools, each with approximately N cells (the value of N may be chosen to ensure a high likelihood of subpools having zero or a greatly reduced fraction of cells with more than one viral integrant; c) allow each subpool to undergo multiple cell divisions to create cell clones with identical numbers of viral integrants per cell; d) perform Nano-FISH on a representative sample from each subpool to assess the number of viral integrants in each cell; e) based on the assessment of step d) estimate the distribution of viral integrants for each subpool and eliminate the subpools with the unfavorable distribution of viral integrants; and f) combine the remaining subpools to create a single enriched pool comprising cells with a more homogenous number of viral integrants.
  • the number of cell divisions and fraction of cells drawn for Nano- FISH analysis may be selected to ensure a high likelihood of detecting the presence of a multiple integration event given the random set of cells drawn.
  • any subpool may be eliminated if the proportion of cells with more than one viral integrants exceeds a specified threshold (which may be 0). Subpools may also be eliminated if the proportion of cells with no viral integrant is above a specified threshold. This secondary selection criterion may increase the relative abundance of the single viral integrant phenotype.
  • the above method for enrichment may allow numerous parameters to be specified in order to achieve a given goal. These parameters may include the number of cells per subpool, the number of subpools, the number of cell divisions (i.e., time in culture), and fraction of cells withdrawn for Nano-FISH.
  • the optimal protocol may depend on the underlying rate of multiple viral insertions and the probability of detecting a spot with Nano- FISH.
  • the approach may depend on the tolerance for allowing cells with multiple or no viral integrants into the enriched pool.
  • subpools may be enriched so that no cells comprise multiple integrants. To achieve this, for example, a statistical model may be used.
  • the optimal value of N may be Up.
  • the target number of cell division cycles D and fraction of cells F to be withdrawn for Nano-FISH may need to be determined. For this determination, all cells may undergo the same number of cell divisions, resulting in 2 D copies of each.
  • the probability of withdrawing k of the cells with 2 integrants in a fraction F of all cells in the subpool may be given by P(k
  • the likelihood of a Nano-FISH spot being detected may be S, then the overall probability of detection may be given by
  • Determining the presence, absence, identity, spatial position or sequence position of a target nucleic acid sequence in a sample may be useful in determining a condition of a patient. For example, contacting a patient sample with a probe configured to bind to a particular nucleic acid sequence may be an expedient means of determining whether the patient has the nucleic acid sequence. Similarly, contacting a patient sample with a plurality of types of probes, each configured to bind to a different nucleic acid sequence, may be an expedient means of screening patients for various genetic or acquired conditions, such as inherited mutations.
  • the method may comprise an operation of providing one or more probes capable of binding to a target nucleic acid sequence, as described herein.
  • the method may comprise an operation of binding the one or more probes to the target nucleic acid sequence, as described herein.
  • the method may comprise an operation of imaging a signal associated with binding of the one or more probes to the target nucleic acid sequence, as described herein.
  • the spatial position of the nucleic acid sequence may be determined relative to features of the sample (such as features of a cell), structures of the sample (such structures or organelles of the cell), or other nucleic acids by using the same or a different imaging modality to detect the reference features, structures, or nucleic acids. For instance, the spatial position of a nucleic acid sequence in a cell relative to the nucleus of a cell by using a plurality of antibodies with a detectable label to counter-label structures of the cell, such as the cell membrane. A cell line expressing a detectable label (such as a liision protein with a structural protein expressed by the cell) may be used to determine spatial position of a nucleic acid sequence in a cell. If the target nucleic acid sequence comprises a viral nucleic acid sequence, the spatial location of the viral nucleic acid sequence may be determined by the methods as described herein.
  • Data collected from detection of all or a portion of the detectable labels in a sample may be used to form one or more two-dimensional images or a three-dimensional rendering or to make calculations determining or estimating the spatial position of the target nucleic acid sequence.
  • a first probe comprising a first detectable label and a first probe sequence configured to bind to a nucleic acid sequence may be used as a reference position for a second probe comprising a second detectable label and a second probe sequence configured to bind to a second nucleic acid sequence (such as a second target nucleic acid sequence).
  • a first probe specific to a first target nucleic acid sequence of a nucleic acid with a known or anchored position on the nucleic acid may be used as a reference to determine the spatial position of a second target nucleic acid sequence bound by a second probe prior to or during imaging.
  • the method may comprise an operation of providing a first set of one or more probes capable of binding to one or more reference nucleic acid sequences with known positions in the genome, as described herein.
  • the method may comprise an operation of binding the first set of one or more probes to the one or more reference nucleic acid sequences, as described herein.
  • the method may comprise an operation of providing a second set of one or more probes capable of binding to a target nucleic acid sequence, as described herein.
  • the method may comprise an operation of binding the second set of one or more probes to the target nucleic acid sequence, as described herein.
  • the method may comprise an operation of detecting a signal associated with binding of the first set of one or more probes to the one or more reference nucleic acid sequences and of the second set of one or more probes to the target nucleic acid sequence, as described herein.
  • the method may comprise an operation of comparing the signals associated with binding of the first set of one or more probes to the reference nucleic acid sequences to the signal associated with binding of the second set of one or more probes to the target nucleic acid sequence.
  • a method of detecting or determining the presence of a nucleic acid sequence may comprise determining the sequence position of a nucleic acid sequence (such as a target nucleic acid sequence). For example, a probe with a probe sequence configured to recognize a first target sequence with a known position in the sequence of a nucleic acid may be used as reference for calculations or estimations of the sequence position of a second target nucleic acid sequence on the nucleic acid.
  • a first probe having a probe sequence configured to recognize a first target sequence with a first known position in the sequence of a nucleic acid and a second probe having a probe sequence configured to recognize a second target nucleic acid sequence with a second known position in the sequence of the nucleic acid may be used as reference points for a third probe configured to recognize a third target nucleic acid sequence with an unknown position in the nucleic acid.
  • the relative sequence position of the third target nucleic acid sequence may be determined or estimated by comparing it to the positions of the first and second target nucleic acid sequences, as indicated by the signals from the first and second probes.
  • the method may comprise an operation of providing a one or more probes capable of binding to a target nucleic acid sequence in a reference sample and a target nucleic acid sequence in a sample under test, as described herein.
  • the method may comprise an operation of binding the one or more probes to the target nucleic acid sequence in the reference sample and the target nucleic acid sequence in the sample under test, as described herein.
  • the method may comprise an operation of detecting a signal associated with binding of the set of one or more probes to the target nucleic acid sequence in the reference sample and the target nucleic acid sequence in the sample being tested, as described herein.
  • the method may comprise an operation of comparing the signal associated with binding of the one or more probes to the target nucleic acid sequence in the reference sample to the signal associated with binding of the one or more probes to the target nucleic acid sequence in the sample under test, as described herein.
  • the detection of a target nucleic acid sequence in a cell may be correlated with a target protein expression in the same cell.
  • the method may comprise providing a one or more probes capable of binding to a target nucleic acid sequence in a sample and a target nucleic acid sequence in a sample being tested, as described herein, and further comprise providing one or more detectable labels to detect the target protein expression.
  • the presence, absence, or quantity of the detected target nucleic acid sequence may be correlated to the presence, absence, or quantity of the target protein expression. This information may be used to further investigate the relationship between the target nucleic acid sequence and the target protein, and/or how different treatments may perturb this correlation.
  • a viral nucleic acid sequence may be introduced into a cell by a viral vector, such as a virus particle, which may be called a virus or a virion.
  • a virus particle may also be introduced to a cell by a bacteriophage.
  • a virus particle may introduce a viral nucleic acid sequence into a cell through a series of steps that may include attachment (such as binding) of the virus particle to the cell membrane of the cell, internalization (such as penetration) of the viral particle into the cell (such as via formation of a vesicle around the virus particle), breakdown of the vesicle containing the virus particle (such as through uncoating, which may comprise breakdown of the portions of the virus such as a the viral coat), expression of the viral nucleic acid sequence or a portion thereof processing and/or maturation of the viral nucleic acid sequence’s expression product, incorporation of the viral nucleic acid sequence or its expression product into a DNA sequence of the host cell, and/or or replication of the viral nucleic acid sequence or a portion thereof
  • a viral nucleic acid sequence may be targeted to the nucleus of the cell after internalization.
  • Introduction of a viral nucleic acid sequence into a cell by a virus particle may lead to permanent integration of the viral nucleic acid sequence into a DNA sequence of the cell.
  • a viral nucleic acid sequence introduced into a cell by a retrovirus such as a lentivirus or adeno- associated virus, may be integrated directly into the DNA sequence of a cell.
  • Introduction of a viral nucleic acid sequence into a cell by a vims particle may not lead to integration into a DNA sequence of the cell.
  • a viral particle may be a double-stranded DNA (dsDNA) virus, a single- stranded DNA (ssDNA) virus, a double- stranded RNA (dsRNA) virus, a sense single- stranded RNA (+ssRNA) virus, an antisense single- stranded RNA (-ssRNA).
  • Some viral particles may introduce a reverse transcriptase, integrase, and/or protease (such as a reverse transcriptase encoded by a pol gene sequence, which may be a portion of the viral nucleic acid sequence) into the infected cell.
  • virus particles that introduce reverse transcriptase into an infected cell include single- stranded reverse transcriptase RNA (ssRNA-RT) viruses and double- stranded DNA reverse transcriptase (dsDNA-RT) viruses.
  • ssRNA-RT viruses include metaviridae, pseudoviridae, and retroviridae.
  • dsDNA-RT viruses include hepadnaviridae (e.g., Hepatitis B vims) and caulimoviridae.
  • Additional examples of viruses include lentiviruses, adenoviruses, adeno- associated viruses, and retroviruses.
  • a viral nucleic acid sequence may be introduced into a cell by a non- viral vector, such as a plasmid.
  • a plasmid may be a DNA polynucleotide encoding one or more genes.
  • a plasmid may comprise a viral nucleic acid sequence.
  • a viral nucleic acid sequence of a plasmid may encode a non-coding RNA (such as a transfer RNA, a ribosomal RNA, a microRNA, an siRNA, a snRNA, a shRNA, an exRNA, a piwi RNA, a snoRNA, a scaRNA, or a long non-coding RNA) or a coding RNA (such as a messenger RNA).
  • a non-coding RNA such as a transfer RNA, a ribosomal RNA, a microRNA, an siRNA, a snRNA, a shRNA, an exRNA, a piwi RNA,
  • a coding RNA may be modified (such as by splicing, poly-adenylation, or addition of a 5’ cap) or translated into a polypeptide sequence (such as a protein) after being transcribed from a DNA nucleic acid sequence of a plasmid.
  • a sample described herein may be a fresh sample or a fixed sample.
  • the sample may be a fresh sample.
  • the sample may be a fixed sample.
  • the sample may be a live sample.
  • the sample may be subjected to a denaturing condition.
  • the sample may be cryopreserved.
  • the sample may be a cell sample.
  • the cell sample may be obtained from the cells or tissue of an animal.
  • the animal cell may comprise a cell from an invertebrate, fish, amphibian, reptile, or mammal.
  • the mammalian cell may be obtained from a primate, ape, equine, bovine, porcine, canine, feline, or rodent.
  • the mammal may be a primate, ape, dog, cat, rabbit, ferret, or the like.
  • the rodent may be a mouse, rat, hamster, gerbil, hamster, chinchilla, or guinea pig.
  • the bird cell may be from a canary, parakeet, or parrot.
  • the reptile cell may be from a turtle, lizard, or snake.
  • the fish cell may be from a tropical fish.
  • the fish cell may be from a zebrafish (such as Danio rerio).
  • the amphibian cell may be from a frog.
  • An invertebrate cell may be from an insect, arthropod, marine invertebrate, or worm.
  • the worm cell may be from a nematode (such as Caenorhabditis elegans).
  • the arthropod cell may be from a tarantula or hermit crab.
  • the cell sample may be obtained from a mammalian cell.
  • the mammalian cell may be an epithelial cell, connective tissue cell, hormone secreting cell, a nerve cell, a skeletal muscle cell, a blood cell, an immune system cell, or a stem cell.
  • a cell may be a fresh cell, live cell, fixed cell, intact cell, or cell lysate.
  • Cell samples can be any primary cell, such as a hematopoetic stem cell (HSCs) or naive or stimulated T cells (e.g., CD4+ T cells).
  • HSCs hematopoetic stem cell
  • CD4+ T cells naive or stimulated T cells
  • Cell samples may be cells derived from a cell line, such as an immortalized cell line.
  • Exemplary cell lines include, but are not limited to, 293A cell line, 293FT cell line, 293F cell line, 293 H cell line, HEK 293 cell line, CHO DG44 cell line, CHO-S cell line, CHO-K 1 cell line, Expi293FTM cell line, Flp-InTM T-RExTM 293 cell line, Flp-InTM-293 cell line, Flp-InTM- 3T3 cell line, Flp-InTM-BHK cell line, Flp-InTM-CHO cell line, Flp-InTM-CV- l cell line, Flp- InTM- Jurkat cell line, FreeStyleTM 293-F cell line, FreeStyleTM CHO-S cell line, GripTiteTM
  • 293 MSR cell line 293 MSR cell line, GS-CHO cell line, HepaRGTM cell line, T-RExTM Jurkat cell line, Per.C6 cell line, T-RExTM-293 cell line, T-RExTM-CHO cell line, T-RExTM-HeLa cell line, NC- HIMT cell line, PC 12 cell line, A549 cells, and K562 cells.
  • the cell sample may be obtained from cells of a primate.
  • the primate may be a human, or a non- human primate.
  • the cell sample may be obtained from a human.
  • the cell sample may comprise cells obtained from blood, urine, stool, saliva, lymph fluid, cerebrospinal fluid, synovial fluid, cystic fluid, ascites, pleural effusion, amniotic fluid, chorionic villus sample, vaginal fluid, interstitial fluid, buccal swab sample, sputum, bronchial lavage, Pap smear sample, or ocular fluid.
  • the cell sample may comprise cells obtained from a blood sample, an aspirate sample, or a smear sample.
  • the cell sample may be a circulating tumor cell sample.
  • a circulating tumor cell sample may comprise lymphoma cells, fetal cells, apoptotic cells, epithelia cells, endothelial cells, stem cells, progenitor cells, mesenchymal cells, osteoblast cells, osteocytes, hematopoietic stem cells (HSC) (e.g., a CD34+ HSC), foam cells, adipose cells, transcervical cells, circulating cardiocytes, circulating fibrocytes, circulating cancer stem cells, circulating myocytes, circulating cells from a kidney, circulating cells from a gastrointestinal tract, circulating cells from a lung, circulating cells from reproductive organs, circulating cells from a central nervous system, circulating hepatic cells, circulating cells from a spleen, circulating cells from a thymus, circulating cells from a thyroid, circulating cells from an endocrine
  • circulating cells from a parathyroid circulating cells from a pituitary, circulating cells from an adrenal gland, circulating cells from islets of Langerhans, circulating cells from a pancreas, circulating cells from a hypothalamus, circulating cells from prostate tissues, circulating cells from breast tissues, circulating cells from circulating retinal cells, circulating ophthalmic cells, circulating auditory cells, circulating epidermal cells, circulating cells from the urinary tract, or combinations thereof
  • the cell can be a T cell.
  • the T cell can be an engineered T cell transduced to express a chimeric antigen receptor (CAR) or engineered T cell receptor (TCR).
  • CAR chimeric antigen receptor
  • TCR engineered T cell receptor
  • the CAR, or TCR T cell can be engineered to bind to BCMA, CD19, CD22, WT1, L1CAM, MUC16, ROR1, or LeY.
  • a cell sample may be a peripheral blood mononuclear cell sample.
  • a cell sample may comprise cancerous cells.
  • the cancerous cells may form a cancer which may be a solid tumor or a hematologic malignancy.
  • the cancerous cell sample may comprise cells obtained from a solid tumor.
  • the solid tumor may include a sarcoma or a carcinoma.
  • Exemplary sarcoma cell sample may include, but are not limited to, cell sample obtained from alveolar rhabdomyosarcoma, alveolar soft part sarcoma, ameloblastoma, angiosarcoma, chondrosarcoma, chordoma, clear cell sarcoma of soft tissue, dedifferentiated liposarcoma, desmoid, desmoplastic small round cell tumor, embryonal rhabdomyosarcoma, epithelioid fibrosarcoma, epithelioid hemangioendothelioma, epithelioid sarcoma, esthesioneuroblastoma, Ewing sarcoma, extrarenal rhabdoid tumor, extraskeletal myxoid chondrosarcoma, extraskeletal osteosarcoma, fibrosarcoma, giant cell tumor,
  • hemangiopericytoma infantile fibrosarcoma, inflammatory myofibroblastic tumor, Kaposi sarcoma, leiomyosarcoma of bone, liposarcoma, liposarcoma of bone, malignant fibrous histiocytoma (MFH), malignant fibrous histiocytoma (MFH) of bone, malignant
  • mesenchymoma malignant peripheral nerve sheath tumor, mesenchymal chondrosarcoma, myxofibrosarcoma, myxoid liposarcoma, rnyxoinflammatory fibroblastic sarcoma, neoplasms with perivascular epitheioid cell differentiation, osteosarcoma, parosteal osteosarcoma, neoplasm with perivascular epitheioid cell differentiation, periosteal osteosarcoma, pleomorphic liposarcoma, pleomorphic rhabdomyosarcoma, PNET/extraskeletal Ewing tumor, rhabdomyosarcoma, round cell liposarcoma, small cell osteosarcoma, solitary fibrous tumor, synovial sarcoma, or telangiectatic osteosarcoma.
  • Exemplary carcinoma cell samples may include, but are not limited to, cell samples obtained from an anal cancer, appendix cancer, bile duct cancer (i.e., cholangiocarcinoma), bladder cancer, brain tumor, breast cancer, cervical cancer, colon cancer, cancer of Unknown Primary (CUP), esophageal cancer, eye cancer, fallopian tube cancer, gastroenterological cancer, kidney cancer, liver cancer, lung cancer, medulloblastoma, melanoma, oral cancer, ovarian cancer, pancreatic cancer, parathyroid disease, penile cancer, pituitary tumor, prostate cancer, rectal cancer, skin cancer, stomach cancer, testicular cancer, throat cancer, thyroid cancer, uterine cancer, vaginal cancer, or vulvar cancer.
  • CUP Unknown Primary
  • the cancerous cell sample may comprise cells obtained from a hematologic malignancy.
  • Hematologic malignancy may comprise a leukemia, a lymphoma, a myeloma, a non- Hodgkin’s lymphoma, or a Hodgkin’s lymphoma.
  • the hematologic malignancy may be a T-cell based hematologic malignancy.
  • the hematologic malignancy may be a B-cell based hematologic malignancy.
  • Exemplary B-cell based hematologic malignancy may include, but are not limited to, chronic lymphocytic leukemia (CLL), small lymphocytic lymphoma (SLL), high risk CLL, a non-CLL/SLL lymphoma, prolymphocytic leukemia (PLL), follicular lymphoma (FL), diffuse large B-cell lymphoma (DLBCL), mantle cell lymphoma (MCL), Waldenstrom’s macroglob ulinemia, multiple myeloma, extranodal marginal zone B cell lymphoma, nodal marginal zone B cell lymphoma, Burkitt’s lymphoma, non-Burkitt high grade B cell lymphoma, primary mediastinal B-cell lymphoma (PMBL), immunoblastic large cell lymphoma, precursor B-lymphoblastic lymphoma, B cell prolymphocytic leukemia, lymphoplasmacytic lymphoma, sple
  • angioimmunoblastic lymphoma cutaneous T-cell lymphoma, adult T-cell
  • leukemia/lymphoma ATLL
  • blastic NK-cell lymphoma enteropathy-type T-cell lymphoma
  • hematosplenic gamma-delta T-cell lymphoma hematosplenic gamma-delta T-cell lymphoma
  • lymphoblastic lymphoma nasal NK/T-cell lymphomas, or treatment-related T-cell lymphomas.
  • a cell sample described herein may comprise a tumor cell line sample.
  • Exemplary tumor cell line sample may include, but are not limited to, cell samples from tumor cell lines such as 600MPE, AU565, BT-20, BT-474, BT-483, BT-549, Evsa-T, Hs578T, MCF-7, MDA-MB-231, SkBr3, T-47D, HeLa, DU145, PC3, LNCaP, A549, H1299, NCI-H460, A2780, SKOV-3/Luc, Neuro2a, RKO, RKO-AS45- 1, HT-29, SW1417, SW948, DLD- l, SW480, Capan- l, MC/9, B72.3, B25.2, B6.2, B38.1, DMS 153, SU.86.86, SNU-182, SNU- 423, SNU-449, SNU-475, SNU-387, Hs 817.T, LMH, LMH/2A, S
  • a cell sample may comprise cells obtained from a biopsy sample, necropsy sample, or autopsy sample.
  • the cell samples may be obtained from an individual by any suitable means of obtaining the sample using well-known and routine clinical methods.
  • Procedures for obtaining tissue samples from an individual are well known. For example, procedures for drawing and processing tissue sample such as from a needle aspiration biopsy are well-known and may be employed to obtain a sample for use in the methods provided.
  • a thin hollow needle is inserted into a mass such as a tumor mass for sampling of cells that, after being stained, will be examined under a microscope.
  • a cell may be a live cell.
  • a cell may be a eukaryotic cell.
  • a cell may be a yeast cell.
  • a cell may be a plant cell.
  • a cell may be obtained from an agricultural plant.
  • the present disclosure provides methods of high-throughput assaying of target nucleic acid cells in multi- well format.
  • the present disclosure provides methods for depositing cells in at least 24 wells, hybridizing oligonucleotide Nano- FISH probes with cells after denaturation, covering cells in each well with a glass coverslip, and imaging the cells with the microscopy techniques disclosed herein.
  • PLL- coated 24-well glass-bottom plates can be used to hold 24 samples, wherein each sample contains a cell population.
  • the cell population in each well can be the same or the cell population in each well can be different.
  • at least 24 unique samples can be processed at the same time.
  • Cells can be deposited into the 24-well plate, treated with fixative solution (e.g., 4$ formaldehyde in IX PBS or 3 parts methanol and 1 part glacial acetic acid), washed, and hybridized to oligonucleotide Nano-FISH probes.
  • fixative solution e.g., 4$ formaldehyde in IX PBS or 3 parts methanol and 1 part glacial acetic acid
  • the 24- well plate can then be washed and cells can be mounted with glass coverslips containing an anti-fade solution (e.g., Prolong Gold) prior to imaging.
  • an anti-fade solution e.g., Prolong Gold
  • up to 1 to 10 plates can be simultaneously processed.
  • Optical Detection of Surrogate Protein Markers e.g., p53BPl
  • Nucleic Acid Sequences e.g., Optical Detection of Surrogate Protein Markers (e.g., p53BP
  • a method of detecting a protein such as a surrogate protein marker (e.g., p53BPl) of a cellular response induced by a cellular perturbation (genome editing and methods of detecting a nucleic acid sequence.
  • the detection may encompass identification of the nucleic acid sequence, determining the presence or absence of the nucleic acid sequence, and/or determining the activity of the nucleic acid sequence.
  • a method of detecting a nucleic acid sequence may include contacting a cell sample with a detection agent, binding the detection agent to the nucleic acid sequence, and analyzing a detection profile from the detection agent to determine the presence, absence, or activity of the nucleic acid sequence.
  • the method may involve utilizing one or more intrinsic properties associated with a detection agent to aid in detection of the nucleic acid sequence.
  • the intrinsic properties may encompass the size of the detection agent, the intensity of the signal, and the location of the detection agent.
  • the size of the detection agent may include the length of the probe and/or the size of the detectable moiety (such as the size of a fluorescent dye molecule) may modulate the specificity of interaction with a regulatory element.
  • the intensity of the signal from the detection agent may correlate to the sensitivity of detection.
  • a detection agent with a molar extinction coefficient of about 0.5-5 x l0 6 M 1 cm 1 may have a higher intensity signal relative to a detection agent with a molar extinction coefficient outside of the 0.5-5 x l0 6 M 1 cm 1 range and may have lower attenuation due to scattering and absorption.
  • a detection agent with a longer excited state lifetime and a large Stoke shift may further improve the sensitivity of detection.
  • the location of the detection agent may, for example, provide the activity state of a nucleic acid sequence.
  • a combination of intrinsic properties of the detection agent may be used to detect a regulatory element of interest.
  • a detection agent may comprise a detectable moiety that is capable of generating a light, and a probe portion that is capable of hybridizing to a target site on a nucleic acid sequence.
  • a detection agent may include a DNA probe portion, an RNA probe portion, a polypeptide probe portion, or a combination thereof
  • a DNA or RNA probe portion may be between about 10 and about 100 nucleotides in length.
  • a DNA or RNA probe portion may be a TALEN probe, ZFN probe, or a CRISPR probe.
  • a DNA or RNA probe portion may be a padlock probe.
  • a polypeptide probe may comprise a DNA-binding protein, a RNA-binding protein, a protein involved in the transcription/translation process or detects the transcription/translation process, a protein that may detect an open or relaxed portion of a chromatin, or a protein interacting partner of a product of a regulatory element (such as an antibody or binding fragment thereof).
  • a detection agent may comprise a DNA or RNA probe portion which may be between about 10 and about 100 nucleotides in length.
  • a set of detection agents may be used to detect a nucleic acid sequence.
  • the set of detection agents may comprise about 2 to about 20, or more detection agents may be used for detection of a nucleic acid sequence.
  • a detection agent may comprise a polypeptide probe selected from a DNA-binding protein, a RNA-binding protein, a protein involved in the transcription/translation process or detects the transcription/translation process, a protein that may detect an open or relaxed portion of a chromatin, or a protein interacting partner of a product of a regulatory element (such as an antibody or binding fragment thereof).
  • a detectable moiety that is capable of generating a light may be directly conjugated or bound to a probe portion.
  • a detectable moiety may indirectly conjugated or bound to a probe portion by a conjugating moiety.
  • a detectable moiety may be a small molecule (such as a dye) which may be directly conjugated or bound to a probe portion.
  • a detectable moiety may be a fluorescently labeled protein or molecule which may be attached to a conjugating moiety (such as a hapten group, an azido group, an alkyne group) of a probe.
  • a profile or a detection profile or signature may include the signal intensity, signal location, and/or size of the signal of the detection agent.
  • the profile or the detection profile may comprise about 100 image frames to about 50,000 frames, or more image frames.
  • Analysis of the profile or the detection profile may determine the activity of the regulatory element. The degree of activation may also be determined from the analysis of the profile or detection profile. Analysis of the profile or the detection profile may further determine the optical isolation and localization of the detection agents, which may correlate to the localization of the nucleic acid sequence.
  • the method may comprise an operation of providing one or more probes capable of binding to a target nucleic acid sequence, as described herein.
  • the method may comprise an operation of binding the one or more probes to the target nucleic acid sequence, as described herein.
  • the method may comprise an operation of photobleaching the one or more probes at one or more wavelengths, as described herein.
  • the method may comprise an operation of detecting a profile of optical emissions associated with the photobleaching, as described herein.
  • the method may comprise an operation of analyzing the detection profile to determine the localization of the target nucleic acid sequence, as described herein.
  • the localization of a nucleic acid sequence may include contacting a nucleic acid sequence with a first set of detection agents, photobleaching the first set of detection agents for a first time point at a first wavelength to generate a second set of detection agents capable of generating a light at a second wavelength, detecting at least one burst generated by the second set of detection agents to generate a detection profile of the second set of detection agents, and analyzing the detection profile to determine the localization of the nucleic acid sequence.
  • a detection agent may comprise a detectable moiety that is capable of generating a light, and a probe portion that is capable of hybridizing to a target site on a nucleic acid sequence.
  • Each detection agent within the first set of detection agents may have the same or a different detectable moiety.
  • Each detection agent within the first set of detection agents may have the same detectable moiety.
  • a detectable moiety may comprise a small molecule (such as a fluorescent dye).
  • a detectable moiety may comprise a fluorescently labeled polypeptide, a fluorescently labeled nucleic acid probe, and/or a fluorescently labeled polypeptide complex.
  • a second set of detection agents may be generated from the first set of detection agents, in which the second set may include detection agents that are capable of generating a burst of fight detectable at a second wavelength.
  • bleaching of the set of detection agents may lead to about 50%, or more detection agents within the set to enter into an“OFF-state”
  • An“OFF-state” may be a dark state in which the detectable moiety crosses from the singlet excited electronic or ON state to the triplet electronic state or OFF- state in which detection of fight (such as fluorescence) may be low (for instance, less than 10%, less than 5%, less than 1%, or less than 0.5% of fight may be detected).
  • the remainder of the detection agents that have not entered into the OFF-state may generate bursts of fights, or to cycle between a singlet excited electronic state (or ON-state) and a singlet ground electronic state.
  • bleaching of the set of detection agents may generate about 40% or less detection agents within the set that may generate bursts of fights.
  • the bursts of fights may be detected stochastically, at a single burst level in which each burst of fight correlates to a single detection agent.
  • a single wavelength may be used for photobleaching a set of detection agents. At least two wavelengths may be used for photobleaching a set of detection agents.
  • wavelength at 49lnm may be used.
  • a wavelength at 405nm may be used in combination with the wavelength at 49lnm.
  • the two wavelengths may be applied simultaneously to photobleach a set of detection agents.
  • the two wavelengths may be applied sequentially to photobleach a set of detection agents.
  • the time for photobleaching a set of detection agents may be from about 10 seconds to about 4 hours, or more.
  • the concentration of the detection agents may be from about 5 nM to about 1 mM
  • the burst of lights from the set of detection agents may generate a detection profile.
  • the detection profile may comprise about 100 image frames to about 50,000 frames, or more image frames.
  • the detection profile may also include the signal intensity, signal location, or size of the signal. Analysis of the detection profile may determine the optical isolation and localization of the detection agents, which may correlate to the localization of the nucleic acid sequence.
  • the detection profile may comprise a chromatic aberration correction.
  • the detection profile may comprise less than 5% or 0% chromatic aberration.
  • More than one nucleic acid sequence may be detected at the same time. Sometimes, at least 2 to at least 20 or more nucleic acid sequence may be detected at the same time. Each of the nucleic acid sequences may be detected by a set of detection agents. The detectable moiety between the different set of detection agents may be the same. For example, two different sets of detection agents may be used to detect two different nucleic acid sequences and the detectable moieties from the two sets of detection agents may be the same. As such, at least 2 to at least 20 or more nucleic acid sequences may be detected at the same time at the same wavelength. The detectable moiety between the different set of detection agents may also be different.
  • two different sets of detection agents may be used to detect two different nucleic acid sequences and the detectable moiety from one set of detection agents may be detected at a different wavelength from the detectable moiety of the second set of detection agents.
  • at least 2 to at least 20, or more nucleic acid sequences may be detected at the same time in which each of the nucleic acid sequences may be detected at a different wavelength.
  • the nucleic acid sequence may comprise DNA, RNA, polypeptides, or a combination thereof
  • the activity of a target nucleic acid sequence may be measuring utilizing the methods described herein.
  • the methods may include detection of a nucleic acid sequence and one or more products of the nucleic acid sequence.
  • One or more products of the nucleic acid sequence may also include intermediate products or elements.
  • the method may comprise contacting a cell sample with a first set and a second set of detection agents, in which the first set of detection agents interact with a target nucleic acid sequence within the cell and the second set of detection agents interact with at least one product of the target nucleic acid sequence, and analyze a detection profile from the first set and the second set of detection agents, in which the presence or the absence of the at least one product indicates the activity of the target nucleic acid sequence.
  • a detection agent may comprise a detectable moiety that is capable of generating a light, and a probe portion that is capable of hybridizing to a target site on a nucleic acid sequence.
  • Each detection agent within the first set of detection agents may have the same or a different detectable moiety.
  • Each detection agent within the first set of detection agents may have the same detectable moiety.
  • a detectable moiety may comprise a small molecule (such as a fluorescent dye).
  • a detectable moiety may comprise a fluorescently labeled polypeptide, a fluorescently labeled nucleic acid probe, and/or a fluorescently labeled polypeptide complex.
  • the method may also allow photobleaching of the first set and the second set of detection agents, whereby generating a subset of detection agents capable of generating a burst of light.
  • a detection profile may be generated from the detection of a set of light bursts, in which the presence or the absence of the at least one product may indicate the activity of the target nucleic acid sequence.
  • the nucleic acid sequence may comprise DNA, RNA, polypeptides, or a combination thereof
  • the nucleic acid sequence may be DNA.
  • the nucleic acid sequence may be RNA.
  • the nucleic acid sequence may be an enhancer RNA (eRNA).
  • the presence of an eRNA may correlate with target gene transcription that is downstream of eRNA.
  • the nucleic acid sequence may be a DNasel hypersensitive site (DHS).
  • the DHS may be an activated DHS.
  • the pattern of the DHS on a chromatin may correlate to the activity of the chromatin.
  • the nucleic acid sequence may be a polypeptide, such as a transcription factor, a DNA or RNA- binding protein or binding fragment thereof or a polypeptide that is involved in chemical modification.
  • the nucleic acid sequence may be chromatin.
  • the below disclosed imaging and image analysis techniques can be used to analyze protein markers (e.g., p53BPl) of cellular perturbation and/or Nano-FISH.
  • protein markers e.g., p53BPl
  • Nano-FISH Nano-FISH
  • a microscopy method may be an air or an oil immersion microscopy method used in a conventional microscope, a holographic or tomographic imaging microscope, or an imaging flow cytometer instrument.
  • imaging flow cytometers such as the ImageStream (EMD Millipore), conventional microscopes or commercial high-content imagers (such as the Operetta (Perkin Elmer), IN Cell (GE), etc.) deploying wide-field and/or confocal imaging modes may achieve sub- cellular resolution to detect signals of interest.
  • DAPI (4',6-diamidino-2- phenylindole) stain may be used to identify cell nuclei and another stain may be used to identify cells containing a nuclease protein.
  • a microscopy method may utilize a super-resolution microscopy, which allows images to be taken with a higher resolution than the diffraction limit.
  • a super-resolution microscopy method may utilize a deterministic super-resolution microscopy method, which utilizes a fluorophore’ s nonlinear response to excitation to enhance resolution.
  • Exemplary deterministic super-resolution methods may include stimulated emission depletion (SEED), ground state depletion (GSD), reversible saturable optical linear fluorescence transitions (RESOLFT), and/or saturated structured illumination microscopy (SSIM).
  • SEED stimulated emission depletion
  • GSD ground state depletion
  • RESOLFT reversible saturable optical linear fluorescence transitions
  • SSIM saturated structured illumination microscopy
  • a super-resolution microscopy method may also include a stochastic super-resolution microscopy method, which utilizes a complex temporal behavior of a fluorophore, to enhance resolution.
  • Exemplary stochastic super-resolution method may include super-resolution optical fluctuation imaging (SOFI), all single- molecular localization method (SMLM) such as spectral precision determination microscopy (SPDM), SPDMphymod, photo- activated localization microscopy (PALM), fluorescence photo- activated localization microscopy (FPALM), selective plane illumination microscopy (SPIM), stochastic optical reconstruction microscopy (STORM), and dSTORM.
  • SOFI super-resolution optical fluctuation imaging
  • SMLM all single- molecular localization method
  • SPDM spectral precision determination microscopy
  • PAM photo- activated localization microscopy
  • FPALM fluorescence photo- activated localization microscopy
  • SPIM selective plane illumination microscopy
  • SPIM stochastic optical reconstruction microscopy
  • dSTORM stochastic optical reconstruction microscopy
  • a microscopy method may be a single- molecular localization method (SMLM).
  • a microscopy method may be a spectral precision determination microscopy (SPDM) method.
  • a SPDM method may rely on stochastic burst or blinking of fluorophore s and subsequent temporal integration of signals to achieve lateral resolution at, for example, between about 10 nm and about 100 nm.
  • a microscopy method may be a spatially modulated illumination (SMI) method.
  • SMI spatially modulated illumination
  • a SMI method may utilize phased lasers and interference patterns to illuminate specimens and increase resolution by measuring the signal in fringes of the resulting Moire patterns.
  • a microscopy method may be a synthetic aperture optics (SAO) method.
  • a SAO method may utilize a low magnification, low numerical aperture (NA) lens to achieve large field of view (FOV) and depth of field, without sacrificing spatial resolution.
  • NA numerical aperture
  • an SAO method may comprise illuminating the detection agent-labeled target (such as a target protein agglomeration or nucleic acid sequence) with a predetermined number (N) of selective excitation patterns, where the number (N) of selective excitation patterns is determined based upon the detection agent’s physical characteristics corresponding to spatial frequency content (such as the size, shape, and/or spacing of the detection agents on the imaging target) from the illuminated target, optically imaging the illuminated target at a resolution insufficient to resolve the objects on the target, and processing optical images of the illuminated target using information on the selective excitation patterns to obtain a final image of the illuminated target at a resolution sufficient to resolve the objects on the target.
  • the detection agent-labeled target such as a target protein agglomeration or nucleic acid sequence
  • N predetermined number of selective excitation patterns
  • the number (N) of selective excitation patterns may correspond to the number of k- space sampling points in a k- space sampling space in a frequency domain, with the extent of the k- space sampling space being substantially proportional to an inverse of a minimum distance (Dc) between the objects that is to be resolved by SAO, and with the inverse of the k-space sampling interval between the k-space sampling points being less than a width (w) of a detected area captured by a pixel of a system for said optical imaging.
  • Dc minimum distance
  • w width
  • the number (N) may include a function of various parameters of the imaging system (such as a magnification of the objective lens, numerical aperture of the objective lens, wavelength of the light emitted from the imaging target, and/or effective pixel size of the pixel sensitive area of the image detector, etc.).
  • a SAO method may analyze a set of detection agent profiles from at least 100, at least 200, at least 250, at least 500, at least 1000, or more cells imaged simultaneously within one field of view utilizing an imaging instrument.
  • the one field of view may be a single wide field of view (FOV) allowing image capture of at least 50, at least 100, at least 200, at least 250, at least 500, at least 1000, or more cells.
  • the single wide field of view may be about 0.70 mm by about 0.70 mm field of view.
  • the SAO imaging instrument may enable a resolution of about 0.25 pm with a 20X/0.45NA lens.
  • the SAO imaging instrument may enable a depth of field of about 2.72 pm with a 20X/0.45NA lens.
  • the imaging instrument may enable a working distance of about 7 mm with a 20X/0.45NA lens.
  • the imaging instrument may enable a z-stack of 1 with a 20X/0.45NA lens.
  • the SAO method may further integrate and interpolate 3-dimensional images from 2-dimensional images.
  • the SAO method may enable the image acquisition of cell images at high spatial resolution and FOV.
  • the SAO method may provide a FOV that is at least about l .5x, at least about 2x, at least about 3x, at least about 4x, at least about 5x, at least about 6x, at least about 7x, at least about 8x, at least about 9x, at least about lOx, at least about 15c, at least about 20x, or more as compared to a FOV provided by a method of microscope imaging using a 40x or 60x objective.
  • the SAO method may provide a FOV that is at least about l .5x, at least about 2x, at least about 3x, at least about 4x, at least about 5x, at least about 6x, at least about 7x, at least about 8x, at least about 9x, at least about lOx, at least about 15c, at least about 20x, or more as compared to a FOV provided by a method of microscope imaging using a 40x or 60x objective.
  • the SAO method may provide a FOV
  • the SAO imaging instrument may be, for example, an SAO instrument as described in U.S. Patent Publication No. 2011/0228073 (Lee et al.).
  • the SAO imaging instrument may be, for example, a StellarVisionTM imaging platform supplied by Optical Biosystems, Inc. (Santa Clara, CA).
  • Fluorescence images may be processed by a method for analysis of, e.g., cell nuclei, target protein agglomerations (e.g., p53BPl), diffused localization of target proteins, and/or FISH signals.
  • the method may comprise obtaining a fluorescence image of one or more probes bound to one or more target proteins or nucleic acid sequences, as described herein.
  • the method may comprise deconvolving the image one or more times, as described herein.
  • the method may comprise generating a region of interest (ROI) from the deconvolved image, as described herein.
  • the method may comprise analyzing the ROI to determine the locations of all target proteins or nucleic acid sequences, as described herein.
  • ROI region of interest
  • Images obtained using the systems and methods described herein may be subjected to an image analysis method.
  • the images may be obtained using the epifluorescence imaging systems and methods described herein.
  • the image may be obtained using the super-resolution imaging systems and methods described herein.
  • the image analysis method may allow a quantitative morphometric analysis to be conducted on regions of interest (ROIs) within the images.
  • ROIs regions of interest
  • the image analysis method may be implemented using Matlab, Script, Python, Java, Perl, Visual Studio, C, or ImageJ.
  • the image analysis method may be adapted from methods for processing fluorescence
  • the image analysis method may be frilly automated and/or tunable by the user.
  • the image analysis method may be configurable to identify p53BPl foci regardless of the shapes of the foci.
  • the image analysis method may be configurable to process two-dimensional and/or three-dimensional images.
  • the image analysis method may allow high throughput of estimation of cell count and boundaries in cell populations, which may be obtained with a speed-up of at least about 2 times, at least about 5 times, at least about 10 times, at least about 15 times, at least about 20 times, at least about 25 times, at least about 30 times, at least about 35 times, at least about 40 times, at least about 45 times, at least about 50 times, at least about 100 times, or more, as compared to manual identification and counting of cell populations.
  • the image analysis method may comprise a deconvolution of the image.
  • the deconvolution process may improve the contrast and resolution of cell images for further analysis.
  • the image analysis method may comprise an iterative deconvolution of the image.
  • the image analysis method may comprise 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 iterations of deconvolving the image.
  • the image analysis method may comprise more than 1, more than 2, more than 3, more than 4, more than 5, more than 6, more than 7, more than 8, more than 9, or more than 10 iterations of deconvolving the image.
  • the deconvolution procedure may remove or reduce out-of-focus blur or other sources of noise in the epifluorescence images or super-resolution images, thereby enhancing the signal-to-noise ratio (SNR) within ROIs.
  • SNR signal-to-noise ratio
  • the image analysis method may further comprise an identification of the ROIs (e.g., candidate cells).
  • the ROIs may be identified using an automated detection method.
  • the ROIs may be identified by processing the raw or deconvolved or reconstructed or pre-processed images by applying a segmentation algorithm. This may allow the rapid delineation of ROIs within the epifluorescence or super-resolution images, thereby allowing scalability of processing images.
  • the segmentation of ROIs may comprise planarization of three- dimensional images (e.g., generated by z-stacking to obtain three-dimensional cell volumes) by utilizing a maximum intensity projection image to generate a two-dimensional ROI mask.
  • the two-dimensional ROI mask may act as a template for an initial three-dimensional mask.
  • the initial three-dimensional mask may be generated by projecting the two-dimensional ROI mask into a third spatial dimension.
  • the projection may be a weighted projection.
  • the initial three-dimensional mask may be lurther refined to obtain a refined three-dimensional ROI mask. Refinement of the initial three-dimensional mask may be achieved utilizing adaptive thresholding and/or region growing methods.
  • Refinement of the initial three-dimensional mask may be achieved by iteratively applying adaptive thresholding and/or region growing methods.
  • the iterative procedure may result in a final three-dimensional ROI mask.
  • the final three-dimensional ROI mask may comprise information regarding the locations of all fluorescently- labeled proteins or FISH- labeled nucleic acid sequences within each cell in a sample.
  • the segmentation may detect ROIs using two-dimensional or three-dimensional computer vision methods such as edge detection and morphology.
  • the ROIs may include cell nuclei, protein (e.g., p53BPl) foci, FISH foci, nuclease localization, or a combination thereof within each cell in a cell population within a field of view (FOV).
  • protein e.g., p53BPl
  • the image analysis method may further comprise feature extraction/computation from the segmented ROIs (e.g., detected candidate cells). Such sets of features may be selected to enable high performance (e.g., accuracy, throughput, sensitivity, specificity, etc.) of identifying/counting ROIs. Morphological features/parameters may be extracted from the segmented ROIs, such as count, spatial location, size (area/volume), shape
  • image parameters may also be extracted from the segmented ROIs, such as quantitative measures of image texture that may be pixel-based or region-based over a tunable length scale (e.g., nuclear diameter, nuclear area, nuclear volume, perimeter, surface area, DNA content, DNA texture measures).
  • extracted features may include number of protein marker foci, size of protein marker foci, shape of protein marker foci, amount of protein marker per cell, spatial location and localization pattern of protein marker foci.
  • ROIs that include nuclease localization number of nuclease per cell, amount of nuclease per cell, nuclease localization or texture, number of cell engineering tool foci, size of cell engineering tool foci, shape of cell engineering tool foci, amount of cell engineering tool foci per cell, spatial location and localization pattern of cell engineering tool foci.
  • additional features may be extracted, such as number, size, shape, amount, spatial location and localization pattern of Nano-FISH foci.
  • target protein agglomerations e.g., p53BPl
  • difiiised localization of target proteins e.g., FISH signals
  • FISH signals e.g., FISH signals
  • lurther informatics and analysis may be performed based on the image analysis results. For example, specificity analysis may be performed by analyzing locations of co- localization between Nano-FISH- labeled genomic loci and p53BPl .
  • Nano-FISH- labeled genomic loci and p53BPl may indicate samples with high potency and specificity of nuclease activity (e.g., with minimal off-target effects), while cell images without co-localization between immunoNanoFISH and p53BPl may indicate samples with issues such as decreased potency of nuclease activity, decreased specificity of nuclease activity (e.g., with some off-target effects), or that an editing event was not detected by the assay.
  • the image analysis method may analyze acquired image data comprising a cell population to generate an output of estimating a count and/or boundaries (e.g., segmented ROIs) of the cell population.
  • the image analysis method may apply a prediction algorithm (e.g., a predictive analytics algorithm) to the acquired data to generate output of estimating a count and/or boundaries (e.g., segmented ROIs) of the cell population.
  • the prediction algorithm may comprise an artificial intelligence based predictor, such as a machine learning based predictor, configured to process the acquired image data comprising a cell population to generate the output of estimating a count and/or boundaries (e.g., segmented ROIs) of the cell population.
  • the machine learning predictor may be trained using datasets from one or more sets of images of known cell populations as inputs and known counts and/or boundaries (e.g., segmented ROIs) of the cell populations as outputs to the machine learning predictor.
  • the machine learning predictor may comprise one or more machine learning algorithms.
  • machine learning algorithms may include a support vector machine (SVM), a naive Bayes classification, a random forest, a neural network, deep learning, or other supervised learning algorithm or unsupervised learning algorithm for classification and regression.
  • SVM support vector machine
  • the machine learning predictor may be trained using one or more training datasets corresponding to image data comprising cell populations.
  • Training datasets may be generated from, for example, one or more sets of image data having common characteristics (features) and outcomes (labels). Training datasets may comprise a set of features and labels corresponding to the features. Features may comprise characteristics such as, for example, certain ranges or categories of cell measurements, such as morphological features/parameters (count, size, diameter, area, volume, perimeter length, circularity, irregularity, eccentricity, etc.), other image parameters
  • DN A content DN A texture measures, characteristics of p53BPl foci (e.g., number, size, shape, etc.), amount of p53BPl protein per cell, spatial location and
  • Labels may comprise outcomes such as, for example, estimated or actual counts and boundaries of cells in a cell population or nuclease specificity or its activity.
  • Training sets may be selected by random sampling of a set of data corresponding to one or more sets of image data.
  • training sets e.g., training datasets
  • the machine learning predictor may be trained until certain predetermined conditions for accuracy or performance are satisfied, such as having minimum desired values corresponding to cell identification accuracy measures.
  • the cell identification accuracy measure may correspond to estimated or actual counts and boundaries (e.g., segmented ROIs) of cells in a cell population.
  • Examples of cell identification accuracy measures may include sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), accuracy, and area under the curve (AUC) of a Receiver Operating Characteristic (ROC) curve corresponding to the accuracy of generating estimated or actual counts and boundaries (e.g., segmented ROIs) of cells in a cell population.
  • sensitivity specificity
  • PV positive predictive value
  • NPV negative predictive value
  • AUC area under the curve
  • ROC Receiver Operating Characteristic
  • such a predetermined condition may be that the sensitivity of identifying a cell of interest comprises a value of, for example, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99%.
  • such a predetermined condition may be that the specificity of identifying a cell of interest comprises a value of, for example, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99%.
  • such a predetermined condition may be that the positive predictive value (PPV) of identifying a cell of interest comprises a value of, for example, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99%.
  • PSV positive predictive value
  • such a predetermined condition may be that the negative predictive value (NPV) of identifying a cell of interest comprises a value of, for example, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99%.
  • NSV negative predictive value
  • such a predetermined condition may be that the area under the curve (AUC) of a Receiver Operating Characteristic (ROC) curve of identifying a cell of interest comprises a value of at least about 0.50, at least about 0.55, at least about 0.60, at least about 0.65, at least about 0.70, at least about 0.75, at least about 0.80, at least about 0.85, at least about 0.90, at least about 0.95, at least about 0.96, at least about 0.97, at least about 0.98, or at least about 0.99.
  • AUC area under the curve
  • ROC Receiver Operating Characteristic
  • image analysis can also be carried out as shown in FIG. 1, which illustrates an assay workflow for cellular imaging of phospho-53BPl (p53BPl) foci.
  • the image analysis method may be implemented in an automated manner, such as using the digital processing devices described herein.
  • % nuclease specificity for a nuclease can be computed from the per-cell p53bpl foci count data.
  • the data distributions for the nuclease-treated and the corresponding untreated reference (background) cell samples are computed.
  • PD p53bpl assay
  • Fp proliferating cell fraction
  • a theoretical on-target distribution is calculated for the on-target activity of the nuclease.
  • the distribution of the nuclease-treated sample is normalized by the distribution of the control sample and the theoretical on-target distribution using a process of non-negative least squares deconvolution.
  • nuclease specificity is the % fraction of background-normalized cells containing p53BPl foci from 0 to PT .
  • Fp and PD are set to 0 and 1
  • Baseline level or threshold level above which a DNA binding domain of a gene editing tool (e.g., a nuclease) is deemed to be non-specific can be calculated empirically by carrying out the imaging assays described herein.
  • Such baseline or threshold level may be application- specific and can be determined by the requirements of an application as a set threshold on the magnitude of change in protein load in response to treatment (relative to background protein load in reference untreated cells) beyond which cell engineering tool is deemed non-specific, or as a relative ranking of cell engineering tools in a screening application when one or several best performing tools are picked.
  • protein indicative of cellular response is stained and imaged in fixed cells, total protein load is calculated by measuring intensity of protein staining within a cell. Change in total protein load is used as a measure of cell response to treatment.
  • protein indicative of cellular response is stained and imaged in fixed cells, and protein accumulation at distinct locations within the cell is detected and enumerated. Change in the number of protein foci is used as a measure of cell response to treatment. In some instances, this change can be expressed as a specificity score.
  • protein indicative of cellular response is stained with immunofluorescence and target DNA loci are stained with nanoFISH and imaged in fixed cells. Protein accumulation at distinct locations and co-localization with nanoFISH spots within the cell are detected and enumerated. Change in the number of protein foci not co- localized with target nanoFISH spots is used as a measure of off-target cell response to treatment.
  • the systems, apparatus, and methods described herein may include a digital processing device, or use of the same.
  • the digital processing device may include one or more hardware central processing units (CPU) that carry out the device’s functions.
  • the digital processing device may further comprise an operating system configured to perform executable instructions.
  • the digital processing device is optionally connected to a computer network, is optionally connected to the Internet such that it accesses the World Wide Web, or is optionally connected to a cloud computing infrastructure.
  • the digital processing device is optionally connected to an intranet.
  • the digital processing device is optionally connected to a data storage device.
  • suitable digital processing devices may include, by way of non-limiting examples, server computers, desktop computers, laptop computers, notebook computers, sub-notebook computers, netbook computers, netpad computers, set-top computers, media streaming devices, handheld computers, Internet appliances, mobile smartphones, tablet computers, personal digital assistants, video game consoles, and vehicles.
  • server computers desktop computers, laptop computers, notebook computers, sub-notebook computers, netbook computers, netpad computers, set-top computers, media streaming devices, handheld computers, Internet appliances, mobile smartphones, tablet computers, personal digital assistants, video game consoles, and vehicles.
  • smartphones are suitable for use in the system described herein.
  • Suitable tablet computers may include those with booklet, slate, and convertible configurations, known to those of skill in the art.
  • the digital processing device may include an operating system configured to perform executable instructions.
  • the operating system may be, for example, software, including programs and data, which may manage the device’s hardware and provides services for execution of applications.
  • suitable server operating systems may include, by way of non-limiting examples, FreeBSD, OpenBSD, NetBSD ® , Linux, Apple ® Mac OS X Server ® , Oracle ® Solaris ® , Windows Server ® , and Novell ® NetWare ® .
  • suitable personal computer operating systems include, by way of non-limiting examples, Microsoft ® Windows ® , Apple ® Mac OS X ® , UNIX ® , and UNIX- like operating systems such as GNU/Linux ® . In some cases, the operating system is provided by cloud computing.
  • suitable mobile smart phone operating systems include, by way of non-limiting examples, Nokia ® Symbian ® OS, Apple ® iOS ® , Research In Motion ® BlackBerry OS ® , Google ® Android ® , Microsoft ® Windows Phone ® OS, Microsoft ® Windows Mobile ® OS, Linux ® , and Palm ® WebOS ® .
  • suitable media streaming device operating systems include, by way of non-limiting examples, Apple TV ® , Roku ® , Boxee ® , Google TV ® , Google Chromecast ® , Amazon Fire ® , and Samsung ®
  • Suitable video game console operating systems include, by way of non-limiting examples, Sony ® PS3 ® , Sony ® PS4 ® , Microsoft ® Xbox 360 ® , Microsoft Xbox One, Nintendo ® Wii ® , Nintendo ® Wii U ® , and Ouya ® .
  • the device may include a storage and/or memory device.
  • the storage and/or memory device may be one or more physical apparatuses used to store data or programs on a temporary or permanent basis.
  • the device is volatile memory and requires power to maintain stored information.
  • the device is non-volatile memory and retains stored information when the digital processing device is not powered.
  • the non-volatile memory comprises flash memory.
  • the non volatile memory may comprise dynamic random-access memory (DRAM).
  • the non-volatile memory may comprise ferroelectric random access memory (FRAM).
  • the non-volatile memory may comprise phase-change random access memory (PRAM).
  • the device may be a storage device including, by way of non-limiting examples, CD-ROMs, DVDs, flash memory devices, magnetic disk drives, magnetic tapes drives, optical disk drives, and cloud computing based storage.
  • the storage and/or memory device may also be a combination of devices such as those disclosed herein.
  • the digital processing device may include a display to send visual information to a user.
  • the display may be a cathode ray tube (CRT).
  • the display may be a liquid crystal display (LCD).
  • the display may be a thin film transistor liquid crystal display (TFT-LCD).
  • the display may further be an organic light emitting diode (OLED) display.
  • OLED organic light emitting diode
  • OLED display is a passive-matrix OLED (PMOLED) or active-matrix OLED (AMOLED) display.
  • the display may be a plasma display.
  • the display may be a video projector.
  • the display may be a combination of devices such as those disclosed herein.
  • the digital processing device may also include an input device to receive information from a user.
  • the input device may be a keyboard.
  • the input device may be a pointing device including, by way of non-limiting examples, a mouse, trackball, track pad, joystick, game controller, or stylus.
  • the input device may be a touch screen or a multi- touch screen.
  • the input device may be a microphone to capture voice or other sound input.
  • the input device may be a video camera or other sensor to capture motion or visual input.
  • the input device may be a KinectTM, Leap MotionTM, or the like.
  • the input device may be a combination of devices such as those disclosed herein.
  • Non-transitory computer readable storage medium
  • the systems, apparatus, and methods disclosed herein may include one or more non-transitory computer readable storage media encoded with a program including instructions executable by the operating system of an optionally networked digital processing device.
  • a computer readable storage medium is a tangible component of a digital processing device.
  • a computer readable storage medium is optionally removable from a digital processing device.
  • a computer readable storage medium may include, by way of non-limiting examples, CD-ROMs, DVDs, flash memory devices, solid state memory, magnetic disk drives, magnetic tape drives, optical disk drives, cloud computing systems and services, and the like.
  • the program and instructions are permanently, substantially permanently, semi-permanently, or non-transitorily encoded on the media.
  • the systems, apparatus, and methods disclosed herein may include at least one computer program, or use of the same.
  • a computer program includes a sequence of instructions, executable in the digital processing device’s CPU, written to perform a specified task.
  • computer readable instructions are implemented as program modules, such as functions, objects, Application Programming Interfaces (APIs), data structures, and the like, that perform particular tasks or implement particular abstract data types.
  • APIs Application Programming Interfaces
  • a computer program in certain embodiments, is written in various versions of various languages.
  • a computer program may comprise one sequence of instructions.
  • a computer program may comprise a plurality of sequences of instructions.
  • a computer program is provided from one location.
  • a computer program is provided from a plurality of locations.
  • a computer program includes one or more software modules.
  • a computer program may include, in part or in whole, one or more web applications, one or more mobile applications, one or more standalone applications, one or more web browser plug-ins, extensions, add-ins, or add-ons, or combinations thereof
  • a computer program may include a web application.
  • a web application in various aspects, utilizes one or more software frameworks and one or more database systems.
  • a web application is created upon a software framework such as Microsoft ®
  • a web application utilizes one or more database systems including, by way of non-limiting examples, relational, non-relational, object oriented, associative, and XML database systems.
  • suitable relational database systems may include, by way of non-limiting examples, Microsoft ® SQL Server, mySQLTM, and Oracle ® .
  • Those of skill in the art will also recognize that a web application, in various instances, is written in one or more versions of one or more languages.
  • a web application may be written in one or more markup languages, presentation definition languages, client- side scripting languages, server- side coding languages, database query languages, or combinations thereof
  • a web application may be written to some extent in a markup language such as Hypertext Markup Language (HTML), Extensible Hypertext Markup Language (XHTML), or extensible Markup Language (XML).
  • a web application is written to some extent in a presentation definition language such as Cascading Style Sheets (CSS).
  • Aweb application may be written to some extent in a client-side scripting language such as Asynchronous Javascript and XML (AJAX), Flash ® Actionscript, Javascript, or Silverlight ® .
  • a web application may be written to some extent in a server-side coding language such as Active Server Pages (ASP), ColdFusion ® , Perl, JavaTM, JavaServer Pages (JSP), Hypertext Preprocessor (PHP), PythonTM, Ruby, Tel, Smalltalk, WebDNA ® , or Groovy.
  • a web application may be written to some extent in a database query language such as Structured Query Language (SQL).
  • SQL Structured Query Language
  • a web application may integrate enterprise server products such as IBM ® Lotus Domino ® .
  • a web application includes a media player element.
  • a media player element utilizes one or more of many suitable multimedia technologies including, by way of non-limiting examples, Adobe ® Flash ® , HTML 5, Apple ® QuickTime ® , Microsoft ®
  • Silverlight ® , JavaTM, and Unity ® .
  • a computer program may include a mobile application provided to a mobile digital processing device.
  • the mobile application is provided to a mobile digital processing device at the time it is manufactured.
  • the mobile application is provided to a mobile digital processing device via the computer network described herein.
  • a mobile application is created by techniques known to those of skill in the art using hardware, languages, and development environments known to the art. Those of skill in the art will recognize that mobile applications are written in several languages. Suitable programming languages include, by way of non-limiting examples, C, C++, C#, Objective-C, JavaTM, Javascript, Pascal, Object Pascal, PythonTM, Ruby, VB.NET, WML, and XHTML/HTML with or without CSS, or combinations thereof
  • Suitable mobile application development environments are available from several sources. Commercially available development environments include, by way of non-limiting examples, AirplayS DK, alcheMo, Appcelerator®, Celsius, Bedrock, Flash Lite, .NET Compact Framework, Rhomobile, and WorkLight Mobile Platform. Other development environments are available without cost including, by way of non-limiting examples,
  • Lazarus, MobiFlex, MoSync, and Phonegap mobile device manufacturers distribute software developer kits including, by way of non-limiting examples, iPhone and iPad (iOS) SDK, AndroidTM SDK, BlackBerry® SDK, BREW SDK, Palm® OS SDK, Symbian SDK, webOS SDK, and Windows® Mobile SDK.
  • iOS iPhone and iPad
  • AndroidTM SDK AndroidTM SDK
  • BlackBerry® SDK BREW SDK
  • Palm® OS SDK Samsung® OS SDK
  • Symbian SDK Symbian SDK
  • webOS SDK webOS SDK
  • Windows® Mobile SDK Windows® Mobile SDK
  • a computer program may include a standalone application, which is a program that is run as an independent computer process, not an add-on to an existing process, e.g., not a plug-in.
  • standalone applications are often compiled.
  • a compiler is a computer program(s) that transforms source code written in a programming language into binary object code such as assembly language or machine code. Suitable compiled programming languages include, by way of non-limiting examples, C, C++, Objective-C, COBOL, Delphi, Eiffel, JavaTM, Lisp, PythonTM, Visual Basic, and VB .NET, or combinations thereof Compilation is often performed, at least in part, to create an executable program.
  • a computer program may include one or more executable complied applications.
  • the computer program may include a web browser plug-in.
  • a plug-in is one or more software components that add specific ftmctionality to a larger software application. Makers of software applications support plug-ins to enable third-party developers to create abilities which extend an application, to support easily adding new features, and to reduce the size of an application. When supported, plug-ins enable customizing the ftmctionality of a software application. For example, plug-ins are commonly used in web browsers to play video, generate interactivity, scan for viruses, and display particular file types. Those of skill in the art will be familiar with several web browser plug-ins including, Adobe® Flash® Player, Microsoft® Silverlight®, and Apple® QuickTime®.
  • the toolbar comprises one or more web browser extensions, add-ins, or add- ons. In some embodiments, the toolbar comprises one or more explorer bars, tool bands, or desk bands.
  • Web browsers may be software applications, designed for use with network- connected digital processing devices, for retrieving, presenting, and traversing information resources on the World Wide Web. Suitable web browsers include, by way of non-limiting examples, Microsoft ® Internet Explorer ® , Mozilla ® Firefox ® , Google ® Chrome, Apple ® Safari ® , Opera Software ® Opera ® , and KDE Konqueror.
  • the web browser is a mobile web browser.
  • Mobile web browsers also called mircrobrowsers, mini-browsers, and wireless browsers
  • mobile digital processing devices including, by way of non-limiting examples, handheld computers, tablet computers, netbook computers, subnotebook computers, smartphones, music players, personal digital assistants (PDAs), and handheld video game systems.
  • PDAs personal digital assistants
  • Suitable mobile web browsers include, by way of non-limiting examples, Google ® Android ® browser, RIM BlackBerry ® Browser, Apple ® Safari ® , Palm ® Blazer, Palm ® WebOS ® Browser, Mozilla ® Firefox ® for mobile, Microsoft ® Internet Explorer ® Mobile, Amazon ® Kindle ® Basic Web, Nokia ® Browser, Opera Software ® Opera ® Mobile, and Sony ® PSPTM browser.
  • the systems and methods disclosed herein may include software, server, and/or database modules, or use of the same.
  • software modules may be created by techniques known to those of skill in the art using machines, software, and languages known to the art.
  • the software modules disclosed herein may be implemented in a multitude of ways.
  • a software module may comprise a file, a section of code, a programming object, a programming structure, or combinations thereof
  • a software module may comprise a plurality of files, a plurality of sections of code, a plurality of programming objects, a plurality of programming structures, or combinations thereof
  • the one or more software modules comprise, by way of non-limiting examples, a web application, a mobile application, and a standalone application.
  • software modules are in one computer program or application. In other instances, software modules are in more than one computer program or application. In some cases, software modules are hosted on one machine. In other cases, software modules are hosted on more than one machine. Sometimes, software modules may be hosted on cloud computing platforms. Other times, software modules may be hosted on one or more machines in one location. In additional cases, software modules are hosted on one or more machines in more than one location. B. Databases
  • the methods, apparatus, and systems disclosed herein may include one or more databases, or use of the same.
  • suitable databases may include, by way of non-limiting examples, relational databases, non-relational databases, object oriented databases, object databases, entity-relationship model databases, associative databases, and XML databases.
  • a database may be internet-based.
  • a database may be web-based.
  • a database may be cloud computing-based.
  • a database may be based on one or more local computer storage devices.
  • Methods and systems described herein may further be performed as a service.
  • a service provider may obtain a sample that a customer wishes to analyze.
  • the service provider may then encode the sample to be analyzed by any of the methods described herein, performs the analysis and provides a report to the customer.
  • the customer may also perform the analysis and provides the results to the service provider for decoding.
  • the service provider then provides the decoded results to the customer.
  • the customer may receive encoded analysis of the samples from the provider and decodes the results by interacting with softwares installed locally (at the customer’s location) or remotely (e.g. on a server reachable through a network).
  • the softwares may generate a report and transmit the report to the costumer.
  • Exemplary customers include clinical laboratories, hospitals, industrial manufacturers and the like.
  • a customer or party may be any suitable customer or party with a need or desire to use the methods provided herein.
  • the methods provided herein may be processed on a server or a computer server).
  • the server may include a central processing unit (CPU, also“processor”) which may be a single core processor, a multi core processor, or plurality of processors for parallel processing.
  • a processor used as part of a control assembly may be a microprocessor.
  • the server may also include memory (e.g. random access memory, read-only memory, flash memory); electronic storage unit (e.g. hard disk); communications interface (e.g. network adaptor) for communicating with one or more other systems; and peripheral devices which includes cache, other memory, data storage, and/or electronic display adaptors.
  • the memory, storage unit, interface, and peripheral devices may be in communication with the processor through a communications bus (solid lines), such as a motherboard.
  • the storage unit may be a data storage unit for storing data.
  • the server may be operatively coupled to a computer network (“network”) with the aid of the communications interface.
  • a processor with the aid of additional hardware may also be operatively coupled to a network.
  • the network may be the Internet, an intranet and/or an extranet, an intranet and/or extranet that is in
  • the network may implement a peer-to-peer network, which may enable devices coupled to the server to behave as a client or a server.
  • the server may be capable of transmitting and receiving computer-readable instructions (e.g., device/system operation protocols or parameters) or data (e.g., sensor measurements, raw data obtained from detecting metabolites, analysis of raw data obtained from detecting metabolites, interpretation of raw data obtained from detecting metabolites, etc.) via electronic signals transported through the network.
  • a network may be used, for example, to transmit or receive data across an international border.
  • the server may be in communication with one or more output devices such as a display or printer, and/or with one or more input devices such as, for example, a keyboard, mouse, or joystick.
  • the display may be a touch screen display, in which case it functions as both a display device and an input device. Different and/or additional input devices may be present such an enunciator, a speaker, or a microphone.
  • the server may use any one of a variety of operating systems, such as for example, any one of several versions of Windows®, or of MacOS®, or of Unix®, or of Linux®.
  • the storage unit may store files or data associated with the operation of a device, systems or methods described herein.
  • the server may communicate with one or more remote computer systems through the network.
  • the one or more remote computer systems may include, for example, personal computers, laptops, tablets, telephones, Smart phones, or personal digital assistants.
  • a control assembly may include a single server. In other situations, the system may include multiple servers in communication with one another through an intranet, extranet and/or the Internet.
  • the server may be adapted to store device operation parameters, protocols, methods described herein, and other information of potential relevance. Such information may be stored on the storage unit or the server and such data is transmitted through a network.
  • Kits A composition described herein may be supplied in the form of a kit.
  • a composition may be materials and software for image analysis of a protein marker (e.g., p53BPl) of a cellular response induced by a cellular perturbation.
  • Materials can include a detectable agent that binds to the protein (e.g., a primary antibody fluorophore conjugate or a primary antibody against the protein and a secondary antibody- fluorophore conjugate).
  • Materials can further include a detectable agent that binds to a cell engineering tool (e.g., genome editing complex, gene regulator) to be tested (e.g., a primary antibody fluorophore conjugate or a primary antibody against the protein and a secondary antibody-fluorophore conjugate).
  • a composition can be an oligonucleotide Nano-FISH probe set designed for a target nucleic acid sequence.
  • the kits of the present disclosure may further comprise instructions regarding the method of using the detectable agents to detect protein (e.g., p53BPl) load, cell engineering tool, or probe set to detect the target nucleic acid sequence.
  • the components of the kit may be in dry or liquid form. If they are in dry form, the kit may include a solution to solubilize the dried material.
  • the kit may also include transfer factor in liquid or dry form. In some embodiments, if the transfer factor is in dry form, the kit includes a solution to solubilize the transfer factor.
  • the kit may also include containers for mixing and preparing the components.
  • the kits as described herein also may include a means for containing compositions of the present disclosure in close confinement for commercial sale and distribution.
  • ranges and amounts may be expressed as“about” a particular value or range. About also includes the exact amount. Hence“about 5 pL” means“about 5 pL” and also“5 pL.” Generally, the term“about” includes an amount that would be expected to be within experimental error. [0427]
  • the section headings used herein are for organizational purposes only and are not to be construed as limiting the subject matter described.
  • FIG. 1 shows a brief summary of the assay workflow including the steps of nuclease transfection in cells, immuno labeling, imaging, processing raw images by deconvolution, enhancement, or reconstruction and segmentation, feature computation (e.g., count, amount, size, location), and informatics and analysis (determining nuclease load and/or specificity, cytotoxicity, and/or heterogeneity) from the extracted/computed features.
  • a nuclease (e.g., TALENs or Cas9) was delivered to cells by electroporation. Cells were incubated for a period of time, such as 24 hours, necessary for nuclease activity and cell response to nuclease-induced DNA double- stranded breaks.
  • the cells were sampled for evaluation of nuclease specificity.
  • Cells were fixed onto glass slides, coverslips, or glass-bottom well-plates, stained with fluorescent labeled antibodies against p53BPl and the nuclease protein, and imaged with a fluorescence microscope (e.g., Nikon).
  • a fluorescence microscope e.g., Nikon
  • raw fluorescence microscopy images were deconvolved (e.g., by processing the raw images with a deconvolution algorithm), regions of interest such as cell nuclei, p53BPl foci, and nuclease localization were algorithmically delineated (e.g., by processing the deconvolved images with a segmentation algorithm), and morphological features/parameters (such as count, size, diameter, area, volume, perimeter length, circularity, irregularity, eccentricity, etc.) and other image parameters (such as contrast, correlation, entropy, energy, and homogene ity/uniformity) were computed for each cell (e.g., by applying one or more feature extraction algorithms to the segmented images).
  • morphological features/parameters such as count, size, diameter, area, volume, perimeter length, circularity, irregularity, eccentricity, etc.
  • other image parameters such as contrast, correlation, entropy, energy, and homogene ity/uniformity
  • FIG. 17 shows an assay workflow for microscopy on a Stellar- Vis ion microscope. Images are captured on the Stellar- Vision microscope, images were reconstructed, images were segmented for regions of interest such as cell nucleic, p53BPl foci, and nuclease localization, features were computed (such as count, size, diameter, area, volume, perimeter length, circularity, irregularity, eccentricity, etc.). The measured per-cell feature information was statistically analyzed to produce quantitative specificity metrics for the tested nuclease(s).
  • FIG. 2 shows fijrther details on image analysis including the steps of obtaining a fluorescence microscopy image, image deconvolution, delineation/segmentation of cell nuclei, p53BPl foci, and nuclease protein, morphological data estimation, and
  • ROIs regions of interest
  • FOV field of view
  • the estimated morphological parameters and other image parameters of the cells were analyzed using informatics methods to obtain statistical inferences on the activity and specificity of the delivered nuclease relative to control cell samples.
  • This example illustrates transfection of cells with nucleases.
  • a BTX ECM830 device with a 2 mm gap cuvette was used.
  • TALEN mRNAs were prepared using a mMessageMachine T7 Ultra Kit (#AM 1345, A bion).
  • 0.2x10 6 cells were washed twice with PBS and centrifuged. Cell pellets were resuspended in 100
  • K562 cells or A549 cells were transferred to 2mL of pre- warmed IMDM/l0%FBS/l%PS (for K 562 cells) or 2 mL of pre-warmed F- l2K/lO%FBS/l%PS (for A549 cells) and CD34 cells were transferred to 600 pl
  • Human CD4 + T lymphocytes were isolated from peripheral blood mononuclear cells (PBMCs) of non- mobilized healthy donors by negative selection.
  • Human CD4+ T lymphocyte culture medium was prepared with X-VTVO 15 (Lonza, Basel, Switzerland) supplemented with lO% FBS, 2 mM L-glutamine, 1% penicillin/streptomycin, and 20 ng/ml IL2 (PeproTech, Rocky Hill, NJ, USA).
  • Cell washing media was prepared with lO% FBS in PBS. Cells were cultured by pre- warming the culture media and washing media to 37°C. Cell tubes were filled with 30ml washing media and cells were counted.
  • T cells were activated with Anti-CD3/CD28-Dynabeads (Life Technologies, Cat# 11132D).
  • Dynabeads washing buffer was prepared containing PBS with 0. l% BSA and 2 mM EDTA, pH 7.4. Anti-CD3/CD28-Dynabeads were resuspended and transferred to a tube. An equal volume of Dynabeads washing buffer was added, the tube was placed on a magnet for 1 min, and the supernatant was discarded. Washed Dynabeads were resuspended in culture media.
  • Washed Dynabeads were added to the CD4+ T cell culture suspension at a bead to cell ratio of 1 :1 and the cells were mixed with a pipette. Plates were incubated at 37°C, 5% C02 humidified incubator for 24 hours to activate T cells. Activated T cells were mixed and placed on the magnet for 5 min and supernatants containing cells were collected. This step was repeated 2-3 times to obtain activated T cells (without Dynabeads) for further experimentation. For transfection of T cells, after transfection cell maintain medium was prepared containing X- VIVO 15 (Lonza, Basel, Switzerland) supplemented with l0% FBS,
  • Electroporation settings included a choose mode of LV, set voltage of 250 V, set pulse length of 5 ms, 1 set number of pulses, a BTX Disposable Cuvette (2mm gap) electrode type and a desired field strength of 3000 V/cm.
  • Cell culture plates were prepared with after transfection cell maintain medium by filling appropriate number of wells with desired 800 pi. Plates were pre-incubated/equilib rated in a humidified 37°C, 5 % CO2 incubator. l-2 pg of TALEN mRNA was aliquoted in a separate tube.
  • BTXpress high performance electroporation solution (BTX, Holliston, MA, USA) was brought to room temperature.
  • Activated CD4+ T cells were collected and counted to determine cell density. Total cells needed (0 2-0 5/ 10 6 cells per sample) were centrifuged at 300xg for 8 minutes at room temperature and washed twice with PBS. For transfection, CD4 + T cells were resuspended in BTXpress high performance electroporation solution (Harvard Apparatus, Holliston, MA, USA), to a final density of 2-5 x 10 6 cells/mL. lOOul of cells was mixed with aliquoted mRNA. Cell-mRNA mixture was added to a well of MOS Multi-Well Electroporation Plate, sealed, and placed into the HT Electroporation System.
  • BTXpress high performance electroporation solution Hard Apparatus, Holliston, MA, USA
  • T cells were electroporated in a BTX ECM830 Square Wave electroporator using a single pulse of 250 V for 5 ms. Electroporated CD4+ T cells were placed in an Axygen Deep 96- well plate or 12 /24 well Falcon Polystyrene Microplates with pre-warmed cell maintain medium. Cells were“cold shocked” in a humidified 30°C, 5 % CO2 incubator for 16-24 hour, then incubated in a humidified 37°C, 5 % CO2 incubator until analysis. Gene expression or down regulation was detectable as early as 4-8 hours post electroporation. For imaging, cells were collected 24 hours after transfection. For genomic DN A isolation, cells were incubated for around 48-72 hours. For RN A collection, cells were incubated up to 4-5 days.
  • This example illustrates p53BPl immunofluorescence analysis using the compositions and methods of the present disclosure.
  • Cell preparation Cells were prepared for immunofluorescence staining and image analysis on a coverslip and in 24 well plates. For preparation of cells on coverslips, cells were seeded onto a poly-l-lysine coated #1.5 glass coverslip (12 mm round or 18 mm square). First, coverslips were placed into a well of a 6-well tissue culture plate. Cells were pre washed with PBS, resuspended to ⁇ 2, 000, 000 cells/mL in PBS, and 50-100 uL cells were spotted onto the center of each coverslip. Cells were allowed to settle for 10-15 minutes at room temperature. Next cells were fixed in 2 mL/well of fresh fixative (4%
  • Blocking buffer was prepared to contain 2% BSA (from 10% BSA/PBS), 0.05% Tween-20, and lx PBS.
  • Cells were blocked with 1.5 mL/well blocking buffer (in a 6-well plate) for 30 minutes at room temperature.
  • Primary antibody incubation was carried out as follows. Primary antibodies were diluted in blocking buffer at the following ratios: 1:500 for anti-p53BPl (tagging for p53BPl, which accumulates at the site of double strand breaks) and 1:2000 for anti-FLAG (tagging for FLAG label on a nuclease).
  • a humidified chamber was prepared and a sheet of Parafilm was placed inside with 100 pL spots of the primary antibody solution. Coverslips were removed from the 6-well plate, inverted onto the primary antibody spots inside the humidified chamber, and incubated for 2 hours at room temperature.
  • Coverslips were returned into the original 6-well plate with blocking buffer and cells were washed with 2 mL/well with IX PBS three times for 5 minutes per wash. Samples were protected from light for subsequent steps performed with the secondary antibody labeled with a fluorophore. Secondary antibody incubation was carried out as follows. The secondary antibodies (donkey-anti-rabbit-Cy3 and donkey- anti-mouse-AF647) were diluted in a blocking buffer at 1:500. A new sheet of Parafilm was placed inside the humidified chamber with 100 pl spots of the secondary antibody solution. Coverslips were removed from the 6-well plate and inverted onto secondary antibody spots. Coverslips were incubated for 1.5 hours at room temperature.
  • Coverslips were returned into the original 6-well plate and washed three times with 3 mL/well with lx PBS for 5 minutes per wash. Finally, cells were stained with DAPI for visualization of the nucleus. Cells were incubated at 1.5 mL/well of lx PBS with 100 ng/mL of DAPI for 10 minutes at room temperature. Cells were washed once with lx PBS.
  • Cell Preparation Cells were seeded onto PLL coated glass bottom 24 well plates as follows. Cells were pre- washed with PBS and resuspended to ⁇ 2, 000, 000 cells/mL in PBS. 20-50 pL of cells were spotted onto the center of each well and allowed to settle for 10-15 minutes at room temperature. Cells were fixed in 0.5 mL/well of fresh fixative (4% formaldehyde in lx PBS) as follow. 500 pL was added to each well, plates were shaked to dislodge poorly attached cells, and incubated for 10 minutes at room temperature.
  • fresh fixative 4% formaldehyde in lx PBS
  • Secondary antibody incubation was carried out as follows. Secondary antibody diluted in blocking buffer at a ratio of 1:500 was added at 300 uL/well. Cells were incubated for 1.5 hours at room temperature, washed three times with 0.5 mL/well of lx PBS for 5 minutes per wash. Cells were stained with DAPI for visualization of the nucleus by incubating cells in 0.3 mL/well of lx PBS + 100 ng/mL DAPI for 10 minutes at room temperature. Cells were washed once with lx PBS. [0444] Mounting. lOuL drop of Prolong Gold was placed on 12 mm round glass coverslips, PBS was aspirated from wells, coverslips with Prolong Gold were inverted onto cells in a well, and Prolong Gold was allowed to cure for 24 hours at room
  • Cell Preparation Cells were seeded onto coated glass bottom 96 well plates (e.g., PLL-coated plates, CC 2 Nunc Micro- well plates) as follows. Cells were pre- washed with PBS and resuspended to -2,000,000 cells/mL in PBS. 10 pL of cells were spotted onto the center of each well and allowed to settle for 10- 15 minutes at room temperature. Cells were fixed in 0.1 mL/well of fresh fixative (4% formaldehyde in lx PBS) as follow. 100 pL was added to each well, plates were shaked to dislodge poorly attached cells, and incubated for 10 minutes at room temperature.
  • coated glass bottom 96 well plates e.g., PLL-coated plates, CC 2 Nunc Micro- well plates
  • Secondary antibody incubation was carried out as follows. Secondary antibody diluted in blocking buffer at a ratio of 1 :500 was added at 75 uL/well. Cells were incubated for 1.5 hours at room temperature, washed three times with 0.1 mL/well of lx PBS for 5 minutes per wash. Cells were stained with DAPI for visualization of the nucleus by incubating cells in 0.1 mL/well of lx PBS + 100 ng/mL DAPI for 10 minutes at room temperature. Cells were washed once with lx PBS.
  • This example illustrates dose response assessment of nucleases in multiple cell types using p53BPl analysis.
  • TALENs GA6, GA7, AAVS1
  • TALENs were tested for editing efficiency (quantification of the number of target sites with indels over the total number of target sites) and dose dependent generation of double stranded breaks, as determined by imaging for and counting p53BPl foci.
  • TALENs were transfected in cells as described in EXAMPLE 2 and p53BPl was stained for and imaged as described in EXAMPLE 4 and EXAMPLE 1
  • TABLE 4 below shows the nuclease designs including the left TALEN arm (bold), the right TALEN arm (italics), and the target sequence (underlined).
  • FIG. 3, FIG. 4, and FIG. 5 illustrate dose response assessments of GA7 TALENs in primary CD34+ hematopoietic stem cells, GA6 TALENs in immortalized K562 cells, and AAVS1 TALENs in immortalized K562 cells.
  • FIG. 3A shows the number of p53BPl foci per cell for CD34+ primary cells treated with a blank transfection control, 0.5 pg GA7 per TALEN monomer, 1 pg GA7 per TALEN monomer, 2 pg GA7 per TALEN monomer, and 4 pg GA7 per TALEN monomer.
  • FIG. 3A shows the number of p53BPl foci per cell for CD34+ primary cells treated with a blank transfection control, 0.5 pg GA7 per TALEN monomer, 1 pg GA7 per TALEN monomer, 2 pg GA7 per TALEN monomer, and 4 pg GA7 per TALEN monomer.
  • 3B shows the total p53BPl content (fluorescence intensity) per nucleus normalized by the nuclear size versus total FLAG tag content per nucleus normalized by the nuclear size indicative of a nuclease for CD34+ primary cells treated with a blank transfection control, 0.5 pg GA7 per TALEN monomer, 1 pg GA7 per TALEN monomer, 2 pg GA7 per TALEN monomer, and 4 pg GA7 per TALEN monomer.
  • FIG. 4A shows the number of p53BPl foci per cell for immortalized K562 cells treated with a blank transfection control, 0.5 pg GA6 per TALEN monomer, 1 pg GA6 per TALEN monomer, 2 pg GA6 per TALEN monomer, and 4 pg GA6 per TALEN monomer.
  • 4B shows the total p53BPl content (fluorescence intensity) per nucleus normalized by the nuclear size versus total FLAG tag content per nucleus normalized by the nuclear size indicative of a nuclease for immortalized K562 cells treated with a blank transfection control, 0.5 pg GA6 per TAI.F.N monomer, 1 pg GA6 per TAI.F.N monomer, 2 pg GA6 per TAFF.N monomer, and 4 pg GA6 per TAFF.N monomer.
  • FIG. 5A shows the number of p53BPl foci per cell for immortalized K562 cells treated with a blank transfection control, 0.5 pg AASVl per TALEN monomer, 1 pg AASV1 per TALEN monomer, 2 pg AASVl per TALEN monomer, and 4 pg AASVl per TALEN monomer.
  • 5B shows the total p53BPl content (fluorescence intensity) per nucleus normalized by the nuclear size versus total FLAG tag content per nucleus normalized by the nuclear size indicative of a nuclease for immortalized K562 cells treated with a blank transfection control, 0.5 pg AASV1 per TALEN monomer, 1 pg GA6, 2 pg AASV1 per TAFF.N monomer, and 4 pg AASV1 per TALEN monomer.
  • Nuclease specificity was assessed for each of GA7, GA6, and AAS VI -targeting TALENs by evaluating the impact of nuclease dose on off- target cutting activity.
  • TALENs that exhibited a high number of p53BPl foci, indicative of double stranded breaks, in a dose- dependent manner indicate a nuclease with low specificity.
  • FIG. 3 CD34+ primary progenitor cells treated with a GA7 targeting TALEN exhibited only minimal increases in the DN A damage response, as indicated by the number of p53BPl foci, as the delivered dose of the TALEN was increased.
  • the less specific GA6 FIG. 3
  • This example illustrates a time course assessment of nuclease activity using the p53BPl analysis of the present disclosure.
  • Nuclease specificity was used to study the cellular response to nuclease activity at various times after treatment of immortalized K 562 cells.
  • K562 cells were transfected with mRNA encoding TALENs targeting the AAVS1 DNA locus.
  • Cells were transfected as described in EXAMPLE 2 and p53BPl was stained for and imaged as described in EXAMPLE 4 and EXAMPLE 1. Cells were sampled and imaged at 6 hours, 12 hours, 24 hours, 48 hours, and 72 hours post- transfection.
  • FIG. 6 shows a graph of the number of p53BPl foci per K562 cells at 6 hours, 12 hours, 24 hours, 48 hours, and 72 hours as compared to a control at each time point.
  • the editing efficiency was determined to be 91% at 48 hours tested. Peak activity was observed for the AAVS1 -targeting TALENs at 24 hours, and persisted beyond the 72 hour post-transfection time point. Additionally, an initial increase in the DNA damage response triggered by electroporation was detected in control cells.
  • AAS VI -targeting TALENs transfected in CD4+ T cells ceased all activity by 48 hours post-transfection, as shown in FIG. 16.
  • FIG. 16 shows a graph of the number of p53BPl foci per CD4+ T cell at 24 hours and 48 hours post- transfection with AAS VI -targeting TALENs as compared to blank transfection controls at each time point.
  • This example illustrates the utility of p53BPl analysis of the present disclosure for pan-cell type assessment of AAVS1 -targeting TALEN specificity.
  • TALENs targeting AAVS1 region were transfected in adherent immortalized A549 cells, suspension immortalized K562 cells, and primary cell samples isolated from blood including CD34+ progenitor cells and CD4+ T cells.
  • Non-T cells were transfected as described in EXAMPLE 2
  • T cells were transfected as described in EXAMPLE 3
  • p53BPl was stained for and imaged as described in EXAMPLE 4 and EXAMPLE 1.
  • All cells were transfected with 2 mRNAs encoding the respective TALEN monomers (one targeting a top strand of the target DNA genomic locus and the second targeting a bottom strand of the target DNA genomic locus). Cells were sampled for evaluation of p53BPl foci 24 hours post-transfection.
  • FIG. 7 shows the results of control transfection and AAS VI -targeting TALEN transfection in various cell types.
  • FIG. 7A shows the number of p53BPl foci in adherent immortalized A549 cells transfected with a control and with an AAS VI -targeting TALEN 24 hours post- transfection.
  • FIG. 7B shows the number of p53BPl foci in suspension
  • FIG. 7C shows the number of p53BPl foci in primary CD34+ progenitor cells transfected with a control and with an AAS VI -targeting TALEN 24 hours post- transfection.
  • FIG. 7D shows the number of p53BPl foci in primary CD4+ T cells transfected with a control and with an AAS VI -targeting TALEN 24 hours post- transfection.
  • FIG. 7E shows representative images of cells treated with AAVS1 TALENs versus untreated controls. Cells were stained for p53BPl with an antibody and are visualized in green.
  • TALENs were stained with a FLAG tag and are visualized in red. Nuclei were stained with DAPI and are visualized in grey. The scale bar indicates a size of 5 pm.
  • TABLE 6 shows the gene editing efficiency of AAVS1 -targeting TALENs in A549 cells, K562 cells, CD34+ cells, and CD4+ T cells.
  • This example illustrates the utility of p53BPl analysis for pan-nuclease type assessment of genome editing specificity.
  • TALENs and Cas9 nucleases targeting the AAVS1 genomic locus were transfected in K 562 cells.
  • K562 cells were transfected with Cas9 protein along with AAVS1 -targeting guide RNAs and incubated at 37°C for 24 hours prior to sampling.
  • K562 cells were transfected with 2 mRNAs encoding the respective TALEN monomers (one targeting a top strand of the target DNA genomic locus and the second targeting a bottom strand of the target DNA genomic locus) and incubated at 30 °C for 24 hours prior to sampling.
  • Cells were transfected as described in EXAMPLE 2 and p53BPl was stained for and imaged as described in EXAMPLE 4 and EXAMPLE 1.
  • FIG. 8 illustrates assessment of nuclease specificity in K 562 cells for TALENs and Cas9 nucleases targeting the AAVS1 genomic locus.
  • FIG. 8A illustrates the number of p53BPl foci per cell for K562 cells transfected with Cas9 protein along with AAVS 1 guide RNAs as compared to a blank transfection control.
  • FIG. 8B illustrates the number of p53BPl foci per cell for K562 cells transfected with AAVS1 -targeting TALENs as compared to a blank transfection control.
  • TABLE 7 below shows the editing efficiency of AAVS1 -targeting Cas9 and AAVS1- targeting TALENs.
  • This example illustrates the utility of p53BPl analysis for assessing nuclease activity in diverse cell types targeting various genomic loci.
  • nuclease specificity as determined by p53BPl analysis can be used to screen multiple nucleases in diverse cell types.
  • the performance of TALENs targeting GA6, AAVS 1, and GA7 in CD34+ progenitor cells and the performance of TALENs targeting TP 150, AAVS1, and TP171 in stimulated CD4+ T cells was evaluated.
  • Non-T cells were transfected as described in EXAMPLE 2
  • T cells were transfected as described in EXAMPLE 3
  • p53BPl was stained for and imaged as described in EXAMPLE 4 and EXAMPLE 1.
  • FIG. 9 shows the DNA damage response, as measured by p53BPl foci quantification, in CD34+ cells and T cells with TALENs targeting various genomic loci.
  • FIG. 9 A shows the number of p53BPl foci per cell in primary CD34+ progenitor cells after transfection with GA6-targeting TALENs, AAVS1 -targeting TALENs, GA7-targeting TALENs, GA6-EK- targeting TALENs, and GA7-targeting TALENs. Controls include blank transfection controls.
  • FIG. 9 A shows the number of p53BPl foci per cell in primary CD34+ progenitor cells after transfection with GA6-targeting TALENs, AAVS1 -targeting TALENs, GA7-targeting TALENs, GA6-EK- targeting TALENs, and GA7-targeting TALENs. Controls include blank transfection controls.
  • Controls include non- electroporated naive T cells, non- electroporated stimulated T cells, and untreated blank transfection control stimulated T cells.
  • TABLE 8 shows the editing efficiency of several TALENs targeting different genomic loci after transfection of primary CD34+ progenitor cells.
  • This example illustrates the use of p53BPl analysis for improving nuclease design. Specificity was assessed using the p53BPl tools and methods of analysis of the present disclosure to evaluate different designs of nucleases targeting the same genomic locus. Non-T cells were transfected as described in EXAMPLE 2 and p53BPl was stained for and imaged as described in EXAMPLE 4 and EXAMPLE 1.
  • K562 cells were transfected with GA6-targeting TALENs having homodimeric Fokl nuclease domains (GA6) or GA6-targeting TALENs with the obligate heterodimeric
  • ELD/KKR Fokl nuclease domains (GA6 EK).
  • ELD Fokl has a sequence of
  • FIG. 12 shows the number of p53BPlfoci per cell in K 562 cells transfected with GA6 or GA6 EK TALENs. [0472] TABLE 11 below shows the genome editing efficiency of GA6 and GA6 EK.
  • the p53BPl tools and methods of analysis of the present disclosure were used to evaluate the contribution of individual components of a nuclease.
  • the specificity of individual monomers of GA6 TALEN (GA6 L (left TALEN) and GA6 R (right TALEN) was measured in K562 cells and compared GA6 homodimers (GA6 LR (left and right TALEN s)) and a blank transfection control.
  • Cells were transfected with mRNA encoding either GA6 L, GA6 R, or both GA6 L + GA6 R (GA6 LR) and incubated at 30°C for 24 hours prior to sampling.
  • nuclease performance was optimized by varying the length of the DNA binding domain in a homodimeric Fokl GA6-targeting TALEN.
  • the GA6 L monomer appeared responsible for the lack of specificity and high number of p53BPl foci per cell, as shown in FIG. 11.
  • the DNA binding domain was extended from 14 repeat units (GA6 L14) to 17 repeat units (GA6 L17) and 19 repeat units (GA6 L19).
  • FIG. 10 shows the number of p53BPl foci per cell in K562 cells transfected with GA6 L14, GA6 L17, and GA6 L19.
  • TABLE 12 below shows the nuclease designs including the left TALEN arm (bold), the right TALEN arm (italics), and the target sequence (underlined).
  • This example illustrates multiplexed p53BPl, FLAG, and Nano-FISH staining and analysis and the use of p53BPl analysis and Nano-FISH to dissect on-target and off-target activity of nucleases for genome editing.
  • Nuclease specificity was assessed in a site-specific manner at the genomic locus of interest by imaging and analyzing nuclease (tagged with FLAG) induced double strand breaks (indicated by staining for p53BPl) at a particular genomic locus of interest, which is visualized by oligonucleotide Nano-FISH probe sets.
  • Cell Preparation Cells were prepared for co-staining by seeding onto poly-l-lysine coated #1.5 glass coverslip (12 mm round or 18 mm square). Coverslips were placed into each well of a 6-well tissue culture plate, cells were prewashed with PBS and resuspended to ⁇ 2, 000, 000 cells/mL in PBS. Cells were spotted (50-100 ul) onto the center of each coverslip and cells were allowed to settle for 10-15 minutes at room temperature. Cells were fixed in 2mL/well with fresh fixative (4% formaldehyde in lx PBS) and incubated for 10 minutes at room temperature. Cells were washed twice with 3 mL/well of lx PBS, each over 5 minutes.
  • Cells were permeabilized in 2 mL/well 0.5% Triton X-100, lx PBS for 15 minutes at room temperature, cells were washed twice with 3 mL/well of lx PBS for 5 minutes each, cells were incubated with 1.5 mL/well 0.1M HC1 for 4 minutes at room temperature, and cells were washed twice with 3 mL/well of 2x SSC over 5 minutes. Cells were incubated in 1.5 mL/well of 2x SSC + 25 ug/mL RNase A for 30 minutes at 37 °C, washed twice with 3 mL/well of 2x SSC, for 5 minutes each. Finally, cells were pre- equilibrated with 1.5 mL/well of 50% Formamide, 2x SSC [pH 7] for at least 30 minutes at room temperature prior to denaturation.
  • Denaturation/Hybridization Denaturation/Hybridization. Denaturation solution (70% formamide, 2x SSC) was added at 3 mL/well in a new 6-well plate and the well-plate was heated for at least 30 minutes on a hotplate set to 78° C. Denaturation was carried out as follows. Coverslips were transferred into the well plate with preheated denaturation solution and incubated for 4.5 minutes at 78°C, then immediately transferred onto hybridization solution. All subsequent steps were carried out so that samples were protected from light.
  • Denaturation solution 70% formamide, 2x SSC
  • Hybridization solution with oligonucleotide Nano-FISH probes was prepared as follows.
  • Oligonucleotides Nano-FISH probes at a concentration of 10 uM were diluted in Hybridization buffer at a ratio of 1:40, such that the final concentration was 250 nM.
  • Oligonucleotide Nano-FISH probes were synthesized to include the Quasar-670 dye, which was imaged in the Cy5 channel.
  • a humidified chamber was set up by placing a sheet of Parafilm onto a wet paper towel inside a dark plastic container. On a sheet of Parafilm, Hybridization solution was spotted at a volume of 80 ul. Hybridization was carried out by removing coverslips from the denaturation solution, inverting onto Hybridization solution spots inside the humidified chamber, and incubating overnight at 37° C.
  • Post-hybridization washes. Coverslips were transferred from the humidified chamber into a new 6- well plate filled with 3 mL/well of 2x SSC and the plate was gently rocked to mix the remaining hybridization solution with SSC. SSC was aspirated and cells were washed with 3 mL/well of 2xSSC three times, each for 10 minutes, at room
  • Blocking buffer was prepared containing 2% BSA (from 10% BSA/PBS), 0.05% Tween-20, lx PBS. Cells were blocked with 1.5 mL/well of blocking buffer in a 6-well plate for 30 minutes at room temperature. Primary antibody incubation was carried out by first diluting the primary antibody in a blocking buffer at the following ratios: 1:500 for anti-p53BPl, 1:2000 for anti-FLAG. A humidified chamber was prepared and on a sheet of Parafilm inside the humidified chamber, 100 ul spots of primary antibody solution was placed. Coverslips were removed from the 6-well plate, inverted onto primary antibody spots, and incubated for 2 hours at room
  • Secondary antibody incubation was carried out by first diluting secondary antibodies (donkey-anti -rabbit- AF 488 and donkey- anti-mouse-AF594) in blocking buffer at a ratio of 1:500. On a new sheet of Parafilm inside the humidified chamber, secondary antibody solution was spotted at a volume of 100 ul. Coverslips were removed from the 6-well plate, inverted onto the secondary antibody spots, and incubated for 1.5 hours at room temperature. Coverslips were returned into the original 6-well plate and cells were washed three times with 3 mL/well of lx PBS for 5 minutes each.
  • Cells were stained with DAPI to visualize the nuclease by incubating cells in 1.5 mL/well of lx PBS + 100 ng/mL DAPI for 10 minutes at room temperature and cells were washed once with lx PBS.
  • Nano-FISH imaging methods and p53BPl imaging disclosed herein allows for in situ visualization of on-target versus off-target nuclease cutting activity.
  • Fluorophore-conjugated oligonucleotide Nano-FISH probes were designed to hybridize to a target DN A genomic locus of interest.
  • K 562 cells were transfected with AAVS1 -targeting TAT UN for 24 hours as described in EXAMPLE 2.
  • a fluorescently labeled Nano-FISH oligonucleotide probe was allowed to hybridize to the AAVS1 genomic locus in K 562 cells and cells were additionally stained for p53BPl, as described above.
  • FIG. 13 shows fluorescence microscopy images of control cells and AAVS1- targeting TALEN treated cells.
  • a DAPI stain (gray) was used to visualize nuclei, p53BPl is shown in green and the AAVS1 oligonucleotide Nano-FISH probe was visualized in red.
  • Imaging showed that in cells transfected with AAVS 1 -targeting TALEN, spots indicative of double stranded breaks (indicated by p53BPl foci) co-localized with AAVS1 oligonucleotide Nano-FISH probe spots. These results showed that the AAVS1 -targeting TALEN exhibited nuclease specificity, as confirmed by co-localization of DNA repair signals at the genomic locus of interest.
  • FIG. 14 shows histograms of the proportion of pairwise distances between AAVS1 Nano-FISH spots and p53BPl foci.
  • FIG. 14A shows histograms of control and AAVS1 TALEN treated cells at pairwise distances of 0.1 to 0.5.
  • FIG. 14B shows histograms of control and AAVS1 TALEN treated cells at pairwise distances of 0 to 0.025.
  • FIG. 14C shows histograms of control and AAVS1 TALEN treated cells at pairwise distances of 0 - -0.08.
  • Histograms showed a significantly higher co-location between AAVS1 loci and sites of DNA repair in TALEN-treated cells relative to untreated control cells.
  • the combination ofNano-FISH and p53BPl foci visualization enable the measurement of off- target activity (the number of p53BPl foci not co-localized with their target genomic loci).
  • This example illustrates the use of p53BPl analysis for diverse micro imaging platforms and small cell samples. Nuclease specificity has also been determined using the compositions and methods described herein in on several types of imaging platforms and in smaller sample sizes. Samples were imaged using a Nikon microscope or the Stellar- Vision microscope, as described in EXAMPLE 1.
  • FIG. 15 shows evaluation of nuclease specificity by counting p53BPl foci in cells transfected with AAVS1 -targeting TALENs.
  • FIG. 15A illustrates the number of p53BPl foci on the x axis versus the proportion of cells with p53BPl foci on the y-axis in cells transfected with AAVS1 -targeting TALENs and, in 3D, imaged on a Nikon widefield fluorescence microscope with a 60x magnification lens using oil immersion contact techniques.
  • “Ref’ samples indicate control cells that were not transfected with TALENs.
  • Biological replicates are shown for control and transfected cells (indicated by set x). The number of cells analyzed in each sample is indicated by“n”
  • FIG. 15B illustrates the number of p53BPl foci on the x axis versus the proportion of cells with p53BPl foci on the y-axis in cells transfected with AAVS1 -targeting TALENs and imaged, in 3D, on a Nikon widefield fluorescence microscope with a 40x magnification lens using non- contact techniques.
  • “Ref’ samples indicate control cells that were not transfected with TALENs.
  • Biological replicates are shown for control and transfected cells. The number of cells analyzed in each sample is indicated by“n.”
  • FIG. 15C illustrates the number of p53BPl foci on the x axis versus the proportion of cells with p53BPl foci on the y-axis in cells transfected with AAVS1 -targeting TALENs and imaged on a Stellar- Vision (SV) fluorescence microscope using non-contact techniques.
  • SV Stellar- Vision
  • Ref samples indicate control cells that were not transfected with TALENs. Biological replicates are shown for control and transfected cells. The number of cells analyzed in each sample is indicated by“n.”
  • TABLE 14 below shows p values from several statistical tests including a t-test, Kolmogorov- Smirnov (KS) test, and Wilcoxon- smith (WS) test comparing of p53BPl spots in transfected cells and control cells.
  • KS Kolmogorov- Smirnov
  • WS Wilcoxon- smith
  • TABLE 15 below shows p-values from a t- test comparing p53BPl spots in transfected cells and control cells for different sample sizes. The results below show a high degree of statistical significance even when analyzing a small number of cells across all imaging modalities. These results demonstrated the utility of using p53BPl analysis for clinically relevant applications that involve the use of small sample sizes to screen nucleases for lead candidates.
  • This example illustrates screening of nucleases for a nuclease with high specificity using the compositions and methods disclosed herein for staining, imaging, and analyzing a protein (e.g., p53BPl) that accumulates at the site of a double strand break.
  • a protein e.g., p53BPl
  • nucleases of various types e.g., TALENS, Cas9 are screened for nuclease specificity in immortalized cells (e.g., K562, A549) and primary cells (e.g., CD34+ progenitor cells, naive or stimulated T cells). Nucleases are transfected in immortalized or primary cells, as described in
  • Cells are stained for p53BPl using the methods as set forth in EXAMPLE 4. Imaging, image analysis, and informatics is carried out using the methods set forth in EXAMPLE 1.
  • p53BPl foci are automatically counted and plotted against a parameter of interest for each nuclease (dose of nuclease, RVD length, etc.).
  • Nuclease specificity is assessed for each nuclease tested by quantifying the total p53BPl load (e.g., number of protein foci or total protein content within the nucleus).
  • a high p53BPl load indicates nucleases with relatively poor specificity.
  • a lower p53BP load indicates nucleases with better specificity.
  • a genome editing complex comprising a nuclease (e.g., TALENs, zinc finger nucleases (ZFNs), or CRISPR/Cas9) targeting a therapeutic gene of interest for genome editing is transfected in immortalized or primary cells as set forth in EXAMPLE 2 or EXAMPLE 3.
  • the nuclease induces double stranded breaks.
  • Cells are stained and analyzed as described in EXAMPLE 10 with an oligonucleotide Nano-FISH probe set for the particular genomic locus of the therapeutic gene of interest and for p53BPl, indicative of double strand breaks induced by the nuclease.
  • Cells are imaged and analyzed as described in EXAMPLE 1. Co- localization of oligonucleotide Nano-FISH probes and all double strand breaks is observed, indicating a nuclease with high specificity and no off target activity.
  • This example illustrates screening of repressors for a repressor with high specificity using the compositions and methods disclosed herein for staining, imaging, and analyzing a protein (e.g., KAP1, H3K9me3 or HPl) that accumulates at the site of repression (e.g., by KRAB).
  • a protein e.g., KAP1, H3K9me3 or HPl
  • Repressors of various types e.g., KRAB, Sin3a, LSD1, SUV39H1, G9A (EHMT2), DNMT1, DNMT3A-DNMT3L, DNMT3B, KOX, TGF-beta-inducible early gene (TIEG), v- erbA, SID, MBD2, MBD3, Rb, or MeCP2 are screened for specificity in immortalized cells (e.g., K562, A549) and primary cells (e.g., CD34+ progenitor cells, naive or stimulated T cells).
  • immortalized cells e.g., K562, A549
  • primary cells e.g., CD34+ progenitor cells, naive or stimulated T cells.
  • Repressors coupled to a binding domain are transfected in immortalized or primary cells, as described in EXAMPLE 2 or EXAMPLE 3.
  • Cells are stained for a protein (e.g., KAP1) using the methods as set forth in EXAMPLE 4 with antibodies specific to the protein. Imaging, image analysis, and informatics is carried out using the methods set forth in EXAMPLE 1.
  • Protein (e.g., KAP1) foci are automatically counted and plotted against a parameter of interest for each repressor (e.g., dose of repressor, RVD length, etc.).
  • Repressor specificity is assessed for each repressor tested by counting for protein (e.g., KAP1) foci.
  • a high number of protein (e.g., KAP1) foci indicate repressors with relatively low specificity.
  • a lower number of protein (e.g., KAPl) foci indicate repressors with better specificity.
  • Site-specific detection of proteins such as H3K9me3 or HPl can be confirmed by combination imaging with Nano- FISH, as described in EXAMPLE 10.
  • a genome editing complex e.g., TALEN, ZFN, CRISPR/Cas9, megaTAL, meganuclease
  • EXAMPLE 2 or EXAMPLE 3 Cells are stained for p53BPl as described in EXAMPLE 4 with a first detectable agent and subsequently administered a oligonucleotide Nano-FISH probe set with a second detectable agent for the target genomic locus and a different oligonucleotide Nano-FISH probe set with a third detectable agent for an off-target genomic locus.
  • Foci of p53BPl are visualized by signal from the first detectable agent, indicating a double strand break and gene editing with the genome editing complex.
  • Foci of the first oligonucleotide Nano-FISH probe set are visualized by signal from the second detectable agent, indicating the target genomic locus.
  • Foci of the second oligonucleotide Nano-FISH probe set are visualized by signal from the third detectable agent, indicating the off- target genomic locus.
  • This example illustrates determining specificity of genome editing with a transthyretin (TTR)-targeting nuclease.
  • a genome editing complex e.g., TALEN, ZFN, CRISPR/Cas9, megaTAL, meganuclease
  • TTR transthyretin
  • a genome editing complex e.g., TALEN, ZFN, CRISPR/Cas9, megaTAL, meganuclease
  • the nuclease induces double stranded breaks.
  • Cells are stained and analyzed as described in EXAMPLE 11 with an oligonucleotide Nano- FISH probe set for TTR and for p53BPl, indicative of double strand breaks induced by the nuclease.
  • Cells are imaged and analyzed as described in EXAMPLE 1.
  • Co-localization of signal from oligonucleotide Nano-FISH probes and p53BPl is quantified to determine the specificity of the nuclease for TTR and any off-target activity of the nuclease.
  • a nuclease with high specificity for TTR and low to none off-target activity is used to administer in a subject in need thereof The subject has transthyretin amyloidosis (ATTR).
  • This example illustrates determining specificity of genome editing with a CCR5- targeting nuclease.
  • a genome editing complex e.g., TALEN, ZFN, CRISPR/Cas9, megaTAL, meganuclease
  • the nuclease induces double stranded breaks.
  • Cells are stained and analyzed as described in EXAMPLE 11 with an oligonucleotide Nano- FISH probe set for CCR5 and for p53BPl, indicative of double strand breaks induced by the nuclease. Cells are imaged and analyzed as described in EXAMPLE 1.
  • Co-localization of signal from oligonucleotide Nano-FISH probes and p53BPl is quantified to determine the specificity of the nuclease for CCR5 and any off-target activity of the nuclease.
  • a nuclease with high specificity for CCR5 and low to none off-target activity is used to administer in a subject in need thereof The subject has HIV.
  • This example illustrates determining specificity of genome editing with a
  • glucocorticoid receptor (NR3 C l )- targeting nuclease A genome editing complex (e.g., TALEN, ZFN, CRISPR/Cas9, megaTAL, meganuclease) targeting NR3C 1 is transfected in immortalized or primary cells as set forth in EXAMPLE 2 or EXAMPLE 3.
  • the nuclease induces double stranded breaks.
  • Cells are stained and analyzed as described in EXAMPLE 11 with an oligonucleotide Nano-FISH probe set for NR3C1 and for p53BPl, indicative of double strand breaks induced by the nuclease. Cells are imaged and analyzed as described in EXAMPLE 1.
  • Co-localization of signal from oligonucleotide Nano-FISH probes and p53BPl is quantified to determine the specificity of the nuclease for NR3C1 and any off- target activity of the nuclease.
  • a nuclease with high specificity for NIG C l and low to none off-target activity is used to administer in a subject in need thereof The subject has glioblastoma multiforme.
  • This example illustrates determining specificity of genome editing with a TRA- targeting nuclease and/or a CD52-targeting nuclease.
  • a genome editing complex e.g., TALEN, ZFN, CRISPR/Cas9, megaTAL, meganuclease
  • a genome editing complex e.g., TALEN, ZFN, CRISPR/Cas9, megaTAL, meganuclease
  • the nuclease induces double stranded breaks.
  • Cells are stained and analyzed as described in EXAMPLE 1 with an oligonucleotide Nano-FISH probe set for TRA and/or CD52 and for p53BPl, indicative of double strand breaks induced by the nuclease.
  • Cells are imaged and analyzed as described in EXAMPLE 1.
  • Co- localization of signal from oligonucleotide Nano-FISH probes and p53BPl is quantified to determine the specificity of the nuclease for TRA and/or CD52 and any off-target activity of the nuclease.
  • a nuclease with high specificity for TRA and/or CD52 and low to none off-target activity is used to administer to cells ex vivo to generate a universal T cell therapy, to be administered to a subject in need thereof
  • the subject has a cancer, such as acute lymphoblastic leukemia or acute myeloid leukemia.
  • This example illustrates determining specificity of genome editing with a nuclease targeting the erythroid specific enhancer ofBCLUA.
  • a genome editing complex e.g., TALEN, ZFN, CRISPR/Cas9, megaTAL, meganuclease targeting the erythroid specific enhancer of BCL11 A is transfected in immortalized or primary cells as set forth in
  • the nuclease induces double stranded breaks.
  • Cells are stained and analyzed as described in EXAMPLE 11 with an oligonucleotide Nano-FISH probe set for the erythroid specific enhancer of BCL11 A and for p53BPl, indicative of double strand breaks induced by the nuclease.
  • Cells are imaged and analyzed as described in EXAMPLE 1.
  • Co- localization of signal from oligonucleotide Nano-FISH probes and p53BPl is quantified to determine the specificity of the nuclease for the erythroid specific enhancer ofBCLUA and any off- target activity of the nuclease.
  • a nuclease with high specificity for the erythroid specific enhancer ofBCLUA and low to none off-target activity is used to engineer hematopoietic stem cells ex vivo, to be administered to a subject in need thereof
  • the subject has beta- thalassemia or sickle cell disease.
  • This example illustrates determining specificity of genome editing with a nuclease disclosed herein to insert alpha-L iduronidase (IDUA).
  • a genome editing complex e.g., TALEN, ZFN, CRISPR/Cas9, megaTAL, meganuclease
  • the nuclease induces double stranded breaks to insert a functional IDUA gene.
  • Cells are stained and analyzed as described in EXAMPLE 11 with an oligonucleotide Nano-FISH probe set for IDUA and for p53BPl, indicative of double strand breaks induced by the nuclease.
  • Cells are imaged and analyzed as described in EXAMPLE 1.
  • Co-localization of signal from oligonucleotide Nano-FISH probes and p53BPl is quantified to determine the specificity of the nuclease and any off- target activity of the nuclease.
  • a nuclease with high and low to none off-target activity is used to administer in a subject in need thereof The subject has MPSI.
  • This example illustrates determining specificity of genome editing with a nuclease disclosed herein to insert iduronate-2-sulfatase (IDS).
  • a genome editing complex e.g., TALEN, ZFN, CRISPR/Cas9, megaTAL, meganuclease
  • the nuclease induces double stranded breaks to insert a functional IDS gene.
  • Cells are stained and analyzed as described in EXAMPLE 11 with an oligonucleotide Nano-FISH probe set for IDS and for p53BPl, indicative of double strand breaks induced by the nuclease.
  • Cells are imaged and analyzed as described in EXAMPLE 1.
  • Co-localization of signal from oligonucleotide Nano-FISH probes and p53BPl is quantified to determine the specificity of the nuclease and any off- target activity of the nuclease.
  • a nuclease with high specificity and low to none off-target activity is used to administer in a subject in need thereof The subject has MPSII.
  • This example illustrates determining specificity of genome editing with a nuclease to insert Factor LX.
  • a genome editing complex e.g., TALEN, ZFN, CRISPR/Cas9, megaTAL, meganuclease
  • targeting a desired genomic locus for insertion of an ectopic nucleic acid encoding for Factor 9 is transfected in immortalized or primary cells as set forth in
  • the nuclease induces double stranded breaks to insert a functional Factor 9 gene.
  • Cells are stained and analyzed as described in EXAMPLE 11 with an oligonucleotide Nano-FISH probe set for Factor 9 and for p53BPl, indicative of double strand breaks induced by the nuclease.
  • Cells are imaged and analyzed as described in
  • EXAMPLE 1 Co-localization of signal from oligonucleotide Nano-FISH probes and p53BPl is quantified to determine the specificity of the nuclease and any off- target activity of the nuclease.
  • a nuclease with high specificity and low to none off-target activity is used to administer in a subject in need thereof The subject has Hemophilia B.
  • This example illustrates determining specificity of genome editing with a PDCD 1- targeting nuclease, a TRA-target nuclease, and/or a TRB-targeting nuclease.
  • a genome editing complex e.g., TALEN, ZFN, CRISPR/Cas9, megaTAL, meganuclease
  • targeting PDCD1, TRA, and/or TRB is transfected in immortalized or primary cells as set forth in
  • the nuclease induces double stranded breaks.
  • Cells are stained and analyzed as described in EXAMPLE 11 with an oligonucleotide Nano-FISH probe set for PDCD1, TRA, and/or TRB and for p53BPl, indicative of double strand breaks induced by the nuclease.
  • Cells are imaged and analyzed as described in EXAMPLE 1.
  • Co- localization of signal from oligonucleotide Nano-FISH probes and p53BPl is quantified to determine the specificity of the nuclease for PDCD1, TRA, and/or TRB and any off- target activity of the nuclease.
  • a nuclease with high specificity for PDCD1, TRA, and/or TRB and low to none off-target activity is used to administer to engineer CAR T cells ex vivo, to be administered to a subject in need thereof
  • the subject has cancer, such as multiple myeloma, melanoma, or sarcoma.
  • This example illustrates determining specificity of genome editing with a TRA- targeting nuclease, a TRB-targeting nuclease, and/or a CS- l -targeting nuclease.
  • a genome editing complex e.g., TALEN, ZFN, CRISPR/Cas9, megaTAL, meganuclease targeting TRA, TRB, and/or CS- l- lis transfected in immortalized or primary cells as set forth in
  • the nuclease induces double stranded breaks.
  • Cells are stained and analyzed as described in EXAMPLE 11 with an oligonucleotide Nano-FISH probe set for TRA, TRB, and/or CS- l and for p53BPl, indicative of double strand breaks induced by the nuclease.
  • Cells are imaged and analyzed as described in EXAMPLE 1.
  • Co- localization of signal from oligonucleotide Nano-FISH probes and p53BPl is quantified to determine the specificity of the nuclease for TRA, TRB, and/or CS- l and any off- target activity of the nuclease.
  • a nuclease with high specificity for TRA, TRB, and/or CS- l and low to none off-target activity is used to administer to engineer CAR T cells ex vivo, to be administered to a subject in need thereof
  • the subject has cancer, such as multiple myeloma.
  • This example illustrates determining specificity of genome editing with a TRA- targeting nuclease and/or a TRB-targeting nuclease.
  • a genome editing complex e.g.,
  • TALEN, ZFN, CRISPR/Cas9, megaTAL, meganuclease) targeting TRA and/or TRB is transfected in immortalized or primary cells as set forth in EXAMPLE 2 or EXAMPLE 3.
  • the nuclease induces double stranded breaks.
  • Cells are stained and analyzed as described in EXAMPLE 11 with an oligonucleotide Nano-FISH probe set for TRA and/or TRBand for p53BPl, indicative of double strand breaks induced by the nuclease. Cells are imaged and analyzed as described in EXAMPLE 1.
  • Co-localization of signal from oligonucleotide Nano- FISH probes and p53BPl is quantified to determine the specificity of the nuclease for TRA and/or TRBand any off-target activity of the nuclease.
  • a nuclease with high specificity for TRA and/or TRBand low to none off-target activity is used to administer to engineer CAR T cells ex vivo, to be administered to a subject in need thereof
  • the subject has cancer, such as acute lymphoblastic leukemia.
  • This example illustrates determining specificity of genome editing with a CEP290- targeting nuclease.
  • a genome editing complex e.g., TALEN, ZFN, CRISPR/Cas9, megaTAL, meganuclease
  • the nuclease induces double stranded breaks.
  • Cells are stained and analyzed as described in EXAMPLE 11 with an oligonucleotide Nano- FISH probe set for CEP290 and for p53BPl, indicative of double strand breaks induced by the nuclease. Cells are imaged and analyzed as described in EXAMPLE 1.
  • Co- localization of signal from oligonucleotide Nano-FISH probes and p53BPl is quantified to determine the specificity of the nuclease for CEP290 and any off-target activity of the nuclease.
  • a nuclease with high specificity for CEP290 and low to none off-target activity is used to administer to a subject in need thereof The subject has Leber congenital amaurosis (LCA10).
  • This example illustrates determining specificity of genome editing with a TRA- targeting nuclease, a TRB-targeting nuclease, and/or a B2M-targeting nuclease.
  • a genome editing complex e.g., TALEN, ZFN, CRISPR/Cas9, megaTAL, meganuclease
  • targeting TRA, TRB, and/or B2M is transfected in immortalized or primary cells as set forth in
  • the nuclease induces double stranded breaks.
  • Cells are stained and analyzed as described in EXAMPLE 11 with an oligonucleotide Nano-FISH probe set for TRA, TRB, and/or B2M and for p53BPl, indicative of double strand breaks induced by the nuclease.
  • Cells are imaged and analyzed as described in EXAMPLE 1.
  • Co- localization of signal from oligonucleotide Nano-FISH probes and p53BPl is quantified to determine the specificity of the nuclease for TRA, TRB, and/or B2M and any off-target activity of the nuclease.
  • a nuclease with high specificity for TRA, TRB, and/or B2M and low to none off-target activity is used to administer to engineer CAR T cells ex vivo, to be administered to a subject in need thereof
  • the subject has cancer, such as CD19 malignancies or BCMA-related malignancies.
  • This example shows multiplexed p53BPl, FLAG, and Nano-FISH staining and analysis for line structural analysis of specific genomic loci within the nucleus.
  • Fine structural analysis using Nano-FISH is carried by, for example, probe pools are designed to target a l .6kb region of chromosome 19 and a l .4kb region of chromosome 18. Distinct spots are produced by Nano-FISH probes targeting specific loci on these chromosomes.
  • the relative radial distance (RRD) a normalized measure of the position of the detected spot with respect to the nuclear centroid
  • This example shows multiplexed p53BPl, FLAG, and Nano-FISH staining and analysis for examining the interaction of a gene enhancer with its target gene promoter.
  • the positioning of a known enhancer is examined.
  • Nano-FISH probes targeting the enhancer and promoter are designed and synthesized.
  • the normalized inter- spot distance (NID) between two genomic loci is compared.
  • Small size of genomic regions targeted by Nano-FISH permits fine scale localization of regulatory DNA regions and provides a granular view of their spatial localizations within nuclei.
  • This example shows multiplexed p53BPl, FLAG, and Nano-FISH staining and analysis super-resolution microscopy to obtain very fine-scale genome localization.
  • Fine scale genome localization using Nano-FISH and super- resolution microscopy is carried out as follows.
  • a custom automated stimulated emission and depletion (STED) microscope is utilized to efficiently acquire multiple measurements of the physical distance between the HS2 and HS3 genomic loci, which are separated by 4. lkb of linear genomic distance.
  • Nano-FISH Examination of fine scale genome localization using Nano-FISH is extended to the multiplexed p53BPl, FLAG, and Nano-FISH staining and analysis disclosed herein to examine fine scale genome localization after editing cells with a genome editing complex (e.g., TALEN, ZFN, CRISPR/Cas9, megaTAL,
  • TALEN genome editing complex
  • ZFN ZFN
  • CRISPR/Cas9 CRISPR/Cas9
  • megaTAL genome editing complex
  • This example illustrates determining specificity of genome editing with a CBLB- targeting nuclease.
  • a genome editing complex e.g., TALEN, ZFN, CRISPR/Cas9, megaTAL, meganuclease
  • the nuclease induces double stranded breaks.
  • Cells are stained and analyzed as described in EXAMPLE 11 with an oligonucleotide Nano- FISH probe set for CBLB and for p53BPl, indicative of double strand breaks induced by the nuclease. Cells are imaged and analyzed as described in EXAMPLE 1.
  • Co-localization of signal from oligonucleotide Nano-FISH probes and p53BPl is quantified to determine the specificity of the nuclease for CBLB and any off-target activity of the nuclease.
  • a nuclease with high specificity for CBLB and low to none off-target activity is administered to engineer CAR T cells ex vivo, to be administered to a subject in need thereof The subject has cancer.
  • This example illustrates determining specificity of genome editing with a TGFbR- targeting nuclease.
  • a genome editing complex e.g., TALEN, ZFN, CRISPR/Cas9, megaTAL, meganuclease
  • TGFBR TGFbR- targeting nuclease
  • the nuclease induces double stranded breaks.
  • Cells are stained and analyzed as described in EXAMPLE 11 with an oligonucleotide Nano- FISH probe set for TGFBR and for p53BPl, indicative of double strand breaks induced by the nuclease. Cells are imaged and analyzed as described in EXAMPLE 1.
  • Co- localization of signal from oligonucleotide Nano-FISH probes and p53BPl is quantified to determine the specificity of the nuclease for TGFBR and any off-target activity of the nuclease.
  • a nuclease with high specificity for TGFBR and low to none off-target activity is administered to engineer CAR T cells ex vivo, to be administered to a subject in need thereof The subject has multiple myeloma.
  • This example illustrates determining specificity of genome editing with a DMD- targeting nuclease.
  • a genome editing complex e.g., TALEN, ZFN, CRISPR/Cas9, megaTAL, meganuclease
  • the nuclease induces double stranded breaks.
  • Cells are stained and analyzed as described in EXAMPLE 11 with an oligonucleotide Nano- FISH probe set for DMD and for p53BPl, indicative of double strand breaks induced by the nuclease. Cells are imaged and analyzed as described in EXAMPLE 1.
  • Co- localization of signal from oligonucleotide Nano-FISH probes and p53BPl is quantified to determine the specificity of the nuclease for DMD and any off-target activity of the nuclease.
  • a nuclease with high specificity for DMD and low to none off-target activity is administered to a subject in need thereof The subject has duchenne muscular dystrophy (DMD).
  • a genome editing complex e.g., TALEN, ZFN, CRISPR/Cas9, megaTAL, meganuclease
  • a genome editing complex e.g., TALEN, ZFN, CRISPR/Cas9, megaTAL, meganuclease targeting CFTR is transfected in immortalized or primary cells as set forth in EXAMPLE 2 or EXAMPLE 3.
  • the nuclease induces double stranded breaks.
  • Cells are stained and analyzed as described in EXAMPLE 11 with an oligonucleotide Nano- FISH probe set for CFTR and for p53BPl, indicative of double strand breaks induced by the nuclease.
  • Cells are imaged and analyzed as described in EXAMPLE 1.
  • Co-localization of signal from oligonucleotide Nano-FISH probes and p53BPl is quantified to determine the specificity of the nuclease for CFTR and any off-target activity of the nuclease.
  • a nuclease with high specificity for CFTR and low to none off- target activity is administered to a subject in need thereof The subject has cystic fibrosis.
  • This example illustrates determining specificity of genome editing with a serpinal - targeting nuclease.
  • a genome editing complex e.g., TALEN, ZFN, CRISPR/Cas9, megaTAL, meganuclease
  • the nuclease induces double stranded breaks.
  • Cells are stained and analyzed as described in EXAMPLE 11 with an oligonucleotide Nano- FISH probe set for serpinal and for p53BPl, indicative of double strand breaks induced by the nuclease. Cells are imaged and analyzed as described in EXAMPLE 1.
  • Co-localization of signal from oligonucleotide Nano-FISH probes and p53BPl is quantified to determine the specificity of the nuclease for serpinal and any off- target activity of the nuclease.
  • a nuclease with high specificity for serpinal and low to none off-target activity is administered to a subject in need thereof The subject has alpha- 1 antitrypsin deficiency (dAlAT def).
  • This example illustrates determining specificity of genome editing with an IL2Rg- targeting nuclease.
  • a genome editing complex e.g., TALEN, ZFN, CRISPR/Cas9, megaTAL, meganuclease
  • targeting IL2Rg is transfected in immortalized or primary cells as set forth in EXAMPLE 2 or EXAMPLE 3.
  • the nuclease induces double stranded breaks.
  • Cells are stained and analyzed as described in EXAMPLE 11 with an oligonucleotide Nano- FISH probe set for IL2Rg and for p53BPl, indicative of double strand breaks induced by the nuclease.
  • Cells are imaged and analyzed as described in EXAMPLE 1.
  • Co-localization of signal from oligonucleotide Nano-FISH probes and p53BPl is quantified to determine the specificity of the nuclease for IL2Rg and any off-target activity of the nuclease.
  • a nuclease with high specificity for IL2Rg and low to none off- target activity is administered to a subject in need thereof The subject has X- linked severe combined immunodeficiency (X-SCID).
  • This example illustrates determining specificity of genome editing with a nuclease targeting HBV genomic DNA in infected cells.
  • a genome editing complex e.g., TALEN, ZFN, CRISPR/Cas9, megaTAL, meganuclease
  • the nuclease induces double stranded breaks.
  • Cells are stained and analyzed as described in EXAMPLE 11 with an oligonucleotide Nano-FISH probe set for HBV genomic DNA and for p53BPl, indicative of double strand breaks induced by the nuclease. Cells are imaged and analyzed as described in EXAMPLE 1.
  • Co-localization of signal from oligonucleotide Nano-FISH probes and p53BPl is quantified to determine the specificity of the nuclease for HBV genomic DNA and any off-target activity of the nuclease.
  • a nuclease with high specificity for HBV genomic DNA and low to none off- target activity is administered to a subject in need thereof The subject has Hepatitis B.
  • a modular software framework of image processing methods to quantify the amount and localization of proteins (such as p53bpl) on a per-cell basis in response to a perturbant such as a nuclease has been developed.
  • morphometric data such as foci (spot) count, foci size, foci intensity, overall nuclear expression (load), spatial localization patterns of foci, etc
  • a generalizable informatics framework of statistical methods to model and analyze the data distributions has also been developed.
  • the informatics framework ultimately yields a numerical estimate ([0,1] or expressed as a percentage) for the specificity of the nuclease.
  • the framework is depicted in Fig. 18. This framework thus provides an objective route for high throughput screening of nucleases to identify lead nucleases against therapeutically useful genomic targets.
  • Per-cell spot counts for the p53bpl protein in control and nuclease-treated cells can be modeled and analyzed using the informatics framework detailed in Fig. 18 to yield numerical estimates of the nuclease specificity.
  • the model incorporates parameters to reflect the sensitivity of the protein marker used, and the ploidy of the target locus that is being edited.
  • the nuclease-treated cell distribution was normalized relative to the distribution of the control sample, and the fraction of cells with p53bpl foci above the ploidy of the target genomic locus was computed as the promiscuity of the nuclease.
  • Nuclease specificity was estimated to be 1 - the promiscuity value.
  • a method for calculation of nuclease specificity based on p53bpl foci counts is depicted in Fig. 19.
  • Guide-seq is a bulk-cell genomic sequencing-based assay that generally considered as the defacto method to derive the specificity of nucleases.
  • the imaging assay disclosed herein provides a complementary estimate of the nuclease specificity, but within a fraction of the time and expense of the guide-seq assay.
  • the specificity of p53BPl imaging assay was compared with guide-seq in K562 cells for 3 nucleases that are considered to have high on-target potency but differing specificities.
  • the p53BPl imaging-based assay mirrors the specificity profiles provided by guide-seq, but within a fraction of the time and cost of the guide-seq assay. See Fig. 20.
  • p53BPl imaging assay was utilized to optimize the specificity of nucleases in primary cells by modifying their design.
  • CD34+ cells were treated with either TALENs featuring homodimeric Fokl nuclease domains (GA6 14) or their variants that contained more repeat units (i.e. GA6 17 and GA6 19) in one of the monomers (the left monomer in this case) to enhance specific recognition of their target genomic locus.
  • the assay revealed a dramatic reduction in off-target activity by using longer GA6 L monomers while still providing a comparable on- target editing efficiency (58% for GA6_l4, 54% for GA6_l7, and 52% for GA6 19). See Fig. 21.
  • EXAMPLE 44 EXAMPLE 44
  • telomere imaging assay was utilized to optimize the specificity of nuclease action in primary cells.
  • CD34+ cells were treated with either TALENs featuring homodimeric Fokl nuclease domains (GA6, GA7) or their variants that contained obligate heterodimeric
  • p53BPl imaging assay can be used to assess both on- and off-target activity on a per-cell basis.
  • K562 cells or CD34+ progenitor cells were treated with AAVS1 and GA6 TALENs that target distinct genomic regions. Untransfected and mock transfected cells were used as controls. An mRNA dose of 2ug per monomer was used for the TALENs. 24 hours post transfection, all cells were subject to p53BPl/FLAG immunofluorescence and NanoFISH with a pool of 115 oligoprobes that were designed to target the 5 kb genomic region adjacent to AAVS1 TALEN cut site. K562 cell experiments were conducted in duplicate.
  • Imaging-based specificity screen to identify lead nucleases for therapeutic genetic targets
  • the p53BPl imaging assay was used to rapidly identify lead nucleases against therapeutically relevant genomic loci.
  • TALENs against the first constant exon of the TCR- alpha gene and the first exon of the PDCD1 gene were designed, and their on- target potency and specificity on primary CD3+ T cells was evaluated. Multiple TALENs provided comparable on-target potency, TALEN #6 had the highest specificity. See Figs. 24A and 24B. Thus, the p53BPl imaging assay identified TALEN #6 as the lead nuclease for these genes.

Landscapes

  • Health & Medical Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Engineering & Computer Science (AREA)
  • Organic Chemistry (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biotechnology (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Microbiology (AREA)
  • Biochemistry (AREA)
  • Biophysics (AREA)
  • Physics & Mathematics (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Plant Pathology (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Analytical Chemistry (AREA)
  • Immunology (AREA)
  • Medicinal Chemistry (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)

Abstract

The present disclosure provides methods and compositions for image based analysis and quantification of a protein load from protein (e.g., p53BP1) accumulation, induced by a cellular perturbation, such as administration of a genome editing tool comprising a DNA binding domain and a nuclease domain, a gene repressor, or a gene activator.

Description

METHODS FOR ASSESSING SPECIFICITY ENGINEERING TOOLS
Figure imgf000002_0001
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application claims priority to United States Provisional Application Serial No. 62/659,664 filed April 18, 2018 and United States Provisional Application Serial No.
62/690,908 filed June 27, 2018, the disclosures of which are herein incorporated by reference in their entirety.
INTRODUCTION
[0002] Current tools to assess oft- target activity of nucleases such as transcription activator- like effector nucleases (TALENs), Zinc Finger Nucleases (ZFNs), Cas nucleases are predominantly bulk-cell based, and thus only provide population-averaged estimates.
Furthermore, these techniques necessitate costly deep sequencing and complex computational strategies to obtain the required results. All current techniques preclude information about the cell-cell variability in the (1) the extent of off- target nuclease activity, (2) nuclear localization of nuclease acti vity, (3) cell transfection efficiency, (4) levels of nuclease expression, (5) nuclease induced cytotoxicity. Thus, there is a need for a quantitative imaging-based assay to overcome these limitations, which could be applied to all nuclease classes in primary ceils and immortalized cells.
SUMMARY
[0003] Methods to assess the specificity of cell engineering tools disclosed herein measure the differential response of a cell to a cellular perturbation by a cell engineering tool by quantifying the change in the load of protein relevant to such a response, relative to the background load of the same protein in untreated reference cells, and, in some cases, normalized by the predicted magnitude of response to perturbation by a target- specific cell engineering tool. Degree of deviation of the change in protein load beyond that expected for a target- specific cell engineering tool is used as an indicator of additional off- target activity by cell engineering tool, which might be undesirable. The cell engineering tool might be optimized to achieve an increased target- specific response using the analytical workflow' disclosed herein
[0004] In various aspects, the present disclosure provides a method of quantifying a protein load, the method comprising quantifying a protein that accumulates in a primary cell in response to a cellular perturbation on a per allele per cell basis. [0005] In various aspects, the present disclosure provides a method of quantifying a protein load, the method comprising quantifying a protein that accumulates in a plurality of cells in response to a cellular perturbation in less than 24 hours on a per allele per cell basis.
[0006] In various aspects, the present disclosure provides a method of screening a plurality of cell engineering tools for specificity, the method comprising quantifying a protein load in an intact cell in less than 24 hours and determining the specificity of the cell engineering tool for a target genomic locus based on the protein load.
[0007] In various aspects, the present disclosure provides a method of producing a potent and specific cell engineering tool, the method comprising: a) administering a cell engineering tool to a cell; b) determining specificity, activity, or a combination thereof of the cell engineering tool for a target genomic locus by quantifying a protein load; c) quantifying potency of the cell engineering tool by measuring gene editing efficiency, activation of gene expression, or repression of gene expression; and d) adjusting a parameter of the cell engineering tool to increase specificity for the target genomic locus.
[0008] In some aspects, the protein accumulates in response to a cellular perturbation. In further aspects, the method fijrther comprises quantifying the protein load on a per allele per cell basis. In some aspects, the intact cell comprises an intact primary cell. In some aspects, the cell comprises an intact primary cell. In further aspects, the cellular perturbation comprises administering a cell engineering tool.
[0009] In some aspects, the method further comprises determining specificity of the cell engineering tool for a target genomic locus. In some aspects, the method further comprises quantifying gene editing efficiency, activation of gene expression, or repression or gene expression. In some aspects, the plurality of cells comprises at least 5 cells, at least 10 cells, at least 20 cells, at least 50 cells, at least 100 cells, at least 200 cells, at least 500 cells, or at least 1000 cells.
[0010] In some aspects, the protein indicates a cellular response. In some aspects, the cellular response comprises a double strand break, activation of transcription, repression of transcription, or chromosome translocation.
[0011] In other aspects, the cell or intact cell comprises an immortalized cell. In some aspects, the cell engineering tool comprises a genome editing complex or a gene regulator. In some aspects, the gene regulator comprises a gene activator or a gene repressor. In some aspects, the protein comprises phosphorylated r53BR1 (r53BR1), gH2AC, 53BP1,
H3K4mel, H3K4me2, H3K27ac, KAPI, H3K9me3, H3K27me3, or HPl . In further aspects, the protein comprises p53BPl . [0012] In some aspects, the method further comprises staining the cell for the protein. In some aspects, the staining the cell for the protein comprises labeling with a primary antibody against the protein and a secondary antibody conjugated to a first fluorophore. In other aspects, the staining the cell for the protein comprises direct labeling with a primary antibody conjugated to a first fluorophore. In some aspects, the method further comprises imaging the cell for one or more protein foci comprising the first fluorophore. In some aspects, the method further comprises image analysis of the cell for the one or more protein foci comprising the first fluorophore.
[0013] In some aspects, the method further comprises quantifying the protein load from the one or more protein foci comprising the first fluorophore. In some aspects, the protein load comprises a number of protein foci, total protein content within the nucleus, spatial localization pattern, or any combination thereof In further aspects, the cell engineering tool further comprises a polypeptide tag. In still further aspects, the polypeptide tag is a FLAG tag.
[0014] In some aspects, the method further comprises staining the cell for the cell engineering tool. In some aspects, the staining the cell for the cell engineering tool comprises staining with a primary antibody against the polypeptide tag and a secondary antibody conjugated to a second fluorophore. In other aspects, the staining the cell for the cell engineering tool comprises direct labeling with a primary antibody conjugated to a second fluorophore. In some aspects, the staining of the cell for the cell engineering tool comprises staining with a primary antibody against the nuclease and a secondary antibody conjugated to a second fluorophore. In other aspects, the staining the cell for the cell engineering tool comprises direct labeling with a primary antibody conjugated to a second fluorophore.
[0015] In some aspects, the method further comprises imaging the cell for one or more cell engineering tool foci comprising the second fluorophore. In some aspects, the method further comprises image analysis of the cell for the one or more cell engineering tool foci comprising the second fluorophore. In some aspects, the method further comprises quantifying cell engineering tool load from the one or more cell engineering tool foci comprising the second fluorophore. In some aspects, the cell engineering tool load comprises a number of cell engineering tool foci, total content of the cell engineering tool within the nucleus, spatial localization pattern, or any combination thereof
[0016] In some aspects, the method further comprises hybridizing a probe set comprising a plurality of probes to the cell, wherein the probe set targets and binds to a target genomic locus. In some aspects, each probe of the plurality of probes comprises a third fluorophore. In some aspects, the probe set comprises an oligonucleotide probe set. In some aspects, the method further comprises imaging the cell for one or more Nano-FISH foci comprising the third fluorophore. In some aspects, the method further comprises image analysis of the cell for the one or more Nano-FISH foci comprising the third fluorophore. In some aspects, co- localization of signal from the first fluorophore and the third fluorophore indicates that the cellular perturbation occurs at the target genomic locus.
[0017] In some aspects, the method further comprises hybridizing a second probe set comprising a second plurality of probes to the cell, wherein the second probe set targets and binds to an off-target genomic locus. In some aspects, each probe of the second plurality of probes comprises a fourth fluorophore. In further aspects, the second probe set comprises a second oligonucleotide probe set. In further aspects, the method further comprises imaging the cell for one or more Nano-FISH foci comprising the fourth fluorophore. In some aspects, the method further comprises image analysis of the cell for the one or more Nano-FISH foci comprising the fourth fluorophore. In some aspects, co-localization of signal from the first fluorophore, the third fluorophore, and the fourth fluorophore indicates a chromosome translocation.
[0018] In some aspects, imaging the cell comprises acquiring images of the cell by a microscopy mode selected from the group consisting of: epifluorescence, widefield, confocal, selective plane illumination, tomography, holography, super-resolution, and synthetic aperture optics (SAO). In further aspects, the method further comprises processing the acquired images to identify regions of interest (ROIs) comprising cell nuclei, protein marker foci, sites of cell engineering tool localization, or a combination thereof
[0019] In some aspects, the method further comprises processing the ROIs to extract a plurality of features selected from the group consisting of: count, spatial location, size (area/volume), shape (circularity/sphericity, eccentricity, irregularity (concavity/convexity), diameter, perimeter/surface area, quantitative measures of image texture that are pixel-based or region-based over a tunable length scale, nuclear diameter, nuclear area, nuclear volume, perimeter, surface area, DNA content, DNA texture measures, number of protein marker foci, size of protein marker foci, shape of protein marker foci, amount of protein marker per cell, spatial location and localization pattern of protein marker foci, number of nuclease per cell, amount of nuclease per cell, nuclease localization or texture, number of cell engineering tool foci, size of cell engineering tool foci, shape of cell engineering tool foci, amount of cell engineering tool foci per cell, spatial location and localization pattern of cell engineering tool foci, number of Nano-FISH foci, size of Nano-FISH foci, shape of Nano-FISH foci, amount of Nano-FISH foci, spatial location of Nano-FISH foci, and localization pattern of Nano- FISH foci.
[0020] In some aspects, the method further comprises processing the extracted plurality of features to measure a degree of co-localization between the one or more Nano-FISH foci and the one or more protein marker foci, thereby determining specificity of the genome editing complex or the gene regulator. In some aspects, the method further comprises applying a machine learning predictor to the extracted plurality of features to evaluate performance of cell engineering tools by predicting a distinction capability of nucleases.
[0021] In some aspects, the method further comprises the genome editing complex comprises a DNA binding domain and a nuclease. In further aspects, the genome editing complex further comprises a linker. In some aspects, the gene activator comprises a DNA binding domain and an activation domain. In further aspects, the gene activator further comprises a linker. In some aspects, the gene repressor comprises a DNA binding domain and a repressor domain. In further aspects, the gene repressor further comprises a linker.
[0022] In some aspects, the DNA binding domain comprises a transcription activator- like effector (TALE) protein, a zinc finger protein (ZFP), or a single guide RNA (sgRNA). In further aspects, the genome editing complex is a TALEN, a ZFN, a CRISPR/Cas9, a megaTAL, or a meganuclease. In some aspects, the nuclease comprises Fokl. In further aspects, Fokl has at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, at least 97%, or at least 99% sequence identity to SEQ ID NO: 1062. In some aspects, the linker comprises the naturally occurring C-terminus of a TALE protein or any truncation thereof In some aspects, the linker comprises 0- 15 residues of glycine, methionine, aspartic acid, alanine, lysine, serine, leucine, threonine, tryptophan, or any combination thereof
[0023] In some aspects, the activation domain comprises VP 16, VP64, p65, p300 catalytic domain, TET1 catalytic domain, TDG, Ldbl self-associated domain, SAM activator (VP64, p65, HSF1), VPR (VP64, p65, Rta). In other aspects, the repressor domain comprises KRAB, Sin3a, LSD1, SUV39H1, G9A (EHMT2), DNMT1, DNMT3A-DNMT3L, DNMT3B, KOX, TGF-beta- inducible early gene (IIEG), v-erbA, SID, MBD2, MBD3, Rb, or MeCP2.
[0024] In some aspects, a parameter of the genome editing complex or the gene regulator is adjusted improve specificity. In some aspects, the parameter is a sequence of the DNA binding domain or length of the DNA binding domain. In some aspects, the protein load is quantified in at least 50 to 100,000 cells. In some aspects, the protein load is quantified in no more than 1000, no more than 500, no more than 100, or no more than 50 cells. In some aspects, the cell comprises a hematopoietic stem cells (HSC), a T cell, a chimeric antigen receptor T cell (CAR T cell). In other aspects, the cell is from a normal solid tissue or a tumorigenic solid tissue. In some aspects, the target genomic locus is within a PDCD1 gene, a CTLA4 gene, a LAG3 gene, a TET2 gene, a BTLA gene, a HAVCR2 gene, a CCR5 gene, a CXCR4 gene, a TRA gene, a TRB gene, a B2M gene, an albumin gene, a HBB gene, a HBA1 gene, a TTR gene, a NR3C l gene, a CD52 gene, an erythroid specific enhancer of the BCLl lA gene, a CBLB gene, a TGFBRl gene, a SERPINA1 gene, a HBV genomic DNA in infected cells, a CEP290 gene, a DMD gene, a CFTR gene, an IL2RG gene, or a combination thereof In some aspects, a chimeric antigen receptor (CAR), alpha-L iduronidase (IDUA), iduronate-2- sulfa tase (IDS), or Factor 9 (F 9) is inserted upon cleavage of a region of the target nucleic acid sequence.
[0025] In certain aspects, a method for determining specificity of a protein
engineering tool comprises contacting a live cell with a cell engineering tool comprising a DNA binding domain and a nuclease domain, a gene repressor, or a gene activator, wherein the live cell comprises genomic DNA comprising a target genomic locus for the DNA binding domain of the cell engineering tool; fixing the cell and contacting the fixed cell with a plurality of nucleic acid probes complementary to the target genomic locus and assaying for presence of a protein indicative of cellular response to the contacting; and assaying for colocalization of the probes and the protein, wherein detection of the colocalization indicates activity of the cell engineering tool at the target genomic locus and absence of the
colocalization indicates activity of the cell engineering tool at an off-target site.
[0026] In certain aspects, assaying for colocalization comprises imaging the cell at 40X or higher magnification. In certain aspects, the fixing of the cell is performed within 24 hours or less of the contacting. The cell engineering tool may include a DNA binding domain and a nuclease domain. The nuclease domain induces a double strand break in the genomic DNA and where the protein indicative of cellular response to the contacting comprises a DNA repair protein. The DNA repair protein may be r53BR1, gH2AC, MRE- l l, BRCA1, RAD-51 , phospho- ATM or MDC 1.
[0027] The cell engineering tool may include a DNA binding domain and a gene repressor. The gene repressor may be KRAB, Sin3a, LSD1, SUV39H1, G9A (EHMT2), DNMT1, DNMT3A-DNMT3L, DNMT3B, KOX, TGF-beta-inducible early gene (TIEG), v- erbA, SID, MBD2, MBD3, Rb, or MeCP2. [0028] The cell engineering tool may include a DNA binding domain and a gene activator. The gene activator may be VP 16, VP64, p65, p300 catalytic domain, TET1 catalytic domain, TDG, Ldb l self- associated domain, SAM activator (VP64, p65, HSF 1), VPR (VP64, p65, Rta).
[0029] The DNA binding domain may be a transcription activator-like effector (TALE) protein, a zinc finger protein (ZFP), or a single guide RNA (sgRNA).
[0030] The cell may be any cell of interest, including the cells as provided herein, e.g., primary cells. The cell may be hematopoietic stem cell (HSC), a T cell, or a chimeric antigen receptor T cell (CAR T cell). The cell may be from a normal solid tissue or a tumorigenic solid tissue. The cell may be an immortalized cell.
[0031] The target genomic locus may be within a PDCD1 gene, a CTLA4 gene, a LAG3 gene, a TET2 gene, a BTLA gene, a HAVCR2 gene, a CCR5 gene, a CXCR4 gene, a TRA gene, a TRB gene, a B2M gene, an albumin gene, a HBB gene, a HBA1 gene, a TTR gene, a NR3C l gene, a CD52 gene, an erythroid specific enhancer of the BCL1 1A gene, a CBLB gene, a TGFBR1 gene, a SERPINA1 gene, a HBV genomic DNA in infected cells, a CEP290 gene, a DMD gene, a CFTR gene, or an IL2RG gene, e.g., in the open reading frame, intron, promoter, regulatory elements, and the like of the gene.
[0032] The assaying for the colocalization comprises imaging the cell by a microscopy mode selected epifluorescence, widefield, confocal, selective plane illumination, tomography, holography, super-resolution, and synthetic aperture optics (SAO).
[0033] The plurality of nucleic acid probes may be 30-60 bases in length and may include 20-200 probes having distinct sequences. The plurality of nucleic acid probes may bind to a 1 kilobase (kb) to 5 kb region comprising the target genomic locus.
[0034] In certain aspects, when the absence of colocalization is detected, the method further comprises adjusting a parameter of the genome editing tool to improve specificity.
The parameter may be a sequence of the DNA binding domain or length of the DNA binding domain. The parameter may be an amount of the genome editing tool introduced into the cell.
[0035] Also provided is a method for measuring total activity of a cell engineering tool in a cell (for example, activity at the target genomic locus, as well as, at an off- target locations)). The method may include contacting a live cell with a cell engineering tool comprising a DNA binding domain and a nuclease domain, a gene repressor, or a gene activator, wherein the live cell comprises genomic DNA comprising a target genomic locus for the DNA binding domain of the cell engineering tool; fixing the cell and assaying for presence of a measurable change in nuclear protein load of a protein indicative of cellular response to the contacting, wherein the measurement reflects the total activity of the cell engineering tool. In certain aspects, the method may further include contacting the fixed cell with a plurality of nucleic acid probes complementary to the target genomic locus; and assaying for colocalization of the probes and the protein indicative of cellular response, wherein detection of the colocalization indicates activity of the cell engineering tool at the target genomic locus and absence of the colocalization indicates activity of the cell engineering tool at an off-target site.
[0036] Assaying for the change in nuclear protein load comprises imaging the cell by a microscopy mode selected from the group consisting of: epifluorescence, widefield, confocal, selective plane illumination, tomography, holography, super-resolution, and synthetic aperture optics (SAO) and comparing to nuclear protein load in a reference cell not contacted with the cell engineering tool.
[0037] In certain aspects, when the measured change in protein load above an application-specific baseline level is detected, the method further comprises adjusting a parameter of the genome editing tool to improve specificity.
[0038] Details of the type of genome engineering tools that can be assessed, types of cells, probes, and imaging are provided herein.
BRIEF DESCRIPTION OF THE DRAWINGS
[0039] FIG. 1 shows a brief summary of the assay workflow including the steps of nuclease transfection in cells, immuno labeling, imaging, processing raw images by deconvolution, optional enhancement, deconvolution or reconstruction and segmentation, feature
computation (e.g., count, amount, size, location of signal from immunolabel), and informatics and analysis (e.g., determining nuclease load and/or specificity, cytotoxicity, and/or heterogeneity).
[0040] FIG. 2 shows further details on image analysis including the steps of obtaining a microscopy image, deconvolution, delineation/segmentation of nuclei, p53BPl foci, and nuclease protein, morphological data estimation, and informatics/analysis as described in
FIG. 1
[0041] FIGS. 3A and 3B illustrate dose response assessments of GA7 TALENs (XXX) in primary CD34+ hematopoietic stem cells. [0042] FIG. 3A shows the number of p53BPl foci per cell for CD34+ primary cells treated with a blank transfection control, 0.5 pg GA7 per TALEN monomer, 1 pg GA7 per TALEN monomer, 2 pg GA7 per TALEN monomer, and 4 pg GA7 per TALEN monomer.
[0043] FIG. 3B shows the total p53BPl content (fluorescence intensity) per nucleus normalized by the nuclear size versus total FLAG tag content per nucleus normalized by the nuclear size indicative of a nuclease for CD34+ primary cells treated with a blank
transfection control, 0.5 pg GA7 per TALEN monomer, 1 pg GA7 per TALEN monomer, 2 pg GA7 per TALEN monomer, and 4 pg GA7 per TALEN monomer.
[0044] FIGS. 4A and 4B illustrate dose response assessments of GA6 TALENs in immortalized K562 cells.
[0045] FIG. 4A shows the number of p53BPl foci per cell for immortalized K562 cells treated with a blank transfection control, 0.5 pg GA6 per TALEN monomer, 1 pg GA6 per TALEN monomer, 2 pg GA6 per TALEN monomer, and 4 pg GA6 per TALEN monomer.
[0046] FIG. 4B shows the total p53BPl content (fluorescence intensity) per nucleus normalized by the nuclear size versus total FLAG tag content per nucleus normalized by the nuclear size indicative of a nuclease for immortalized K562 cells treated with a blank transfection control, 0.5 pg GA6 per TALEN monomer, 1 pg GA6 per TALEN monomer, 2 pg GA6 per TALEN monomer, and 4 pg GA6 per TALEN monomer.
[0047] FIGS. 5A and 5B illustrate dose response assessments of AAVS1 TALENs in immortalized K562 cells.
[0048] FIG. 5A shows the number of p53BPl foci per cell for immortalized K562 cells treated with a blank transfection control, 0.5 pg AASVl per TALEN monomer, 1 pg AASV1 per TALEN monomer, 2 pg AASV1 per TALEN monomer, and 4 pg AASVl per TALEN monomer.
[0049] FIG. 5B shows the total p53BPl content (fluorescence intensity) per nucleus normalized by the nuclear size versus total FLAG tag content per nucleus normalized by the nuclear size indicative of a nuclease for immortalized K562 cells treated with a blank transfection control, 0.5 pg AASV1 per TALEN monomer, 1 pg GA6, 2 pg AASV1 per TALEN monomer, and 4 pg AASV1 per TALEN monomer.
[0050] FIG. 6 shows a graph of the number of p53BPl foci per K562 cells at 6 hours, 12 hours, 24 hours, 48 hours, and 72 hours post transfection of AASV1 as compared to a control at each time point.
[0051] FIGS. 7A-7E show the results of control transfection and AAS VI -targeting TALEN transfection in various cell types. [0052] FIG. 7 A shows the number of p53BPl foci in adherent immortalized A549 cells transfected with a control and with an AAS VI -targeting TALEN 24 hours post- transfection.
[0053] FIG. 7B shows the number of p53BPl foci in suspension immortalized K 562 cells transfected with a control and with an AAS VI -targeting TALEN 24 hours post- transfection.
[0054] FIG. 7C shows the number of p53BPl foci in primary CD34+ progenitor cells transfected with a control and with an AAS VI -targeting TALEN 24 hours post- transfection.
[0055] FIG. 7D shows the number of p53BPl foci in primary CD4+ T cells transfected with a control and with an AAS VI -targeting TALEN 24 hours post- transfection.
[0056] FIG. 7E shows representative images of cells treated with AAVS1 TALENs versus untreated controls. Cells were stained for p53BPl with an antibody and are visualized in green. TALENs were stained with a FLAG tag and are visualized in red. Nuclei were stained with DAPI and are visualized in grey. The scale bar indicates a size of 5 pm.
[0057] FIGS. 8A-8B illustrate assessment of nuclease specificity in K562 cells for TALENs and Cas9 nucleases targeting the AAVS 1 genomic locus.
[0058] FIG. 8A illustrates the number of p53BPl foci per cell for K 562 cells transfected with Cas9 protein along with AAVS 1 guide RNAs as compared to a blank transfection control.
[0059] FIG. 8B illustrates the number of p53BPl foci per cell for K562 cells transfected with AAVS 1 -targeting TALENs as compared to a blank transfection control.
[0060] FIGS. 9A-9B show the DNA damage response, as measured by p53BPl foci quantification, in CD34+ cells and T cells with TALENs targeting various genomic loci.
[0061] FIG. 9A shows the number of p53BPl foci per cell in primary CD34+ progenitor cells after transfection with GA6-targeting TALENs, AAVS 1 -targeting TALENs, GA7- targeting TALENs, GA6-EK-targeting TALENs, and GA7-targeting TALENs. Controls include blank transfection controls.
[0062] FIG. 9B shows the number of p53BPl foci per cell in primary stimulated CD4+ T cells after transfection with TPl50-targeting TALENs, AAVS 1 -targeting TALENs, and TPl7l-targeting TALENs. Controls include non- electroporated naive T cells, non- electroporated stimulated T cells, and untreated blank transfection control stimulated T cells.
[0063] FIG. 10 shows the number of p53BPl foci per cell in K 562 cells transfected with GA6 L14, GA6 L17, and GA6 L19.
[0064] FIG. 11 shows the number of p53BPl foci per cell in K 562 cells transfected with GA6 L, GA6 R, GA6 LR versus untreated control cells. [0065] FIG. 12 shows the number of p53BPlfoci per cell in K 562 cells transfected with GA6 or GA6 EK TALENs.
[0066] FIG. 13 shows fluorescence microscopy images of control cells and AAVS 1- targeting TALEN treated cells. A D API stain (gray) was used to visualize nuclei, p53BPl is shown in green and the AAVS1 oligonucleotide Nano-FISH probe was visualized in red. Imaging showed that in cells transfected with AAVS1 -targeting TALEN, spots indicative of double stranded breaks (indicated by p53BPl foci) co-localized with AAVS1 oligonucleotide Nano-FISH probe spots.
[0067] FIGS. 14A-14C show histograms of the proportion of pairwise distances between AAVS 1 Nano-FISH spots and p53BPl foci.
[0068] FIG. 14A shows histograms of control and AAVS1 TALEN treated cells at pairwise distances of 0. l to 0.5.
[0069] FIG. 14B shows histograms of control and AAVS1 TALEN treated cells at pairwise distances of 0 to 0.025.
[0070] FIG. 14C shows histograms of control and AAVS1 TALEN treated cells at pairwise distances of 0 - -0.08.
[0071] FIGS. 15A-15C show evaluation of nuclease specificity by counting p53BPl foci in cells transfected with AAVS 1 -targeting TALENs.
[0072] FIG. 15A illustrates the number of p53BPl foci on the x axis versus the proportion of cells with p53BPl foci on the y-axis in cells transfected with AAVS1 -targeting TALENs and, in 3D, imaged on a Nikon widefield fluorescence microscope with a 60x magnification lens using oil immersion contact techniques. “Ref’ samples indicate control cells that were not transfected with TALENs. Biological replicates are shown for control and transfected cells (indicated by set x). The number of cells analyzed in each sample is indicated by“n”
[0073] FIG. 15B illustrates the number of p53BPl foci on the x axis versus the proportion of cells with p53BPl foci on the y-axis in cells transfected with AAVS1 -targeting TALENs and imaged, in 3D, on a Nikon widefield fluorescence microscope with a 40x magnification lens using non-contact techniques. ‘Ref’ samples indicate control cells that were not transfected with TALENs. Biological replicates are shown for control and transfected cells. The number of cells analyzed in each sample is indicated by“n.”
[0074] FIG. 15C illustrates the number of p53BPl foci on the x axis versus the proportion of cells with p53BPl foci on the y-axis in cells transfected with AAVS1 -targeting TALENs and imaged on a Stellar- Vis ion (SV) fluorescence microscope using non-contact techniques.
“Ref’ samples indicate control cells that were not transfected with TALENs. Biological replicates are shown for control and transfected cells. The number of cells analyzed in each sample is indicated by“n”
[0075] FIG. 16 shows a graph of the number of p53BPl foci per CD4+ T cell at 24 hours and 48 hours post- transfection with AAS VI -targeting TALENs as compared to blank transfection controls at each time point.
[0076] FIG. 17 shows an assay workflow for microscopy on a Stellar- Vision microscope. Images are captured on the Stellar- Vision microscope, images were reconstructed, images were segmented for regions of interest such as cell nucleic, p53BPl foci, and nuclease localization, features were computed (such as count, size, diameter, area, volume, perimeter length, circularity, irregularity, eccentricity, etc.). The measured per-cell feature information was statistically analyzed to produce quantitative specificity metrics for the tested nuclease(s).
[0077] FIG. 18 depicts a method for estimating nuclease specificity based on p53BPl foci characteristics.
[0078] FIG. 19 depicts a method for estimating nuclease specificity based on p53BPl foci counts.
[0079] FIG. 20 shows a comparison of off- target activity estimated using Guide- Seq vs. p53BPl imaging assay.
[0080] FIG. 21 illustrates use of the number of p53BPl foci as a read out for improved nuclease specificity.
[0081] FIG. 22 illustrates use of the number of p53BPl foci as a read out for improved nuclease specificity.
[0082] FIG. 23A illustrates the use of immunoNanoFISH and p53BPl staining for per-allele per-cell on/off-target activity estimation in K562 cells.
[0083] FIG. 23B illustrates the use of immunoNanoFISH and p53BPl staining for per-allele per-cell on/off-target activity estimation in CD34+ cells.
[0084] FIG. 24A illustrates the use of p53BPl imaging for identifying nucleases suitable for targeting TCR-alpha locus.
[0085] FIG. 24B illustrates the use of p53BPl imaging for identifying nucleases suitable for targeting PDCD- l .
[0086] FIG. 25 illustrates the use of p53BPl imaging for dose titration of a lead TALEN.
[0087] FIG. 26 illustrates the use of p53BPl imaging for screening nucleases for specificity and potency. [0088] FIG. 27 shows that double strand break (DSB) repair protein serve as markers for evaluating nuclease specificity.
DETAILED DESCRIPTION
[0089] The present disclosure provides compositions and methods for image-based analysis of cells eliciting a cellular response comprising accumulation of a moiety, such as a domain or a protein, in response to a cellular perturbation. The methods disclosed herein can allow for quantification of a protein load in a cell, wherein the protein can accumulate in response to a cellular response to a cellular perturbation. In some embodiments, the cellular response can be accumulation of a protein at the site of a double strand break. Alternatively, the cellular response can be active or passive accumulation of a protein, which participates in activating or repressing translational machinery. In some embodiments, the cellular perturbation comprises administration of a cell engineering tool. Examples of cell engineering tools include genome editing complex or gene regulator (an epigenetic repressor or activator). The genome editing complex or gene regulator can be designed to edit or regulate a target genomic locus. Modification of the target genomic locus can have therapeutic value. For example, modification of the target genomic locus can include introduction of a gene encoding a functional protein, knocking out a gene encoding a protein, or repressing expression of a protein for, e.g., treatment of indications that would benefit from the modification of the target genomic locus, such as, an indication that results from aberrant protein expression.
[0090] In some embodiments, the methods and compositions disclosed herein include an image-based assay for quantitation of foci within the nucleus of the cell. For example, the image-based assay can allow for visualization of fluorescent foci within the cell nucleus. The fluorescent foci may indicate accumulation of a protein. The protein can be labeled with any detectable agent disclosed herein. Upon accumulation within the nucleus, said detectable agent-labeled protein can be visualized as agglomerations or spots, also referred to as“foci.” The present disclosure also describes foci representing other detectable agents. For example, disclosed herein are foci of fluorescently labeled cell engineering tools (e.g., genome editing complex or gene regulator such as an epigenetic repressor or activator). Cell engineering tools (e.g., genome editing complex or gene regulator such as an epigenetic repressor or activator) can be labeled with a second fluorophore, different from the fluorophore conjugated to the protein. This can allow for simultaneous imaging and image analysis of the cell engineering tool (e.g., genome editing complex or gene regulator such as an epigenetic repressor or activator) and a protein, which accumulates during a cellular response. Also disclosed herein are foci of a fluorescently labeled genomic locus, wherein the genomic locus is visualized by labeled oligonucleotide Nano-FISH probe sets, which have a third
fluorophore different from the first and second fluorophore. The genomic locus can be a target or off-target genomic locus. To visualize target and off-target genomic loci of interest, two separate Nano-FISH probe sets can be used, each with a different detectable agent.
[0091] The methods and compositions disclosed herein include an image-based assay for quantifying a protein that accumulates during a cellular response to a cellular perturbation caused by a cell engineering tool (e.g., genome editing complex or gene regulator such as an epigenetic repressor or activator), thereby serving as a marker of specificity and/or activity of the cell engineering tool. Specifically, the image-based methods can quantify a protein load, wherein the protein load is number of protein foci or total protein content per nucleus. The image-based methods described herein can also quantify a cell engineering tool load, wherein the cell engineering tool load can be a number of cell engineering tool foci or total cell engineering tool content per nucleus.
[0092] In some embodiments, a cellular perturbation comprising accumulation of a protein can be induced by a genome editing complex, which includes a DNA binding domain, a nuclease, and an optional linker. Genome editing complexes can also be referred to simply as “nucleases.” Specific genome editing complexes, whose cellular activity can be monitored, can include TALENs, megaTAL, a meganuclease, CAS nuclease (e.g., CRISPR/Cas9 systems), and zinc finger nucleases (ZFNs).
[0093] In other embodiments, the cellular perturbation can be induced by a gene regulator, such as a gene repressor, which can include a DNA binding domain, a repressor domain, and, optionally, a linker. In certain embodiments, the image based analysis of this disclosure allows for quantification of spots in a cell or a subcellular compartment, such as the nucleus, which are indicative of protein accumulation in response to a cellular perturbation.
[0094] In some embodiments, the image-based assay allows for quantification of spots representing protein accumulation within the nucleus on a per allele per cell basis. For example, when cells are edited with a genome editing complex (e.g., a TALEN,
CRISPR/Cas9, ZFN, megaTALs, or meganucleases) to introduce a functional gene or to knock out a gene, nucleases (e.g., FokI or Cas9) induce a double strand break at the site of modification. Upon induction of the double strand break, a protein, such as a DNA repair protein, e.g., phosphorylated (serl778) 53BP1 (p53BPl) or gH2AC can accumulate at the site of the double strand break and is indicative of a DNA damage response. In some embodiments, p53BPl serves as a surrogate marker of a double strand break.
[0095] The present disclosure provides methods for staking cells tor p53BPl with a detectable agent. The detectable agent can comprise a primary antibody and a secondary antibody conjugated to a fluorophore. hi other embodiments, the detectable agent can comprise a direct primary antibody conjugated to a fluorophore. Thus, p53BPl foci, including one or more p53BPl protein moieties accumulating at the site of a double strand break, can be resolved and visualized in the nucleus of the cell. The number of p53BP l foci can indicate the number of double stand breaks induced in a cell and image analysis can, thus, serve to quantitatively resolve the DNA damage process spatially and temporally in each cell induced by a gene editing complex (e.g., a TALEN, CRISPR/Cas9, ZFN, megaTALs, or meganucleases). Staining and visualizing p53BP l foci within the nucleus of a cell, using the staining and image analysis techniques disclosed herein, can serve as a powerful tool to probe the specificity of a genome editing complex (e.g., a TALEN,
CRISPR/Cas9, ZFN, megaTALs, or meganucleases) on a per allele per cell basis.
[0096] The compositions and methods of the present disclosure can be a powerful tool for assesskig the specificity and activity of cell engineering tools (e.g., genome editing complex or gene regulator such as an epigenetic repressor or activator). These methods can be used to screen at least 5, at least 10, at least 50, at least 100, at least 150, at least 200, at least 250, at least 300, at least 350, at least 400, at least 500, or at least 1000 cell engineering tools (e.g., genome editing complex or gene regulator such as an epigenetic repressor or activator).
These methods can be used to screen at 5- 10, 10-50, 50- 100, 150-200, 200-250, 250-300, 300-350, 350-400, 400-450, 450-500, or 500- 1000 (e.g., genome editing complex or gene regulator such as an epigenetic repressor or activator) for lead candidates that exhibit potency (e.g., high gene editing efficiency or heightened or dampened gene expression) and specificity (low off-target (not at the genomic locus) cellular responses). The methods of the present disclosure can also be used to produce a potent and specific cell engineering tool, by iteratively tuning a parameter of a cell engineering tool and testing for improved specificity.
[0097] The compositions and methods of the present disclosure can be used to evaluate cell engineering tools for activity and/or specificity in primary cells. In some embodiments, immortalized cells can also be used with the compositions and methods of the present disclosure. In further embodiments, the primary cells and immortalized cell lines can be intact. Thus, the image-based methods described herein allow probing of an allele in intact cells, such as, a fixed cell without requiring isolation of genomic DNA for sequencing. Detemiining Specificity of Genome Editing Complexes
[0098] In some embodiments, the present disclosure provides compositions and methods for probing the specificity of a genome editing complex (e.g., a TALEN, CRISPR/Cas9, ZFN, megaTALs, or meganucleases) by imaging and analyzing p53BPl foci. Genome editing complexes are a type of a cell engineering tool and can be referred to herein as a“nuclease.” In other words, imaging and analyzing p53BPl foci after administration of a genome editing complex (e.g., a TALEN, CRISPR/Cas9, ZFN, megaTALs, or meganucleases) can be used to quantify off-target DNA damage induced by the nuclease. Described below are several genome editing complexes (e.g., a TALEN, CRISPR/Cas9, and/or ZFN), which can be used to introduce a functional gene or knock out a gene, via nuclease-induced double strand breaks. Genome editing complexes can be administered to a cell by electroporation, lipofection, viral transduction, or another suitable delivery method. Further described below are the types of outcomes or readouts that can be analyzed using image-based analysis of p53BPl or yFKAX foci. In particular the methods can be used to quantify a protein (p53BPl) load, which can comprise the number of p53BP! foci and/or total p53BPl content within the nucleus.
A. TALENs
[0099] A nuclease may comprise a Transcription Activator-Like Effector (TALE) sequence. A TALE may comprise a DNA-binding module which includes a variable number of repeat units or repeat modules having about 33-35 amino acid residues. Each acid repeat unit recognizes one nucleotide through two adjacent amino acids (such as at amino acids at positions 12 and 13 of the repeat). In general, the amino acid sequences of each repeat unit does not vary significantly outside of positions 12 and 13. The amino acids at positions 12 and 13 of a repeat may also be referred to as repeat- variable diresidue (RVD).
[0100] A TALE probe described herein may comprise between about 1 to about 50 TALE repeat modules. A TALE probe described herein may comprise between about 5 and about 45, between about 8 and about 45, between about 10 and about 40, between about 12 and about 35, between about 15 and about 30, between about 20 and about 30, between about 8 and about 40, between about 8 and about 35, between about 8 and about 30, between about 10 and about 35, between about 10 and about 30, between about 10 and about 25, between about 10 and about 20, or between about 15 and about 25 TAL effector repeat modules.
[0101] A TALE probe described herein may comprise about 1, about 2, about 3, about 4, about 5, about 6, about 7, about 8, about 9, about 10, about 11, about 12, about 13, about 14, about 15, about 16, about 17, about 18, about 19, about 20, about 21, about 22, about 23, about 24, about 25, about 26, about 27, about 28, about 29, about 30, about 31, about 32, about 33, about 34, about 35, about 36, about 37, about 38, about 39, about 40, about 45, or about 50 TALE repeat modules. A TALE probe described herein may comprise about 5
TALE repeat modules. A TALE probe described herein may comprise about 10 TALE repeat modules. A TALE probe described herein may comprise about 11 TALE repeat modules. A TALE probe described herein may comprise about 12 TALE repeat modules. A TALE probe described herein may comprise about 13 TALE repeat modules. A TALE probe described herein may comprise about 14 TALE repeat modules. A TALE probe described herein may comprise about 15 TALE repeat modules. A TALE probe described herein may comprise about 16 TALE repeat modules. A TALE probe described herein may comprise about 17 TALE repeat modules. A TALE probe described herein may comprise about 18 TALE repeat modules. A TALE probe described herein may comprise about 19 TALE repeat modules. A TALE probe described herein may comprise about 20 TALE repeat modules. A TALE probe described herein may comprise about 21 TALE repeat modules. A TALE probe described herein may comprise about 22 TALE repeat modules. A TALE probe described herein may comprise about 23 TALE repeat modules. A TALE probe described herein may comprise about 24 TALE repeat modules. A TALE probe described herein may comprise about 25 TALE repeat modules. A TALE probe described herein may comprise about 26 TALE repeat modules. A TALE probe described herein may comprise about 27 TALE repeat modules. A TALE probe described herein may comprise about 28 TALE repeat modules. A TALE probe described herein may comprise about 29 TALE repeat modules. A TALE probe described herein may comprise about 30 TALE repeat modules. A TALE probe described herein may comprise about 35 TALE repeat modules. A TALE probe described herein may comprise about 40 TALE repeat modules. A TALE probe described herein may comprise about 45 TALE repeat modules. A TALE probe described herein may comprise about 50 TALE repeat modules.
[0102] A TAL effector repeat module may be a wild-type TALE DNA-binding module or a modified TALE DNA-binding repeat module enhanced for specific recognition of a nucleotide. A TALE probe described herein may comprise one or more wild-type TALE DNA-binding module. A TATE probe described herein may comprise one or more modified TAL effector DNA-binding repeat module enhanced for specific recognition of a nucleotide. A modified TALE DNA-binding repeat module may comprise 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more mutations that may enhance the repeat module for specific recognition of a nucleic acid sequence (e.g., a target sequence). In some cases, a modified TALE DNA- binding repeat module is modified at amino acid position 2, 3, 4, 11, 12, 13, 21, 23, 24, 25,
26, 27, 28, 30, 31, 32, 33, 34, or 35. In some cases, a modified TALE DNA-binding repeat module is modified at amino acid positions 12 or 13.
[0103] A TALE repeat module may be a repeat module-like domain or RVD-like domain. A RVD-like domain has a sequence different from naturally occurring polynucleotidic repeat module comprising RVD (RVD domain) but have a similar function and/or global structure. Non- limiting examples of RVD-like domains include protein domains selected from Puf RNA binding protein or Ankyrin super- family.
[0104] A TATE repeat module may comprise a RVD of TABLE 1. A TATE probe described herein may comprise one or more RVDs selected from TABLE 1. Sometimes, a TALE probe described herein may comprise up to 1, up to 2, up to 3, up to 4, up to 5, up to 6, up to 7, up to 8, up to 9, up to 10, up to 11, up to 12, up to 13, up to 14, up to 15, up to 16, up to 17, up to
18, up to 19, up to 20, up to 21, up to 22, up to 23, up to 24, up to 25, up to 26, up to 27, up to
28, up to 29, up to 30, up to 31, up to 32, up to 33, up to 34, up to 35, up to 36, up to 37, up to
38, up to 39, up to 40, up to 45, up to 50, up to 60, up to 70, up to 80, up to 90, or up to 100
RVDs selected from TABLE 1.
TABLE 1
*Denotes a gap in the repeat sequence corresponding to a lack of an amino acid residue at the second position of the RVD.
[0105] A RVD may recognize or interact with one type of nucleotide (e.g., the RVD HD binds only to C). A RVD may recognize or interact with more than one type of nucleotide
(e.g., the RVD binds to G and A). The efficiency of a RVD domain at recognizing a nucleotide is ranked as“strong”, “intermediate” or“weak”. The ranking may be according to a ranking described in Streubel el a/. ,“TAL effector RVD specificities and efficiencies,” Nature Biotechnology 30(7): 593-595 (2012). The ranking of RVD may be as illustrated in TABLE 2, based on the ranking provided in Streubel et al. Nature Biotechnology 30(7): 593- 595 (2012).
TABLE 2
Figure imgf000020_0001
*Denotes a gap in the repeat sequence corresponding to a lack of an amino acid residue at t le second position of the RVD.
[0106] A TALE DNA-binding domain may Huffier comprise a C-terminal truncated TALE DNA-binding repeat module, such as, a shortened, e.g., a half- repeat unit. A C-terminal truncated TALE DNA-binding repeat module may be between about 15 and about 34 residues in length. A C-terminal truncated TALE DNA-binding repeat module may be between about 15 and about 32, between about 18 and about 34, between about 18 and about 32, between about 24 and about 35, between about 28 and about 32, between about 25 and about 34, between about 25 and about 32, between about 25 and about 30, between about 28 and about 32, or between about 28 and about 30 residues in length. A C-terminal truncated TALE DNA-binding repeat module may be at least 18, at least 19, at least 20, at least 21, at least 22, at least 23, at least 24, at least 25, at least 26, at least 27, at least 28, at least 29, at least 30, at least 31, at least 32, at least 33, up to 34 residues in length. A C-terminal truncated TALE DNA-binding repeat module may be up to 15 residues, up to 18 residues, up to 19 residues, up to 20 residues, up to 21 residues, up to 22 residues, up to 23 residues, up to 24 residues, up to 25 residues, up to 26 residues, up to 27 residues, up to 28 residues, up to 29 residues, up to 30 residues, up to 31 residues, up to 32 residues, up to 33 residues, or up to 34 residues in length. A C-terminal truncated TALE DNA-binding repeat module may include a
RVD of TABLE 1
[0107] A TALE DNA-binding domain may further comprise an N-terminal cap. An N- terminal cap may be a polypeptide sequence flanking the DNA-binding repeat module. An N- terminal cap may be any length and may comprise from about 0 to about 136 amino acid residues in length. An N-terminal cap may be about 5, about 10, about 15, about 20, about 25, about 30, about 35, about 40, about 45, about 50, about 60, about 70, about 80, about 90, about 100, about 110, about 120, or about 130 amino acid residues in length. An N-terminal cap may modulate structural stability of the DNA-binding repeat modules. An N-terminal cap may modulate nonspecific interactions. An N-terminal cap may decrease nonspecific interaction. An N-terminal cap may reduce off- target effect. As used here, off- target effect refers to the binding of a DNA binding protein (e.g., a TALE protein) to a sequence that is not the target sequence of interest. An N-terminal cap may further comprise a wild-type N- terminal cap sequence of a TALE protein or may comprise a modified N-terminal cap sequence a TALE protein, such as a TALE protein from Xanthomonas.
[0108] A TALE DNA-binding domain may further comprise a C-terminal cap sequence. A C-terminal cap sequence may be a polypeptide portion flanking the C-terminal truncated TALE DNA-binding repeat module. A C-terminal cap may be any length and may comprise from about 0 to about 278 amino acid residues in length. A C-terminal cap may be about 5, about 10, about 15, about 20, about 25, about 30, about 35, about 40, about 45, about 50, about 60, about 80, about 100, about 150, about 200, or about 250 amino acid residues in length. A C-terminal cap may further comprise a wild-type C-terminal cap sequence of a TALE protein or may comprise a modified C-terminal cap sequence a TALE protein, such as a TATE protein from Xanthomonas.
[0109] A nuclease domain may be linked to a TALE DNA-binding domain either directly or through a linker. A linker may be between about 1 and about 50 amino acid residues in length. A linker may be from about 5 to about 45, from about 5 to about 40, from about 5 to about 35, from about 5 to about 30, from about 5 to about 25, from about 5 to about 20, from about 5 to about 15, from about 10 to about 40, from about 10 to about 35, from about 10 to about 30, from about 10 to about 25, from about 10 to about 20, from about 12 to about 40, from about 12 to about 35, from about 12 to about 30, from about 12 to about 25, from about 12 to about 20, from about 14 to about 40, from about 14 to about 35, from about 14 to about 30, from about 14 to about 25, from about 14 to about 20, from about 14 to about 16, from about 15 to about 40, from about 15 to about 35, from about 15 to about 30, from about 15 to about 25, from about 15 to about 20, from about 15 to about 18, from about 18 to about 40, from about 18 to about 35, from about 18 to about 30, from about 18 to about 25, from about 18 to about 24, from about 20 to about 40, from about 20 to about 35, from about 20 to about 30, or from about 25 to about 30 amino acid residues in length.
[0110] A nuclease domain fused to a TALE can be an endonuclease or an exonuclease. An endonuclease can include restriction endonucleases and homing endonucleases. An endonuclease can also include Sl Nuclease, mung bean nuclease, pancreatic DNase I, micrococcal nuclease, or yeast HO endonuclease. An exonuclease can include a 3’- 5’ exonuclease or a 5’ -3’ exonuclease. An exonuclease can also include a DN A exonuclease or an RNA exonuclease. Examples of exonuclease includes exonucleases I, II, III, IV, V, and Vni; DNA polymerase I, RNA exonuclease 2, and the like. A nuclease domain fused to a TALE can be a restriction endonuclease (or restriction enzyme). In some instances, a restriction enzyme cleaves DNA at a site removed from the recognition site and has a separate binding and cleavage domains. In some instances, such restriction enzyme is a Type IIS restriction enzyme.
[0111] A nuclease domain fused to a TALE can be a Type IIS nuclease. A Type IIS nuclease can be Fokl or Bfil. In some cases, a nuclease domain fused to a TALE is Fokl. In other cases, a nuclease domain fused to a TALE is Bfil.
[0112] Fokl can be a wild-type Fokl or can comprise one or more mutations. In some cases, Fokl can comprise 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more mutations. A mutation can enhance cleavage efficiency. A mutation can abolish cleavage activity. In some cases, a mutation can modulate homodimerization. For example, Fokl can have a mutation at one or more amino acid residue positions 446, 447, 479, 483, 484, 486, 487, 490, 491, 496, 498, 499, 500, 531, 534, 537, and 538 to modulate homodimerization.
[0113] In some instances, a Fokl cleavage domain is, for example, as described in Kim et al. ‘Hybrid restriction enzymes: Zinc finger fusions to Fok I cleavage domain,” PNAS 93 : 1156- 1160 (1996), which is incorporated herein by reference in its entirety. In some cases, a Fokl cleavage domain described herein is a Fokl of
(OT VK SET EEKK SET RHK LK YVPHFY1FJ IFIA RNSTODRH EMK VMEFFMK VYGYRG KHLGGSRKPDGAIYT V GSPID Y GVI VD TK A Y S GGYN LPI GQ ADEMQR Y VEEN Q TRN KHINPNEWWKVYPSSVTEFKFLFVSGHFKGNYKAQLTRLNHITNCNGAVLSVEELLI GGEMIK AGTLTLEE VRRKFNN GEINF , SEQ ID NO: 1062). In other instances, a Fokl cleavage domain described herein is a Fokl, for example, as described in ET.S. Patent No. 8,586,526, which is incorporated herein by reference in its entirety.
[0114] A TALE probe can be designed to recognize each strand of a double- stranded segment of DN A by engineering the TALE to include a sequence of repeat- variable diresidue subunits that may comprise about 22, about 23, about 24, about 25, about 26, about 27, about 28, about 29, about 30, about 31, about 32, about 33, about 34, about 35, about 36, about 37, about 38, about 39, or about 40 amino acid repeats capable of associating with specific DNA sequences, such that the detectable label of the TALE probe is located at the target nucleic acid sequence.
[0115] Also described herein are megaTALs, in which a TALE DNA binding domain is fused to a monomeric meganuclease, also referred to as a“homing endonuclease” capable of binding and cleaving a target genomic locus of interest. Image-based analysis methods and compositions described herein can be used to evaluate the specificity and/or activity of a megaTAL.
[0116] Image-based analysis methods and compositions described herein can be used to evaluate the specificity and/or activity of a meganuclease. Meganucleases can include intron endonucleases and intein endonucleases. Meganucleases can be a LAGLIDADG
endonuclease and can include I-Crel or I-Scel.
B. CRISPR/Cas9
[0117] Similar to TALENs and ZFNs, clustered regularly interspaced palindromic repeats- associated- Cas9 (CRISPR-Cas9) systems can also be engineered to target and edit a specific nucleic acid sequence. A CRISPR-dCas9 can comprise multiple components in a
ribonucleoprotein complex, which can include the Cas9 protein that can interact with a single- guide RNA (sgRNA), an optional linker, and a repressor domain. The sgRNA can be made of a CRISPR RNA (crRNA) and a trans- activating crRNA (tracrRNA). The CRISPR- Cas9s described herein can be used to modulate transcription of a target gene to which the sgRNA binds. For example, the CRISPR-Cas9s of the present disclosure can be used to repress expression of a target gene.
[0118] The sgRNA can comprise at least 18, at least 19, at least 20, at least 21, at least 22, at least 23, at least 24, or at least 25 nucleotides that are complementary to a target sequences of interest. Thus, this portion of the sgRNA is analogous to the DNA binding domain described herein with respect to TALENs and ZFNs. The portion of the sgRNA (e.g., the about 20 nucleotides within the sgRNA that bind to a target) bind adjacent to a protospacer adjacent motif (PAM), which can comprise 2-6 nucleotides in the target sequence that is bound by Cas9.
C. ZFNs
[0119] Similar to TALEN, zinc-finger nuclease (ZFN) is a restriction enzyme that can be engineered to target and edit specific nucleic acid sequences. A ZFN can comprise a zinc- finger DNA binding domain linked either directly or indirectly to a nuclease domain.
[0120] A zinc-finger DNA binding domain of a ZFN can comprise from about 1 to about 10 zinc finger motifs. A zinc-finger DNA binding domain can comprise from about 1 to about 9, from about 2 to about 8, from about 2 to about 6 or from about 2 to about 4 zinc finger motifs. In some cases, a zinc-finger DNA binding domain can comprise at least 1, 2, 3, 4, 5,
6, 7, 8, 9, 10, or more zinc finger motifs. A zinc- finger DNA binding domain can comprise at least 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 zinc finger motifs. A zinc-finger DNA binding domain can comprise about 1 zinc finger motif A zinc-finger DNA binding domain can comprise about 2 zinc finger motif A zinc-finger DNA binding domain can comprise about 3 zinc finger motif A zinc-finger DNA binding domain can comprise about 4 zinc finger motif A zinc-finger DNA binding domain can comprise about 5 zinc finger motif A zinc-finger DNA binding domain can comprise about 6 zinc finger motif A zinc-finger DNA binding domain can comprise about 7 zinc finger motif A zinc-finger DNA binding domain can comprise about 8 zinc finger motif A zinc-finger DNA binding domain can comprise about 9 zinc finger motif A zinc- finger DNA binding domain can comprise about 10 zinc finger motif
[0121] A zinc finger motif can be a wild-type zinc finger motif or a modified zinc finger motif enhanced for specific recognition of a set of nucleotides. A ZFN described herein can comprise one or more wild-type zinc finger motif A ZFN described herein can comprise one or more modified zinc finger motif enhanced for specific recognition of a set of nucleotides.
A modified zinc finger motif can comprise 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, or more mutations that can enhance the motif for specific recognition of a set of nucleotides. In some cases, one or more amino acid residues within the a-helix of a zinc finger motif are modified. In some cases, one or more amino acid residues at positions - 1, +1, +2, +3, +4, +5, and/or +6 relative to the N-terminus of the a-helix of a zinc finger motif can be modified.
[0122] A nuclease domain linked to a zinc-finger DNA-binding domain can be an
endonuclease or an exonuclease. An endonuclease can include restriction endonucleases and homing endonucleases. An endonuclease can also include Sl Nuclease, mung bean nuclease, pancreatic DNase I, micrococcal nuclease, or yeast HO endonuclease. An exonuclease can include a 3’ -5’ exonuclease or a 5’ -3’ exonuclease. An exonuclease can also include a DNA exonuclease or an RN A exonuclease. Examples of exonuclease includes exonucleases I, II,
IP, IV, V and VIII; DNA polymerase I, RNA exonuclease 2, and the like.
[0123] A nuclease domain fiised to a zinc-finger DNA-binding domain can be a restriction endonuclease (or restriction enzyme). In some instances, a restriction enzyme cleaves DNA at a site removed from the recognition site and has a separate binding and cleavage domains. In some instances, such restriction enzyme is a Type IIS restriction enzyme.
[0124] A nuclease domain fused to a zinc-finger DNA-binding domain can be a Type IIS nuclease. A Type IIS nuclease can be Fokl or Bill. In some cases, a nuclease domain fused to a zinc-finger DNA-binding domain is Fokl. In other cases, a nuclease domain fused to a zinc- finger DNA-binding domain is Bfil.
[0125] A nuclease domain can be linked to a zinc-finger DNA-binding domain either directly or through a linker. A linker can be between about 1 to about 50 amino acid residues in length. A linker can be from about 5 to about 45, from about 5 to about 40, from about 5 to about 35, from about 5 to about 30, from about 5 to about 25, from about 5 to about 20, from about 5 to about 15, from about 10 to about 40, from about 10 to about 35, from about 10 to about 30, from about 10 to about 25, from about 10 to about 20, from about 12 to about 40, from about 12 to about 35, from about 12 to about 30, from about 12 to about 25, from about 12 to about 20, from about 14 to about 40, from about 14 to about 35, from about 14 to about 30, from about 14 to about 25, from about 14 to about 20, from about 14 to about 16, from about 15 to about 40, from about 15 to about 35, from about 15 to about 30, from about 15 to about 25, from about 15 to about 20, from about 15 to about 18, from about 18 to about 40, from about 18 to about 35, from about 18 to about 30, from about 18 to about 25, from about 18 to about 24, from about 20 to about 40, from about 20 to about 35, from about 20 to about 30, or from about 25 to about 30 amino acid residues in length.
[0126] A linker for linking a nuclease domain to a zinc-finger DNA-binding domain can be about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26,
27, 28, 29, 30, 35, 40, 45, or 50 amino acid residues in length.
D. Genome Editing Complex Readouts
[0127] In some embodiments, the present disclosure provides an image-based assay for quantification of protein (e.g., p53BPl or gH2 AX ) load on a per cell basis after administration of any of the gene editing complexes disclosed herein (e.g., a TALEN, CRISPR/Cas9, ZFN, megaTALs, or meganucleases). Protein load can be determined, for example, by quantification of number of p53BPl foci or total p53BPl content per nucleus. Types of analyses that can be performed include identification of DNA damage response proteins as surrogates for nuclease activity, development of a reliable quantitative imaging assay to visualize the protein (e.g., p53BPl or gH2 AX ), quantification of nuclease activity in each cell at its target genomic locus and elsewhere (for example, by measurement of indels), quantification of cell transfection efficiency and levels of nuclease expression, quantification of cytotoxicity resulting from nuclease activity, screening of nucleases in a high-throughput (96-well) format, and screening of gene editing complexes with high precision using as low as 50 cells to as high as 1000 cells or more. Image-based analysis of p53BPl for evaluating nuclease specificity can be performed across all nucleases (e.g., a TALEN, CRISPR/Cas9, ZFN, megaTALs, or meganucleases) and across all cell types including immortalized cells and primary cells.
[0128] In some embodiments, the genome editing complex can be tagged, for example with a FLAG tag. When further staining for p53BPl foci, the image analysis methods of the present disclosure allows for co-quantification of genome editing complex amount by staining for the FLAG tag (e.g., antibody-based methods) and p53BPl load (e.g., number of p53BPl foci, total p53BPl amount per nucleus), which serves as a measure of genome editing complex specificity. Additionally, genome editing complex-induced cytotoxicity can be measured by quantifying the fraction of apoptotic nuclei in transfected cells.
[0129] Genome editing complex specificity can be measured by evaluating dose response in cells using the image-based assay of the present disclosure and analyzing for p53BPl load. In certain embodiments, genome editing complex with high specificity can induce a similar level of double strand breaks, as visualized by a similar p53BPl load, regardless of the genome editing complex dose. In some embodiments, genome editing complex specificity can be measured over time, for example up to 3 hrs post-transfection, up to 6 hours post transfection, up to 12 hours post transfection, up to 24 hours post-transfection, up to 48 hours post transfection, up to 60 hrs post-transfection, 0 to 6 hours post-transfection, 3 to 60 hours post transfection, 6 to 12 hours post transfection, 24 to 48 hours post transfection, 6 to 24 hours 48 hours to 5 days after transfection. 5 to 10 days after transfection, 10- 15 days post transfection. 15 to 20 days post transfection, 20 to 25 days post transfection, 25 to 30 days post transfection, or 6 hours to 30 days post transfection. [0130] In some embodiments, imaging p53BPl foci for quantification of double strand breaks can be used to determine which component of a genome editing complex drives specificity versus off target activity. For example, TALENs can be comprised of a left DNA binding domain coupled to Fokl targeting a top DNA strand and a right DNA binding domain coupled to Fokl targeting a bottom DNA strand. These can be referred to as a left TALEN monomer and a right TALEN monomer. Quantification of p53BPl foci after administration of just one TALEN monomer can reveal which monomer leads to off- target enzymatic activity.
[0131] In some embodiments, genome editing complexes can be iteratively improved upon by changing a parameter of the genome editing complex, testing for specificity by image analysis of p53BPl load after administration in cells, and, optionally, further tuning the parameter of the genome editing complex and re-testing specificity. For example, as described herein, a TALEN can include a DNA binding domain comprising a number of repeat units. As length of the DNA binding domain is increased, specificity for the target genomic locus can be increased. TALENs can be iteratively designed to increase the number of repeats within the DNA binding domain, administering said TALEN to a cell, evaluating specificity by imaging for p53BPl foci and quantifying p53BPl load, and if needed further increasing the number of repeats within the DNA binding domain.
[0132] In some embodiments, visualization of DNA double strand breaks, induced by a genome editing complex, via staining for p53BPl can be further combined with imaging of the target genomic locus of interest using oligonucleotide Nano-FISH probe sets and methods described further below. For example, cells can be transfected with a genome editing complex targeting a genomic locus of interest. The nuclease enzyme (e.g., Fokl) of the genome editing complex can be tagged (e.g., via a FLAG tag) and cells can be denatured and labeled with oligonucleotide Nano-FISH probes for the same genomic locus of interest. DNA double strand breaks can be further imaged via staining for p53BPl foci. Co-localization of signal from p53BPl foci with signal from oligonucleotide Nano-FISH probe foci indicates nuclease activity at the target genomic locus of interest, thus indicating specificity. Signal from p53BPl foci that are spatially separated from signal from oligonucleotide Nano-FISH probe foci can indicate off-target nuclease activity that may not be at the genomic locus of interest.
[0133] Image based analysis of the specificity of genome editing complexes via visualization of p53BPl can be done at high throughput. High throughput analysis can involve analysis of greater than 1000, greater than 10,000, or greater than 100,000 cells in less than 24 hours or less than 48 hours. In some embodiments, high throughput analysis can involve analysis of more than 1 unique sample, more than 5 unique samples, more than 10 unique samples, or more than 100 unique samples within 24 hours. In other embodiments, cell populations less than 1000, less than 500, less than 100, or 50 or less can be analyzed.
[0134] In some embodiments, image-based analysis of p53BPl content in a cell after administration of a gene editing complex can be combined with measurements of gene editing efficiency (e.g., measuring indels at the target site). Thus, the present disclosure allows assessment of genome editing complexes for potency and specificity, wherein potency is determined by measuring gene editing efficiency and specificity is measured via quantification of p53BPl foci either alone or in combination with oligonucleotide Nano- FISH for the genomic locus of interest.
Gene Regulators
[0135] In some embodiments, the present disclosure provides compositions and methods for probing the specificity of a gene regulator (e.g., a TALE-TF, CRISPR/dCas9, and/or ZFP- TF) by imaging and analyzing for protein accumulation at a target genomic locus. Described below are several gene regulators (e.g., a TALE-TF, CRISPR/dCas9, and/or ZFP-TF), which can be used to activate expression of a target gene or repress expression of a target gene. In some cases, additional proteins are recruited to the target genomic locus and can serve as a marker for gene activation (e.g., H3K4mel, H3K4me2, H3K27ac) or gene repression (e.g., KAP1, H3K9me3, H3K27me3 or HP1). Further described below are the types of outcomes or readouts that can be analyzed using image-based analysis of gene repression.
A. Transcription Activator-like Effector - Transcription Factor (TALE-TF)
[0136] The present disclosure provides for a gene regulator or an engineered transcription factor, wherein the engineered transcription factor can be a transcription activator- like effector-transcription factor (TALE-TF). A TALE-TF can include multiple components including the transcription activator- like effector (TALE) protein, an optional linker, and a repressor domain. The TALE-TFs described herein can be used to modulate transcription of a target gene to which the TALE protein binds. For example, tire TALE-TFs of the present disclosure can be used to repress expression of a target gene.
[0137] In some embodiments, the TAL effector can be any TAL effector described above. A TALE-TF of the present disclosure can further include a transcription repressor domain. The repressor domain can be a Kriippel- associated box (KRAB) protein, which induces transcriptional repression of polymerases (RNA pol I, II, and/or III) by binding to other corepressors. Alternatively, the repressor domain can be any one of KQX, TGF-beta- inducible early gene (TIEG), v-erbA, SID, MBD2, MBD3, DNMT1, DNMT3A-L, or DNMT3B, Rb, and MeCP2.
[0138] In some embodiments, a TALE-TF of the present disclosure can further include a transcription activation domain. The activation domain can comprises VP 16, VP64, p65, p300 catalytic domain, TET1 catalytic domain, TDG, Ldbl self- associated domain, SAM activator (VP64, p65, HSF 1), or VPR (VP64, p65, Rta)
[0139] In some embodiments, any one of the TALEs described herein can bind to a region of interest of any gene. For example, the TALEs described herein can bind upstream of the promoter region, upstream of the gene transcription start site, or downstream of the transcription start site. In certain embodiments, the TALE protein binding region is no farther than 50 base pairs downstream of the transcription start site. In some embodiments, the TALE protein is designed to bind in proximity to the transcription start site (TSS). In other embodiments, the TALE can be designed to bind in the 5’ UTR region.
B. Zinc Finger Protein - Transcription Factor (ZFP-TF)
[0140] The present disclosure provides for a engineered transcription factor, wherein the engineered transcription factor can be a zinc-finger protein-transcription factor (ZFP-TF). A ZFP-TF can include multiple components including the zinc finger protein (ZFP), an optional linker, and a repressor domain. The ZFP-TFs described herein can be used to modulate transcription of a target gene to which the ZFP binds. For example, the ZFP-TFs of the present disclosure can be used to repress expression of a target gene. The repressor domain can be a Kriippel- associated box (KRAB) protein, which induces transcriptional repression of polymerases (RNA pol I, II, and/or III) by binding to other corepressors. Alternatively, the repressor domain can be any one of Sin3a, LSD1, SUV39H1, G9A (EHMT2), DNMT1, DNMT3A-DNMT3L, DNMT3B, KOX, TGF-beta-inducible early gene (TIEG), v-erbA, SID, MBD2, MBD3, Rb, or MeCP2.
[0141] In some embodiments, a ZFP-TF of the present disclosure can further include a transcription activation domain. The activation domain can comprises VP 16, VP64, p65, p300 catalytic domain, TET1 catalytic domain, TDG, Ldbl self- associated domain, SAM activator (VP64, p65, HSF 1), or VPR (VP64, p65, Rta)
[0142] The ZFP can also be referred to as a zinc finger DNA binding domain. The zinc- finger DNA binding domain can comprise a set of zinc finger motifs. Each zinc finger motif can be about 30 amino acids in length and can fold into a bba structure in which the a- helix can be inserted into the major groove of the DNA double helix and can engage in sequence- specific interaction with the DNA site. In some cases, the sequence- specific recognition can span over 3 base pairs. In some cases, a single zinc finger motif can interact specifically with 1, 2 or 3 nucleotides.
C. CRISPR-dCas9 - Transcription Factor (CRISPR-dCas9-TF)
[0143] The present disclosure provides for a engineered transcription factor, wherein the engineered transcription factor can be a clustered regularly interspaced palindromic repeats- associated- deactivated Cas9 (CRISPR-dCas9). A CRISPR-dCas9 can comprise multiple components in a ribonucleoprotein complex, which can include the dCas9 protein that can interact with a single-guide RNA (sgRNA), an optional linker, and a repressor domain. The sgRNA can be made of a CRISPR RNA (crRNA) and a trans- activating crRNA (tracrRNA). The CRISPR-dCas9s described herein can be used to modulate transcription of a target gene to which the sgRNA binds. For example, the CRISPR-dCas9s of the present disclosure can be used to repress expression of a target gene.
[0144] The sgRNA can comprise at least 18, at least 19, at least 20, at least 21, at least 22, at least 23, at least 24, or at least 25 nucleotides that are complementary to a target sequences of interest. Thus, this portion of the sgRNA is analogous to the DNA binding domain described above with respect to ZFPs and TALEs. The portion of the sgRNA (e.g., the about 20 nucleotides within the sgRNA that bind to a target) bind adjacent to a protospacer adjacent motif (PAM), which can comprise 2-6 nucleotides in the target sequence that is bound by dCas9.
[0145] The dCas9 can be generated from a wild-type Cas9 protein by mutating 2 residues.
The CRISPR-dCas9 ribonucleoprotein complex can repress a target gene by steric hindrance. The CRISPR-dCas9 ribonucleoprotein complex can be further coupled to any repressor domain described herein (e.g., KRAB, Sin3a, LSD1, SUV39H1, G9A (EHMT2), DNMT1, DNMT3A-DNMT3L, DNMT3B, KOX, TGF-beta-inducible early gene (TIEG), v-erbA, SID, MBD2, MBD3, Rb, or MeCP2) to provide repression of a target gene.
[0146] In some embodiments, a CRISPR-dCas9 ribonucleoprotein complex can be further coupled to a transcription activation domain. The activation domain can comprises VP 16, VP64, p65, p300 catalytic domain, TET1 catalytic domain, TDG, Ldbl self- associated domain, SAM activator (VP64, p65, HSF1), or VPR (VP64, p65, Rta) D. Epigenetic Regulation Readouts
[0147] In some embodiments, the present disclosure provides for imaging protein
accumulation after administration of a gene regulator (e.g., TALE-TF, CRISPR-dCas9, or ZFP-TF). Types of analyses that can be performed include identification of protein for repression of translation machinery, development of a reliable quantitative imaging assay to visualize the chosen surrogate protein, quantification of gene repression activity in each cell at its target genomic locus and elsewhere, quantification of cell transfection efficiency and levels of gene regulator expression, and screening of gene regulators in a high-throughput (96- well) format. For example, a TALE-TF comprising a DNA binding domain, a KRAB repressor domain and, optionally, a linker can be transfected into a cell of interest. The cell can be an immortalized cell or a primary cell. Upon binding to the target genomic locus, the KRAB repressor domain is capable of recruiting other co-repressors (e.g., KAP1). Staining can be performed against recruited co-repressors (e.g., KAP1) for evaluating repressor activity. The staining can include a primary and secondary antibody-fluorophore conjugate or a primary antibody-fluorophore conjugate.
[0148] In another example, the TALE-TF can comprise a DNMT3 a repressor domain. In another example, the TALE-TF can comprise any repressor domain or activation domain described herein. Staining can then be performed for proteins accumulating at the site gene activation (e.g., H3K4mel, H3K4me2, H3K27ac) or gene repression (e.g., KAP1, H3K9me3, H3K27me3 or HP 1) to evaluate specificity of the gene regulator. These image-based analyses of proteins indicative of gene regulator activity can be performed across all gene regulators (e.g., TALE-TF, CRISPR/dCas9, ZFP-TFs) and across all cell types, including immortalized cells and primary cells.
[0149] In some embodiments, the activation or repression domain can be tagged with a detectable agent, such as a fluorescent moiety. When further staining for proteins that accumulate in response to gene activation (e.g., H3K4mel, H3K4me2, H3K27ac) or gene repression (e.g., KAP1, H3K9me3, H3K27me3 or HPl), the image analysis methods of the present disclosure allows for co- quantification of gene regulator amount and a protein (e.g., H3K4mel, H3K4me2, H3K27ac proteins for activation or K API, H3K9me3, H3K27me3 or HP1 proteins for repression) load, which serves as a measure of gene regulator activity. As described above, protein load can include number of protein foci or total protein content per nucleus.
[0150] Additionally, cytotoxicity induced by administration of gene regulators (e.g., TALE- TF, CRISPR-dCas9, or ZFP-TF) can be measured by quantifying the fraction of apoptotic nuclei in transfected cells. Gene regulator specificity can be measured by evaluating dose response in cells using the image-based assay of the present disclosure and analyzing for foci comprising markers of gene activation (e.g., H3K4mel, H3K4me2, H3K27ac) or gene repression (e.g., KAP1, H3K9me3, H3K27me3 or HPl). In some embodiments, gene regulator specificity can be measured over time, for example 6 hours post- transfection, 12 hours post transfection, 24 hours post-transfection, 48 hours post transfection, 0-6 hours post- transfection. 6- 12 hours post transfection, 24-48 hours post transfection, 48 hours to 5 days after transfection. 5- 10 days after transfection, 10- 15 days post transfection. 15-20 days post transfection, 20-25 days post transfection, 25-30 days post transfection, or 6 hours - 30 days post transfection.
[0151] In some embodiments, visualization of gene regulator activity, via staining for a protein that accumulates in response to gene activation (e.g., H3K4mel, H3K4me2,
H3K27ac) or gene repression (e g., KAPI, H3K9me3, H3K27me3 or HPl), can be further combined with imaging of the target genomic locus of interest using oligonucleotide Nano- FISH probe sets and methods described further below. For example, cells can be transfected with a gene regulator (e.g., TALE-TF, ZFP-TF, CRISPR/dCas9) targeting a genomic locus of interest. Cells can be denatured and labeled with oligonucleotide Nano-FISH probes for the same genomic locus of interest. Recruited protein that accumulates in response to gene activation (e.g., H3K4mel, H3K4me2, H3K27ac) or gene repression (e.g., KAP1, H3K9me3, H3K27me3 or HPl) can be further imaged via staining. Co-localization of protein foci (e.g., H3K4mel, H3K4me2, H3K27ac for activators or KAP1, H3K9me3, H3K27me3 or HP1 for repressors) with signal from oligonucleotide Nano-FISH probes indicates activity of the gene regulator at the target genomic locus of interest. Signal from protein foci that are spatially separated from signal from oligonucleotide Nano-FISH probes indicates off- target gene regulator activity that may not be at the genomic locus of interest.
Translocation
[0152] In some embodiments, the present disclosure involves imaging of a translocation event, such as chromosome translocation. For example, chromosome translocation can involve the generation of double strand breaks in two non-homologous regions of DNA, which can result in joining of the two non-homologous regions (translocation).
[0153] A genome editing complex (e.g., TALEN, ZFN, CRISPR/Cas9, megaTAL, meganuclease) can be administered to an immortalized or primary cell. Cells can be stained for p53BPl with a first detectable agent, subsequently or concurrently contacted with a oligonucleotide Nano-FISH probe set with a second detectable agent to hybridize to a target genomic locus, and contacted with a different oligonucleotide Nano-FISH probe set with a third detectable agent to hybridize to an off-target genomic locus. Samples are imaged and analyzed using the techniques disclosed herein. Foci of p53BPl can be visualized by signal from the first detectable agent, indicating a double strand break and gene editing with the genome editing complex. Foci of the oligonucleotide Nano-FISH probe set hybridized to a target genomic locus can be visualized by signal from the second detectable agent, indicating the target genomic locus. Foci of the oligonucleotide Nano-FISH probe set hybridized to an off-target genomic locus can be visualized by signal from the third detectable agent, indicating the off- target genomic locus.
[0154] In the absence of a translocation event, co- localization of the signal from the first detectable agent and the second detectable agent can be visualized observed, indicating co- localization of p53BPl with the oligonucleotide Nano-FISH probe set for the target genomic locus. When chromosomal translocation occurs, co-localization of the signal from the first detectable agent, the second detectable agent, and the third detectable agent can be observed, indicating co- localization of p53BPl with the oligonucleotide Nano-FISH probe set for the target genomic locus and the oligonucleotide Nano-FISH probe set for the off- target genomic locus.
[0155] The term“hybridization” or“hybridizes” refers to a process in which a region of nucleic acid strand anneals to and forms a stable duplex, either a homoduplex or a heteroduplex, under normal hybridization conditions with a complementary nucleic acid strand and does not form a stable duplex with unrelated (non- complementary) nucleic acid molecules under the same normal hybridization conditions. The formation of a duplex is accomplished by annealing two complementary nucleic acids under hybridization conditions. The hybridization condition can be made to be highly specific by adjustment of the conditions under which the hybridization reaction takes place, such that two nucleic acid strands will not form a stable duplex, e.g., a duplex that retains a region of double- strandedness under normal stringency conditions, unless the two nucleic acid strands contain a certain number of nucleotides in specific sequences which are substantially or completely complementary. “Normal hybridization or normal stringency conditions” are readily determined for any given hybridization reaction. See, for example, Ausubel et al., Current Protocols in Molecular Biology, John Wiley & Sons, Inc., New York, or Sambrook et al., Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory Press. As used herein, the term“hybridizing” or“hybridization” refers to any process by which a strand of nucleic acid binds with a complementary strand through base pairing..
Genes and Indications of Interest
[0156] In some embodiments, the image-based analysis of protein (e.g., p53BPl) of cellular perturbation (e.g., genome editing with a TALEN, CRISPR/Cas9, or ZFN) and/or Nano-FISH image analysis can be used to identify a lead genome editing complex for the purposes of genetic modification of a cell. In some embodiments, genome editing can be performed by fusing a nuclease of the present disclosure with a DNA binding domain for a particular genomic locus of interest. Genetic modification can involve introducing a functional gene for therapeutic purposes, knocking out a gene for therapeutic gene, or engineering a cell ex vivo (e.g., HSCs or CAR T cells) to be administered back into a subject in need thereof For example, the genome editing complex can have a target site within a gene such as PDCDl, CTLA4, LAG3, TET2, BTLA, HAVCR2, CCR5, CXCR4, TRA, TRB, B2M, albumin, HBB, HBA1, TTR, NR3C1, CD52, eiythroid specific enhancer of the BCL11A gene, CBLB, TGFBR1, SERPINA1, HBV genomic DNA in infected cells, CEP290, DMD, CFTR,
IL2RG, CS- l, or any combination thereof A“gene,” for the purposes of the present disclosure, includes a DNA region encoding a gene product, as well as all DNA regions which regulate the production of the gene product, whether or not such regulatory sequences are adjacent to coding and/or transcribed sequences. Accordingly, a gene includes, but is not necessarily limited to, promoter sequences, terminators, translational regulatory sequences such as ribosome binding sites and internal ribosome entry sites, enhancers, silencers, insulators, boundary elements, replication origins, matrix attachment sites and locus control region.
[0157] In some embodiments, a genome editing complex can cleave double stranded DNA at a target site in order to insert a chimeric antigen receptor (CAR), alpha-L iduronidase
(IDUA), iduronate-2-sulfatase (IDS), or Factor 9 (F9). Cells, such as hematopoietic stem cells (HSCs) and T cells, can be engineered ex vivo with the genome editing complex.
Alternatively, genome editing complexes can be directly administered to a subject in need thereof Image-based analysis of protein (e.g., p53BPl) of said genome editing complexes can enable the development of highly specific genome editing complexes with less than 10 off-target double strand breaks, less than 5 off-target double strand breaks, less than 4 off- target double strand breaks, less than 3 off-target double strand breaks, less than 2 off-target double strand breaks, less than 1 off- target double strand breaks, or no off- target double strand breaks.
[0158] The subject receiving treatment can be suffering from a disease such as transthyretin amyloidosis (ATTR), HIV, glioblastoma multiforme, cancer, acute lymphoblastic leukemia, acute myeloid leukemia, beta-thalassemia, sickle cell disease, MPSI, MPSII, Hemophilia B, multiple myeloma, melanoma, sarcoma, Leber congenital amaurosis (LCA10), CD19 malignancies, BCMA-related malignancies, duchenne muscular dystrophy (DMD), cystic fibrosis, alpha- 1 antitrypsin deficiency, X-linked severe combined immunodeficiency (X- SCID), or Hepatitis B.
[0159] A Nano-FISH probe set, as described below, can be designed for any genomic locus of interest described herein (e g., PDCD1, CTLA4, LAG3, TET2, BTLA, HAVCR2, CCR5, CXCR4, TRA, TRB, B2M, albumin, HBB, HBA1, TTR, NR3C1, CD52, erythroid specific enhancer of the BCLl lA gene, CBLB, TGFBR1, SERPINA1, HBV genomic DNA in infected cells, CEP290, DMD, CFTR, IL2RG, CS- l, or any combination thereof) to be used in combination with image-based analysis of protein (e.g., p53BPl) of cellular perturbation.
Nano-FTSH and viral Nano-FTSH Techniques
[0160] Any of the above compositions and methods for image-based analysis of a surrogate marker (e.g., a protein such as p53BPl) for a cellular response induced by a cellular perturbation can be further combined with Nano-FISH. Oligonucleotide Nano-FISH probe sets can be used to visualize a target genomic locus of interest. Thus, the specificity of a genome editing complex (e.g., a TALEN, CRISPR/Cas9, ZFN), a gene regulator (e.g., a TALE-TF, ZFP-TF, CRISPR/dCas9), or a translocation event can be visualized by combination imaging with Nano-FISH. Compositions and methods for Nano-FISH are described in further detail below.
[0161] Described herein are methods of detecting a cellular regulatory element in situ utilizing a super-resolution microscopy technique to determine the presence, absence, and/or activity of a regulatory element. Also described herein are methods of detecting different types of regulatory elements simultaneously utilizing a heterogeneous set of detection agents, and translating the molecular information from the different types of regulatory elements to determine the activity state of a cell. The activity state of a cell may correlate to a
localization, expression level, and/or interaction state of a regulatory element. One or more of the methods described herein may further interpolate 2-dimensional images to generate 3- dimensional maps which enable detection of localization, interaction states, and activity of one or more regulatory elements. Intrinsic properties such as size, intensity, and location of a detection agent lurther may enable detection of a regulatory element. Described herein are methods of determining the localization of a regulatory element and measuring the activity of a regulatory element. The methods provided herein may avoid the introduction of artifacts such as biological stressors and perturbations or destroys cellular architecture.
[0162] One or more methods described herein may detect different types of regulatory elements, distinguish between different types of regulatory elements, and/or generate a map of a regulatory element (e.g., chromatin). For example, a regulatory element may be labeled by one or more different types of detection agents. The one or more different types of detection agents may include DNA detection agents, RNA detection agents, protein detection agents, or combinations thereof The detection agent may comprise a probe portion, which may interact (e.g., hybridize) to a target site within the regulatory element, and optionally comprise a detectable moiety. The detectable moiety may include a fluorophore, such as a fluorescent dye or a quantum dot. The detection agent may be an unlabeled probe which can be further conjugated to an additional labeled probe. Upon labeling, the regulatory element may be detected by stochastic or deterministic super-resolution microscopy method. The stochastic super-resolution microscopy method may be a synthetic aperture optics (SAO) method. The SAO method may generate a detection profile, which can encompass fluorescent signal intensity, size, shape, or localization of the detection agent. Based on the detection profile, the activity state, the localization, expression level, and/or interaction state of the regulatory element may be determined. A map based on the detection profile of the regulatory element may also be generated, and may be correlated to cell type identification (e.g., cancerous cell identification). The regulatory element may be further analyzed in the presence of an exogenous agent or condition, such as a small molecule fragment or a drug, or under an environment such as a change in temperature, pH, nutrient, or a combination thereof The perturbation of the activity state of the regulatory element in the presence of the exogenous agent or condition may be measured. A report may further be generated and provided to a user, such as a laboratory clinician or health care provider.
[0163] The systems and methods disclosed herein also relate to a novel nanoscale
fluorescence in situ hybridization methodology (hereinafter referred to as“Nano-FISH”) to reliably label and detect localized small (less than 12 kb in size) DNA segments in cells. In some cases, Nano-FISH can utilize defined pools or sets of synthetic fluorescent dye-labeled oligonucleotides (probe pools or probe sets) to reliably detect small genomic regions in large numbers of adherent or suspension cells in situ. In some instances, Nano-FISH can be conducted utilizing conventional wide-field microscopic imaging. In other embodiments, Nano-FISH can be conducted using super- resolution imaging techniques.
[0164] In some cases, Nano-FISH can be coupled with an automated image informatics pipeline to enable high-throughput detection and 2D and/or 3D spatial localization of small genomic DNA elements in situ in hundreds of, thousands of or more individual cells per experiment. In some instances, to facilitate rigorous statistical analyses of the resulting large image data sets, a scalable image analysis software suite can reliably identify and
quantitatively annotate labeled loci on a single-cell basis.
[0165] In some cases, Nano-FISH can allow detection of the precise localization of specific regulatory genomic elements in 2D or 3D nuclear space, the identification of small-scale structural genomic variations (such as sequence gains or losses), the quantitation of spatial interactions between regulatory elements and their putative target gene(s), or the detection of genomic conformational changes that induce stimulus-dependent gene expression. In some instances, Nano-FISH can allow the visualization of the precise localization of a target nucleic acid sequence. The target nucleic acid sequence can be an endogenous nucleic acid sequence, a nucleic acid sequence derived from an exogenous source, or a combination thereof An exogenous nucleic acid sequence can be introduced into a first cell and can be further detected in progeny of the first cell. An exogenous target nucleic acid sequence can be introduced to a cell through electroporation, lipofection, transfection, microinjection, viral transduction, or a gene gun. Non-limiting examples of vector systems that can be used to introduce a target nucleic acid sequence into a cell may include viral vector, episomal vector, naked RNA (recombinant or natural), naked DNA (recombinant or natural), bacterial artificial chromosome (BAC), and RNA/DNA hybrid systems used separately or in combination. Vector systems can be used without additional reagents meant to aid in the incorporation and/or expression of desired mutations. A non-limiting list of reagents meant to aid in the incorporation and/or expression of desired mutations can include Lipofectamine, FuGENE, FuGENE HD, calcium phosphate, HeLaMONSTER, Xtreme Gene. An
endogenous nucleic acid sequence can be a gene sequence or fragment thereof An endogenous nucleic acid sequence can be a sequence in a chromosome. An endogenous nucleic acid sequence can be a nucleic acid sequence resulting from somatic chromosomal rearrangement, such as the nucleic acid sequence of a B cell receptor, T cell receptor, or fragment thereof In some instances, Nano-FISH can allow the detection of the precise localization of exogenous nucleic acids inserted or integrated into a genome. In some embodiments, Nano-FISH can allow the detection of the precise localization of exogenous DN A inserted into a genome, as may be inserted by a genetic engineering technique or by viral infection or transduction. In some instances, Nano-FISH can allow the detection of an episomal nucleic acid sequence.
[0166] The systems and methods described herein can be useful in detecting or determining the presence, absence, identity, or quantity of a target nucleic acid sequence in a sample. In particular, the methods, compositions, and systems described herein can be used to efficiently detect, to identify, and to quantify a target nucleic acid sequence that is a short nucleic acid sequences. In some cases, a short nucleic acid sequence that can be detected or quantified using the disclosures of the present application may be from 15 nucleotides in length to about 12 kb in length. A short nucleic acid sequence can be less than 1 kb.
[0167] Methods for the detection, identification, and/or quantification of a short nucleic acid sequence of a sample can comprise contacting the short nucleic acid sequence with a probe comprising a detectable label and determining the presence, absence, or quantity of probes bound to the target nucleic acid sequence. Determination of the sequence position of the short nucleic acid sequence relative to other nucleotides or another short nucleic acid sequence (for instance, using a second probe capable of binding to a second target sequence of the nucleic acid) can be a step in the methods described herein. The methods described herein can also comprise determining the spatial position of the short nucleic acid sequence. For example, Nano-FISH can be used to measure the normalized inter-spot distance between a first short nucleic acid sequence encoding an enhancer or portion thereof and a second nucleic acid encoding a promoter of a gene or portion thereof which can be used to study changes in genome conformation that may be associated with gene function.
[0168] The methods described herein can comprise comparing the presence, absence, spatial position, sequence position, or quantity of a short nucleic acid sequence of a sample to a reference value. A non-limiting example of quantifying detection of a short nucleic acid sequence in a cell can comprise quantifying the number of copies of a nucleic acid sequence that has been incorporated into a modified cell (for example, a cell modified by the introduction of a nucleic acid sequence into the cell by genetic editing), which can be used as quality control for modified cells produced by cell engineering strategies.
[0169] The degree of precision and accuracy in nucleic acid sequence detection,
identification, and quantification made possible by the methods, compositions, and systems of the present disclosure can enable the detection of viral nucleic acid sequences, which commonly range from about 1 kb in length to about 10 kb in length. [0170] Also described herein are methods, compositions, and systems useful in characterizing and/or quantifying the presence, absence, position, or identity of a target nucleic acid sequence in a cell or sample derived therefrom relative to a reference nucleic acid sequence in the same cell or sample or relative to a control cell or sample. For example, improvements to the efficiency of detection and to a detection threshold, as described herein, can allow for the detection and characterization of short nucleic acid sequences (for instance, non-repeating nucleic acid sequence insertions) during analysis or validation of cell samples or cell lines.
[0171] Additionally, described herein, are methods, compositions, and systems for correlating protein expression with target nucleic acid sequence detection. For example, a target nucleic acid sequence can be associated with the expression of a target protein. Using Nano-FISH, the presence, absence, or quantity of the target nucleic acid sequence can be detected, and a detectable label may be used to detect a target protein expression, which therefore can allow for the correlation between the presence, absence, or quantity of the target nucleic acid sequence and the expression of the target protein.
[0172] The Nano-FISH methods as described herein can be used as a diagnostic for the detection, identification, and/or quantification of a short nucleic acid sequence of a sample. For example, Nano-FISH can be used as a diagnostic for HIV by detecting HIV nucleic acid sequences in a sample. The Nano-FISH methods as described herein can be used with therapeutics by detecting, identifying, and/or quantifying a short nucleic acid sequence of a sample. For example, Nano-FISH can be used with therapeutics in which a short nucleic acid sequence is integrated into a cell’s DNA (e.g., chimeric antigen receptor T cell therapeutics) to determine detect, identify, and/or quantify the short nucleic acid sequence integration. This can be important for any type of viral- mediated (e.g., lentiviral- mediated) transgene integration because these integrations can be heterogeneous (i.e., some cells do not get infected, others are infected multiple times), and integrations occur randomly in the genome (i.e. inactive sequences, or active genes). In contrast to Nano-FISH, existing methods to measure transgene integration and expression suffer from limitations including lacking single-cell resolution (qPCR), providing data about protein products without DNA
information (flow cell sorting), or being laborious (single-cell cloning).
[0173] Additionally, Nano-FISH is a significantly improved and distinct tool from
conventional FISH for numerous reasons related to control over design of the probe set, which enable the detection of short nucleic acid sequences at high throughput and at a high signal-to-noise ratio. [0174] In some embodiments, Nano-FISH probe sets of the present disclosure can be comprised of one or more short oligonucleotide Nano-FISH probes designed against a target, allowing for complete control over probe size. For example, using the Nano-FISH methods described herein, one or more oligonucleotide Nano-FISH probes of exact size can be designed against a transfer plasmid backbone. The oligonucleotide Nano-FISH probes of the present disclosure can be from 30 to 60 nucleotides in length. In certain embodiments, the oligonucleotide Nano-FISH probes of the present disclosure can be 40 nucleotides in length. In contrast, conventional FISH techniques require the use of fosmids (varying in size from 40-50 kilobases), BACs (varying in size from varying in size from 100-250 kilobases), or plasmids (varying in size from 5- 10 kilobases), which are conventionally nick translated to incorporate hapten or lluorescently labeled-dUTP (or other nucleotide). The result of nick translating fosmids, BACs, and/or plasmids to obtain conventional FISH probes is the generation of a highly heterogeneous pool of probes of varying sizes. Conventional FISH probes average around 500 nucleotides in length but exhibit a size distribution from 100 bases to anywhere around 1.5 kilobases, which is up to 50 times larger than an
oligonucleotide Nano-FISH probe. Alternatively, conventional probes can be generated by means ofPCR with the incorporation of labeled nucleotides during the reaction. Thus, in contrast to the oligonucleotide Nano-FISH probes of this disclosure, there is poor control over the resulting probe size of nick translated conventional FISH probes made from fosmids, BACs, or plasmids.
[0175] In some embodiments, the Nano-FISH probes of the present disclosure are precisely controlled to introduce an exact number of fluorescent dye molecules per probe. For example, in some embodiments, each oligonucleotide Nano-FISH probe of the present disclosure can have exactly a detectable agent at the 3’ end. The detectable agent can be any dye molecule, such as a Quasar Dye (e.g., Q570 and Q670). Oligonucleotide Nano-FISH probes of the present disclosure may be synthesized from the 3’ to 5’ end, and the fluorophore may be included on the first nucleotide at the 3’ end. In some embodiments, an oligonucleotide Nano- FISH probe of the present disclosure can have 2 fluorescent dye molecules. For example, a Nano-FISH oligonucleotide probe of the present disclosure with a size of 55 to 60 nucleotides can have 2 fluorescence dye molecules. In this case, the second dye molecule may be placed on an internal nucleotide or at the 5’ end. Additionally, since the oligonucleotide Nano-FISH probes of the present disclosure directly incorporate a fluorophore at the 3’ end of each probe, the present disclosure provides a probe set that can be directly labeled and, thus, offers direct labeling and detection of a target nucleotide sequence without any need for signal
amplification.
[0176] In contrast, because conventional FISH probes can be nick translated to incorporate hapten-dUTPs or other labeled nucleotides for subsequent secondary detection by a fluorescent antibody/reagent, there is no control over the exact number of fluorescent dye molecules that are incorporated in a given probe. Thus, the resulting conventional FISH probes are a heterogeneous mixture with various degrees of fluorescent dye labels. Moreover, while some conventional FISH probes can directly incorporate a fluorescent dye, most conventional FISH probes contain Digoxigenin or biotin-labeled nucleotides, which are subsequently reacted to an antibody- fluorophore conjugate or a streptavidin-fluorophore conjugate. Thus, conventional FISH probes are indirectly labeled with a fluorophore. In contrast, the oligonucleotide Nano-FISH probes of the present disclosure are directly labeled with a fluorophore.
[0177] In some embodiments, the Nano-FISH probes of the present disclosure are designed to precisely target a desired strand of a target (e.g., the Watson strand, the Crick strand, or both strands). Moreover, the oligonucleotide Nano-FISH probes of the present disclosure can be designed to overlap by at least 5 base pairs. For example a first oligonucleotide Nano- FISH probe can be designed to target the Watson strand of a target sequence and a second oligonucleotide Nano-FISH probe can be designed to target an adjacent region on the Crick strand of a target sequence. The first and second probe can overlap by at least 5 nucleotides, can be directly adjacent to each other, or can be spaced apart by at least several nucleotides.
In some embodiments, the first and second probe can overlap by 5-20 nucleotides.
Overlapping probes on the plus and minus strands can allow for the design and hybridization of larger probe sets to target smaller nucleic acid sequences.
[0178] Finally, the oligonucleotide Nano-FISH probes of the present disclosure are designed and selected according to certain criteria in order to precisely target and detect an exogenous sequence (e.g., a viral nucleic acid sequence), while minimizing off-target binding that would increase the background noise during imaging. For example, a target can be selected and the hg38 coordinates can be determined. Next, a tiling density can be selected from all on one strand, a fixed 2 base pair spacing between adjacent oligonucleotide Nano-FISH probes, or a spacing of 30 base pairs on each DNA strands with a 5 base pair overlap between the top and bottom strands at each end. In some embodiments, oligonucleotide Nano-FISH probes of the present disclosure are tiled across a target to avoid steric hindrance between molecules. Next, oligonucleotide Nano-FISH probe sequences are tiled across regions of interest, such as the human genome or the human genome with an artificial extra chromosome representing the target (e.g., the CAR transfer plasmid). In some embodiments, a program can be used to tile oligonucleotide Nano-FISH probes across the region of interest. As an example, a 40 base pair probe pool can be generated by tiling 40 base pair oligonucleotide probes at a predetermined spacing between oligonucleotides across a target sequence. The tiled 40 base pair probe pool can be designed to provide a minimum spacing of 2 base pairs between each consecutive oligonucleotide Nano-FISH probe. Each oligonucleotide Nano-FISH probe in the resulting probe pool can be compared to a l6-mer database of genomic sequences to identify partial matches of probes to genomic sequences that can result in off-target background staining, which would negatively affect the signal- to-noise ratio. An oligonucleotide Nano- FISH probe that comprises a total of 24 matches or less to the l6-mer database may be considered to be unique in the human genome and, thus, can be selected to move forward. A probe with more than 300 matches to the l6-mer database of genomic sequences can be discarded from consideration as it generates too many non-target hits. The number of matches of an oligonucleotide Nano-FISH probe can have to the l6-mer database of genomic sequences may depend on the size of the probe. For example, a 30 base pair long
oligonucleotide Nano-FISH probe that exbibits a total of 14 matches or less to the l6-mer database may be considered to be unique in the human genome and, thus, may be selected to move forward. A 50 base pair long oligonucleotide Nano-FISH probe that exhibits a total of 34 matches or less to the l6-mer database may be considered to be unique in the human genome and, thus, may be selected to move forward. A 60 base pair long oligonucleotide Nano-FISH probe that exhibits a total of 44 matches or less to the l6-mer database may be considered to be unique in the human genome and, thus, may be selected to move forward. Thus, an oligonucleotide Nano-FISH probe of the present disclosure between 30 to 60 base pairs in length may exhibit 14 to 44 matches or less to the l6-mer database and be considered unique in the human genome. Oligonucleotide Nano-FISH probes of the present disclosure have less than 300 matches to the l6-mer database of genomic sequences. Pools of at least 30 oligonucleotide Nano-FISH probes that satisfied all design criteria can be selected to carry forward. Additional selection criteria that can be applied when selecting the oligonucleotide Nano-FISH probes of the present disclosure include percent GC content. For example, oligonucleotide Nano-FISH probes can have a percent GC content above at least 25%. In some embodiments, oligonucleotide Nano-FISH probes of the present disclosure are selected for use if they have less than 5 hits, less than 4 hits, less than 3 hits, less than 2 hits, or less than 1 hit of at least a 50% contiguous homology elsewhere in the human genome (e.g., by a BLAT search of each oligo against the genome). A BLAT search of each oligo against the genome may result in larger stretches of homology. A probe that exhibits less than 50% (~20 bases) homology may be considered to be unique and, thus, may be selected to move forward. When designing a probe set for enhanced resolution, the probe set can be designed to have a limited number of oligonucleotide Nano-FISH probes, such as 25-35 probes, that can be closely spaced. When designing a probe set for enhanced detection, the probe set can be designed include from 100- 150 probes.
[0179] Additionally, oligonucleotide Nano-FISH probes of the present disclosure may be selected to not include a repetitive element. For example, a repetitive element may be short interspersed nuclear elements (SINE) including ALUs, long interspersed nuclear elements (LINE), long terminal repeat elements (LTR) including retroposons, DNA repeat elements, simple repeats (micro- satellites), low complexity repeats, satellite repeats, RNA repeats such as RNA, tRNA, rRNA, snRNA, scRNA, or srpRNA, or other repeats such as the class rolling circle (RC). Any one or more of the above design criteria may be used to select the oligonucleotide Nano-FISH probes that make up a probe set of the present disclosure. As described above, the process of comparing each oligonucleotide Nano-FISH probe against a l6-mer database of human genomic sequences may result in the selecting for probes that do not comprise repetitive elements.
[0180] In contrast to the designed and selected oligonucleotide Nano-FISH probes of the present disclosure, conventional FISH probes that are nick translated are not filtered for low homology to human genomic sequences. As a result, conventional FISH techniques incorporate a step of blocking the FISH probes with a blocking agent such as Cot- 1 DNA, salmon sperm DNA, yeast tRNA, or any combination thereof, which bind to any regions of the conventional FISH probes that are highly repetitive. The blocked conventional FISH probes are then incubated with cells. In contrast, the present oligonucleotide Nano-FISH probes can be directly incubated with cells for hybridization with a target sequence, without the need for a blocking agent.
[0181] In some embodiments, a probe set is referred to herein as a“probe pool” or a “plurality of probes.” For example, an oligonucleotide Nano-FISH probe set can comprise from 20-200 oligonucleotide probes. In some embodiments, the probe set can comprise 20- 200 oligonucleotide Nano-FISH probes.
[0182] Overall, the above described properties of the Nano-FISH probes of the present disclosure, can lead to increased precision in detecting a target sequence, especially detection of small target sequences that are less than 5 kilobases, and lower background signals stemming from off target probe-DNA interactions, as compared to conventional FISH probes. In other words, the Nano-FISH probes of the present disclosure can yield a better or higher signal-to-noise ratio than conventional FISH probes.
[0183] In some embodiments, 9 oligonucleotide-Nano-FISH probes of the present disclosure may be used visualize insertions of an exogenous nucleic acid sequence in the nucleus at a signal to noise ratio of about 1.2- 1.5 to 1. In some embodiments, 15 oligonucleotide-Nano- FISH probes of the present disclosure may be used visualize insertions of an exogenous nucleic acid sequence in the nucleus at a signal to noise ratio of about 1.5 :1. In some embodiments, 30 oligonucleotide-Nano-FISH probes of the present disclosure may be used visualize insertions of an exogenous nucleic acid sequence in the nucleus at a signal to noise ratio of about 4-8 to 1. In some embodiments, 60 oligonucleotide-Nano-FISH probes of the present disclosure may be used visualize insertions of an exogenous nucleic acid sequence in the nucleus at a signal to noise ratio of about 5- 10:1. In some embodiments, 90
oligonucleotide Nano-FISH probes of the present disclosure may result in at least one detected allele (in a triploid cell background) in about 98% of cells. In some embodiments, 60 oligonucleotide Nano-FISH probes of the present disclosure may result in at least one detected allele (in a triploid cell background) in about 92% of cells. In some embodiments, 30 oligonucleotide Nano-FISH probes of the present disclosure may result in at least one detected allele (in a triploid cell background) in about 89% of cells. In some embodiments, 15 oligonucleotide Nano-FISH probes of the present disclosure may result in at least one detected allele (in a triploid cell background) in about 34% of cells.
[0184] In some embodiments, the target exogenous nucleic acid sequence does not need to be amplified prior to detection. Thus, the exogenous nucleic acid sequences of the present disclosure are non-amplified exogenous nucleic acid sequences. In some embodiments, the signal from the oligonucleotide Nano-FISH probes of the present disclosure does not need to be amplified prior to detection. Thus, the Nano-FISH methods of the present disclosure provide methods of non- signal amplified detection. In other words, the Nano-FISH methods of the present disclosure provide methods of direct, non- amplified signal detection.
[0185] The compositions and methods provided herein can also comprise a plurality of probe sets, wherein each probe set can contain any number of oligonucleotide Nano-FISH probes described above. Within a probe set, oligonucleotide Nano-FISH probes may all labeled with the same fluorophore. Each probe set in the plurality of probe sets may be labeled with different fluorophores. Each probe set in the plurality of probe sets may further comprise oligonucleotide Nano-FISH probes for the detection of unique target sequences (e.g., exogenous or viral nucleic acid sequences). Thus, a plurality of probe sets can be used to detect multiple target sequences simultaneously, with each target sequence being labeled with a unique fluorophore.
A. Types of Regulatory Elements
[0186] A regulatory element may be DNA, RNA, a polypeptide, or a combination thereof A regulatory element may be DNA. A regulatory element may be RNA. A regulatory element may be a polypeptide. A regulatory element may be any combination of DNA, RNA, and/or polypeptide (e.g., protein- protein complexes, protein-DNA/RNA complexes, and the like).
[0187] A regulatory element may be DNA. A regulatory element may be a single- stranded DNA regulatory element, a double- stranded DNA regulatory element, or a combination thereof The DNA regulatory element may be single- stranded. The DNA regulatory element may be double-stranded. The DNA regulatory element may encompass a DNA fragment. The DNA regulatory element may encompass a gene. The DNA regulatory element may encompass a chromosome. The DNA regulatory element may include endogenous DNA regulatory elements (e.g., endogenous genes). The DNA regulatory element may include artificial DNA regulatory elements (e.g., foreign genes introduced into a cell).
[0188] A regulatory element may be RNA. A regulatory element may be a single- stranded RNA regulatory element, a double- stranded RNA regulatory element, or a combination thereof The RNA regulatory element may be single- stranded. The RNA regulatory element may be double- stranded. The RNA regulatory element may include endogenous RNA regulatory elements. The RNA regulatory element may include artificial RNA regulatory elements. The RNA regulatory element may include microRNA (miRNA), transfer RNA (tRNA), ribosomal RNA (rRNA), messenger RNA (mRNA), pre-mRNA, transfer-messenger RNA (tmRNA), heterogeneous nuclear RNA (hnRNA), short interfering RNA (siRNA), or short hairpin RNA (shRNA). The RNA regulatory element may be a RNA fragment. The RNA regulatory element may be an anti- sense RNA.
[0189] An RNA regulatory element may be an enhancer RNA (eRNA). An enhancer RNA may be a non- coding RNA molecule transcribed from an enhancer region of a DNA molecule, and may be from about 50 base-pairs (bp) in length to about 3 kilo base pairs in length. An enhancer RNA may be a 1D eRNA or an eRNA that may be unidirectionally transcribed. An enhancer RNA may also be a 2D eRNA or an eRNA that may be
bidirectionally transcribed. An eRNA may be polyadenylated. Alternatively, an eRNA may be non-polyadenylated. [0190] A regulatory element may be a DNasel hypersensitive site (DHS). DHS may be a region of chromatin unoccupied by transcription factors and which is sensitive to cleavage by the DNase I enzyme. The presence of DHS regions within a chromatin may demarcate transcription factory occupancy at a nucleotide resolution. The presence of DHS regions may further correlate with activation of cis-regulatory elements, such as an enhancer, promoter, silencer, insulator, or locus control region. DHS variation may be correlated to variation in gene expression in healthy or diseased cells (e.g., cancerous cells) and/or correlated to phenotypic traits.
[0191] A DHS pattern may encode memory of prior cell fate decisions and exposures. For example, upon differentiation, a DHS pattern of a progeny may encode transcription factor occupancy of its parent. Further, a DHS pattern of a cell may encode an environmentally- induced transcription factor occupancy from an earlier time point.
[0192] A DHS pattern may encode cellular maturity. An embryonic stem cell may encode a set of DHS s that may be transmitted combinatorially to a differentiated progeny, and this set of DHS s may be decreased with each cycle of differentiation. As such, the set of DHS s may be correlated with time, thereby allowing a DHS pattern to be correlated with cellular maturity.
[0193] A DHS pattern may also encode splicing patterns. Protein coding exons may be occupied by transcription factors, which may fijrther be correlated with codon usage patterns and amino acid choice on evolutionary time scales and human fitness. A transcription factory occupancy may further modulate alternative splicing patterns, for example, by imposing sequence constraints at a splice junction. As such, a DHS pattern may encode transcription factor occupancy of one or more exons of interest and may provide additional information on alternative splicing patterns.
[0194] A DHS pattern may encode a cell type. For example, within each cell type, about 100,000 to about 250,000 DHSs may be detected. About 5% of the detected DHSs may be located within a transcription start site and the remaining DHSs may be detected at a distal site from the transcription start site. Each cell type may contain a distinct DHS pattern at the distal site and mapping the DHS pattern at the distal site may allow identification of a cell type. An overlap may further be present within two DHS patterns from two different cell types, for example, an overlap of a set of detected DHSs within the two DHS patterns. An overlap may be less than about 70 of the detected DHSs. The presence of an overlap may not affect the identification of a cell type. [0195] A regulatory element may be a polypeptide. The polypeptide may be a protein or a polypeptide fragment. For example, a regulatory element may be a transcription factor, DNA- binding protein or functional fragment, RNA-binding protein or functional fragment, protein involved in chemical modification (e.g., involved in histone modification), or gene product.
A regulatory element may be a transcription factor. A regulatory element may be a DNA or RNA-binding protein or functional fragment. A regulatory element may be a product of a gene transcript. A regulatory element may be a chromatin.
B. Methods of Detecting a Regulatory Element
[0196] Described herein is a method of detecting a regulatory element. The detection may encompass identification of the regulatory element, determining the presence or absence of the regulatory element, and/or determining the activity of the regulatory element. A method of detecting a regulatory element may include contacting a cell sample with a detection agent, binding the detection agent to the regulatory element, and analyzing a detection profile from the detection agent to determine the presence, absence, or activity of the regulatory element.
[0197] The method may involve utilizing one or more intrinsic properties associated with a detection agent to aid in detection of the regulatory element. The intrinsic properties may encompass the size of the detection agent, the intensity of the signal, and the location of the detection agent. The size of the detection agent may include the length of the probe and/or the size of the detectable moiety (e.g., the size of a fluorescent dye molecule) may modulate the specificity of interaction with a regulatory element. The intensity of the signal from the detection agent may correlate to the sensitivity of detection. For example, a detection agent with a molar extinction coefficient of about 0.5-5 x l06 M 1cm 1 may have a higher intensity signal relative to a detection agent with a molar extinction coefficient outside of the 0.5-5 x l06 M 1cm 1 range and may have lower attenuation due to scattering and absorption. Further, a detection agent with a longer excited state lifetime and a large Stoke shift (measured by the distance between the excitation and emission peaks) may further improve the sensitivity of detection. The location of the detection agent may, for example, provide the activity state of a regulatory element. A combination of intrinsic properties of the detection agent may be used to detect a regulatory element of interest.
[0198] A detection agent may comprise a detectable moiety that is capable of generating a light, and a probe portion that is capable of hybridizing to a target site on a regulatory element. As described herein, a detection agent may include a DNA probe portion, an RNA probe portion, a polypeptide probe portion, or a combination thereof Sometimes, a DNA or RNA probe portion may be between 10 and about 100 nucleotides in length. Sometimes, a DNA or RNA probe portion may be 10 to 100, or more nucleotides in length. A DNA or RNA probe portion may be a TALEN probe, ZFN probe, or a CRISPR probe. A DNA or RNA probe portion may be a padlock probe. A polypeptide probe may comprise a DNA- binding protein, a RNA-binding protein, a protein involved in the transcription/translation process, a protein that detects the transcription/translation process, a protein that may detect an open or relaxed portion of a chromatin, or a protein interacting partner of a product of a regulatory element (e.g., an antibody or binding fragment thereof).
[0199] A detection agent may comprise a DNA or RNA probe portion which may be between about 10 and about 100 nucleotides in length. A detection agent may comprise a DNA or RNA probe portion which may be about 10 to 100, or more nucleotides in length.
[0200] A set of detection agents may be used to detect a regulatory element. The set of detection agents may comprise 2 to 20, or more detection agents may be used for detection of a regulatory element. A detection agent may comprise a polypeptide probe selected from a DNA-binding protein, a RNA-binding protein, a protein involved in the
transcription/translation process, a protein that detects the transcription/translation process, a protein that may detect an open or relaxed portion of a chromatin, or a protein interacting partner of a product of a regulatory element (e.g., an antibody or binding fragment thereof).
[0201] A detectable moiety that is capable of generating a light may be directly conjugated or bound to a probe portion. A detectable moiety may be indirectly conjugated or bound to a probe portion by a conjugating moiety. As described herein, a detectable moiety may be a small molecule (e.g., a dye) which may be directly conjugated or bound to a probe portion. A detectable moiety may be a fluorescently labeled protein or molecule which may be attached to a conjugating moiety (e.g., a hapten group, an azido group, an alkyne group) of a probe.
[0202] A profile or a detection profile or signature may include the signal intensity, signal location, or size of the signal of the detection agent. The profile or the detection profile may comprise about 100 image frames to 50,000 frames, or more frames. Analysis of the profile or the detection profile may determine the activity of the regulatory element. The degree of activation may also be determined from the analysis of the profile or detection profile.
Analysis of the profile or the detection profile may further determine the optical isolation and localization of the detection agents, which may correlate to the localization of the regulatory element.
[0203] In additional cases, a detection agent may comprise a polypeptide probe selected from a DNA-binding protein, a RNA-binding protein, a protein involved in the transcription/translation process or detects the transcription/translation process, a protein that may detect an open or relaxed portion of a chromatin, or a protein interacting partner of a product of a regulatory element (e.g., an antibody or binding fragment thereof).
[0204] Sometimes, a detectable moiety that is capable of generating a light is directly conjugated or bound to a probe portion. Other times, a detectable moiety is indirectly conjugated or bound to a probe portion by a conjugating moiety. As described elsewhere herein, a detectable moiety may be a small molecule (e.g., a dye) which may be directly conjugated or bound to a probe portion. Alternatively, a detectable moiety may be a fluorescently labeled protein or molecule which may be attached to a conjugating moiety (e.g., a hapten group, an azido group, an alkyne group) of a probe.
[0205] In some instances, a profile or a detection profile or signature may include the signal intensity, signal location, or size of the signal of the detection agent. Sometimes, the profile or the detection profile may comprise about 100 frames to 50,000 frames or more images. Analysis of the profile or the detection profile may determine the activity of the regulatory element. In some cases, the degree of activation may also be determined from the analysis of the profile or detection profile. In additional cases, analysis of the profile or the detection profile may fijrther determine the optical isolation and localization of the detection agents, which may correlate to the localization of the regulatory element.
I. Detection of DNA and/or RNA Regulatory Elements
[0206] A regulatory element may be DNA. Described herein is a method of detecting a DNA regulatory element, which may include contacting a cell sample with a detection agent, binding the detection agent to the DNA regulatory element, and analyzing a profile from the detection agent to determine the presence, absence, or activity of the DNA regulatory element.
[0207] A regulatory element may be RNA. Described herein is a method of detecting a RNA regulatory element, which may include contacting a cell sample with a detection agent, binding the detection agent to the RNA regulatory element, and analyzing a profile from the detection agent to determine the presence, absence, or activity of the RNA regulatory element.
[0208] A regulatory element may be an enhancer RNA (eRNA). The presence of an eRNA may correlate to an activated regulatory element. For example, the production of an eRNA may correlate to the transcription of a target gene. As such, the detection of an eRNA element may indicate that a target gene downstream of the eRNA element may be activated. [0209] Provided herein is a method of detecting an eRNA regulatory element, which may include contacting a cell sample with a detection agent, binding the detection agent to the eRNA regulatory element, and analyzing a profile from the detection agent to determine the presence, absence, or activity of the eRNA regulatory element. Described herein is an in situ method of detecting an activated regulatory DNA site, which may include incubating a sample with a set of detection agents (e.g., fluorescently-labeled probes), hybridizing the set of detection agents to at least one enhancer RNA (eRNA), and analyzing a profile (e.g., a fluorescent profile) from the set of detection agents to determine the presence of an eRNA, in which the presence of eRNA correlates to an activated regulatory DNA site.
II. Detection of a DNase I Hypersensitive Site, Generation of a DNasel Hypersensitive Site Map, and Determination of a Cell Type Based on a DNasel Hypersensitive Site Profile
[0210] A regulatory element may be a DNasel hypersensitive site (DHS). A DNasel hypersensitive site may be an inactivated DNasel hypersensitive site. A DNasel
hypersensitive site may be an activated DNasel hypersensitive site. Described herein is a method of detecting a DHS, which may include contacting a cell sample with a detection agent, binding the detection agent to the DHS, and analyzing a profile from the detection agent to determine the presence, absence, or activity of the DHS.
[0211] The DHS may be an active DHS and may further contain a single stranded DNA region. The single stranded DNA region may be detected by Sl nuclease. A method of detecting a DHS may further be extended to detect the presence of a single stranded DNA region within a DHS. Such a method, for example, may comprise contacting a cell sample with a detection agent, binding the detection agent to a single stranded region of a DHS, and analyzing a profile from the detection agent to determine the presence or absence of the single stranded region within a DHS.
[0212] Also described herein is a method of determining the activity level of a regulatory element, which may include incubating a cell sample with a set of detection agents (e.g., fluorescently labeled probes), in which each detection agent hybridizes to a DHS, measuring a signature (e.g., a fluorescent signature) from the set of detection agents, and based on the signature, determining a DHS profile, and comparing the DHS profile with a control, in which a correlation with the control indicates the activity level of the regulatory element in the cell sample. The signature (e.g., the fluorescent signature) may further correlate to a signal intensity (or a peak height). A set of signal intensities may be compiled into a DHS profile and compared with a control to generate a second DHS profile which comprises a set of relative signal intensities (or relative peak heights). The set of relative signal intensities may correlate to the activity level of a regulatory element.
[0213] Also described herein is a method of generating a DHS map, which may provide information on cell-to-cell variation in gene expression, memory of early developmental fate decisions which establish lineage hierarchies, quantitation of embryonic stem cell DHS sites which decreases with cell passage, and presence of oncogenic elements.
[0214] The location of a set of DHS sites may be correlated to a cell type. For example, the location of about 1 to 60, or more DHS may be used to determine a cell type. The cell may be a normal cell or a cancerous cell. DHS variation may be used to determine the presence of cancerous cells in a sample. A method of determining a cell type (e.g., a cancerous cell) may include incubating a cell sample with a set of detection agents (e.g., fluorescently labeled probes), in which each detection agent hybridizes to a DHS, measuring a signature (e.g., a fluorescent signature) from the set of detection agents, and based on the signature, determining a DHS profile, and comparing the DHS profile with a control, in which a correlation with the control indicates the cell type of the sample.
[0215] A DHS site may be visualized through a terminal deoxynucleotidyl transferase (TdT) dUTP Nick- End labeling (TUNEL) assay. A TUNEL assay may utilize a terminal
deoxynucleotidyl transferase (TdT) which may catalyze the addition of a dUTP at the site of a nick or strand break. A fluorescent moiety may further be conjugated to dEITP. A TUNEL assay may be utilized for visualization of a plurality of DHSs present in a cell.
[0216] The sequence of a DHS site may be detected in situ, by utilizing an in situ sequencing methodology. For example, the two ends of a padlock probe may be hybridized to a target regulatory element sequence and the two ends may be further ligated together by a ligase (e.g., T4 ligase) when bound to the target sequence. An amplification (e.g., a rolling circle amplification or RCA) may be performed utilizing a polymerase (e.g., f29 polymerase), which may result in a single stranded DNA comprising at least about lto at least about 10, or more tandem copies of the target sequence. The amplified product at least about be sequenced by ligation in situ using partition sequencing compatible primers and labeled probes (e.g., fluorescently labeled probes). For example, each target sequence within the amplified product may bind to a primer and probe set resulting in a bright spot detectable by, e.g., an immunofluorescence microscopy. The labeled probe (e.g., the fluorescent label on the probe) may identify the nucleotide at the ligation site, thereby allowing the color detected to define the nucleotide at the respective ligation position. Sometimes, at least 1 to at least 20, or more rounds of ligation and detection may occur for detection of a DHS site.
[0217] A control as used herein may refer to a DHS profile generated from a regulatory element whose activity level is known. A control may also refer to a DHS profile generated from an inactivated regulatory element. A control may further refer to a DHS profile generated from an activated or inactivated regulatory element from a specific cell type. For example, the cell type may be an epithelial cell, connective tissue cell, muscle cell, or nerve cell type. The cell may be a cell derived from heart, lung, kidney, stomach, intestines, liver, pancreas, brain, esophagus, and the like. The cell type may be a hormone- secreting cell, such as a pituitary cell, a gut and respiratory tract cell, thyroid gland cell, adrenal gland cell,
Leydig cell of testes, Theca interna cell of ovarian follicle, Juxtaglomerular cell, Macula densa cell, Peripolar cell, or Mesangial cell type. The cell may be a blood cell or a blood progenitor cell. The cell may be an immune system cell, e.g., monocytes, dendritic cell, neutrophile granulocyte, eosinophil granulocyte, basophil granulocyte, hybridoma cell, mast cell, helper T cell, suppressor T cell, cytotoxic T cell, Natural Killer T cell, B cell, or natural killer cell.
PI. Detection and Mapping of a Chromatin
[0218] A regulatory element may also be a chromatin. Provided herein is a method of detecting a chromatin, which may include contacting a cell sample with a detection agent, binding the detection agent to the chromatin, and analyzing a profile from the detection agent to determine the activity state of the chromatin. The activity level of a chromatin may be determined based on the presence or activity level of a nucleic acid of interest or the presence or absence of a chromatin associated protein. The activity level of a chromatin may be determined based on DHS locations. The one or more DHS locations on a chromatin may be used to map chromatin activity state. For example, one or more DHSs may be localized in a region and the surrounding chromatin may be decompacted and readily visualized relative to an inactive chromatin state when a DHS is not present. The one or more DHSs within a localized region may fijrther form a localized DHS set and a plurality of localized DHS sets may fijrther provide a global map or pattern of chromatin activity (e.g., an activity pattern).
[0219] Also included herein is a method of generating a chromatin map based on the pattern of DNasel hypersensitive sites, RNA regulatory elements (e.g., eRNA), chromatin associated proteins or gene products, or a combination thereof The method of generating a chromatin map may be based on the pattern of DNasel hypersensitive sites. The method may comprise generating a 3-dimensional map from a detection profile (or a 2-dimensional detection profile). A chromatin map may provide information on the compaction of chromatin, the spatial structure, spacing of regulatory elements, and localization of the regulatory elements to globally map chromatin structure and accessibility.
[0220] A chromatin map for a cell type may also be generated, in which each cell type comprises a different chromatin pattern. Each cell type may be associated with at least one unique marker. The at least one unique marker (or fiduciary marker) may be a genomic sequence. The at least one unique marker (or fiduciary marker) may be DHS. A cell type may comprise about 5, about 10, about 15, about 20, about 25, about 30, about 35, about 40, about 45, about 50, about 60, or more unique markers (or fiduciary markers). The cell type may be an epithelia cell, a connective tissue cell, a muscle cell, a nerve cell, a hormone- secreting cell, a blood cell, an immune system cell, or a stem cell type. The cell type may be a cancerous cell type.
[0221] A chromatin profile (e.g., based on DHSs) in the presence of an exogenous agent or condition may also be generated. The method may comprise incubating a cell sample with a set of fluorescently labeled probes specific to target sites (e.g., target DHSs) on a chromatin in the presence of an exogenous agent or condition; measuring a fluorescent signature of the set of fluorescently labeled probes; based on the fluorescent signature, generating a fluorescent profile of the chromatin; and comparing the fluorescent profile with a second fluorescent profile of a chromatin obtained from an equivalent sample incubated with an equivalent set of fluorescently labeled probes in the absence of the exogenous agent or condition, wherein a difference between the two sets of fluorescent profiles indicates a change in the chromatin density (e.g., changes in the presences or activation of DHSs) induced by the exogenous agent or condition. The exogenous agent or condition may comprise a small molecule or a drug. The exogenous agent may be a small molecule, such as a steroid. The exogenous agent or condition may comprise an environmental factor, such as a change in pH, temperature, nutrient, or a combination thereof
C. Methods of Determining the Localization of a Regulatory Element
[0222] Also described herein is a method for determining the localization of a regulatory element. The localization of a regulatory element may provide an activity state of the regulatory element. The localization of a regulatory element may also provide an interaction state with at least one additional regulatory element. For example, the localization of a first regulatory element with respect to a second regulatory element may provide spatial coordinate and distance information between the two regulatory elements, and v further provide information regarding whether the two regulatory elements may interact with each other. The activity state of a regulatory element may include, for example, a transcription or translation initiation event, a translocation event, or an interaction event with one or more additional regulatory elements. The regulatory element may comprise DNA, RNA, polypeptides, or a combination thereof The regulatory element may be DNA. The regulatory element may be RNA. The regulatory element may be an enhancer RNA (eRNA). The regulatory element may be a DNasel hypersensitive site (DHS). The DHS may be an inactive DHS or an active DHS. The regulatory element may be a polypeptide. The regulatory element may be chromatin.
[0223] The localization of a regulatory element may include contacting a regulatory element with a first set of detection agents, photobleaching the first set of detection agents for a first time point at a first wavelength to generate a second set of detection agents capable of generating a light at a second wavelength, detecting at least one burst generated by the second set of detection agents to generate a detection profile of the second set of detection agents, and analyzing the detection profile to determine the localization of the regulatory element.
[0224] A detection agent may comprise a detectable moiety that is capable of generating a light, and a probe portion that is capable of hybridizing to a target site on a regulatory element. Each detection agent within the first set of detection agents may have the same or a different detectable moiety. Each detection agent within the first set of detection agents may have the same detectable moiety. A detectable moiety may comprise a small molecule (e.g., a fluorescent dye). A detectable moiety may comprise a fluorescently labeled polypeptide, a fluorescently labeled nucleic acid probe, and/or a fluorescently labeled polypeptide complex.
[0225] Upon photobleaching, a second set of detection agents may be generated from the first set of detection agents, in which the second set may include detection agents that are capable of generating a burst of light detectable at a second wavelength. For example, bleaching of the set of detection agents may lead to about 50%, about 60%, about 70%, about 80%, about 90%, or more detection agents within the set to enter into an“OFF-state.” An“OFF-state” may be a dark state in which the detectable moiety crosses from the singlet excited or ON state to the triplet state or OFF-state in which detection of light (e.g., fluorescence) may be low (e.g., less than 10%, less than 5%, less than 1%, or less than 0.5% of the light may be detected). The remainder of the detection agents that have not entered into the OFF-state may generate bursts of lights, or to cycle between a singlet excited state (or ON- state) and a singlet ground state. As such, bleaching of the set of detection agents may generate about 40%, about 30%, about 20%, about 10%, about 5%, or less detection agents within the set that may generate bursts of lights. The bursts of lights may be detected stochastically, at a single burst level in which each burst of light correlates to a single detection agent.
[0226] A single wavelength may be used for photobleaching a set of detection agents. At least two wavelengths may be used for photobleaching a set of detection agents. A
wavelength at 491 nm may be used. A wavelength at 405 nm may be used in combination with the wavelength at 491 nm. The two wavelengths may be applied simultaneously to photobleach a set of detection agents. Alternatively, the two wavelengths may be applied sequentially to photobleach a set of detection agents.
[0227] The time for photobleaching a set of detection agents may be from about 10 seconds to about 4 hours, or more. The concentration of the detection agents may be from about 5 nM to about 1 mM. The burst of lights from the set of detection agents may generate a detection profile. The detection profile may comprise about 100 image frames to about 50,000 frames, or more. The detection profile may also include the signal intensity, signal location, or size of the signal. Analysis of the detection profile may determine the optical isolation and localization of the detection agents, which may correlate to the localization of the regulatory element.
[0228] The detection profile may comprise a chromatic aberration correction. The detection profile may comprise less than 5%, chromatic aberration. The detection profile may comprise 0% chromatic aberration.
[0229] More than one regulatory element may be detected at the same time. At least 2 to 20, or more regulatory elements may be detected at the same time. Each of the regulatory elements may be detected by a set of detection agents. The detectable moiety between the different set of detection agents may be the same. For example, two different sets of detection agents may be used to detect two different regulatory elements and the detectable moieties from the two sets of detection agents may be the same. As such, at least 2 to at least 20, or more regulatory elements may be detected at the same time at the same wavelength.
Sometimes, the detectable moiety between the different set of detection agents may also be different. For example, two different sets of detection agents may be used to detect two different regulatory elements and the detectable moiety from one set of detection agents may be detected at a different wavelength from the detectable moiety of the second set of detection agents. As such, at least 2 to 20, or more regulatory elements may be detected at the same time in which each of the regulatory elements may be detected at a different wavelength. The regulatory element may comprise DNA, RNA, polypeptides, or a combination thereof
D. Methods of Measuring the Activity of a Regulatory Element
[0230] Also described herein is a method of measuring the activity of a target regulatory element. The method may include detection of a regulatory element and one or more products of the regulatory element. One or more products of the regulatory element may also include intermediate products or elements. The method may comprise contacting a cell sample with a first set and a second set of detection agents, in which the first set of detection agents interact with a target regulatory element within the cell and the second set of detection agents interact with at least one product of the target regulatory element, and analyzing a detection profile from the first set and the second set of detection agents, in which the presence or the absence of the at least one product indicates the activity of the target regulatory element.
[0231] As discussed herein, a detection agent may comprise a detectable moiety that is capable of generating a light, and a probe portion that is capable of hybridizing to a target site on a regulatory element. Each detection agent within the first set of detection agents may have the same or a different detectable moiety. Each detection agent within the first set of detection agents may have the same detectable moiety. A detectable moiety may comprise a small molecule (e.g., a fluorescent dye). A detectable moiety may comprise a fluorescently labeled polypeptide, a fluorescently labeled nucleic acid probe, and/or a fluorescently labeled polypeptide complex.
[0232] The method may also allow photobleaching of the first set and the second set of detection agents, thereby generating a subset of detection agents capable of generating a burst of light. A detection profile may be generated from the detection of a set of light bursts, in which the presence or the absence of the at least one product may indicate the activity of the target regulatory element.
[0233] The regulatory element may comprise DNA, RNA, polypeptides, or a combination thereof The regulatory element may be DNA. The regulatory element may be RNA. The regulatory element may be an enhancer RNA (eRNA). The presence of an eRNA may correlate with target gene transcription that is downstream of eRNA. The regulatory element may be a DNasel hypersensitive site (DHS). The DHS may be an activated DHS. The pattern of the DHS on a chromatin may correlate to the activity of the chromatin. The regulatory element may be a polypeptide, e.g., a transcription factor, a DNA or RNA-binding protein or binding fragment thereof, or a polypeptide that is involved in chemical modification. The regulatory element may be chromatin.
E. Target Nucleic Acid Sequence
[0234] A target nucleic acid sequence may be a nucleic acid sequence of interest or may encode a DNA, RNA, or protein of interest or a portion thereof A DNA, RNA, or protein of interest may be a DNA, RNA, or protein produced by a cell or contained within a cell. A target nucleic acid sequence may be incorporated into a structure of a cell. A target nucleic acid sequence may also be associated with a cell. For example, a target nucleic acid sequence may be in contact with the exterior of a cell. A target nucleic acid sequence may be unassociated with a structure of a cell. For example, a target nucleic acid sequence may be a circulating nucleic acid sequence. A target nucleic acid sequence or a portion thereof may be artificially constructed or modified. A target nucleic acid sequence may be a natural biological product. A target nucleic acid sequence may be a short nucleic acid sequence. A target nucleic acid sequence may be a nucleic acid sequence that is from a source that is exogenous to a cell. A target nucleic acid sequence may be an endogenous nucleic acid sequence. A target nucleic acid sequence may be a nucleic acid sequence that comprises a combination of an endogenous nucleic acid sequence and a nucleic acid sequence from a source that is exogenous to a cell. A target nucleic acid sequence may be a chromosomal nucleic acid sequence or fragment thereof A target nucleic acid sequence may be an episomal nucleic sequence or fragment thereof A target nucleic acid sequence may be a sequence resulting from somatic rearrangement or somatic hypermutation, such as a nucleic acid sequence from a T cell receptor, B cell receptor, or fragment thereof
[0235] A nucleic acid of a cell or sample, which may comprise the target nucleic acid sequence, may comprise a deoxyribonucleic acid (DNA) or a ribonucleic acid (RNA), or a combination thereof A nucleic acid may be a chromosome, an oligonucleotide, a plasmid, an artificial chromosome, or a fragment or portion thereof A nucleic acid may comprise genomic DNA, episomal DNA, complementary DNA, mitochondrial DNA, recombinant DNA, cell-free DNA (cfDNA), messenger RNA (mRNA), pre-mRNA, microRNA (miRNA), transfer RNA (tRNA), transfer messenger RNA (tmRNA), ribosomal RNA (rRNA), heterogeneous nuclear RNA (hnRNA), short interfering RNA (siRNA), anti- sense RNA, or short hairpin RNA (shRNA). A nucleic acid may be single-stranded, double- stranded, or a combination thereof [0236] A target nucleic acid sequence may comprise a naturally occurring nucleic acid sequence, an artificially constructed nucleic acid sequence (such as an artificially synthesized nucleic acid sequence), or a modified nucleic acid sequence (such as a naturally occurring nucleic acid sequence that has been altered or modified through a natural or artificial process).
[0237] A naturally occurring nucleic acid sequence may comprise a nucleic acid sequence present in a cellular sample. A naturally occurring nucleic acid sequence may comprise a nucleic acid sequence present in an unfixed cell. A naturally occurring nucleic acid sequence may comprise a nucleic acid sequence derived from a cellular sample. A nucleic acid sequence may also be derived from a vims (such as a viral nucleic acid sequence from a lentivirus or adenovirus).
[0238] A naturally occurring nucleic acid sequence may comprise a nucleic acid sequence present in an acellular sample. A naturally occurring nucleic acid sequence may comprise a nucleic acid sequence derived from an acellular sample. For example, a nucleic acid sequence may be a cell-free DNA sequence present in a bodily fluid (such as a sample of cerebrospinal fluid). A nucleic acid may comprise a target nucleic acid sequence that is not endogenous to the source (exogenous) from which it was taken or in which it is analyzed. A nucleic acid may be an artificially synthesized oligonucleotide.
[0239] A nucleic acid sequence may comprise one or more modifications. A modification may be a post-translational modification of a nucleic acid sequence or an epigenetic modification of nucleic acid sequence (e.g., modification to the methylation of a nucleic acid sequence). A modification may be a genetic modification. A genetic modification to a nucleic acid sequence may be an insertion, a deletion, or a substitution of a nucleic acid sequence. A nucleic acid sequence modification may comprise an insertion may comprise transformation, transduction, or transfection of a sample. For example, a nucleic acid sequence modification comprising an insertion may result from infection or transduction of a cell with a virus and subsequent incorporation of a viral nucleic acid sequence into a nucleic acid sequence of the cells, such as the cell’s genomic DNA. The integrated viral nucleic acid sequence (viral integrant) or fragment thereof may be the target nucleic acid sequence. Modification of a nucleic acid sequence may be an artificial modification, resulting from, for instance, genetic engineering or intentional nucleic acid sequence modification during nucleic acid fabrication. A nucleic acid sequence may be the result of somatic rearrangement.
[0240] A modification to a nucleic acid sequence comprising an insertion, deletion or substitution may comprise a difference between the nucleic acid sequence and a reference sequence. A reference sequence may be a nucleic acid sequence in a database, an artificial nucleic acid, a viral nucleic acid sequence, a nucleic acid sequence of the same cell, a nucleic acid sequence of a cell from the tissue, a nucleic acid sequence from a different tissue of the same subject, or a nucleic acid sequence from a subject of a different species.
[0241] A modification to a nucleic acid sequence may comprise a difference in 1 nucleotide (a single nucleotide polymorphism, SNP), from 1 to 1,000 nucleotides. Modification to a nucleic acid sequence comprising a difference in a plurality of nucleotides may comprise differences in two or more adjacent nucleotides or nucleotide sequences relative to a reference nucleic acid sequence. Modifications to a nucleic acid sequence comprising a difference in a plurality of nucleotides may also comprise differences in two or more non- adjacent nucleotides or nucleotide sequences (such as two or more modifications to the nucleic acid sequence that are separated by at least one nucleotide) relative to a reference nucleic acid sequence.
[0242] A target sequence may be assayed in situ or it may be isolated and/or purified from a cellular or acellular sample. For example, a target sequence comprising a nucleic acid may comprise a portion (a region) of genomic DNA located in situ in the nucleus of a fixed (intact) cell. A target sequence may comprise a nucleic acid sequence that is isolated from a sample (such as an aliquot of cerebrospinal fluid).
F. Detection Agents
[0243] Detection agents may be utilized to detect nucleic acid sequence of interest. A detection agent may comprise a probe portion. The probe portion may include a probe, or a combination of probes. The probe portion may comprise a nucleic acid molecule, a polypeptide, or a combination thereof The detection agents may further comprise a detectable moiety. The detectable moiety may comprise a fluorophore. A fluorophore may be a molecule that may absorb light at a first wavelength and transmit or emit light at a second wavelength. The fluorophore may be a small molecule (such as a dye) or a fluorescent polypeptide. The detectable moiety may be a fluorescent small molecule (such as a dye). The detectable moiety may not contain a fluorescent polypeptide. The detection agent may further comprise a conjugating moiety. The conjugating moiety may allow attachment of the detection agent to a nucleic acid sequence of interest. The detection agent may comprise a probe that is synthesized with direct dye incorporation at the 3’ end or 5’ end. G. Probes
[0244] A detection agent may comprise a probe portion. A probe portion may comprise a probe or a combination of probes. A probe may be a nucleic acid probe, a polypeptide probe, or a combination thereof A probe portion may be an unconjugated probe that does not contain a detectable moiety. A probe portion may be a conjugated probe which comprises a single probe with a detectable moiety, or two or more probes in which at least one probe may be an unconjugated probe bound to at least a second probe which comprises a detectable moiety.
[0245] A probe may be a nucleic acid probe. The nucleic acid probe may be a DNA probe, a RNA probe, or a combination thereof The nucleic acid probe may be a DNA probe. The nucleic acid probe may be a RNA probe. The nucleic acid probe may be a double stranded nucleic acid probe, a single stranded nucleic acid probe, or may contain single- stranded and/or double stranded portions. The nucleic acid probe may further comprise overhangs on one or both termini, may further comprises blunt ends on one or both termini, or may further form a hairpin.
[0246] The nucleic acid probe may be at least 10 to about 100 nucleotides in length. TABLE 3 lists exemplary nucleotide sequences according to the present disclosure.
TABLE 3 - Exemplary Probe Nucleotide Sequences
Figure imgf000060_0001
Figure imgf000061_0001
Figure imgf000062_0001
Figure imgf000063_0001
Figure imgf000064_0001
Figure imgf000065_0001
Figure imgf000066_0001
Figure imgf000067_0001
Figure imgf000068_0001
Figure imgf000069_0001
Figure imgf000070_0001
Figure imgf000071_0001
Figure imgf000072_0001
Figure imgf000073_0001
Figure imgf000074_0001
Figure imgf000075_0001
Figure imgf000076_0001
Figure imgf000077_0001
Figure imgf000078_0001
Figure imgf000079_0001
Figure imgf000080_0001
Figure imgf000081_0001
Figure imgf000082_0001
Figure imgf000083_0001
Figure imgf000084_0001
Figure imgf000085_0001
Figure imgf000086_0001
[0247] A nucleic acid probe may be a non-labeled probe, or a probe that does not contain a detectable moiety. A non-labeled probe may further interact with a labeled probe (e.g., a labeled nucleic acid probe). A non-labeled probe may hybridize with a labeled nucleic acid probe. A non-labeled probe may also interact with a labeled polypeptide probe. The labeled polypeptide probe may be a protein that recognizes a sequence within the non-labeled probe. A labeled probe may include a nucleic acid portion and a polypeptide tag portion and the polypeptide tag portion may further interact with a molecule comprising a detectable moiety. For example, a non-labeled probe may be a nucleic acid probe comprising a streptavidin which may interact with a biotinylated molecule comprising a detectable moiety.
[0248] A nucleic acid probe may comprise about 95%, about 96%, about 97%, about 98%, about 99%, or about 100% sequence specificity or sequence complementarity to a target site of a regulatory element. A nucleic acid probe may comprise about 95%, about 96%, about 97%, about 98%, about 99%, or about 100% sequence specificity or sequence
complementarity to a target nucleic acid sequence. A nucleic acid probe may comprise about 95%, about 96%, about 97%, about 98%, about 99%, or about 100% sequence specificity or sequence complementarity to a target viral nucleic acid sequence The hybridization may be a high stringent hybridization condition.
[0249] A nucleic acid probe may hybridize with a genomic sequence that is present in low or single copy numbers (e.g., genomic sequences that are not repetitive elements). As used herein, repetitive element refers to a DNA sequence that is present in many identical or similar copies in the genome. Repetitive elements are not intended to refer to a DNA sequence that is present on each copy of the same chromosome (e.g., a DNA sequence that is present only once, but is found on both copies of chromosome 1 1, would not be considered a repetitive element, and would be considered a sequence that is present in the genome as one copy). The genome may consist of three broad sequence components : single copy or at least very low copy number DNA (approximately 60% of the human genome); moderately repetitive elements (approximately 30% of the human genome); and highly repetitive elements (approximately 10% of the human genome). For a review, see Human Molecular Genetics, Chapter 7 (1999), John Wiley & Sons, Inc.
[0250] A nucleic acid probe may have reduced off-target interaction. For example, “off- target” or“off-target interaction” may refer to an instance in which a nucleic acid probe against a given target hybridizes or interact with another target site (e.g., a different DNA sequence, RNA sequence, or a cellular protein or other moiety).
[0251] A nucleic acid probe may further be cross-linked to a target site of a regulatory element. For example, the nucleic acid probe may be cross-linked by a photo- crosslinking means such as UV or by a chemical cross-linking means such as by formaldehyde, or through a reactive group within the nucleic acid probe. Reactive group may include sulfhydryl- reactive linkers such as bismaleimidohexane (BMH), and the like.
[0252] A nucleic acid probe may include natural or unnatural nucleotide analogues or bases or a combination thereof The unnatural nucleotide analogues or bases may comprise modifications at one or more of ribose moiety, phosphate moiety, nucleoside moiety, or a combination thereof The unnatural nucleotide analogues or bases may comprise 2’-0- methyl, 2’-0-methoxyethyl (2’-0-MOE), 2’-0-aminopropyl, 2'-deoxy, T-deoxy-2'-lluoro, 2'- O-aminopropyl (2'-0-AP), 2'-0-dimethylaminoethyl (2'-0-DMAOE), 2'-0- dimethylaminopropyl (2'-0-DMAP), T-O- dimethylaminoethyloxyethyl (2'-0-DMAEOE), or 2'-0-N-methylacetamido (2'-0-NMA) modified, locked nucleic acid (LNA), ethylene nucleic acid (ENA), peptide nucleic acid (PNA), E, 5’- anhydrohexitol nucleic acids (HNA), morpholino, methylphosphonate nucleotides, thiolphosphonate nucleotides, or 2’-fluoro N3- P5’-phosphoramidites. The nucleic acid probes may further comprise one or more abasic sites. The abasic site may further be functionalized with a detectable moiety.
[0253] A nucleic acid probe may be a locked nucleic acid probe (such as a labeled locked nucleic acid probe), a labeled or unlabeled peptide nucleic acid (PNA) probe, a labeled or unlabeled oligonucleotide, an oligopaint, an ECHO probe, a molecular beacon probe, a padlock (or molecular inversion probe), a labeled or unlabeled toe-hold probe, a labeled TALE probe, a labeled ZFN probe, or a labeled CRISPR probe.
[0254] A nucleic acid probe may be a labeled or unlabeled locked nucleic acid probe or a labeled or unlabeled peptide nucleic acid probe. Locked nucleic acid probes and peptide nucleic acid probes are known to those of skill in the art and are described in Briones et al., Anal Bioanal Chem (2012) 402:3071-3089.
[0255] A nucleic acid probe may be a padlock (or molecular inversion probe). A padlock probe may be hybridized to a target regulatory element sequence in which the two ends may correspond to the target sequence. A padlock probe may be ligated together by a ligase (such as T4 ligase) when bound to the target sequence. An amplification (such as a rolling circle amplification or RC A) may be performed utilizing for example f29 polymerase, which may result in a single stranded DNA comprising multiple tandem copies of the target sequence.
[0256] A nucleic acid probe may be an oligopaint as described in U.S. Publication No. 2010/0304994; and in Beliveau, et al,“Versatile design and synthesis platform for visualizing genomes with oligopaint FISH probes,” PNAS 109(52): 21301-21306 (2012). Oligopaint may refer to detectably labeled polynucleotides that have sequences
complementary to an oligonucleotide sequence (such as a portion of a DNA sequence, like a particular chromosome or sub-chromosomal region of a particular chromosome). Oligopaints may be generated from synthetic probes and arrays that are, optionally, computationally patterned (rather than using natural DNA sequences and/or chromosomes as a template).
[0257] A nucleic acid probe can be a labeled or unlabeled toe-hold probe. Toe-hold probes are known to those of skill in the art as described in Zhang et al., Optimizing the Specificity of Nucleic Acid Hybridization, Nature Chemistry 4: 208-214 (2012).
[0258] A nucleic acid probe may be a molecular beacon. Molecular beacons may be hairpin shaped molecules with an internally quenched fluorophore whose fluorescence is restored when they bind to a target nucleic acid sequence. Molecular beacons are known to those of skill in the ari as described in Guo et al., Anal. Bioanal. Chem. (2012) 402:3115-3125.
[0259] A nucleic acid probe may be an ECHO probe. ECHO probes may be sequence- specific, hybridization- sensitive, quencher-free fluorescent probes for RNA detection, which may be designed using the concept of fluorescence quenching caused by intramolecular excitonic interaction of fluorescent dyes. ECHO probes are known to those of skill in the art as described in Kubota et al., PLoS ONE, Vol. 5, Issue 9, el3003 (2010); or Okamoto, Chem. Soc. Rev., 2011, 40, 5815-5828, Wang et al., ENA (2012), 18:166- 175.
[0260] A probe may be a clustered regularly interspaced palindromic repeat (CRISPR) probe. The CRISPR system may use a Cas9 protein to recognize DNA sequences, in which the target specificity may be solely determined by a small guide (sg) RNA and a protospacer adjacent motif (PAM). Upon binding to target DNA, the Cas9-sgRNA complex may generate a DNA double- stranded break. For imaging applications, a Cas9 protein may be replaced with an endonuclease- deactivated Cas9 (dCas9) protein. For example, imaging a cell, such as by fluorescence in situ hybridization (FISH), may be achieved by synthesizing a dCas9 within the cell, synthesizing RNA within the cell to bind genomic DNA and to complex with the dCas9 forming a dCas9/RNA complex, labeling the dCas9/RNA complex, and imaging the labeled dCas9/RNA complex within the live cell bound to genomic DNA. The
endonuclease- deactivated Cas9 may be synthesized in vivo by using an integrated construct, a transiently transfected construct, by injection into the cell of a syncitia of nuclei or via electroporation into cells and/or nuclei.
[0261] A probe may comprise an endonuclease- deactivated Cas9 (dCas9) protein as described in Chen et al.,“Dynamic imaging of genomic loci in living human cells by an optimized CRISPR/Cas system,” Cell 155(7): 1479- 1491 (2013); or Ma et al.,“Multicolor CRISPR labeling of chromosomal loci in human cells,” PNAS 112(10): 3002-3007 (2015). The dCas9 protein may be lurther labeled with a detectable moiety.
[0262] The RNA of the Cas9/RNA complex may be synthesized in vivo by using an integrated construct, a transiently transfected construct, by injection into the cell of a syncitia of nuclei or via electroporation into cells and/or nuclei. The Cas9/RNA complex may be labeled by making a fusion protein that includes Cas9 and a reporter, by injection of RNA that has been attached to a reporter into the cell or by a syncitia of nuclei including RNA that has been attached to a reporter, by electroporation into cells or nuclei or by indirect labeling of the RNA by hybridization with a labeled secondary oligonucleotide. The label may be a conditional reporter, based on the binding of Cas9/RNA to the target nucleic acid. The label may be quenched and may then be activated upon the Cas9/RNA complex binding to the target nucleic acid. A probe may be a transcription activator- like effector nuclease (TALEN) probe or a zinc- finger nuclease (ZFN) probe. [0263] A probe disclosed herein may be a polypeptide probe. A polypeptide probe may include a protein or a binding fragment thereof that interacts with a target site (such as a nucleic acid target site or a protein target) of interest. A polypeptide probe may comprise a DNA-binding protein, a RNA-binding protein, a protein involved in the
transcription/translation process or detects the transcription/translation process, a protein that may detect an open or relaxed portion of a chromatin, or a protein interacting partner of a product of a regulatory element.
[0264] A polypeptide probe may be a DNA-binding protein. The DNA-binding protein may be a transcription factor that modulates the transcription process, polymerases, or histones. A DNA-binding protein may comprise a zinc linger domain, a helix-turn-helix domain, a leucine zipper domain (such as a basic leucine zipper domain), a high mobility group box (HMG-box) domain, and the like. The DNA-binding protein may interact with a nucleic acid region in a sequence specific manner. The DNA-binding protein may interact with a nucleic acid region in a sequence non-specific manner. The DNA-binding protein may interact with single- stranded DNA. The DNA-binding protein may interact with double- stranded DNA.
The DNA-binding protein probe may fijrther comprise a detectable moiety.
[0265] A polypeptide probe may be a RNA-binding protein. The RNA-binding protein may participate in forming ribonucleoprotein complexes. The RNA-binding protein may modulate post-transcription such as in splicing, polyadenylation, mRNA stabilization, mRNA localization, or in translation. A RNA-binding protein may comprise a RNA recognition motif (RRM), dsRNA binding domain, zinc finger domain, K-Homology domain (KH domain), and the like. The RNA-binding protein may interact with single- stranded RNA. The RNA-binding protein may interact with double- stranded RNA. The RNA-binding protein probe may fijrther comprise a detectable moiety.
[0266] A polypeptide probe may be a protein that may detect an open or relaxed portion of a chromatin. The polypeptide probe may be a modified enzyme that lacks cleavage activity.
The modified enzyme may be an enzyme that recognizes DNA or RNA (double-stranded or single- stranded). Examples of modified enzymes may be obtained from oxidoreductases, transferases, hydrolases, lyases, isomerases, or ligases. A modified enzyme may be an endonuclease (such as a deactivated restriction endonuclease such as the TALEN or CRISPR probes described herein).
[0267] A polypeptide probe may be an antibody or binding fragment thereof The antibody or binding fragment thereof may be a protein interacting partner of a product of a regulatory element. The antibody or binding fragment thereof may comprise a humanized antibody or binding fragment thereof, murine antibody or binding fragment thereof chimeric antibody or binding fragment thereof, monoclonal antibody or binding fragment thereof monovalent Fab’, divalent Fab2, F(ab)'3 fragments, single-chain variable fragment (scFv), bis-scFv, (scFv)2, diabody, minibody, nanobody, triabody, tetrabody, disulfide stabilized Fv protein (dsFv), single-domain antibody (sdAb), Ig NAR, camelid antibody or binding fragment thereof or a chemically modified derivative thereof The antibody or binding fragment thereof may further comprise a detectable moiety.
[0268] Multiple probes may be used together in a probe set to detect a nucleic acid sequence using Nano-FISH. A probe set can also be referred to herein as a“probe pool.” The probe set may be designed for the detection of the target nucleic acid sequence. For example, the probe set may be optimized for probes based on GC content, l6mer base matches (for determining binding specificity of the probe), and their predicted melting temperature when hybridized.
The l6mer base matches may have a total of 24 matches to the l6mer database. In some embodiments, probe sets with greater than 100 l6-mer database matches may be discarded.
[0269] Exemplary probe nucleotide sequences are shown in TABLE 3 for probe sets for different target sequences. Some exemplary probe sequences may be target sequences located in the GREB1 promoter of chromosome 2, ER iDHSl of chromosome 2, ER iDHS2 of chromosome 2, HBGlup of chromosome 11, HBG2 up of chromosome l l, HSl of chromosome 11, HS2 of chromosome 11, HS3 of chromosome 11, HS4 of chromosome 11, HS5 of chromosome 11, HS 1 Lflank of chromosome 11, HS1 2flank of chromosome 11, HS2 3 flank of chromosome l l, HS3 4flank of chromosome 11, F1S4 5 flank of chromosome 11, HS5 Rflank of chromosome 11, CCND1 SNP of chromosome 11, CCND1 CTL of chromosome 11, the CCND1 promoter of chromosome 11, Chromosome 18 dead 1 of chromosome 18, Chromosome 18 dead2 of chromosome 18, Chromosome dead3 of chromosome 18, CNOT promoter of chromosome 19, CNOT interl of chromosome 19, CNOT inter2 of chromosome 19, CNOT inter3 of chromosome 19, TSEN promoter of chromosome 19, KLK2 promoter of chromosome 19, KLK3 promoter of chromosome 19, or KLK eRNA of chromosome 19. GREB1 is gene that may be induced by estrogen stimulation of MCF-7 breast cancer cells. ER iDHS l and ER iDHS2 are DHS that may be induced by estrogen stimulation of MCF-7 breast cancer cells. HBGlup and HBG2up are hemoglobin genes expressed in K562 erthyroleukemia cells. HS1, HS2, HS3, HS4, and HS5 are hypersensitive sits in the beta-globin locus control region, and HSl Lflank, HS2 3flank, HS3 4flank, HS4 5flank, HS5 Rflank are sequences in the intervening regions between HS1-HS5. CCND SNP is an enhancer for the CCND1 gene, CCND1 CTL is a control region adjacent to the CCND1 SNP, and the CCND1 promoter is the promoter region of the CCND1 gene. Chromosome 18 deadl, Chromosome 18 dead 2, and Chromosome 18 dead3 are non hypersensitive regions of chromosome 18. The CNOT promoter is the promoter (active region) of CNOT. The TSEN promoter is the promoter (active region) of TSEN. The KLK2 promoter is the promoter KLK2. The KLK3 promoter is the promoter of KLK3. KLK eRNA is an enhancer for the KLK2 gene and/or the KLK3 gene, and which may also enhance RNA. For example, a probe set comprising at least nine different Q570 labeled probes selected from the group consisting of SEQ ID NO: 1 - SEQ ID NO: 39 may be used to detect the GREB1 promoter in chromosome 2. A Q570 labeled probe set comprising probes with SEQ ID NO: 7 - SEQ ID NO: 35 may be used to detect the GREB1 promoter in chromosome 2. A probe set comprising at least nine different Q670 labeled probes selected from the group consisting of SEQ ID NO: 40 - SEQ ID NO: 72 may be used to detect the ER iDHS 1 in chromosome 2. A probe set comprising at least nine different Q670 labeled probes selected from the group consisting of SEQ ID NO: 73 - SEQ ID NO: 104 may be used to detect the ER iDHS 2 in chromosome 2. A probe set comprising at least nine different Q570 labeled probes selected from the group consisting of SEQ ID NO: 105 - SEQ ID NO: 134 may be used to detect the HBGlup in chromosome 11. A probe set comprising at least nine different Q570 labeled probes selected from the group consisting of SEQ ID NO: 135 - SEQ ID NO: 164 may be used to detect the HBG2up in chromosome 11. A probe set comprising at least nine different Q570/670 labeled probes selected from the group consisting of SEQ ID NO: 165 - SEQ ID NO: 194 may be used to detect HS1 in chromosome 11. A probe set comprising at least nine different Q570/670 labeled probes selected from the group consisting of SEQ ID NO: 195 - SEQ ID NO: 224 may be used to detect HS2 in chromosome 11. A probe set comprising at least nine different Q570/670 labeled probes selected from the group consisting of SEQ ID NO: 225 - SEQ ID NO: 254 may be used to detect HS3 in chromosome 11. A probe set comprising at least nine different Q670 labeled probes selected from the group consisting of SEQ ID NO: 255 - SEQ ID NO: 298 may be used to detect HS4 in chromosome 11. A probe set comprising at least nine different Q570/670 labeled probes selected from the group consisting of SEQ ID NO: 299 - SEQ ID NO: 340 may be used to detect HS5 in chromosome 11. A probe set comprising at least nine different Q670 labeled probes selected from the group consisting of SEQ ID NO: 341 - SEQ ID NO: 370 may be used to detect HS1 Lflank in chromosome 11. A probe set comprising at least nine different Q570 labeled probes selected from the group consisting of SEQ ID NO: 371 - SEQ ID NO: 400 may be used to detect HS1 2flank in chromosome 11. A probe set comprising at least nine different Q670 lab eled probes selected from the group consisting of SEQ ID NO: 401 - SEQ ID NO: 430 may be used to detect HS2 3llank in chromosome 11. A probe set comprising at least nine different Q570 labeled probes selected from the group consisting of SEQ ID NO: 431 - SEQ ID NO: 460 may be used to detect HS3 4flank in chromosome 11. A probe set comprising at least nine different Q670 labeled probes selected from the group consisting of SEQ ID NO: 461 - SEQ ID NO: 484 may be used to detect HS4 5flank in chromosome 11. A probe set comprising at least nine different Q570 labeled probes selected from the group consisting of SEQ ID NO: 485 - SEQ ID NO: 514 may be used to detect HS5 Rflank in chromosome 11.
A probe set comprising at least nine different Q570 labeled probes selected from the group consisting of SEQ ID NO: 515 - SEQ ID NO: 544 may be used to detect CCND1 SNP in chromosome 11. A probe set comprising at least nine different Q670 labeled probes selected from the group consisting of SEQ ID NO: 545, SEQ ID NO: 539 - SEQ ID NO: 544, or SEQ ID NO: 546 - SEQ ID NO: 564 may be used to detect CCND1 CTL in chromosome 11. A probe set comprising at least nine different Q670 labeled probes selected from the group consisting of SEQ ID NO: 559 - SEQ ID NO: 592 may be used to detect the CCND1 promoter in chromosome 11. A probe set comprising at least nine different Q670 labeled probes selected from the group consisting of SEQ ID NO: 593 - SEQ ID NO: 622 may be used to detect Chromosome 18 deadl in chromosome 18. A probe set comprising at least nine different Q670 labeled probes selected from the group consisting of SEQ ID NO: 623 - SEQ ID NO: 652 may be used to detect Chromosome 18 dead2 in chromosome 18. A probe set comprising at least nine different Q670 labeled probes selected from the group consisting of SEQ ID NO: 653 - SEQ ID NO: 682 may be used to detect Chromosome 18 dead3 in chromosome 18. A probe set comprising at least nine different Q670 labeled probes selected from the group consisting of SEQ ID NO: 683 - SEQ ID NO: 712 may be used to detect the CNOT3 promoter in chromosome 19. A probe set comprising at least nine different Q670 labeled probes selected from the group consisting of SEQ ID NO: 713 - SEQ ID NO: 742 may be used to detect the TSEN34 promoter in chromosome 19. A probe set comprising at least nine different Q670 labeled probes selected from the group consisting of SEQ ID NO: 743 - SEQ ID NO: 772 may be used to detect CNOT3 interl in chromosome 19. A probe set comprising at least nine different Q670 labeled probes selected from the group consisting of SEQ ID NO: 773 - SEQ ID NO: 802 may be used to detect CNOT3 inter2 in chromosome 19. A probe set comprising at least nine different Q670 labeled probes selected from the group consisting of SEQ ID NO: 803 - SEQ ID NO: 832 may be used to detect CNOT3 inter3 in chromosome 19. A probe set comprising at least nine different Q570 labeled probes selected from the group consisting of SEQ ID NO: 833 - SEQ ID NO: 862 may be used to detect the KLK2 promoter in chromosome 19. A probe set comprising at least nine different Q570 labeled probes selected from the group consisting of SEQ ID NO: 863 - SEQ ID NO:
892 may be used to detect the KLK3 promoter in chromosome 19. A probe set comprising at least nine different Q670 labeled probes selected from the group consisting of SEQ ID NO:
893 - SEQ ID NO: 929 may be used to detect KLK eRNA in chromosome 19. A probe set comprising at least at least nine different probes labeled with a detection agent selected from the group consisting of SEQ ID NO: 930 - SEQ ID NO: 1061 may be used to detect an HIV nucleic acid sequence.
H. Detectable Moieties
[0270] A detecting agent may comprise a detectable moiety. A detectable moiety may be a small molecule (such as a dye) or a macromolecule. A macromolecule may include polypeptides (such as proteins and/or protein fragments), nucleic acids, carbohydrates, lipids, macrocycles, polyphenols, and/or endogenous macromolecule complexes. A detectable moiety may be a small molecule. A detectable moiety may be a macromolecule.
[0271] A detectable moiety may include a moiety that is detectable by a colorimetric method or a fluorescent method. For example, a colorimetric method may be an assay which utilizes reagents that undergo a measurable color change in the presence of an analyte (such as an enzyme, an antibody, a compound, a hormone). Exemplary colorimetric method may include enzyme- mediated detection method such as tyramide signal amplification (TSA) which utilizes horseradish peroxidase (HRP) to generate a signal when digested by tyramide substrate and 3,3’,5,5’-Tetramethylbenzidine (TMB) which generates a blue color upon oxidation to 3,3’5,5’-tetramethylbenzidine diamine in the presence of a peroxidase enzyme such as HRP. A detectable moiety described herein may include a moiety that is detectable by a colorimetric method.
[0272] A detectable moiety may also include a moiety that is detectable by a fluorescent method. Sometimes, the detectable moiety may be a fluorescent moiety. A fluorescent moiety may be a small molecule (such as a dye) or a fluorescently labeled macromolecule. A fluorescently labeled macromolecule may include a fluorescently labeled polypeptide (such as a labeled protein and/or a protein fragment), a fluorescently labeled nucleic acid molecule, a fluorescently labeled carbohydrate, a fluorescently labeled lipid, a fluorescently labeled macrocycle, a fluorescently labeled polyphenol, and/or a fluorescently labeled endogenous macromolecule complex (such as a primary antibody- secondary antibody complex). [0273] A fluorescent small molecule may comprise rhodamine, rhodol, fluorescein, thiofluorescein, aminofluorescein, carboxyfluorescein, chlorofluorescein, methylfluorescein, sulfofluorescein, aminorhodol, carboxyrhodol, chlororhodol, methylrhodol, sulforhodol; aminorhodamine, carboxyrhodamine, chlororhodamine, methylrhodamine, sulforhodamine, thiorhodamine, cyanine, indocarbocyanine, oxacarbocyanine, thiacarbocyanine, merocyanine, cyanine 2, cyanine 3, cyanine 3.5, cyanine 5, cyanine 5.5, cyanine 7, oxadiazole derivatives, pyridyloxazole, nitrobenzoxadiazole, benzoxadiazole, pyren derivatives, cascade blue, oxazine derivatives, Nile red, Nile blue, cresyl violet, oxazine 170, acridine derivatives, proflavin, acridine orange, acridine yellow, arylmethine derivatives, auramine, crystal violet, malachite green, tetrapyrrole derivatives, porphin, phtalocyanine, bilirubin 1- dimethylaminonaphthyl-5-sulfonate, l-anilino-8-naphthalene sulfonate, 2-p-touidinyl-6- naphthalene sulfonate, 3-phenyl-7-isocyanatocoumarin, N-(p-(2- benzoxazolyl)phenyl)maleimide, stilbenes, pyrenes, 6-FAM (Fluorescein), 6-FAM (NHS Ester), 5(6)-FAM, 5-FAM, Fluorescein dT, 5-TAMRA-cadavarine, 2-aminoacridone, HEX, JOE (NHS Ester), MAX, TET, ROX, TAMRA, TARMA™ (NHS Ester), TEX 615, ATTO™ 488, ATTO™ 532, ATTO™ 550, ATTO™ 565, ATTO™ RholOl, ATTO™ 590, ATTO™ 633, ATTO™ 647N, TYE™ 563, TYE™ 665, or TYE™ 705.
[0274] A fluorescent moiety may comprise Cy3, Cy5, Cy5.5, Cy7, Q570, Alexa488, Alexa555, Alexa594, Alexa647, Alexa680, Alexa 750, Alexa 790, TexasRed, CF610, Propidium iodide, Quasar 570 (Q570), Quasar 670 (Q670), IRDye700, IRDye800,
Indocyanine green, Pacific Blue dye, Pacific Green dye, or Pacific Orange dye.
[0275] A fluorescent moiety may comprise a quantum dot (QD). Quantum dots may be a nanoscale semiconducting photolumine scent material, for example, as described in Alivisatos A.P.,“Semiconductor clusters, nanocrystals, and quantum dots,” Science 271(5251): 933-937 (1996).
[0276] Exemplary QDs may include, but are not limited to, CdS quantum dots, CdSe quantum dots, CdSe/CdS core/shell quantum dots, CdSe/ZnS core/shell quantum dots, CdTe quantum dots, PbS quantum dots, and/or PbSe quantum dots. As used herein, CdSe/ZnS may mean that a ZnS shell is coated on a CdSe core surface (a“core-shell” quantum dot). The shell materials of core-shell QDs may have a higher bandgap and passivate the core QDs surfaces, resulting in higher quantum yield and higher stability and wider applications than core QDs.
[0277] QDs may absorb a wide spectrum of light, and may be physically tuned with emission bandwidths in various wavelengths. See, e.g., Badolato, et al., Science 208:1158-61 (2005). For example, the emission bandwidth may be in the visible spectrum (from about 350 to about 750 nm), the ultraviolet- visible spectrum (from about 100 nm to about 750 nm), or in the near-infrared spectrum (from about 750 nm to about 2500 nm). QDs that emit energy in the visible range may include, but are not limited to, CdS, CdSe, CdTe, ZnSe, ZnTe, GaP, and GaAs. QDs that emit energy in the blue to near-ultraviolet range include, but are not limited to, ZnS and GaN. QDs that emit energy in the near-infrared range include, but are not limited to, InP, InAs, InSb, PbS, and PbSe.
[0278] The radius of a QD may be modulated to manipulate the emission bandwidth. For example, a radius of between about 5 and about 6 nm QD may emit wavelengths resulting in emission colors such as orange or red. A radius of between about 2 and about 3 nm may emit wavelengths resulting in emission colors such as blue or green.
[0279] A QD may further form a QD microstructure, which encompasses one or more layers of QD. For example, each quantum dot containing layer may comprise a single type of quantum dot of a specific emission color. For example, each layer may be made of any material suitable for use that (a) allows excitation light to reach the quantum dot and allows fluorescence generated from the quantum dot to pass through the layer(s) for detection and (b) may be combined with a quantum dot to form a layer. Examples of materials that may be used to form layers containing quantum dots include, but are not limited to, inorganic, organic, or polymeric material, each with or without biodegradable properties, and combinations thereof The layers may comprise silica-based compounds or polymers.
Exemplary silica-based layers may include, but are not limited to, those comprising tetramethoxy silane or tetraethylortho silicate. Exemplary polymer layers may include, but are not limited to, those comprising polystyrene, poly (methyl methacrylate),
polyhydro xyalkano ate, polylactide, or co-polymers thereof
[0280] The quantum dot further may comprise a spacer layer which serves as a barrier to prevent interactions between different QD layers, and may be made of any material suitable for use that (a) allows excitation light to reach the quantum dots in the quantum dot containing layer(s) below it and allows fluorescence generated from those quantum dots to pass through it and (b) may segregate the quantum dots in one layer from those in other layers. Examples of materials that may be used to form spacer layers are the same as for the quantum dot containing layers.
[0281] The materials used for the quantum dot containing and spacer layers may be the same or different. The same material may be used in the quantum dot containing layers and the spacer layers. [0282] The quantum dot containing layers and the spacer layers within a given QD molecule may be any thickness and may be varied. For example, thicker QD-containing layers may allow for the loading of increased QDs in the shell, resulting in greater fluorescence intensity for that layer than for a thinner layer containing the same concentration of QDs. Thus, varying layer thickness may facilitate preparing QD-containing layer of various intensities, thereby generating spectrally distinct QD bar codes. In various instances, the QD-containing layers may be between 5 nm and 500 nm. Those of skill in the art will understand that other methods for varying intensity also exist, for example, modifying concentrations of the same QD in one microstructure with a first unique barcode compared to a second QD
microstructure with a different fluorescent barcode. The ability to vary the intensities for the same QD color allows for an increased number of distinct and distinguishable microstructures (e.g., spectrally distinct barcodes). The spacer layers may be greater than 10 nm, up to approximately 5 pm thick; the spacer layers may be greater than 10 nm, up to approximately 500 nm thick; the space layers may be greater than 10 nm, up to approximately 100 nm thick.
[0283] The quantum dot- containing and spacer layers may be arranged in any order.
Examples include, but are not limited to, alternating QD-containing layers and spacer layers, or quantum dot containing layers separated by more than one spacer layer. Thus, a“spacer layer” may comprise a single layer, or may comprise two or more such spacer layers.
[0284] The QD microstructure may comprise any number of quantum dot containing layers suitable for use with the microstructure. For example, a microstructure described herein may comprise 2 or more quantum dot- containing layers and an appropriate number of spacer layers based on the number of quantum dot- containing layers. Further, the number of quantum dot containing layers in a given microstructure may range from 1 to“m,” where“m” is the number of quantum dots that may be used.
[0285] A defined intensity level may refer to a known amount of quantum dots in each quantum dot containing layer, resulting in a known amount of fluorescent intensity generated from the QD containing layer upon appropriate stimulation. Since each QD containing layer has a defined intensity level, each microstructure may possess a defined ratio of fluorescence intensities generated from the various QD-containing layers upon stimulation. This defined ratio is referred to herein as a barcode. Thus, each type of microstructure with the same QD layers possesses a similar barcode that may be distinguished from microstructures with different QD layers.
[0286] Thus, each quantum dot containing layer may comprise a single type of quantum dot of a specific emission color and the layer is produced to possess a defined intensity level, based on the concentration of the QD in the layer. By varying the intensity levels of QDs (“n”) in different microstructures and using a variety of different quantum dots (“m”), the number of different unique barcodes (and thus the number of different unique microstructure populations that may be produced) is approximated by the equation, (nm-l) unique codes.
This may provide the ability to generate a large number of different populations of microstructures each with its own unique barcode.
[0287] A set of QD-labeled probes may further generate a spectrally distinct barcode. For example, each probe with the set of QD-labeled probes may comprise a QD with a distinct excitation wavelength and the combination of the set may generate a distinct barcode. A set of spectrally distinct QD-labeled probes may be utilized to detect a regulatory element. As such, when detecting two or more regulatory elements, each regulatory element may be spectrally barcoded.
[0288] A quantum dot provided herein may include QDot525, QDot 545, QDot 565, QDot 585, QDot 605, or QDot 655. A probe described herein may comprise a quantum dot. A quantum dot may comprise a quantum dot as described in Han et a/.,“Quantum- dot- tagged microbeads for multiplexed optical coding of biomolecules,” Nat. Biotechnol. 19:631-635 (2001); Gao X.,“QD barcodes for biosensing and detection,” Conf Proc IEEE Eng Med Biol Soc 2009: 6372-6373 (2009); and Zrazhevskiy, et al,“Multicolor multicycle molecular profiling with quantum dots for single-cell analysis,” NatProtoc 8:1852- 1869 (2013).
[0289] A QD may further comprise a functional group or attachment moiety. One example of such a QD that has a functional group or attachment moiety is a QD with a carboxylic acid terminated surface, such as those commercially available though, for example, Quantum Dot, Inc., Flayward, CA.
I. Conjugating Moiety
[0290] The probe may include a conjugating moiety. The conjugation moiety may be attached at the 5’ terminus, the 3’ terminus, or at an internal site. The conjugating moiety may be a nucleotide analog (such as bromodeoxyuridine). The conjugating moiety may be a conjugating functional group. The conjugating functional group may be an azido group or an alkyne group. The probe may further be derivatized through a chemical reaction such as click chemistry. The click chemistry may be a copper(I)- catalyzed [3+2]-Huisgen 1,3 -dipolar cyclo- addition of alkynes and azides leading to 1,2, 3-triazoles. The click chemistry may be a copper free variant of the above reaction. [0291] The conjugating moiety may comprise a hapten group. A hapten group may include digoxigenin, 2,4-dinitrophenyl, biotin, avidin, or are selected from azoles, nitroaryl compounds, benzolurazans, triterpenes, ureas, thioureas, rotenones, oxazoles, thiazoles, coumarins, cyclolignans, heterobiaryl compounds, azoaryl compounds or benzodiazepines. A hapten group may include biotin.
[0292] The probe comprising the conjugating moiety may further be linked to a second probe (such as a nucleic acid probe or a polypeptide probe), a fluorescent moiety (such as a dye such as a quantum dot), a target nucleic acid, or a conjugating partner such as a polymer (such as PEG), a macromolecule (such as a carbohydrate, a lipid, a polypeptide), and the like.
J. Detection of a Target Nucleic Acid Sequence
[0293] The method may comprise an operation of providing one or more probes capable of binding to a target nucleic acid sequence, as described herein. The method may comprise an operation of binding the one or more probes to the target nucleic acid sequence, as described herein. The method may comprise an operation of detecting a signal associated with binding of the one or more probes to the target nucleic acid sequence, as described herein.
[0294] The target nucleic acid sequence may be detected in an intact cell. The target nucleic acid sequence may be detected in a fixed cell. The target nucleic sequence may be detected in a lysate or chromatin spread.
[0295] A probe may be used to detect a nucleic acid sequence in a sample. For example, a probe comprising a probe sequence capable of binding a nucleic acid sequence (such as a target nucleic acid sequence) and a detectable label (such as a detectable agent) may be used to detect the nucleic acid sequence. A method for detecting a nucleic acid sequence may comprise contacting a nucleic acid sequence with a probe comprising a probe sequence configured to bind at least a portion of the nucleic acid sequence and detecting the probe (such as detecting the detectable label of the probe). The detection of a nucleic acid sequence may comprise binding the probe to the nucleic acid sequence. For example, the detection of a nucleic acid sequence may comprise binding the probe sequence, such as the sequence of an oligonucleotide probe, to a target nucleic acid sequence. In some cases, the detection of a nucleic acid sequence may comprise hybridizing the probe sequence (such as the nucleic acid binding region) of a nucleic acid probe to a target nucleic acid sequence. The nucleic acid sequence may be a virus nucleic acid sequence. The nucleic acid sequence may be an agricultural viral nucleic acid sequence. The nucleic acid sequence may be a lentivirus nucleic acid sequence, an adenovirus nucleic acid sequence, an adeno-associated virus nucleic acid sequence, or a retrovirus nucleic acid sequence.
[0296] A nucleic acid sequence may be contacted with a plurality of probes. A nucleic acid sequence may be contacted with a number of probes ranging from about 1 to about 108 probes, from about 2 to about to about 50 million probes. The probes of the plurality of probes may be the same. A plurality of probes may have sequences such that the probes are tiled across the nucleic acid sequence. Each probe can bind to a target nucleic acid sequence along the nucleic acid sequence. The probes of a plurality may be different. A first probe of the plurality of probes may be different than a second probe of the plurality of probes. The plurality of probes may bind to the nucleic acid sequence with from 0 to 10 nucleotides separating each probe.
[0297] A nucleic acid sequence may be washed after it has been contacted with a probe. Washing a nucleic acid sequence after it has been contacted with a probe may reduce background signal for detection of the detectable label of the probe.
[0298] A nucleic acid sequence (such as a target nucleic acid sequence) can be contacted by a plurality of probes. A nucleic acid sequence can be contacted with a plurality of types of probes. That is, a method of detection of a nucleic acid sequence (such as a target nucleic acid sequence) may comprise contacting the target nucleic acid sequence with a plurality of sets of probes (such as a plurality of types of probes). A first probe set (such as a first type of probe) may be different from a second probe set (such a second type of probe) in that the first probe type comprises a first probe sequence which is different than the probe sequence of the second probe type. The probe sequence of a first type of probe may be the same as the probe sequence of a second type of probe. A first probe set may comprise a first detectable label and a first probe sequence and a second probe set may comprise a second detectable label and a second probe sequence, wherein the first and second probe sequences are the same and the first and second detectable labels are different. The first and second probe sequences may be different and the first and second detectable labels of a first and second probe set may be the same. The first and second probe sequences of a first and second probe set may be different and the first and second detectable labels of a first and second probe set may be different. A method of detecting a nucleic acid sequence may comprise contacting a nucleic acid sequence with 1 to 20 types of probes.
[0299] A first probe sequence may be configured to specifically recognize (such as to bind to or to hybridize with) a first nucleic acid sequence (such as a first target nucleic acid sequence). A second probe sequence may be configured to specifically recognize (such as to bind to or to hybridize with) a second nucleic acid sequence (such as a second target nucleic acid sequence).
[0300] A detectable label may be detected with a detector. A detector may detect the signal intensity of the detectable label. A detector may spatially distinguish between two detectable labels. A detector may also distinguish between a first and second detectable label based on the spectral pattern produced by the first and second detectable labels, wherein the first and second detectable label do not produce an identical spectral intensity pattern. For example, a detector may distinguish between a first and second detectable signal, wherein the
wavelength of the signal produced by the first detectable label is not the same as the wavelength of the signal produced by the second detectable label. A detector may resolve (such as by spatially distinguishing or spectrally distinguishing) a first and second detectable label that are less than 1 kb apart to less than 100 kb apart on a chromosome. The detectable label of the probe may be detected optically. For example, a detectable label of a probe may be detected by light microscopy, fluorescence microscopy, or chromatography. Detection of the detectable label of a probe may comprise stimulating the probe or a portion thereof (such as the detectable label) with a source of radiation (such as a light source, such as a laser). Detection of the detectable label of a probe may also comprise an enzymatic reaction.
[0301] Detection of the target nucleic acid sequence may be within a period of not more than 12 hours to not more than 48 hours.
[0302] Determining the presence of a genetic modification in a cell using the Nano-FISH method described herein may be useful is assessing the phenotype of the cell resulting from the genetic modification. A method for assessing a phenotype of an intact genetically modified cell may comprise : a) providing the intact genetically modified cell comprising a target nucleic acid sequence less than 2.5 kilobases in length; b) contacting the intact genetically modified cell with a first plurality of probes, wherein each probe comprises a first detectable label and a probe sequence that binds to a portion of the target nucleic acid sequence; c) detecting a presence of the first detectable label in the intact cell, wherein the presence of the first detectable label indicates the presence of the target nucleic acid sequence; d) determining a phenotype of the intact genetically modified cell; and e) correlating the phenotype of the intact genetically modified cell with the presence of the target nucleic acid sequence. The method may fiirther comprise determining a number or location of genetic modifications in the intact genetically modified cell. The method may further comprise f) selecting a first intact genetically modified cell comprising a phenotype of interest; g) determining a set of conditions used for a genetic modification of the first intact genetically modified cell; and h) preparing a second genetically modified cell using the set of conditions for genetic modification. The intact genetically modified cell may be a eukaryotic cell that was genetically modified. The intact genetically modified cell may be a bacteria cell that was genetically modified. The intact genetically modified cell may be a mammalian cell that was genetically modified. The intact genetically modified cell may be any cell as described herein that was genetically modified. The phenotype may be a product expressed as a result of the genetic modification of the cell. The phenotype may be an increased level or decreased level of the product expressed as a result of the genetic modification of the cell.
The phenotype may be an increased quality of the product expressed as a result of the genetic modification of the cell. The expressed product may be protein, such as an enzyme. The expressed product may be a transgene protein, RNA, or a secondary product of the genetic modification. For example, if an enzyme is produced as a result of the genetic modification of the cell, a secondary product of the genetic modification is a product of the enzyme.
[0303] Determining the number of target nucleic acid sequences in a cell may be useful in determining the phenotype of the cell. Cells with a specific number of target nucleic acid sequences may be tested for increased cellular activity, decreased cellular activity, or toxicity. Increased cellular activity may be increased expression of a protein or a cellular product. Decreased cellular activity may be decreased expression of a protein or a cellular product. Toxicity may be a result of cellular activity that may be too high or too low, resulting in cell death. For example, the contacting a sample ofvirally transduced cells with a probe configured to bind to a particular target viral nucleic acid sequence and then determining the number of viral integrants may be an expedient means of determining whether vims has successfully integrated in the cells of the sample in way in which a desired therapeutic effect may result if given to a patient as a therapy.
[0304] Determining the presence, absence, identity, spatial position or sequence position of a target nucleic acid sequence in a sample may be useful in determining a condition of a patient. For example, the contacting a sample of cells with a probe configured to bind to a particular target nucleic acid sequence and then determining the number of target nucleic acid sequences in the cell may be an expedient means of determining the number of target nucleic acid sequences may be affecting the cell phenotype or function. For example, contacting a patient sample with a probe configured to bind to a particular nucleic acid sequence may be an expedient means of determining whether the patient has the nucleic acid sequence. As another example, contacting a sample of virally transduced cells with a probe configured to bind to a particular target viral nucleic acid sequence may be an expedient means of determining whether virus has successfully integrated in the cells of the sample. Similarly, contacting a patient sample with a plurality of types of probes, each configured to bind to a different nucleic acid sequence, may be an expedient means of screening patients for various genetic or acquired conditions, such as inherited mutations.
K. Quantification of a Target Nucleic Acid Sequence in a Cell
[0305] A method of detecting or determining the presence of a nucleic acid sequence may comprise determining the number of probes associated with the nucleic acid sequence. A method of detecting or determining the presence of a nucleic acid sequence may comprise determining the number of probes hybridized to the nucleic acid sequence.
[0306] It may also be possible to determine the quantity of target nucleic acid sequences in this manner. If a viral nucleic acid sequence comprises the target nucleic acid sequence, the number of viral nucleic acid sequences may be quantified using the methods described herein. Quantification of the number of viral nucleic acid sequences in a sample (such as a cell comprising viral integrations) may be useful in determining the multiplicity of infection. This quantification may also be useful for methods of enriching heterogeneous populations of transduced cells to a more homogenous cell population or to a cell population comprising a greater percentage of cells comprising a specific number or a specific range of viral integrations. Quantification of target nucleic acid sequences in a sample using the methods, compositions, and systems described herein may be useful in determining the number of repeated sequences in a nucleic acid of a sample.
[0307] In some embodiments, this method can be used for quantifying populations of cells transduced to express chimeric antigen receptors (CARs) in order to determine the average number of viral insertions per cell or the distribution of viral insertions per cell within the cell populations.
[0308] For example, a Nano-FISH probe or a Nano-FISH probe set of this disclosure can be used to verify the number of viral insertions in T cells that have been engineered to express CARs, such as BCMA, CD19, CD22, WT1, L1CAM, MUC16, ROR1, or LeY. Thus, the Nano-FISH probe or Nano-FISH probe sets of the present disclosure can be used as a quality control step to verify that engineered CAR T cells have truly been transduced with a vector encoding for a given CAR, prior to administering the CAR T cells to a subject in need thereof
[0309] In some embodiments, this method can be used for quantifying populations of CD34+ hematopoietic stem cells (HSCs) transduced to express a gene of interest for the purpose of gene therapy, in order to determine the average number of viral insertions per cell or the distribution of viral insertions per cell within the cell populations.
[0310] For example, a Nano-FISH probe or a Nano-FISH probe set of this disclosure can be used to verify the number of viral insertions in CD34+ cells that have been engineered with any vector, such as a lentivirus vector or an adeno- associated vims vector to express any gene of interest. Thus, the Nano-FISH probe or Nano-FISH probe sets of the present disclosure can be used as a quality control step to verify that engineered CD34+ cells have truly been transduced with a vector encoding for a given gene, prior to administering the engineered CD34+ cells to a subject in need thereof For example, in some embodiments a CD34+ cell from a human donor is transduced with the lentivirus vector encoding for any gene. A subset of the engineered CD34+ cells can be subject to viral Nano-FISH validation wherein, the CD34+ cells are hybridized to a Nano-FISH probe or Nano-FISH probe set of the present disclosure and imaged to detect and quantify spots in the cell nuclei corresponding to viral insertions. The engineered CD34+ cells can, thus, be verified for successful transduction of any gene. Furthermore, the engineered CD34+ cells can, thus, be characterized for the average number of insertions per cell and/or the distribution of viral insertions per cell. Viral Nano-FISH can provide these valuable metrics characterizing the heterogeneity and quality of the engineered CD34+ cells prior to administration to a subject in need thereof The above described methods can be used to validate CD34+ cells engineered to in any of the following gene therapies: thalassemia, sickle cell disease, muscular dystrophy, or an immune disorder.
L. Enrichment and Optimization for the Number of Target Nucleic Acid Sequences in a Cell
[0311] The quantification of a target nucleic acid sequence, such as a viral nucleic acid sequence, may allow for the precise tuning of per-cell viral integrant number among a pool of cells transduced with a vims, such as a retrovirus.
[0312] Viral transduction of cells may be heterogeneous, producing cells with no viral integrant, a single copy of a viral integrant, or two or more copies of a viral integrant. Using Nano-FISH, a pool of cells with a consistent number of viral integrants may be produced, wherein cells comprising an undesirable number of viral integrants (e.g., too many or no viral integrants) may be reduced or eliminated. Viral integrants may be detected using the methods as described herein for Nano-FISH, also referred to herein as“viral Nano-FISH.” This may use microscopic imaging of fixed cells, and thus the imaged cells may not themselves be collected for subsequent use. However, pairing the Nano-FISH with a statistical approach may allow for (i) inferring the distribution of viral integrants in subpools of cells expanding in culture, and (ii) combining subpools to create a refined pool of cells with uniform viral integrants number. The pool of cells with the uniform number of viral integrants may be a therapeutic used to treat a disease.
[0313] In some embodiments, this method may be used for enriching populations of cells transduced to express chimeric antigen receptors (CARs) in order to deliver a cell population with a uniform number of CAR integrations to a patient as a cancer therapy.
[0314] The enrichment process may comprise the following steps: a) quantify the number of viral integrants in a sample from a source pool of cells; b) subdivide the remaining cells of the source pool into K subpools, each with approximately N cells (the value of N may be chosen to ensure a high likelihood of subpools having zero or a greatly reduced fraction of cells with more than one viral integrant; c) allow each subpool to undergo multiple cell divisions to create cell clones with identical numbers of viral integrants per cell; d) perform Nano-FISH on a representative sample from each subpool to assess the number of viral integrants in each cell; e) based on the assessment of step d) estimate the distribution of viral integrants for each subpool and eliminate the subpools with the unfavorable distribution of viral integrants; and f) combine the remaining subpools to create a single enriched pool comprising cells with a more homogenous number of viral integrants.
[0315] In some instances, the number of cell divisions and fraction of cells drawn for Nano- FISH analysis may be selected to ensure a high likelihood of detecting the presence of a multiple integration event given the random set of cells drawn. In some instances, any subpool may be eliminated if the proportion of cells with more than one viral integrants exceeds a specified threshold (which may be 0). Subpools may also be eliminated if the proportion of cells with no viral integrant is above a specified threshold. This secondary selection criterion may increase the relative abundance of the single viral integrant phenotype.
[0316] The above method for enrichment may allow numerous parameters to be specified in order to achieve a given goal. These parameters may include the number of cells per subpool, the number of subpools, the number of cell divisions (i.e., time in culture), and fraction of cells withdrawn for Nano-FISH. In addition, the optimal protocol may depend on the underlying rate of multiple viral insertions and the probability of detecting a spot with Nano- FISH. Finally, the approach may depend on the tolerance for allowing cells with multiple or no viral integrants into the enriched pool. [0317] In some cases, subpools may be enriched so that no cells comprise multiple integrants. To achieve this, for example, a statistical model may be used. For example, the probability of a given pool of N cells containing zero cells with multiple insertions is given by (1 -p) If there are K subpools, then the total number of cells contained in subpools without any multiple insertions may be M = KN( \ - p)x. Therefore, K = M [Af( 1 - / ) ] subpools may be needed to achieve a total of progenitor cells without multiple
integrations. The optimal value of N may be Up.
[0318] In addition to the parameters N and f, the target number of cell division cycles D and fraction of cells F to be withdrawn for Nano-FISH may need to be determined. For this determination, all cells may undergo the same number of cell divisions, resulting in 2D copies of each. Thus, the probability of withdrawing k of the cells with 2 integrants in a fraction F of all cells in the subpool may be given by P(k|N,D,F) a hypergeometric probability distribution with 2D positive items in N2D total items with FN2D drawn from the total. In some cases, the likelihood of a Nano-FISH spot being detected may be S, then the overall probability of detection may be given by
Figure imgf000106_0001
[0319] Determining the presence, absence, identity, spatial position or sequence position of a target nucleic acid sequence in a sample may be useful in determining a condition of a patient. For example, contacting a patient sample with a probe configured to bind to a particular nucleic acid sequence may be an expedient means of determining whether the patient has the nucleic acid sequence. Similarly, contacting a patient sample with a plurality of types of probes, each configured to bind to a different nucleic acid sequence, may be an expedient means of screening patients for various genetic or acquired conditions, such as inherited mutations.
M. Determination of the Spatial Position of a Target Nucleic Acid Sequence
[0320] The method may comprise an operation of providing one or more probes capable of binding to a target nucleic acid sequence, as described herein. The method may comprise an operation of binding the one or more probes to the target nucleic acid sequence, as described herein. The method may comprise an operation of imaging a signal associated with binding of the one or more probes to the target nucleic acid sequence, as described herein.
[0321] A method of detecting or determining the presence of a nucleic acid sequence may comprise determining the spatial position of a nucleic acid sequence (such as a target nucleic acid sequence). Determining the spatial position of a nucleic acid sequence may comprise contacting a nucleic acid sequence with a probe, which may comprise a detectable label and a probe sequence configured to bind to the nucleic acid sequence, and detecting the detectable label of the probe.
[0322] The spatial position of the nucleic acid sequence may be determined relative to features of the sample (such as features of a cell), structures of the sample (such structures or organelles of the cell), or other nucleic acids by using the same or a different imaging modality to detect the reference features, structures, or nucleic acids. For instance, the spatial position of a nucleic acid sequence in a cell relative to the nucleus of a cell by using a plurality of antibodies with a detectable label to counter-label structures of the cell, such as the cell membrane. A cell line expressing a detectable label (such as a liision protein with a structural protein expressed by the cell) may be used to determine spatial position of a nucleic acid sequence in a cell. If the target nucleic acid sequence comprises a viral nucleic acid sequence, the spatial location of the viral nucleic acid sequence may be determined by the methods as described herein.
[0323] Data collected from detection of all or a portion of the detectable labels in a sample may be used to form one or more two-dimensional images or a three-dimensional rendering or to make calculations determining or estimating the spatial position of the target nucleic acid sequence.
[0324] A first probe comprising a first detectable label and a first probe sequence configured to bind to a nucleic acid sequence (such as a target nucleic acid sequence) may be used as a reference position for a second probe comprising a second detectable label and a second probe sequence configured to bind to a second nucleic acid sequence (such as a second target nucleic acid sequence). For example, a first probe specific to a first target nucleic acid sequence of a nucleic acid with a known or anchored position on the nucleic acid may be used as a reference to determine the spatial position of a second target nucleic acid sequence bound by a second probe prior to or during imaging.
N. Detection of the Sequence Position of a Target Nucleic Acid Sequence
[0325] The method may comprise an operation of providing a first set of one or more probes capable of binding to one or more reference nucleic acid sequences with known positions in the genome, as described herein. The method may comprise an operation of binding the first set of one or more probes to the one or more reference nucleic acid sequences, as described herein. The method may comprise an operation of providing a second set of one or more probes capable of binding to a target nucleic acid sequence, as described herein. The method may comprise an operation of binding the second set of one or more probes to the target nucleic acid sequence, as described herein. The method may comprise an operation of detecting a signal associated with binding of the first set of one or more probes to the one or more reference nucleic acid sequences and of the second set of one or more probes to the target nucleic acid sequence, as described herein. The method may comprise an operation of comparing the signals associated with binding of the first set of one or more probes to the reference nucleic acid sequences to the signal associated with binding of the second set of one or more probes to the target nucleic acid sequence.
[0326] A method of detecting or determining the presence of a nucleic acid sequence may comprise determining the sequence position of a nucleic acid sequence (such as a target nucleic acid sequence). For example, a probe with a probe sequence configured to recognize a first target sequence with a known position in the sequence of a nucleic acid may be used as reference for calculations or estimations of the sequence position of a second target nucleic acid sequence on the nucleic acid. For example, a first probe having a probe sequence configured to recognize a first target sequence with a first known position in the sequence of a nucleic acid and a second probe having a probe sequence configured to recognize a second target nucleic acid sequence with a second known position in the sequence of the nucleic acid may be used as reference points for a third probe configured to recognize a third target nucleic acid sequence with an unknown position in the nucleic acid. The relative sequence position of the third target nucleic acid sequence may be determined or estimated by comparing it to the positions of the first and second target nucleic acid sequences, as indicated by the signals from the first and second probes.
O. Detection of Target Nucleic Acid Sequences in a Sample Relative to a Control
[0327] The method may comprise an operation of providing a one or more probes capable of binding to a target nucleic acid sequence in a reference sample and a target nucleic acid sequence in a sample under test, as described herein. The method may comprise an operation of binding the one or more probes to the target nucleic acid sequence in the reference sample and the target nucleic acid sequence in the sample under test, as described herein. The method may comprise an operation of detecting a signal associated with binding of the set of one or more probes to the target nucleic acid sequence in the reference sample and the target nucleic acid sequence in the sample being tested, as described herein. The method may comprise an operation of comparing the signal associated with binding of the one or more probes to the target nucleic acid sequence in the reference sample to the signal associated with binding of the one or more probes to the target nucleic acid sequence in the sample under test, as described herein.
P. Correlation of the Detection of a Target Nucleic Acid Sequence in a Sample with a Target Protein Expression
[0328] The detection of a target nucleic acid sequence in a cell may be correlated with a target protein expression in the same cell. The method may comprise providing a one or more probes capable of binding to a target nucleic acid sequence in a sample and a target nucleic acid sequence in a sample being tested, as described herein, and further comprise providing one or more detectable labels to detect the target protein expression. The presence, absence, or quantity of the detected target nucleic acid sequence may be correlated to the presence, absence, or quantity of the target protein expression. This information may be used to further investigate the relationship between the target nucleic acid sequence and the target protein, and/or how different treatments may perturb this correlation.
[0329] A viral nucleic acid sequence may be introduced into a cell by a viral vector, such as a virus particle, which may be called a virus or a virion. A virus particle may also be introduced to a cell by a bacteriophage. A virus particle may introduce a viral nucleic acid sequence into a cell through a series of steps that may include attachment (such as binding) of the virus particle to the cell membrane of the cell, internalization (such as penetration) of the viral particle into the cell (such as via formation of a vesicle around the virus particle), breakdown of the vesicle containing the virus particle (such as through uncoating, which may comprise breakdown of the portions of the virus such as a the viral coat), expression of the viral nucleic acid sequence or a portion thereof processing and/or maturation of the viral nucleic acid sequence’s expression product, incorporation of the viral nucleic acid sequence or its expression product into a DNA sequence of the host cell, and/or or replication of the viral nucleic acid sequence or a portion thereof A viral nucleic acid sequence may be targeted to the nucleus of the cell after internalization.
[0330] Introduction of a viral nucleic acid sequence into a cell by a virus particle may lead to permanent integration of the viral nucleic acid sequence into a DNA sequence of the cell. For example, a viral nucleic acid sequence introduced into a cell by a retrovirus, such as a lentivirus or adeno- associated virus, may be integrated directly into the DNA sequence of a cell. Introduction of a viral nucleic acid sequence into a cell by a vims particle may not lead to integration into a DNA sequence of the cell. [0331] A viral particle may be a double-stranded DNA (dsDNA) virus, a single- stranded DNA (ssDNA) virus, a double- stranded RNA (dsRNA) virus, a sense single- stranded RNA (+ssRNA) virus, an antisense single- stranded RNA (-ssRNA). Some viral particles may introduce a reverse transcriptase, integrase, and/or protease (such as a reverse transcriptase encoded by a pol gene sequence, which may be a portion of the viral nucleic acid sequence) into the infected cell. Examples of virus particles that introduce reverse transcriptase into an infected cell include single- stranded reverse transcriptase RNA (ssRNA-RT) viruses and double- stranded DNA reverse transcriptase (dsDNA-RT) viruses. Examples of ssRNA-RT viruses include metaviridae, pseudoviridae, and retroviridae. Examples of dsDNA-RT viruses include hepadnaviridae (e.g., Hepatitis B vims) and caulimoviridae. Additional examples of viruses include lentiviruses, adenoviruses, adeno- associated viruses, and retroviruses.
[0332] A viral nucleic acid sequence may be introduced into a cell by a non- viral vector, such as a plasmid. A plasmid may be a DNA polynucleotide encoding one or more genes. A plasmid may comprise a viral nucleic acid sequence. A viral nucleic acid sequence of a plasmid may encode a non-coding RNA (such as a transfer RNA, a ribosomal RNA, a microRNA, an siRNA, a snRNA, a shRNA, an exRNA, a piwi RNA, a snoRNA, a scaRNA, or a long non-coding RNA) or a coding RNA (such as a messenger RNA). A coding RNA may be modified (such as by splicing, poly-adenylation, or addition of a 5’ cap) or translated into a polypeptide sequence (such as a protein) after being transcribed from a DNA nucleic acid sequence of a plasmid.
Samples for Analysis of Protein (e.g., p53BPl) Accumulation in Response to a Cellular Perturbation and Nano-FISH Analysis
[0333] A sample described herein may be a fresh sample or a fixed sample. The sample may be a fresh sample. The sample may be a fixed sample. The sample may be a live sample. The sample may be subjected to a denaturing condition. The sample may be cryopreserved.
[0334] The sample may be a cell sample. The cell sample may be obtained from the cells or tissue of an animal. The animal cell may comprise a cell from an invertebrate, fish, amphibian, reptile, or mammal. The mammalian cell may be obtained from a primate, ape, equine, bovine, porcine, canine, feline, or rodent. The mammal may be a primate, ape, dog, cat, rabbit, ferret, or the like. The rodent may be a mouse, rat, hamster, gerbil, hamster, chinchilla, or guinea pig. The bird cell may be from a canary, parakeet, or parrot. The reptile cell may be from a turtle, lizard, or snake. The fish cell may be from a tropical fish. For example, the fish cell may be from a zebrafish (such as Danio rerio). The amphibian cell may be from a frog. An invertebrate cell may be from an insect, arthropod, marine invertebrate, or worm. The worm cell may be from a nematode (such as Caenorhabditis elegans). The arthropod cell may be from a tarantula or hermit crab.
[0335] The cell sample may be obtained from a mammalian cell. For example, the mammalian cell may be an epithelial cell, connective tissue cell, hormone secreting cell, a nerve cell, a skeletal muscle cell, a blood cell, an immune system cell, or a stem cell. A cell may be a fresh cell, live cell, fixed cell, intact cell, or cell lysate. Cell samples can be any primary cell, such as a hematopoetic stem cell (HSCs) or naive or stimulated T cells (e.g., CD4+ T cells).
[0336] Cell samples may be cells derived from a cell line, such as an immortalized cell line. Exemplary cell lines include, but are not limited to, 293A cell line, 293FT cell line, 293F cell line, 293 H cell line, HEK 293 cell line, CHO DG44 cell line, CHO-S cell line, CHO-K 1 cell line, Expi293F™ cell line, Flp-In™ T-REx™ 293 cell line, Flp-In™-293 cell line, Flp-In™- 3T3 cell line, Flp-In™-BHK cell line, Flp-In™-CHO cell line, Flp-In™-CV- l cell line, Flp- In™- Jurkat cell line, FreeStyle™ 293-F cell line, FreeStyle™ CHO-S cell line, GripTite™
293 MSR cell line, GS-CHO cell line, HepaRG™ cell line, T-REx™ Jurkat cell line, Per.C6 cell line, T-REx™-293 cell line, T-REx™-CHO cell line, T-REx™-HeLa cell line, NC- HIMT cell line, PC 12 cell line, A549 cells, and K562 cells.
[0337] The cell sample may be obtained from cells of a primate. The primate may be a human, or a non- human primate. The cell sample may be obtained from a human. For example, the cell sample may comprise cells obtained from blood, urine, stool, saliva, lymph fluid, cerebrospinal fluid, synovial fluid, cystic fluid, ascites, pleural effusion, amniotic fluid, chorionic villus sample, vaginal fluid, interstitial fluid, buccal swab sample, sputum, bronchial lavage, Pap smear sample, or ocular fluid. The cell sample may comprise cells obtained from a blood sample, an aspirate sample, or a smear sample.
[0338] The cell sample may be a circulating tumor cell sample. A circulating tumor cell sample may comprise lymphoma cells, fetal cells, apoptotic cells, epithelia cells, endothelial cells, stem cells, progenitor cells, mesenchymal cells, osteoblast cells, osteocytes, hematopoietic stem cells (HSC) (e.g., a CD34+ HSC), foam cells, adipose cells, transcervical cells, circulating cardiocytes, circulating fibrocytes, circulating cancer stem cells, circulating myocytes, circulating cells from a kidney, circulating cells from a gastrointestinal tract, circulating cells from a lung, circulating cells from reproductive organs, circulating cells from a central nervous system, circulating hepatic cells, circulating cells from a spleen, circulating cells from a thymus, circulating cells from a thyroid, circulating cells from an endocrine
- 1 l O- gland, circulating cells from a parathyroid, circulating cells from a pituitary, circulating cells from an adrenal gland, circulating cells from islets of Langerhans, circulating cells from a pancreas, circulating cells from a hypothalamus, circulating cells from prostate tissues, circulating cells from breast tissues, circulating cells from circulating retinal cells, circulating ophthalmic cells, circulating auditory cells, circulating epidermal cells, circulating cells from the urinary tract, or combinations thereof
[0339] The cell can be a T cell. For example, in some embodiments, the T cell can be an engineered T cell transduced to express a chimeric antigen receptor (CAR) or engineered T cell receptor (TCR). The CAR, or TCR T cell can be engineered to bind to BCMA, CD19, CD22, WT1, L1CAM, MUC16, ROR1, or LeY.
[0340] A cell sample may be a peripheral blood mononuclear cell sample.
[0341] A cell sample may comprise cancerous cells. The cancerous cells may form a cancer which may be a solid tumor or a hematologic malignancy. The cancerous cell sample may comprise cells obtained from a solid tumor. The solid tumor may include a sarcoma or a carcinoma. Exemplary sarcoma cell sample may include, but are not limited to, cell sample obtained from alveolar rhabdomyosarcoma, alveolar soft part sarcoma, ameloblastoma, angiosarcoma, chondrosarcoma, chordoma, clear cell sarcoma of soft tissue, dedifferentiated liposarcoma, desmoid, desmoplastic small round cell tumor, embryonal rhabdomyosarcoma, epithelioid fibrosarcoma, epithelioid hemangioendothelioma, epithelioid sarcoma, esthesioneuroblastoma, Ewing sarcoma, extrarenal rhabdoid tumor, extraskeletal myxoid chondrosarcoma, extraskeletal osteosarcoma, fibrosarcoma, giant cell tumor,
hemangiopericytoma, infantile fibrosarcoma, inflammatory myofibroblastic tumor, Kaposi sarcoma, leiomyosarcoma of bone, liposarcoma, liposarcoma of bone, malignant fibrous histiocytoma (MFH), malignant fibrous histiocytoma (MFH) of bone, malignant
mesenchymoma, malignant peripheral nerve sheath tumor, mesenchymal chondrosarcoma, myxofibrosarcoma, myxoid liposarcoma, rnyxoinflammatory fibroblastic sarcoma, neoplasms with perivascular epitheioid cell differentiation, osteosarcoma, parosteal osteosarcoma, neoplasm with perivascular epitheioid cell differentiation, periosteal osteosarcoma, pleomorphic liposarcoma, pleomorphic rhabdomyosarcoma, PNET/extraskeletal Ewing tumor, rhabdomyosarcoma, round cell liposarcoma, small cell osteosarcoma, solitary fibrous tumor, synovial sarcoma, or telangiectatic osteosarcoma.
[0342] Exemplary carcinoma cell samples may include, but are not limited to, cell samples obtained from an anal cancer, appendix cancer, bile duct cancer (i.e., cholangiocarcinoma), bladder cancer, brain tumor, breast cancer, cervical cancer, colon cancer, cancer of Unknown Primary (CUP), esophageal cancer, eye cancer, fallopian tube cancer, gastroenterological cancer, kidney cancer, liver cancer, lung cancer, medulloblastoma, melanoma, oral cancer, ovarian cancer, pancreatic cancer, parathyroid disease, penile cancer, pituitary tumor, prostate cancer, rectal cancer, skin cancer, stomach cancer, testicular cancer, throat cancer, thyroid cancer, uterine cancer, vaginal cancer, or vulvar cancer.
[0343] The cancerous cell sample may comprise cells obtained from a hematologic malignancy. Hematologic malignancy may comprise a leukemia, a lymphoma, a myeloma, a non- Hodgkin’s lymphoma, or a Hodgkin’s lymphoma. The hematologic malignancy may be a T-cell based hematologic malignancy. The hematologic malignancy may be a B-cell based hematologic malignancy. Exemplary B-cell based hematologic malignancy may include, but are not limited to, chronic lymphocytic leukemia (CLL), small lymphocytic lymphoma (SLL), high risk CLL, a non-CLL/SLL lymphoma, prolymphocytic leukemia (PLL), follicular lymphoma (FL), diffuse large B-cell lymphoma (DLBCL), mantle cell lymphoma (MCL), Waldenstrom’s macroglob ulinemia, multiple myeloma, extranodal marginal zone B cell lymphoma, nodal marginal zone B cell lymphoma, Burkitt’s lymphoma, non-Burkitt high grade B cell lymphoma, primary mediastinal B-cell lymphoma (PMBL), immunoblastic large cell lymphoma, precursor B-lymphoblastic lymphoma, B cell prolymphocytic leukemia, lymphoplasmacytic lymphoma, splenic marginal zone lymphoma, plasma cell myeloma, plasmacytoma, mediastinal (thymic) large B cell lymphoma, intravascular large B cell lymphoma, primary effusion lymphoma, or lymphomatoid granulomatosis. Exemplary T-cell based hematologic malignancy may include, but are not limited to, peripheral T-cell lymphoma not otherwise specified (PTCL-NOS), anaplastic large cell lymphoma,
angioimmunoblastic lymphoma, cutaneous T-cell lymphoma, adult T-cell
leukemia/lymphoma (ATLL), blastic NK-cell lymphoma, enteropathy-type T-cell lymphoma, hematosplenic gamma-delta T-cell lymphoma, lymphoblastic lymphoma, nasal NK/T-cell lymphomas, or treatment-related T-cell lymphomas.
[0344] A cell sample described herein may comprise a tumor cell line sample. Exemplary tumor cell line sample may include, but are not limited to, cell samples from tumor cell lines such as 600MPE, AU565, BT-20, BT-474, BT-483, BT-549, Evsa-T, Hs578T, MCF-7, MDA-MB-231, SkBr3, T-47D, HeLa, DU145, PC3, LNCaP, A549, H1299, NCI-H460, A2780, SKOV-3/Luc, Neuro2a, RKO, RKO-AS45- 1, HT-29, SW1417, SW948, DLD- l, SW480, Capan- l, MC/9, B72.3, B25.2, B6.2, B38.1, DMS 153, SU.86.86, SNU-182, SNU- 423, SNU-449, SNU-475, SNU-387, Hs 817.T, LMH, LMH/2A, SNU-398, PLHC-l, HepG2/SF, OCI-Lyl, OCI-Ly2, OCI-Ly3, OCI-Ly4, OCI-Ly6, OCI-Ly7, OCI-LylO, OCI- Lyl 8, OCI-Lyl9, U2932, DB, HBL- l, RIVA, SUDHL2, TMD8, MEC 1, MEC2, 8E5, CCRF- CEM, MOLT- 3, TALL- 104, AML- 193, THP- l, BDCM, HL-60, Jurkat, RPMI 8226, MOLT- 4, RS4, K-562, KASUMI- l, Daudi, GA- 10, Raji, JeKo- l, NK-92, and Mino.
[0345] A cell sample may comprise cells obtained from a biopsy sample, necropsy sample, or autopsy sample.
[0346] The cell samples (such as a biopsy sample) may be obtained from an individual by any suitable means of obtaining the sample using well-known and routine clinical methods. Procedures for obtaining tissue samples from an individual are well known. For example, procedures for drawing and processing tissue sample such as from a needle aspiration biopsy are well-known and may be employed to obtain a sample for use in the methods provided. Typically, for collection of such a tissue sample, a thin hollow needle is inserted into a mass such as a tumor mass for sampling of cells that, after being stained, will be examined under a microscope.
[0347] A cell may be a live cell. A cell may be a eukaryotic cell. A cell may be a yeast cell.
A cell may be a plant cell. A cell may be obtained from an agricultural plant.
High-throughput Assay for Analysis of Protein Markers of Cellular Perturbation and Nano-FTSH
[0348] In some embodiments, the present disclosure provides methods of high-throughput assaying of target nucleic acid cells in multi- well format. For example, the present disclosure provides methods for depositing cells in at least 24 wells, hybridizing oligonucleotide Nano- FISH probes with cells after denaturation, covering cells in each well with a glass coverslip, and imaging the cells with the microscopy techniques disclosed herein. As an example, PLL- coated 24-well glass-bottom plates can be used to hold 24 samples, wherein each sample contains a cell population. The cell population in each well can be the same or the cell population in each well can be different. Thus, at least 24 unique samples can be processed at the same time. Cells can be deposited into the 24-well plate, treated with fixative solution (e.g., 4$ formaldehyde in IX PBS or 3 parts methanol and 1 part glacial acetic acid), washed, and hybridized to oligonucleotide Nano-FISH probes. The 24- well plate can then be washed and cells can be mounted with glass coverslips containing an anti-fade solution (e.g., Prolong Gold) prior to imaging. In some embodiments, up to 1 to 10 plates can be simultaneously processed. Optical Detection of Surrogate Protein Markers (e.g., p53BPl) and/or Nucleic Acid Sequences
[0349] Described herein is a method of detecting a protein, such a surrogate protein marker (e.g., p53BPl) of a cellular response induced by a cellular perturbation (genome editing and methods of detecting a nucleic acid sequence. The detection may encompass identification of the nucleic acid sequence, determining the presence or absence of the nucleic acid sequence, and/or determining the activity of the nucleic acid sequence. A method of detecting a nucleic acid sequence may include contacting a cell sample with a detection agent, binding the detection agent to the nucleic acid sequence, and analyzing a detection profile from the detection agent to determine the presence, absence, or activity of the nucleic acid sequence.
[0350] The method may involve utilizing one or more intrinsic properties associated with a detection agent to aid in detection of the nucleic acid sequence. The intrinsic properties may encompass the size of the detection agent, the intensity of the signal, and the location of the detection agent. The size of the detection agent may include the length of the probe and/or the size of the detectable moiety (such as the size of a fluorescent dye molecule) may modulate the specificity of interaction with a regulatory element. The intensity of the signal from the detection agent may correlate to the sensitivity of detection. For example, a detection agent with a molar extinction coefficient of about 0.5-5 x l06 M 1cm 1 may have a higher intensity signal relative to a detection agent with a molar extinction coefficient outside of the 0.5-5 x l06 M 1cm 1 range and may have lower attenuation due to scattering and absorption. Further, a detection agent with a longer excited state lifetime and a large Stoke shift (measured by the distance between the excitation and emission peaks) may further improve the sensitivity of detection. The location of the detection agent may, for example, provide the activity state of a nucleic acid sequence. A combination of intrinsic properties of the detection agent may be used to detect a regulatory element of interest.
[0351] A detection agent may comprise a detectable moiety that is capable of generating a light, and a probe portion that is capable of hybridizing to a target site on a nucleic acid sequence. As described herein, a detection agent may include a DNA probe portion, an RNA probe portion, a polypeptide probe portion, or a combination thereof A DNA or RNA probe portion may be between about 10 and about 100 nucleotides in length. A DNA or RNA probe portion may be a TALEN probe, ZFN probe, or a CRISPR probe. A DNA or RNA probe portion may be a padlock probe. A polypeptide probe may comprise a DNA-binding protein, a RNA-binding protein, a protein involved in the transcription/translation process or detects the transcription/translation process, a protein that may detect an open or relaxed portion of a chromatin, or a protein interacting partner of a product of a regulatory element (such as an antibody or binding fragment thereof). In some instances, a detection agent may comprise a DNA or RNA probe portion which may be between about 10 and about 100 nucleotides in length.
[0352] A set of detection agents may be used to detect a nucleic acid sequence. The set of detection agents may comprise about 2 to about 20, or more detection agents may be used for detection of a nucleic acid sequence. A detection agent may comprise a polypeptide probe selected from a DNA-binding protein, a RNA-binding protein, a protein involved in the transcription/translation process or detects the transcription/translation process, a protein that may detect an open or relaxed portion of a chromatin, or a protein interacting partner of a product of a regulatory element (such as an antibody or binding fragment thereof).
[0353] A detectable moiety that is capable of generating a light may be directly conjugated or bound to a probe portion. A detectable moiety may indirectly conjugated or bound to a probe portion by a conjugating moiety. As described herein, a detectable moiety may be a small molecule (such as a dye) which may be directly conjugated or bound to a probe portion. A detectable moiety may be a fluorescently labeled protein or molecule which may be attached to a conjugating moiety (such as a hapten group, an azido group, an alkyne group) of a probe.
[0354] A profile or a detection profile or signature may include the signal intensity, signal location, and/or size of the signal of the detection agent. The profile or the detection profile may comprise about 100 image frames to about 50,000 frames, or more image frames.
Analysis of the profile or the detection profile may determine the activity of the regulatory element. The degree of activation may also be determined from the analysis of the profile or detection profile. Analysis of the profile or the detection profile may further determine the optical isolation and localization of the detection agents, which may correlate to the localization of the nucleic acid sequence.
[0355] The method may comprise an operation of providing one or more probes capable of binding to a target nucleic acid sequence, as described herein. The method may comprise an operation of binding the one or more probes to the target nucleic acid sequence, as described herein. The method may comprise an operation of photobleaching the one or more probes at one or more wavelengths, as described herein. The method may comprise an operation of detecting a profile of optical emissions associated with the photobleaching, as described herein. The method may comprise an operation of analyzing the detection profile to determine the localization of the target nucleic acid sequence, as described herein. [0356] The localization of a nucleic acid sequence may include contacting a nucleic acid sequence with a first set of detection agents, photobleaching the first set of detection agents for a first time point at a first wavelength to generate a second set of detection agents capable of generating a light at a second wavelength, detecting at least one burst generated by the second set of detection agents to generate a detection profile of the second set of detection agents, and analyzing the detection profile to determine the localization of the nucleic acid sequence.
[0357] A detection agent may comprise a detectable moiety that is capable of generating a light, and a probe portion that is capable of hybridizing to a target site on a nucleic acid sequence. Each detection agent within the first set of detection agents may have the same or a different detectable moiety. Each detection agent within the first set of detection agents may have the same detectable moiety. A detectable moiety may comprise a small molecule (such as a fluorescent dye). A detectable moiety may comprise a fluorescently labeled polypeptide, a fluorescently labeled nucleic acid probe, and/or a fluorescently labeled polypeptide complex.
[0358] Upon photobleaching, a second set of detection agents may be generated from the first set of detection agents, in which the second set may include detection agents that are capable of generating a burst of fight detectable at a second wavelength. For example, bleaching of the set of detection agents may lead to about 50%, or more detection agents within the set to enter into an“OFF-state” An“OFF-state” may be a dark state in which the detectable moiety crosses from the singlet excited electronic or ON state to the triplet electronic state or OFF- state in which detection of fight (such as fluorescence) may be low (for instance, less than 10%, less than 5%, less than 1%, or less than 0.5% of fight may be detected). The remainder of the detection agents that have not entered into the OFF-state may generate bursts of fights, or to cycle between a singlet excited electronic state (or ON-state) and a singlet ground electronic state. As such, bleaching of the set of detection agents may generate about 40% or less detection agents within the set that may generate bursts of fights. The bursts of fights may be detected stochastically, at a single burst level in which each burst of fight correlates to a single detection agent.
[0359] A single wavelength may be used for photobleaching a set of detection agents. At least two wavelengths may be used for photobleaching a set of detection agents. A
wavelength at 49lnm may be used. A wavelength at 405nm may be used in combination with the wavelength at 49lnm. The two wavelengths may be applied simultaneously to photobleach a set of detection agents. The two wavelengths may be applied sequentially to photobleach a set of detection agents. The time for photobleaching a set of detection agents may be from about 10 seconds to about 4 hours, or more. The concentration of the detection agents may be from about 5 nM to about 1 mM
[0360] The burst of lights from the set of detection agents may generate a detection profile. The detection profile may comprise about 100 image frames to about 50,000 frames, or more image frames. The detection profile may also include the signal intensity, signal location, or size of the signal. Analysis of the detection profile may determine the optical isolation and localization of the detection agents, which may correlate to the localization of the nucleic acid sequence.
[0361] The detection profile may comprise a chromatic aberration correction. The detection profile may comprise less than 5% or 0% chromatic aberration.
[0362] More than one nucleic acid sequence may be detected at the same time. Sometimes, at least 2 to at least 20 or more nucleic acid sequence may be detected at the same time. Each of the nucleic acid sequences may be detected by a set of detection agents. The detectable moiety between the different set of detection agents may be the same. For example, two different sets of detection agents may be used to detect two different nucleic acid sequences and the detectable moieties from the two sets of detection agents may be the same. As such, at least 2 to at least 20 or more nucleic acid sequences may be detected at the same time at the same wavelength. The detectable moiety between the different set of detection agents may also be different. For example, two different sets of detection agents may be used to detect two different nucleic acid sequences and the detectable moiety from one set of detection agents may be detected at a different wavelength from the detectable moiety of the second set of detection agents. As such, at least 2 to at least 20, or more nucleic acid sequences may be detected at the same time in which each of the nucleic acid sequences may be detected at a different wavelength. The nucleic acid sequence may comprise DNA, RNA, polypeptides, or a combination thereof
[0363] The activity of a target nucleic acid sequence may be measuring utilizing the methods described herein. The methods may include detection of a nucleic acid sequence and one or more products of the nucleic acid sequence. One or more products of the nucleic acid sequence may also include intermediate products or elements. The method may comprise contacting a cell sample with a first set and a second set of detection agents, in which the first set of detection agents interact with a target nucleic acid sequence within the cell and the second set of detection agents interact with at least one product of the target nucleic acid sequence, and analyze a detection profile from the first set and the second set of detection agents, in which the presence or the absence of the at least one product indicates the activity of the target nucleic acid sequence.
[0364] As described herein, a detection agent may comprise a detectable moiety that is capable of generating a light, and a probe portion that is capable of hybridizing to a target site on a nucleic acid sequence. Each detection agent within the first set of detection agents may have the same or a different detectable moiety. Each detection agent within the first set of detection agents may have the same detectable moiety. A detectable moiety may comprise a small molecule (such as a fluorescent dye). A detectable moiety may comprise a fluorescently labeled polypeptide, a fluorescently labeled nucleic acid probe, and/or a fluorescently labeled polypeptide complex.
[0365] The method may also allow photobleaching of the first set and the second set of detection agents, whereby generating a subset of detection agents capable of generating a burst of light. A detection profile may be generated from the detection of a set of light bursts, in which the presence or the absence of the at least one product may indicate the activity of the target nucleic acid sequence.
[0366] The nucleic acid sequence may comprise DNA, RNA, polypeptides, or a combination thereof The nucleic acid sequence may be DNA. The nucleic acid sequence may be RNA. The nucleic acid sequence may be an enhancer RNA (eRNA). The presence of an eRNA may correlate with target gene transcription that is downstream of eRNA. The nucleic acid sequence may be a DNasel hypersensitive site (DHS). The DHS may be an activated DHS. The pattern of the DHS on a chromatin may correlate to the activity of the chromatin. The nucleic acid sequence may be a polypeptide, such as a transcription factor, a DNA or RNA- binding protein or binding fragment thereof or a polypeptide that is involved in chemical modification. The nucleic acid sequence may be chromatin.
Image Analysis of Protein Markers (e.g., p53BPl) of Cellular Perturbation and Nano-
I ISII
[0367] The below disclosed imaging and image analysis techniques can be used to analyze protein markers (e.g., p53BPl) of cellular perturbation and/or Nano-FISH.
A. Epifluorescence Imaging
[0368] One or more far-field or near- field fluorescence techniques may be utilized for the detection, localization, activity determination, and mapping of one or more protein agglomerations or nucleic acid sequences described herein. A microscopy method may be an air or an oil immersion microscopy method used in a conventional microscope, a holographic or tomographic imaging microscope, or an imaging flow cytometer instrument. In such a method, imaging flow cytometers such as the ImageStream (EMD Millipore), conventional microscopes or commercial high-content imagers (such as the Operetta (Perkin Elmer), IN Cell (GE), etc.) deploying wide-field and/or confocal imaging modes may achieve sub- cellular resolution to detect signals of interest. For example, DAPI (4',6-diamidino-2- phenylindole) stain may be used to identify cell nuclei and another stain may be used to identify cells containing a nuclease protein.
B. Super-resolution Imaging
[0369] A microscopy method may utilize a super-resolution microscopy, which allows images to be taken with a higher resolution than the diffraction limit. A super-resolution microscopy method may utilize a deterministic super-resolution microscopy method, which utilizes a fluorophore’ s nonlinear response to excitation to enhance resolution. Exemplary deterministic super-resolution methods may include stimulated emission depletion (SEED), ground state depletion (GSD), reversible saturable optical linear fluorescence transitions (RESOLFT), and/or saturated structured illumination microscopy (SSIM). A super-resolution microscopy method may also include a stochastic super-resolution microscopy method, which utilizes a complex temporal behavior of a fluorophore, to enhance resolution.
Exemplary stochastic super-resolution method may include super-resolution optical fluctuation imaging (SOFI), all single- molecular localization method (SMLM) such as spectral precision determination microscopy (SPDM), SPDMphymod, photo- activated localization microscopy (PALM), fluorescence photo- activated localization microscopy (FPALM), selective plane illumination microscopy (SPIM), stochastic optical reconstruction microscopy (STORM), and dSTORM.
[0370] A microscopy method may be a single- molecular localization method (SMLM). A microscopy method may be a spectral precision determination microscopy (SPDM) method.
A SPDM method may rely on stochastic burst or blinking of fluorophore s and subsequent temporal integration of signals to achieve lateral resolution at, for example, between about 10 nm and about 100 nm.
[0371] A microscopy method may be a spatially modulated illumination (SMI) method. A SMI method may utilize phased lasers and interference patterns to illuminate specimens and increase resolution by measuring the signal in fringes of the resulting Moire patterns. [0372] A microscopy method may be a synthetic aperture optics (SAO) method. A SAO method may utilize a low magnification, low numerical aperture (NA) lens to achieve large field of view (FOV) and depth of field, without sacrificing spatial resolution. For example, an SAO method may comprise illuminating the detection agent-labeled target (such as a target protein agglomeration or nucleic acid sequence) with a predetermined number (N) of selective excitation patterns, where the number (N) of selective excitation patterns is determined based upon the detection agent’s physical characteristics corresponding to spatial frequency content (such as the size, shape, and/or spacing of the detection agents on the imaging target) from the illuminated target, optically imaging the illuminated target at a resolution insufficient to resolve the objects on the target, and processing optical images of the illuminated target using information on the selective excitation patterns to obtain a final image of the illuminated target at a resolution sufficient to resolve the objects on the target. The number (N) of selective excitation patterns may correspond to the number of k- space sampling points in a k- space sampling space in a frequency domain, with the extent of the k- space sampling space being substantially proportional to an inverse of a minimum distance (Dc) between the objects that is to be resolved by SAO, and with the inverse of the k-space sampling interval between the k-space sampling points being less than a width (w) of a detected area captured by a pixel of a system for said optical imaging. The number (N) may include a function of various parameters of the imaging system (such as a magnification of the objective lens, numerical aperture of the objective lens, wavelength of the light emitted from the imaging target, and/or effective pixel size of the pixel sensitive area of the image detector, etc.).
[0373] A SAO method may analyze a set of detection agent profiles from at least 100, at least 200, at least 250, at least 500, at least 1000, or more cells imaged simultaneously within one field of view utilizing an imaging instrument. The one field of view may be a single wide field of view (FOV) allowing image capture of at least 50, at least 100, at least 200, at least 250, at least 500, at least 1000, or more cells. The single wide field of view may be about 0.70 mm by about 0.70 mm field of view. The SAO imaging instrument may enable a resolution of about 0.25 pm with a 20X/0.45NA lens. The SAO imaging instrument may enable a depth of field of about 2.72 pm with a 20X/0.45NA lens. The imaging instrument may enable a working distance of about 7 mm with a 20X/0.45NA lens. The imaging instrument may enable a z-stack of 1 with a 20X/0.45NA lens. The SAO method may further integrate and interpolate 3-dimensional images from 2-dimensional images. The SAO method may enable the image acquisition of cell images at high spatial resolution and FOV. For example, for a given cell type, the SAO method may provide a FOV that is at least about l .5x, at least about 2x, at least about 3x, at least about 4x, at least about 5x, at least about 6x, at least about 7x, at least about 8x, at least about 9x, at least about lOx, at least about 15c, at least about 20x, or more as compared to a FOV provided by a method of microscope imaging using a 40x or 60x objective. For example, the SAO method may provide a FOV
corresponding to a 20x microscope lens with a spatial resolution corresponding to a lOOx microscope lens.
[0374] The SAO imaging instrument may be, for example, an SAO instrument as described in U.S. Patent Publication No. 2011/0228073 (Lee et al.). The SAO imaging instrument may be, for example, a StellarVision™ imaging platform supplied by Optical Biosystems, Inc. (Santa Clara, CA).
Analysis of Fluorescence Images
[0375] Fluorescence images may be processed by a method for analysis of, e.g., cell nuclei, target protein agglomerations (e.g., p53BPl), diffused localization of target proteins, and/or FISH signals. The method may comprise obtaining a fluorescence image of one or more probes bound to one or more target proteins or nucleic acid sequences, as described herein. The method may comprise deconvolving the image one or more times, as described herein. The method may comprise generating a region of interest (ROI) from the deconvolved image, as described herein. The method may comprise analyzing the ROI to determine the locations of all target proteins or nucleic acid sequences, as described herein.
[0376] Images obtained using the systems and methods described herein may be subjected to an image analysis method. The images may be obtained using the epifluorescence imaging systems and methods described herein. The image may be obtained using the super-resolution imaging systems and methods described herein.
[0377] The image analysis method may allow a quantitative morphometric analysis to be conducted on regions of interest (ROIs) within the images. The image analysis method may be implemented using Matlab, Octave, Python, Java, Perl, Visual Studio, C, or ImageJ. The image analysis method may be adapted from methods for processing fluorescence
microscopy images of cells for segmentation of cell nuclei, protein agglomerations, Nano- FISH signals, and/or nuclease localization. The image analysis method may be frilly automated and/or tunable by the user. The image analysis method may be configurable to identify p53BPl foci regardless of the shapes of the foci. The image analysis method may be configurable to process two-dimensional and/or three-dimensional images. The image analysis method may allow high throughput of estimation of cell count and boundaries in cell populations, which may be obtained with a speed-up of at least about 2 times, at least about 5 times, at least about 10 times, at least about 15 times, at least about 20 times, at least about 25 times, at least about 30 times, at least about 35 times, at least about 40 times, at least about 45 times, at least about 50 times, at least about 100 times, or more, as compared to manual identification and counting of cell populations.
[0378] The image analysis method may comprise a deconvolution of the image. The deconvolution process may improve the contrast and resolution of cell images for further analysis. The image analysis method may comprise an iterative deconvolution of the image. The image analysis method may comprise 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 iterations of deconvolving the image. The image analysis method may comprise more than 1, more than 2, more than 3, more than 4, more than 5, more than 6, more than 7, more than 8, more than 9, or more than 10 iterations of deconvolving the image. The deconvolution procedure may remove or reduce out-of-focus blur or other sources of noise in the epifluorescence images or super-resolution images, thereby enhancing the signal-to-noise ratio (SNR) within ROIs.
[0379] The image analysis method may further comprise an identification of the ROIs (e.g., candidate cells). The ROIs may be identified using an automated detection method. The ROIs may be identified by processing the raw or deconvolved or reconstructed or pre-processed images by applying a segmentation algorithm. This may allow the rapid delineation of ROIs within the epifluorescence or super-resolution images, thereby allowing scalability of processing images. The segmentation of ROIs may comprise planarization of three- dimensional images (e.g., generated by z-stacking to obtain three-dimensional cell volumes) by utilizing a maximum intensity projection image to generate a two-dimensional ROI mask. For rapid segmentation, the two-dimensional ROI mask may act as a template for an initial three-dimensional mask. For instance, the initial three-dimensional mask may be generated by projecting the two-dimensional ROI mask into a third spatial dimension. The projection may be a weighted projection. The initial three-dimensional mask may be lurther refined to obtain a refined three-dimensional ROI mask. Refinement of the initial three-dimensional mask may be achieved utilizing adaptive thresholding and/or region growing methods.
Refinement of the initial three-dimensional mask may be achieved by iteratively applying adaptive thresholding and/or region growing methods. The iterative procedure may result in a final three-dimensional ROI mask. The final three-dimensional ROI mask may comprise information regarding the locations of all fluorescently- labeled proteins or FISH- labeled nucleic acid sequences within each cell in a sample. [0380] The segmentation may detect ROIs using two-dimensional or three-dimensional computer vision methods such as edge detection and morphology. The ROIs may include cell nuclei, protein (e.g., p53BPl) foci, FISH foci, nuclease localization, or a combination thereof within each cell in a cell population within a field of view (FOV).
[0381] The image analysis method may further comprise feature extraction/computation from the segmented ROIs (e.g., detected candidate cells). Such sets of features may be selected to enable high performance (e.g., accuracy, throughput, sensitivity, specificity, etc.) of identifying/counting ROIs. Morphological features/parameters may be extracted from the segmented ROIs, such as count, spatial location, size (area/volume), shape
(circular ity/sp her icity, eccentricity, irregularity (concavity/convexity)), diameter,
perimeter/surface area, etc. In addition, other image parameters may also be extracted from the segmented ROIs, such as quantitative measures of image texture that may be pixel-based or region-based over a tunable length scale (e.g., nuclear diameter, nuclear area, nuclear volume, perimeter, surface area, DNA content, DNA texture measures).
[0382] In the case of ROIs that include protein foci, extracted features may include number of protein marker foci, size of protein marker foci, shape of protein marker foci, amount of protein marker per cell, spatial location and localization pattern of protein marker foci. In the case of ROIs that include nuclease localization, number of nuclease per cell, amount of nuclease per cell, nuclease localization or texture, number of cell engineering tool foci, size of cell engineering tool foci, shape of cell engineering tool foci, amount of cell engineering tool foci per cell, spatial location and localization pattern of cell engineering tool foci. In addition, in the case of ROIs that include Nano-FISH foci, additional features may be extracted, such as number, size, shape, amount, spatial location and localization pattern of Nano-FISH foci.
[0383] After the image analysis method has analyzed the cell nuclei, target protein agglomerations (e.g., p53BPl), difiiised localization of target proteins, and/or FISH signals, lurther informatics and analysis may be performed based on the image analysis results. For example, specificity analysis may be performed by analyzing locations of co- localization between Nano-FISH- labeled genomic loci and p53BPl . Cell images with high co- localization and similar counts between Nano-FISH- labeled genomic loci and p53BPl may indicate samples with high potency and specificity of nuclease activity (e.g., with minimal off-target effects), while cell images without co-localization between immunoNanoFISH and p53BPl may indicate samples with issues such as decreased potency of nuclease activity, decreased specificity of nuclease activity (e.g., with some off-target effects), or that an editing event was not detected by the assay.
[0384] The image analysis method may analyze acquired image data comprising a cell population to generate an output of estimating a count and/or boundaries (e.g., segmented ROIs) of the cell population. For example, the image analysis method may apply a prediction algorithm (e.g., a predictive analytics algorithm) to the acquired data to generate output of estimating a count and/or boundaries (e.g., segmented ROIs) of the cell population. The prediction algorithm may comprise an artificial intelligence based predictor, such as a machine learning based predictor, configured to process the acquired image data comprising a cell population to generate the output of estimating a count and/or boundaries (e.g., segmented ROIs) of the cell population. The machine learning predictor may be trained using datasets from one or more sets of images of known cell populations as inputs and known counts and/or boundaries (e.g., segmented ROIs) of the cell populations as outputs to the machine learning predictor.
[0385] The machine learning predictor may comprise one or more machine learning algorithms. Examples of machine learning algorithms may include a support vector machine (SVM), a naive Bayes classification, a random forest, a neural network, deep learning, or other supervised learning algorithm or unsupervised learning algorithm for classification and regression. The machine learning predictor may be trained using one or more training datasets corresponding to image data comprising cell populations.
[0386] Training datasets may be generated from, for example, one or more sets of image data having common characteristics (features) and outcomes (labels). Training datasets may comprise a set of features and labels corresponding to the features. Features may comprise characteristics such as, for example, certain ranges or categories of cell measurements, such as morphological features/parameters (count, size, diameter, area, volume, perimeter length, circularity, irregularity, eccentricity, etc.), other image parameters
(contrast, correlation, entropy, energy, and homogene ity/uniformity, etc.), nuclear size (diameter, area, or volume), perimeter or surface area, shape (e.g., circularity, irregularity, eccentricity, etc.), DN A content, DN A texture measures, characteristics of p53BPl foci (e.g., number, size, shape, etc.), amount of p53BPl protein per cell, spatial location and
localization pattern of p53BPl foci, amount of nuclease per cell, nuclease localization or texture, and characteristics of FISH signals (number, size, shape, amount, spatial location and localization pattern). Labels may comprise outcomes such as, for example, estimated or actual counts and boundaries of cells in a cell population or nuclease specificity or its activity.
[0387] Training sets (e.g., training datasets) may be selected by random sampling of a set of data corresponding to one or more sets of image data. Alternatively, training sets (e.g., training datasets) may be selected by proportionate sampling of a set of data corresponding to one or more sets of image data. The machine learning predictor may be trained until certain predetermined conditions for accuracy or performance are satisfied, such as having minimum desired values corresponding to cell identification accuracy measures. For example, the cell identification accuracy measure may correspond to estimated or actual counts and boundaries (e.g., segmented ROIs) of cells in a cell population. Examples of cell identification accuracy measures may include sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), accuracy, and area under the curve (AUC) of a Receiver Operating Characteristic (ROC) curve corresponding to the accuracy of generating estimated or actual counts and boundaries (e.g., segmented ROIs) of cells in a cell population.
[0388] For example, such a predetermined condition may be that the sensitivity of identifying a cell of interest comprises a value of, for example, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99%.
[0389] As another example, such a predetermined condition may be that the specificity of identifying a cell of interest comprises a value of, for example, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99%.
[0390] As another example, such a predetermined condition may be that the positive predictive value (PPV) of identifying a cell of interest comprises a value of, for example, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99%.
[0391] As another example, such a predetermined condition may be that the negative predictive value (NPV) of identifying a cell of interest comprises a value of, for example, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99%. [0392] As another example, such a predetermined condition may be that the area under the curve (AUC) of a Receiver Operating Characteristic (ROC) curve of identifying a cell of interest comprises a value of at least about 0.50, at least about 0.55, at least about 0.60, at least about 0.65, at least about 0.70, at least about 0.75, at least about 0.80, at least about 0.85, at least about 0.90, at least about 0.95, at least about 0.96, at least about 0.97, at least about 0.98, or at least about 0.99.
[0393] In some embodiments, image analysis can also be carried out as shown in FIG. 1, which illustrates an assay workflow for cellular imaging of phospho-53BPl (p53BPl) foci.
[0394] The image analysis method may be implemented in an automated manner, such as using the digital processing devices described herein.
[0395] In certain aspects, % nuclease specificity for a nuclease can be computed from the per-cell p53bpl foci count data. The data distributions for the nuclease-treated and the corresponding untreated reference (background) cell samples are computed. Given the detection efficiency of the p53bpl assay (PD) at the target site and the proliferating cell fraction (Fp), a theoretical on-target distribution is calculated for the on-target activity of the nuclease. Subsequently, the distribution of the nuclease-treated sample is normalized by the distribution of the control sample and the theoretical on-target distribution using a process of non-negative least squares deconvolution. Lastly, the specificity is calculated as follows from the distribution of the background-normalized cell population: Given the ploidy (PT) of the editing target, nuclease specificity is the % fraction of background-normalized cells containing p53BPl foci from 0 to PT . For simplicity in modeling, Fp and PD are set to 0 and 1
[0396] Baseline level or threshold level above which a DNA binding domain of a gene editing tool (e.g., a nuclease) is deemed to be non-specific can be calculated empirically by carrying out the imaging assays described herein. Such baseline or threshold level may be application- specific and can be determined by the requirements of an application as a set threshold on the magnitude of change in protein load in response to treatment (relative to background protein load in reference untreated cells) beyond which cell engineering tool is deemed non-specific, or as a relative ranking of cell engineering tools in a screening application when one or several best performing tools are picked.
[0397] In one case, protein indicative of cellular response is stained and imaged in fixed cells, total protein load is calculated by measuring intensity of protein staining within a cell. Change in total protein load is used as a measure of cell response to treatment. [0398] In another case, protein indicative of cellular response is stained and imaged in fixed cells, and protein accumulation at distinct locations within the cell is detected and enumerated. Change in the number of protein foci is used as a measure of cell response to treatment. In some instances, this change can be expressed as a specificity score.
[0399] In yet another case, protein indicative of cellular response is stained with immunofluorescence and target DNA loci are stained with nanoFISH and imaged in fixed cells. Protein accumulation at distinct locations and co-localization with nanoFISH spots within the cell are detected and enumerated. Change in the number of protein foci not co- localized with target nanoFISH spots is used as a measure of off-target cell response to treatment.
A. Digital Processing Device
[0400] The systems, apparatus, and methods described herein may include a digital processing device, or use of the same. The digital processing device may include one or more hardware central processing units (CPU) that carry out the device’s functions. The digital processing device may further comprise an operating system configured to perform executable instructions. In some instances, the digital processing device is optionally connected to a computer network, is optionally connected to the Internet such that it accesses the World Wide Web, or is optionally connected to a cloud computing infrastructure. In other instances, the digital processing device is optionally connected to an intranet. In other instances, the digital processing device is optionally connected to a data storage device.
[0401] In accordance with the description herein, suitable digital processing devices may include, by way of non-limiting examples, server computers, desktop computers, laptop computers, notebook computers, sub-notebook computers, netbook computers, netpad computers, set-top computers, media streaming devices, handheld computers, Internet appliances, mobile smartphones, tablet computers, personal digital assistants, video game consoles, and vehicles. Those of skill in the art will recognize that many smartphones are suitable for use in the system described herein. Those of skill in the art will also recognize that select televisions, video players, and digital music players with optional computer network connectivity are suitable for use in the system described herein. Suitable tablet computers may include those with booklet, slate, and convertible configurations, known to those of skill in the art. [0402] The digital processing device may include an operating system configured to perform executable instructions. The operating system may be, for example, software, including programs and data, which may manage the device’s hardware and provides services for execution of applications. Those of skill in the art will recognize that suitable server operating systems may include, by way of non-limiting examples, FreeBSD, OpenBSD, NetBSD®, Linux, Apple® Mac OS X Server®, Oracle® Solaris®, Windows Server®, and Novell® NetWare®. Those of skill in the art will recognize that suitable personal computer operating systems include, by way of non-limiting examples, Microsoft® Windows®, Apple® Mac OS X®, UNIX®, and UNIX- like operating systems such as GNU/Linux®. In some cases, the operating system is provided by cloud computing. Those of skill in the art will also recognize that suitable mobile smart phone operating systems include, by way of non-limiting examples, Nokia® Symbian® OS, Apple® iOS®, Research In Motion® BlackBerry OS®, Google® Android®, Microsoft® Windows Phone® OS, Microsoft® Windows Mobile® OS, Linux®, and Palm® WebOS®. Those of skill in the art will also recognize that suitable media streaming device operating systems include, by way of non-limiting examples, Apple TV®, Roku®, Boxee®, Google TV®, Google Chromecast®, Amazon Fire®, and Samsung®
HomeSync®. Those of skill in the art will also recognize that suitable video game console operating systems include, by way of non-limiting examples, Sony® PS3®, Sony® PS4®, Microsoft® Xbox 360®, Microsoft Xbox One, Nintendo® Wii®, Nintendo® Wii U®, and Ouya®.
[0403] In some instances, the device may include a storage and/or memory device. The storage and/or memory device may be one or more physical apparatuses used to store data or programs on a temporary or permanent basis. In some instances, the device is volatile memory and requires power to maintain stored information. In other instances, the device is non-volatile memory and retains stored information when the digital processing device is not powered. In still other instances, the non-volatile memory comprises flash memory. The non volatile memory may comprise dynamic random-access memory (DRAM). The non-volatile memory may comprise ferroelectric random access memory (FRAM). The non-volatile memory may comprise phase-change random access memory (PRAM). The device may be a storage device including, by way of non-limiting examples, CD-ROMs, DVDs, flash memory devices, magnetic disk drives, magnetic tapes drives, optical disk drives, and cloud computing based storage. The storage and/or memory device may also be a combination of devices such as those disclosed herein. [0404] The digital processing device may include a display to send visual information to a user. The display may be a cathode ray tube (CRT). The display may be a liquid crystal display (LCD). Alternatively, the display may be a thin film transistor liquid crystal display (TFT-LCD). The display may further be an organic light emitting diode (OLED) display. In various cases, on OLED display is a passive-matrix OLED (PMOLED) or active-matrix OLED (AMOLED) display. The display may be a plasma display. The display may be a video projector. The display may be a combination of devices such as those disclosed herein.
[0405] The digital processing device may also include an input device to receive information from a user. For example, the input device may be a keyboard. The input device may be a pointing device including, by way of non-limiting examples, a mouse, trackball, track pad, joystick, game controller, or stylus. The input device may be a touch screen or a multi- touch screen. The input device may be a microphone to capture voice or other sound input. The input device may be a video camera or other sensor to capture motion or visual input.
Alternatively, the input device may be a Kinect™, Leap Motion™, or the like. In further aspects, the input device may be a combination of devices such as those disclosed herein.
B. Non-transitory computer readable storage medium
[0406] In some instances, the systems, apparatus, and methods disclosed herein may include one or more non-transitory computer readable storage media encoded with a program including instructions executable by the operating system of an optionally networked digital processing device. In further instances, a computer readable storage medium is a tangible component of a digital processing device. In still further instances, a computer readable storage medium is optionally removable from a digital processing device. A computer readable storage medium may include, by way of non-limiting examples, CD-ROMs, DVDs, flash memory devices, solid state memory, magnetic disk drives, magnetic tape drives, optical disk drives, cloud computing systems and services, and the like. In some cases, the program and instructions are permanently, substantially permanently, semi-permanently, or non-transitorily encoded on the media.
C. Computer program
[0407] The systems, apparatus, and methods disclosed herein may include at least one computer program, or use of the same. A computer program includes a sequence of instructions, executable in the digital processing device’s CPU, written to perform a specified task. In some embodiments, computer readable instructions are implemented as program modules, such as functions, objects, Application Programming Interfaces (APIs), data structures, and the like, that perform particular tasks or implement particular abstract data types. In light of the disclosure provided herein, those of skill in the art will recognize that a computer program, in certain embodiments, is written in various versions of various languages.
[0408] The functionality of the computer readable instructions may be combined or distributed as desired in various environments. A computer program may comprise one sequence of instructions. A computer program may comprise a plurality of sequences of instructions. In some instances, a computer program is provided from one location. In other instances, a computer program is provided from a plurality of locations. In additional cases, a computer program includes one or more software modules. Sometimes, a computer program may include, in part or in whole, one or more web applications, one or more mobile applications, one or more standalone applications, one or more web browser plug-ins, extensions, add-ins, or add-ons, or combinations thereof
D. Web application
[0409] A computer program may include a web application. In light of the disclosure provided herein, those of skill in the art will recognize that a web application, in various aspects, utilizes one or more software frameworks and one or more database systems. In some cases, a web application is created upon a software framework such as Microsoft®
.NET or Ruby on Rails (RoR). In some cases, a web application utilizes one or more database systems including, by way of non-limiting examples, relational, non-relational, object oriented, associative, and XML database systems. Sometimes, suitable relational database systems may include, by way of non-limiting examples, Microsoft® SQL Server, mySQL™, and Oracle®. Those of skill in the art will also recognize that a web application, in various instances, is written in one or more versions of one or more languages. A web application may be written in one or more markup languages, presentation definition languages, client- side scripting languages, server- side coding languages, database query languages, or combinations thereof A web application may be written to some extent in a markup language such as Hypertext Markup Language (HTML), Extensible Hypertext Markup Language (XHTML), or extensible Markup Language (XML). In some embodiments, a web application is written to some extent in a presentation definition language such as Cascading Style Sheets (CSS). Aweb application may be written to some extent in a client-side scripting language such as Asynchronous Javascript and XML (AJAX), Flash® Actionscript, Javascript, or Silverlight®. A web application may be written to some extent in a server-side coding language such as Active Server Pages (ASP), ColdFusion®, Perl, Java™, JavaServer Pages (JSP), Hypertext Preprocessor (PHP), Python™, Ruby, Tel, Smalltalk, WebDNA®, or Groovy. Sometimes, a web application may be written to some extent in a database query language such as Structured Query Language (SQL). Other times, a web application may integrate enterprise server products such as IBM® Lotus Domino®. In some instances, a web application includes a media player element. In various further instances, a media player element utilizes one or more of many suitable multimedia technologies including, by way of non-limiting examples, Adobe® Flash®, HTML 5, Apple® QuickTime®, Microsoft®
Silverlight®, Java™, and Unity®.
E. Mobile application
[0410] A computer program may include a mobile application provided to a mobile digital processing device. In some cases, the mobile application is provided to a mobile digital processing device at the time it is manufactured. In other cases, the mobile application is provided to a mobile digital processing device via the computer network described herein.
[0411] In view of the disclosure provided herein, a mobile application is created by techniques known to those of skill in the art using hardware, languages, and development environments known to the art. Those of skill in the art will recognize that mobile applications are written in several languages. Suitable programming languages include, by way of non-limiting examples, C, C++, C#, Objective-C, Java™, Javascript, Pascal, Object Pascal, Python™, Ruby, VB.NET, WML, and XHTML/HTML with or without CSS, or combinations thereof
[0412] Suitable mobile application development environments are available from several sources. Commercially available development environments include, by way of non-limiting examples, AirplayS DK, alcheMo, Appcelerator®, Celsius, Bedrock, Flash Lite, .NET Compact Framework, Rhomobile, and WorkLight Mobile Platform. Other development environments are available without cost including, by way of non-limiting examples,
Lazarus, MobiFlex, MoSync, and Phonegap. Also, mobile device manufacturers distribute software developer kits including, by way of non- limiting examples, iPhone and iPad (iOS) SDK, Android™ SDK, BlackBerry® SDK, BREW SDK, Palm® OS SDK, Symbian SDK, webOS SDK, and Windows® Mobile SDK.
[0413] Those of skill in the art will recognize that several commercial forums are available for distribution of mobile applications including, by way of non-limiting examples, Apple® App Store, Android™ Market, BlackBerry® App World, App Store for Palm devices, App Catalog for webOS, Windows® Marketplace for Mobile, Ovi Store for Nokia® devices, Samsung® Apps, and Nintendo® DSi Shop.
F. Standalone application
[0414] A computer program may include a standalone application, which is a program that is run as an independent computer process, not an add-on to an existing process, e.g., not a plug-in. Those of skill in the art will recognize that standalone applications are often compiled. A compiler is a computer program(s) that transforms source code written in a programming language into binary object code such as assembly language or machine code. Suitable compiled programming languages include, by way of non-limiting examples, C, C++, Objective-C, COBOL, Delphi, Eiffel, Java™, Lisp, Python™, Visual Basic, and VB .NET, or combinations thereof Compilation is often performed, at least in part, to create an executable program. A computer program may include one or more executable complied applications.
Web browser plug-in
[0415] The computer program may include a web browser plug-in. In computing, a plug-in is one or more software components that add specific ftmctionality to a larger software application. Makers of software applications support plug-ins to enable third-party developers to create abilities which extend an application, to support easily adding new features, and to reduce the size of an application. When supported, plug-ins enable customizing the ftmctionality of a software application. For example, plug-ins are commonly used in web browsers to play video, generate interactivity, scan for viruses, and display particular file types. Those of skill in the art will be familiar with several web browser plug-ins including, Adobe® Flash® Player, Microsoft® Silverlight®, and Apple® QuickTime®. In some embodiments, the toolbar comprises one or more web browser extensions, add-ins, or add- ons. In some embodiments, the toolbar comprises one or more explorer bars, tool bands, or desk bands.
[0416] In view of the disclosure provided herein, those of skill in the art will recognize that several plug-in frameworks are available that enable development of plug-ins in various programming languages, including, by way of non-limiting examples, C++, Delphi, Java™, PHP, Python™, and VB .NET, or combinations thereof [0417] Web browsers (also called Internet browsers) may be software applications, designed for use with network- connected digital processing devices, for retrieving, presenting, and traversing information resources on the World Wide Web. Suitable web browsers include, by way of non-limiting examples, Microsoft® Internet Explorer®, Mozilla® Firefox®, Google® Chrome, Apple® Safari®, Opera Software® Opera®, and KDE Konqueror. In some embodiments, the web browser is a mobile web browser. Mobile web browsers (also called mircrobrowsers, mini-browsers, and wireless browsers) are designed for use on mobile digital processing devices including, by way of non-limiting examples, handheld computers, tablet computers, netbook computers, subnotebook computers, smartphones, music players, personal digital assistants (PDAs), and handheld video game systems. Suitable mobile web browsers include, by way of non-limiting examples, Google® Android® browser, RIM BlackBerry® Browser, Apple® Safari®, Palm® Blazer, Palm® WebOS® Browser, Mozilla® Firefox® for mobile, Microsoft® Internet Explorer® Mobile, Amazon® Kindle® Basic Web, Nokia® Browser, Opera Software® Opera® Mobile, and Sony® PSP™ browser.
A. Software modules
[0418] The systems and methods disclosed herein may include software, server, and/or database modules, or use of the same. In view of the disclosure provided herein, software modules may be created by techniques known to those of skill in the art using machines, software, and languages known to the art. The software modules disclosed herein may be implemented in a multitude of ways. A software module may comprise a file, a section of code, a programming object, a programming structure, or combinations thereof A software module may comprise a plurality of files, a plurality of sections of code, a plurality of programming objects, a plurality of programming structures, or combinations thereof In various aspects, the one or more software modules comprise, by way of non-limiting examples, a web application, a mobile application, and a standalone application. In some instances, software modules are in one computer program or application. In other instances, software modules are in more than one computer program or application. In some cases, software modules are hosted on one machine. In other cases, software modules are hosted on more than one machine. Sometimes, software modules may be hosted on cloud computing platforms. Other times, software modules may be hosted on one or more machines in one location. In additional cases, software modules are hosted on one or more machines in more than one location. B. Databases
[0419] The methods, apparatus, and systems disclosed herein may include one or more databases, or use of the same. In view of the disclosure provided herein, those of skill in the art will recognize that many databases are suitable for storage and retrieval of analytical information described elsewhere herein. In various aspects described herein, suitable databases may include, by way of non-limiting examples, relational databases, non-relational databases, object oriented databases, object databases, entity-relationship model databases, associative databases, and XML databases. A database may be internet-based. A database may be web-based. A database may be cloud computing-based. Alternatively, a database may be based on one or more local computer storage devices.
C. Services
[0420] Methods and systems described herein may further be performed as a service. For example, a service provider may obtain a sample that a customer wishes to analyze. The service provider may then encode the sample to be analyzed by any of the methods described herein, performs the analysis and provides a report to the customer. The customer may also perform the analysis and provides the results to the service provider for decoding. In some instances, the service provider then provides the decoded results to the customer. In other instances, the customer may receive encoded analysis of the samples from the provider and decodes the results by interacting with softwares installed locally (at the customer’s location) or remotely (e.g. on a server reachable through a network). Sometimes, the softwares may generate a report and transmit the report to the costumer. Exemplary customers include clinical laboratories, hospitals, industrial manufacturers and the like. Sometimes, a customer or party may be any suitable customer or party with a need or desire to use the methods provided herein.
D. Server
[0421] The methods provided herein may be processed on a server or a computer server).
The server may include a central processing unit (CPU, also“processor”) which may be a single core processor, a multi core processor, or plurality of processors for parallel processing. A processor used as part of a control assembly may be a microprocessor. The server may also include memory (e.g. random access memory, read-only memory, flash memory); electronic storage unit (e.g. hard disk); communications interface (e.g. network adaptor) for communicating with one or more other systems; and peripheral devices which includes cache, other memory, data storage, and/or electronic display adaptors. The memory, storage unit, interface, and peripheral devices may be in communication with the processor through a communications bus (solid lines), such as a motherboard. The storage unit may be a data storage unit for storing data. The server may be operatively coupled to a computer network (“network”) with the aid of the communications interface. A processor with the aid of additional hardware may also be operatively coupled to a network. The network may be the Internet, an intranet and/or an extranet, an intranet and/or extranet that is in
communication with the Internet, a telecommunication or data network. The network with the aid of the server, may implement a peer-to-peer network, which may enable devices coupled to the server to behave as a client or a server. The server may be capable of transmitting and receiving computer-readable instructions (e.g., device/system operation protocols or parameters) or data (e.g., sensor measurements, raw data obtained from detecting metabolites, analysis of raw data obtained from detecting metabolites, interpretation of raw data obtained from detecting metabolites, etc.) via electronic signals transported through the network. Moreover, a network may be used, for example, to transmit or receive data across an international border. The server may be in communication with one or more output devices such as a display or printer, and/or with one or more input devices such as, for example, a keyboard, mouse, or joystick. The display may be a touch screen display, in which case it functions as both a display device and an input device. Different and/or additional input devices may be present such an enunciator, a speaker, or a microphone. The server may use any one of a variety of operating systems, such as for example, any one of several versions of Windows®, or of MacOS®, or of Unix®, or of Linux®.
[0422] The storage unit may store files or data associated with the operation of a device, systems or methods described herein. The server may communicate with one or more remote computer systems through the network. The one or more remote computer systems may include, for example, personal computers, laptops, tablets, telephones, Smart phones, or personal digital assistants. A control assembly may include a single server. In other situations, the system may include multiple servers in communication with one another through an intranet, extranet and/or the Internet. The server may be adapted to store device operation parameters, protocols, methods described herein, and other information of potential relevance. Such information may be stored on the storage unit or the server and such data is transmitted through a network.
Kits [0423] A composition described herein may be supplied in the form of a kit. A composition may be materials and software for image analysis of a protein marker (e.g., p53BPl) of a cellular response induced by a cellular perturbation. Materials can include a detectable agent that binds to the protein (e.g., a primary antibody fluorophore conjugate or a primary antibody against the protein and a secondary antibody- fluorophore conjugate). Materials can further include a detectable agent that binds to a cell engineering tool (e.g., genome editing complex, gene regulator) to be tested (e.g., a primary antibody fluorophore conjugate or a primary antibody against the protein and a secondary antibody-fluorophore conjugate). A composition can be an oligonucleotide Nano-FISH probe set designed for a target nucleic acid sequence. The kits of the present disclosure may further comprise instructions regarding the method of using the detectable agents to detect protein (e.g., p53BPl) load, cell engineering tool, or probe set to detect the target nucleic acid sequence.
[0424] The components of the kit may be in dry or liquid form. If they are in dry form, the kit may include a solution to solubilize the dried material. The kit may also include transfer factor in liquid or dry form. In some embodiments, if the transfer factor is in dry form, the kit includes a solution to solubilize the transfer factor. The kit may also include containers for mixing and preparing the components. The kits as described herein also may include a means for containing compositions of the present disclosure in close confinement for commercial sale and distribution.
[0425] Unless defined otherwise, all technical and scientific terms used herein have the same meaning as is commonly understood by one of skill in the art to which the claimed subject matter belongs. It is to be understood that the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of any subject matter claimed. In this application, the use of the singular includes the plural unless specifically stated otherwise. It must be noted that, as used in the specification and the appended claims, the singular forms“a,”“an” and“the” include plural referents unless the context clearly dictates otherwise. In this application, the use of“or” means“and/or” unless stated otherwise. Furthermore, use of the term“including” as well as other forms, such as “include”, “includes,” and“included,” is not limiting.
[0426] As used herein, ranges and amounts may be expressed as“about” a particular value or range. About also includes the exact amount. Hence“about 5 pL” means“about 5 pL” and also“5 pL.” Generally, the term“about” includes an amount that would be expected to be within experimental error. [0427] The section headings used herein are for organizational purposes only and are not to be construed as limiting the subject matter described.
EXAMPLES
[0428] These examples are provided for illustrative purposes only and not to limit the scope of the claims provided herein.
EXAMPLE 1
Assay Workflow for Cellular Imaging of p53BPl Foci
[0429] This example illustrates an assay workflow for cellular imaging of phospho-53BPl (p53BPl) foci. FIG. 1 shows a brief summary of the assay workflow including the steps of nuclease transfection in cells, immuno labeling, imaging, processing raw images by deconvolution, enhancement, or reconstruction and segmentation, feature computation (e.g., count, amount, size, location), and informatics and analysis (determining nuclease load and/or specificity, cytotoxicity, and/or heterogeneity) from the extracted/computed features.
[0430] A nuclease (e.g., TALENs or Cas9) was delivered to cells by electroporation. Cells were incubated for a period of time, such as 24 hours, necessary for nuclease activity and cell response to nuclease-induced DNA double- stranded breaks.
[0431] The cells were sampled for evaluation of nuclease specificity. Cells were fixed onto glass slides, coverslips, or glass-bottom well-plates, stained with fluorescent labeled antibodies against p53BPl and the nuclease protein, and imaged with a fluorescence microscope (e.g., Nikon). For microscopy on a Nikon, raw fluorescence microscopy images were deconvolved (e.g., by processing the raw images with a deconvolution algorithm), regions of interest such as cell nuclei, p53BPl foci, and nuclease localization were algorithmically delineated (e.g., by processing the deconvolved images with a segmentation algorithm), and morphological features/parameters (such as count, size, diameter, area, volume, perimeter length, circularity, irregularity, eccentricity, etc.) and other image parameters (such as contrast, correlation, entropy, energy, and homogene ity/uniformity) were computed for each cell (e.g., by applying one or more feature extraction algorithms to the segmented images). The measured per-cell feature information was statistically analyzed to produce quantitative specificity metrics for the tested nuclease(s). FIG. 17 shows an assay workflow for microscopy on a Stellar- Vis ion microscope. Images are captured on the Stellar- Vision microscope, images were reconstructed, images were segmented for regions of interest such as cell nucleic, p53BPl foci, and nuclease localization, features were computed (such as count, size, diameter, area, volume, perimeter length, circularity, irregularity, eccentricity, etc.). The measured per-cell feature information was statistically analyzed to produce quantitative specificity metrics for the tested nuclease(s).
[0432] FIG. 2 shows fijrther details on image analysis including the steps of obtaining a fluorescence microscopy image, image deconvolution, delineation/segmentation of cell nuclei, p53BPl foci, and nuclease protein, morphological data estimation, and
informatics/analysis as described in FIG. 1. Acquired cell images were first deconvolved to minimize the effect of out-of-focus blurring caused by the widefield imaging optics.
Subsequently, automated 2D/3D computer vision methods were used to delineate regions of interest (ROIs) such as the nucleus, p53BPl foci, and nuclease protein localization within every cell in the field of view (FOV). The derived ROI masks were used to estimate per-cell morphological parameters (or features) such as count, size, amount, location, and
heterogeneity as needed. The estimated morphological parameters and other image parameters of the cells were analyzed using informatics methods to obtain statistical inferences on the activity and specificity of the delivered nuclease relative to control cell samples.
EXAMPLE 2
Transfection of Cells with Nucleases
[0433] This example illustrates transfection of cells with nucleases. For all transfections a BTX ECM830 device with a 2 mm gap cuvette was used. TALEN mRNAs were prepared using a mMessageMachine T7 Ultra Kit (#AM 1345, A bion). For each transfection, 0.2x106 cells were washed twice with PBS and centrifuged. Cell pellets were resuspended in 100 |1 BTexpress solution (BTX Harvard Apparatus, Cat#45-0805) and 2pg mRNA per TALEN Monomer was added. Cell/mRNA mixtures were transferred to a transfection cuvette and electroporated with one pulse of 250V for 5msec. Following electroporation, cells were transferred to pre-warmed media. K562 cells or A549 cells were transferred to 2mL of pre- warmed IMDM/l0%FBS/l%PS (for K 562 cells) or 2 mL of pre-warmed F- l2K/lO%FBS/l%PS (for A549 cells) and CD34 cells were transferred to 600 pl
xvivo/CCl lO/IL6. Cells were incubated at 30°C for 24 hours prior to imaging. Genotyping was performed 24 and 48 hours post- transfection. EXAMPLE 3
T Cell Stimulation, and Transfection Methods
[0434] This example illustrates T cell stimulation and transfection methods. Human CD4+ T lymphocytes were isolated from peripheral blood mononuclear cells (PBMCs) of non- mobilized healthy donors by negative selection. Human CD4+ T lymphocyte culture medium was prepared with X-VTVO 15 (Lonza, Basel, Switzerland) supplemented with lO% FBS, 2 mM L-glutamine, 1% penicillin/streptomycin, and 20 ng/ml IL2 (PeproTech, Rocky Hill, NJ, USA). Cell washing media was prepared with lO% FBS in PBS. Cells were cultured by pre- warming the culture media and washing media to 37°C. Cell tubes were filled with 30ml washing media and cells were counted. Cells were centrifuged at 400 xg for 8 minutes at room temperature, resuspended in complete culture media to a concentration of 1-2 c 106 cells / mL, and placed in 37°C, 5% C02 humidified incubator for lurther experimentation.
[0435] T cells were activated with Anti-CD3/CD28-Dynabeads (Life Technologies, Cat# 11132D). Dynabeads washing buffer was prepared containing PBS with 0. l% BSA and 2 mM EDTA, pH 7.4. Anti-CD3/CD28-Dynabeads were resuspended and transferred to a tube. An equal volume of Dynabeads washing buffer was added, the tube was placed on a magnet for 1 min, and the supernatant was discarded. Washed Dynabeads were resuspended in culture media. Washed Dynabeads were added to the CD4+ T cell culture suspension at a bead to cell ratio of 1 :1 and the cells were mixed with a pipette. Plates were incubated at 37°C, 5% C02 humidified incubator for 24 hours to activate T cells. Activated T cells were mixed and placed on the magnet for 5 min and supernatants containing cells were collected. This step was repeated 2-3 times to obtain activated T cells (without Dynabeads) for further experimentation. For transfection of T cells, after transfection cell maintain medium was prepared containing X- VIVO 15 (Lonza, Basel, Switzerland) supplemented with l0% FBS,
2 mM L-glutamine, 1% penicillin/streptomycin, 20 ng/ml IL2 (PeproTech, Rocky Hill, NJ, USA), and 20 ng/ml IL7 (PeproTech, Rocky Hill, NJ, USA).
[0436] Electroporation settings included a choose mode of LV, set voltage of 250 V, set pulse length of 5 ms, 1 set number of pulses, a BTX Disposable Cuvette (2mm gap) electrode type and a desired field strength of 3000 V/cm. Cell culture plates were prepared with after transfection cell maintain medium by filling appropriate number of wells with desired 800 pi. Plates were pre-incubated/equilib rated in a humidified 37°C, 5 % CO2 incubator. l-2 pg of TALEN mRNA was aliquoted in a separate tube. BTXpress high performance electroporation solution (BTX, Holliston, MA, USA) was brought to room temperature. Activated CD4+ T cells were collected and counted to determine cell density. Total cells needed (0 2-0 5/ 106 cells per sample) were centrifuged at 300xg for 8 minutes at room temperature and washed twice with PBS. For transfection, CD4+ T cells were resuspended in BTXpress high performance electroporation solution (Harvard Apparatus, Holliston, MA, USA), to a final density of 2-5 x 106 cells/mL. lOOul of cells was mixed with aliquoted mRNA. Cell-mRNA mixture was added to a well of MOS Multi-Well Electroporation Plate, sealed, and placed into the HT Electroporation System. T cells were electroporated in a BTX ECM830 Square Wave electroporator using a single pulse of 250 V for 5 ms. Electroporated CD4+ T cells were placed in an Axygen Deep 96- well plate or 12 /24 well Falcon Polystyrene Microplates with pre-warmed cell maintain medium. Cells were“cold shocked” in a humidified 30°C, 5 % CO2 incubator for 16-24 hour, then incubated in a humidified 37°C, 5 % CO2 incubator until analysis. Gene expression or down regulation was detectable as early as 4-8 hours post electroporation. For imaging, cells were collected 24 hours after transfection. For genomic DN A isolation, cells were incubated for around 48-72 hours. For RN A collection, cells were incubated up to 4-5 days.
EXAMPLE 4
p53BPl Immunofluorescence Imaging
[0437] This example illustrates p53BPl immunofluorescence analysis using the compositions and methods of the present disclosure.
Covers lip Format
[0438] Cell preparation. Cells were prepared for immunofluorescence staining and image analysis on a coverslip and in 24 well plates. For preparation of cells on coverslips, cells were seeded onto a poly-l-lysine coated #1.5 glass coverslip (12 mm round or 18 mm square). First, coverslips were placed into a well of a 6-well tissue culture plate. Cells were pre washed with PBS, resuspended to ~2, 000, 000 cells/mL in PBS, and 50-100 uL cells were spotted onto the center of each coverslip. Cells were allowed to settle for 10-15 minutes at room temperature. Next cells were fixed in 2 mL/well of fresh fixative (4%
formaldehyde in lx PBS) and incubated for 10 minutes at room temperature. Cells were washed twice with 3 mL/well lx PBS over 5 minutes, permeabilized in 2 mL/well with 0.5% Triton X-100, lx PBS for 15 minutes at room temperature. Cells were washed three times for 5 minutes per wash with 3 mL/well of lx PBS. Cells were stored at 4°C in lx PBS prior to staining. [0439] Staining. Blocking buffer was prepared to contain 2% BSA (from 10% BSA/PBS), 0.05% Tween-20, and lx PBS. Cells were blocked with 1.5 mL/well blocking buffer (in a 6-well plate) for 30 minutes at room temperature. Primary antibody incubation was carried out as follows. Primary antibodies were diluted in blocking buffer at the following ratios: 1:500 for anti-p53BPl (tagging for p53BPl, which accumulates at the site of double strand breaks) and 1:2000 for anti-FLAG (tagging for FLAG label on a nuclease). A humidified chamber was prepared and a sheet of Parafilm was placed inside with 100 pL spots of the primary antibody solution. Coverslips were removed from the 6-well plate, inverted onto the primary antibody spots inside the humidified chamber, and incubated for 2 hours at room temperature. Coverslips were returned into the original 6-well plate with blocking buffer and cells were washed with 2 mL/well with IX PBS three times for 5 minutes per wash. Samples were protected from light for subsequent steps performed with the secondary antibody labeled with a fluorophore. Secondary antibody incubation was carried out as follows. The secondary antibodies (donkey-anti-rabbit-Cy3 and donkey- anti-mouse-AF647) were diluted in a blocking buffer at 1:500. A new sheet of Parafilm was placed inside the humidified chamber with 100 pl spots of the secondary antibody solution. Coverslips were removed from the 6-well plate and inverted onto secondary antibody spots. Coverslips were incubated for 1.5 hours at room temperature. Coverslips were returned into the original 6-well plate and washed three times with 3 mL/well with lx PBS for 5 minutes per wash. Finally, cells were stained with DAPI for visualization of the nucleus. Cells were incubated at 1.5 mL/well of lx PBS with 100 ng/mL of DAPI for 10 minutes at room temperature. Cells were washed once with lx PBS.
[0440] Mounting. 10 pl of Prolong Gold was dropped onto a clean microscope slide (up to 2 coverslips per slide), coverslips were removed from the 6-well plate using tweezers and inverted onto Prolong Gold, and Prolong Gold was allowed to cure for 24 hours at room temperature. After 24 hours, the edges of coverslips were further sealed with nail polish and coverslips were cleaned with water and wiped dry prior to imaging.
24 Well Format [0441] Plate Coating with PLL. 0.5 mL/well of poly-L-lysine solution (0.1%, SigmaAldrich, cat.no. P8920) was added to 24-well glass-bottom plates (#l.5H), Cellvis, cat.no. P24-1.5H-N and incubated for 1-2 hours at room temperature. PLL was aspirated, the plate was rinsed with 0.5 mL/well of ddLLO three times, water was removed from wells, and plates were dried overnight at room temperature.
[0442] Cell Preparation. Cells were seeded onto PLL coated glass bottom 24 well plates as follows. Cells were pre- washed with PBS and resuspended to ~2, 000, 000 cells/mL in PBS. 20-50 pL of cells were spotted onto the center of each well and allowed to settle for 10-15 minutes at room temperature. Cells were fixed in 0.5 mL/well of fresh fixative (4% formaldehyde in lx PBS) as follow. 500 pL was added to each well, plates were shaked to dislodge poorly attached cells, and incubated for 10 minutes at room temperature. Cells were washed twice with 0.5 mL/well for 5 minutes each with lx PBS, permeabilized in 0.5 mL/well 0.5% Triton X-100, lx PBS for 15 minutes at room temperature, washed with 0.5 mL/well lx PBS three times for 5 minutes each, and stored at 4 °C in lx PBS prior to staining.
[0443] Staining. A blocking buffer containing 2% BSA(from 10% BSA/PBS), 0.05% Tween-20, lx PBS. Cells were blocked with 0.4 mL/well blocking buffer for 30 minutes at room temperature. Primary antibody incubation was carried out as follows. Primary antibodies were diluted in blocking buffer (1:500 for anti-p53BPl, 1:2000 for anti-FLAG), blocking buffer was removed from cells and 300 uL/well of the primary antibody solution was added to cells. Cells were incubated for 2 hours at room temperature and washed three times with 0.5 mL/well IX PBS for 5 minutes each. Samples were protected from light for subsequent steps performed with the secondary antibody labeled with a fluorophore. Secondary antibody incubation was carried out as follows. Secondary antibody diluted in blocking buffer at a ratio of 1:500 was added at 300 uL/well. Cells were incubated for 1.5 hours at room temperature, washed three times with 0.5 mL/well of lx PBS for 5 minutes per wash. Cells were stained with DAPI for visualization of the nucleus by incubating cells in 0.3 mL/well of lx PBS + 100 ng/mL DAPI for 10 minutes at room temperature. Cells were washed once with lx PBS. [0444] Mounting. lOuL drop of Prolong Gold was placed on 12 mm round glass coverslips, PBS was aspirated from wells, coverslips with Prolong Gold were inverted onto cells in a well, and Prolong Gold was allowed to cure for 24 hours at room
temperature.
96 Well Format
[0445] Cell Preparation. Cells were seeded onto coated glass bottom 96 well plates (e.g., PLL-coated plates, CC2 Nunc Micro- well plates) as follows. Cells were pre- washed with PBS and resuspended to -2,000,000 cells/mL in PBS. 10 pL of cells were spotted onto the center of each well and allowed to settle for 10- 15 minutes at room temperature. Cells were fixed in 0.1 mL/well of fresh fixative (4% formaldehyde in lx PBS) as follow. 100 pL was added to each well, plates were shaked to dislodge poorly attached cells, and incubated for 10 minutes at room temperature. Cells were washed twice with 0.1 mL/well for 5 minutes each with lx PBS, permeabilized in 0.1 mL/well 0.5% Triton X- 100, lx PBS for 15 minutes at room temperature, washed with 0.1 mL/well lx PBS three times for 5 minutes each, and stored at 4 °C in lx PBS prior to staining.
[0446] Staining. A blocking buffer containing 2% BS A (from l0% BSA/PBS), 0.05% Tween-20, lx PBS. Cells were blocked with 75 uL/well blocking buffer for 30 minutes at room temperature. Primary antibody incubation was carried out as follows. Primary antibodies were diluted in blocking buffer (1 :500 for anti-p53BPl, 1 :2000 for anti-FLAG), blocking buffer was removed from cells and 75 uL/well of the primary antibody solution was added to cells. Cells were incubated for 2 hours at room temperature and washed three times with 0.1 mL/well IX PBS for 5 minutes each. Samples were protected from light for subsequent steps performed with the secondary antibody labeled with a fluorophore.
Secondary antibody incubation was carried out as follows. Secondary antibody diluted in blocking buffer at a ratio of 1 :500 was added at 75 uL/well. Cells were incubated for 1.5 hours at room temperature, washed three times with 0.1 mL/well of lx PBS for 5 minutes per wash. Cells were stained with DAPI for visualization of the nucleus by incubating cells in 0.1 mL/well of lx PBS + 100 ng/mL DAPI for 10 minutes at room temperature. Cells were washed once with lx PBS.
[0447] Mounting. No mounting was applied for 96 well format. Plate was filled with 0.1 mL/well of lx PBS and stored at 4 °C prior to imaging. Imaging was performed at room temperature with wells filled with lx PBS. EXAMPLE 5
Dose Response Assessment of Nucleases in Multiple Cell Types using p53BPl Analysis
[0448] This example illustrates dose response assessment of nucleases in multiple cell types using p53BPl analysis. Several TALENs (GA6, GA7, AAVS1) were tested for editing efficiency (quantification of the number of target sites with indels over the total number of target sites) and dose dependent generation of double stranded breaks, as determined by imaging for and counting p53BPl foci. TALENs were transfected in cells as described in EXAMPLE 2 and p53BPl was stained for and imaged as described in EXAMPLE 4 and EXAMPLE 1
[0449] TABLE 4 below shows the nuclease designs including the left TALEN arm (bold), the right TALEN arm (italics), and the target sequence (underlined).
TABLE 4 - TALEN Nuclease Constructs
Figure imgf000145_0001
[0450] FIG. 3, FIG. 4, and FIG. 5 illustrate dose response assessments of GA7 TALENs in primary CD34+ hematopoietic stem cells, GA6 TALENs in immortalized K562 cells, and AAVS1 TALENs in immortalized K562 cells. FIG. 3A shows the number of p53BPl foci per cell for CD34+ primary cells treated with a blank transfection control, 0.5 pg GA7 per TALEN monomer, 1 pg GA7 per TALEN monomer, 2 pg GA7 per TALEN monomer, and 4 pg GA7 per TALEN monomer. FIG. 3B shows the total p53BPl content (fluorescence intensity) per nucleus normalized by the nuclear size versus total FLAG tag content per nucleus normalized by the nuclear size indicative of a nuclease for CD34+ primary cells treated with a blank transfection control, 0.5 pg GA7 per TALEN monomer, 1 pg GA7 per TALEN monomer, 2 pg GA7 per TALEN monomer, and 4 pg GA7 per TALEN monomer.
[0451] FIG. 4A shows the number of p53BPl foci per cell for immortalized K562 cells treated with a blank transfection control, 0.5 pg GA6 per TALEN monomer, 1 pg GA6 per TALEN monomer, 2 pg GA6 per TALEN monomer, and 4 pg GA6 per TALEN monomer. FIG. 4B shows the total p53BPl content (fluorescence intensity) per nucleus normalized by the nuclear size versus total FLAG tag content per nucleus normalized by the nuclear size indicative of a nuclease for immortalized K562 cells treated with a blank transfection control, 0.5 pg GA6 per TAI.F.N monomer, 1 pg GA6 per TAI.F.N monomer, 2 pg GA6 per TAFF.N monomer, and 4 pg GA6 per TAFF.N monomer.
[0452] FIG. 5A shows the number of p53BPl foci per cell for immortalized K562 cells treated with a blank transfection control, 0.5 pg AASVl per TALEN monomer, 1 pg AASV1 per TALEN monomer, 2 pg AASVl per TALEN monomer, and 4 pg AASVl per TALEN monomer. FIG. 5B shows the total p53BPl content (fluorescence intensity) per nucleus normalized by the nuclear size versus total FLAG tag content per nucleus normalized by the nuclear size indicative of a nuclease for immortalized K562 cells treated with a blank transfection control, 0.5 pg AASV1 per TALEN monomer, 1 pg GA6, 2 pg AASV1 per TAFF.N monomer, and 4 pg AASV1 per TALEN monomer.
[0453] The corresponding editing efficiency of GA7 TALEN s, GA6 TALENs, and AASV1 TALENS are shown below in TABLE 5.
TABLE 5 - Gene Editing Efficiency
Figure imgf000146_0001
[0454] Nuclease specificity was assessed for each of GA7, GA6, and AAS VI -targeting TALENs by evaluating the impact of nuclease dose on off- target cutting activity. TALENs that exhibited a high number of p53BPl foci, indicative of double stranded breaks, in a dose- dependent manner indicate a nuclease with low specificity. For example, as shown in FIG. 3 CD34+ primary progenitor cells treated with a GA7 targeting TALEN exhibited only minimal increases in the DN A damage response, as indicated by the number of p53BPl foci, as the delivered dose of the TALEN was increased. In contrast, the less specific GA6 (FIG.
4) and AAS VI (FIG. 5)-targeting TALENs resulted in increased off- target activity (increased number of p53BPl foci) as the delivered dose of each of the TALENs was increased in K562 cells. The editing efficiency of each of the TALENs did not markedly change as dose was increased. Thus, examining off-target activity using the p53BPl -based image analysis disclosed herein, was used to optimize the nuclease dosage for low off-target activity while maintaining gene editing efficiency.
EXAMPLE 6
Time Course Assessment of Nuclease Activity using p53BPl Analysis
[0455] This example illustrates a time course assessment of nuclease activity using the p53BPl analysis of the present disclosure. Nuclease specificity was used to study the cellular response to nuclease activity at various times after treatment of immortalized K 562 cells. K562 cells were transfected with mRNA encoding TALENs targeting the AAVS1 DNA locus. Cells were transfected as described in EXAMPLE 2 and p53BPl was stained for and imaged as described in EXAMPLE 4 and EXAMPLE 1. Cells were sampled and imaged at 6 hours, 12 hours, 24 hours, 48 hours, and 72 hours post- transfection. FIG. 6 shows a graph of the number of p53BPl foci per K562 cells at 6 hours, 12 hours, 24 hours, 48 hours, and 72 hours as compared to a control at each time point. The editing efficiency was determined to be 91% at 48 hours tested. Peak activity was observed for the AAVS1 -targeting TALENs at 24 hours, and persisted beyond the 72 hour post-transfection time point. Additionally, an initial increase in the DNA damage response triggered by electroporation was detected in control cells. In a separate experiment, AAS VI -targeting TALENs transfected in CD4+ T cells ceased all activity by 48 hours post-transfection, as shown in FIG. 16. FIG. 16 shows a graph of the number of p53BPl foci per CD4+ T cell at 24 hours and 48 hours post- transfection with AAS VI -targeting TALENs as compared to blank transfection controls at each time point.
EXAMPLE 7
Utility of p53BPl Analysis for Pan-Cell Type Assessment of AAVSl-Targeting TALEN
Specificity
[0456] This example illustrates the utility of p53BPl analysis of the present disclosure for pan-cell type assessment of AAVS1 -targeting TALEN specificity. To demonstrate that nuclease specificity as determined by p53BPl analysis can be measured across several cell types, TALENs targeting AAVS1 region were transfected in adherent immortalized A549 cells, suspension immortalized K562 cells, and primary cell samples isolated from blood including CD34+ progenitor cells and CD4+ T cells. Non-T cells were transfected as described in EXAMPLE 2, T cells were transfected as described in EXAMPLE 3, and p53BPl was stained for and imaged as described in EXAMPLE 4 and EXAMPLE 1. All cells were transfected with 2 mRNAs encoding the respective TALEN monomers (one targeting a top strand of the target DNA genomic locus and the second targeting a bottom strand of the target DNA genomic locus). Cells were sampled for evaluation of p53BPl foci 24 hours post-transfection.
[0457] FIG. 7 shows the results of control transfection and AAS VI -targeting TALEN transfection in various cell types. FIG. 7A shows the number of p53BPl foci in adherent immortalized A549 cells transfected with a control and with an AAS VI -targeting TALEN 24 hours post- transfection. FIG. 7B shows the number of p53BPl foci in suspension
immortalized K 562 cells transfected with a control and with an AAS VI -targeting TALEN 24 hours post- transfection. FIG. 7C shows the number of p53BPl foci in primary CD34+ progenitor cells transfected with a control and with an AAS VI -targeting TALEN 24 hours post- transfection. FIG. 7D shows the number of p53BPl foci in primary CD4+ T cells transfected with a control and with an AAS VI -targeting TALEN 24 hours post- transfection. FIG. 7E shows representative images of cells treated with AAVS1 TALENs versus untreated controls. Cells were stained for p53BPl with an antibody and are visualized in green.
TALENs were stained with a FLAG tag and are visualized in red. Nuclei were stained with DAPI and are visualized in grey. The scale bar indicates a size of 5 pm.
[0458] TABLE 6 below shows the gene editing efficiency of AAVS1 -targeting TALENs in A549 cells, K562 cells, CD34+ cells, and CD4+ T cells.
TABLE 6 - Gene Editing Efficiency of AA S1 -targe ting TALENs in A549 cells, K562 cells, CD34+ cells, and CD4+ T cells
Figure imgf000148_0001
[0459] All cells exhibited an increase in the number of p53BPl DNA repair foci upon treatment with TALENs in comparison to untreated controls. Moreover, p53BPl image analysis revealed differences in the level of background DNA repair activity as well as the magnitude of response to nuclease treatment between different cell types. EXAMPLE 8
Utility of p53BPl Analysis for Pan-Nuclease Type Assessment of Genome Editing
Specificity
[0460] This example illustrates the utility of p53BPl analysis for pan-nuclease type assessment of genome editing specificity. To demonstrate that nuclease specificity as determined by p53BPl analysis can be measured across various types of nucleases, TALENs and Cas9 nucleases targeting the AAVS1 genomic locus were transfected in K 562 cells. For Cas9 treatment, K562 cells were transfected with Cas9 protein along with AAVS1 -targeting guide RNAs and incubated at 37°C for 24 hours prior to sampling. For treatment with TALENs, K562 cells were transfected with 2 mRNAs encoding the respective TALEN monomers (one targeting a top strand of the target DNA genomic locus and the second targeting a bottom strand of the target DNA genomic locus) and incubated at 30 °C for 24 hours prior to sampling. Cells were transfected as described in EXAMPLE 2 and p53BPl was stained for and imaged as described in EXAMPLE 4 and EXAMPLE 1.
[0461] FIG. 8 illustrates assessment of nuclease specificity in K 562 cells for TALENs and Cas9 nucleases targeting the AAVS1 genomic locus. FIG. 8A illustrates the number of p53BPl foci per cell for K562 cells transfected with Cas9 protein along with AAVS 1 guide RNAs as compared to a blank transfection control. FIG. 8B illustrates the number of p53BPl foci per cell for K562 cells transfected with AAVS1 -targeting TALENs as compared to a blank transfection control.
[0462] TABLE 7 below shows the editing efficiency of AAVS1 -targeting Cas9 and AAVS1- targeting TALENs.
TABLE 7 - Editing Efficiency of AAVSl-Targeting Cas9 and TALENs
Figure imgf000149_0001
[0463] Both Cas9 and TALENs produced measurable DNA damage responses as indicated by the increased number of p53BPl foci relative to the untreated controls. EXAMPLE 9
Utility of p53BPl Analysis for Assessing Nuclease Activity in Diverse Cell Types and
Several Genomic Loci
[0464] This example illustrates the utility of p53BPl analysis for assessing nuclease activity in diverse cell types targeting various genomic loci. To demonstrate that nuclease specificity as determined by p53BPl analysis can be used to screen multiple nucleases in diverse cell types, the performance of TALENs targeting GA6, AAVS 1, and GA7 in CD34+ progenitor cells and the performance of TALENs targeting TP 150, AAVS1, and TP171 in stimulated CD4+ T cells was evaluated. Non-T cells were transfected as described in EXAMPLE 2, T cells were transfected as described in EXAMPLE 3, and p53BPl was stained for and imaged as described in EXAMPLE 4 and EXAMPLE 1. The performance of GA6 and GA7- targeting TALENs with a homodimeric Fokl nuclease domain was compared to TALENs with the obligate heterodimeric ELD/KKR Fokl nuclease domains (GA6-EK and GA7-EK) in primary CD34+ progenitor cells.
[0465] FIG. 9 shows the DNA damage response, as measured by p53BPl foci quantification, in CD34+ cells and T cells with TALENs targeting various genomic loci. FIG. 9 A shows the number of p53BPl foci per cell in primary CD34+ progenitor cells after transfection with GA6-targeting TALENs, AAVS1 -targeting TALENs, GA7-targeting TALENs, GA6-EK- targeting TALENs, and GA7-targeting TALENs. Controls include blank transfection controls. FIG. 9B shows the number of p53BPl foci per cell in primary stimulated CD4+ T cells after transfection with TPl50-targeting TALENs, AAVS1 -targeting TALENs, and TPl7l-targeting TALENs. Controls include non- electroporated naive T cells, non- electroporated stimulated T cells, and untreated blank transfection control stimulated T cells.
[0466] TABLE 8 below shows the editing efficiency of several TALENs targeting different genomic loci after transfection of primary CD34+ progenitor cells.
TABLE 8 - Editing Efficiency of TALENs in Primary CD34+ Progenitor Cells
Figure imgf000150_0001
[0467] TABLE 9 below shows the editing efficiency of several TALENs targeting different genomic loci after transfection of CD4+ T cells.
TABLE 9 - Editing Efficiency of TALENs in CD4+ T cells
Figure imgf000151_0001
[0468] Determination of nuclease specificity by p53BPl foci analysis showed a range of cell responses to different nucleases, from minimal activation of DNA repair with more specific GA7-EK TALEN activity to substantially higher levels of DNA repair with less specific GA6 TALEN activity.
EXAMPLE 10
Use of p53BPl Analysis for Improving Nuclease Design
[0469] This example illustrates the use of p53BPl analysis for improving nuclease design. Specificity was assessed using the p53BPl tools and methods of analysis of the present disclosure to evaluate different designs of nucleases targeting the same genomic locus. Non-T cells were transfected as described in EXAMPLE 2 and p53BPl was stained for and imaged as described in EXAMPLE 4 and EXAMPLE 1.
[0470] K562 cells were transfected with GA6-targeting TALENs having homodimeric Fokl nuclease domains (GA6) or GA6-targeting TALENs with the obligate heterodimeric
ELD/KKR Fokl nuclease domains (GA6 EK). ELD Fokl has a sequence of
OT VK SET EEKK SET RHK T K YVPHFY1F1 TET ARN STODRTT EMK VMEFFMK VYGYRG KHLGGSRKPDGAIYT V GSPID YGVI VD TK A Y S GGYN LPI GQ ADEMERY VEEN Q TRD KHLNPNEWWK VYP S S VTEFKFLF VSGHFKGNYK AQ LTRLNFH TN CN GA VLS VEELLI GGEMIK AGTLTLEEVRRKFNN GEINFRS (SEQ ID NO: 1066) and KKRFokI has a sequence of
OT VK SET EEKK SET RHK T K YVPHFY1F1 TET ARN STODRTT EMK VMEFFMK VYGYRG KHLGGSRKPDGAIYT V GSPID YGVI VD TK AY S GGYN LPI GQ ADEMQR Y VKEN Q TRN KHINPNEWWK VYP S S VTEFKFLF VS GHFK GN YK AQL TRLN RK TN CN GAVLS VEELLI GGEMIK AGTLTLEEVRRKFNN GEINFRS (SEQ ID NO: 1067).
[0471] FIG. 12 shows the number of p53BPlfoci per cell in K 562 cells transfected with GA6 or GA6 EK TALENs. [0472] TABLE 11 below shows the genome editing efficiency of GA6 and GA6 EK.
TABLE 11 - Genome Editing Efficiency of GA6 and GA6 EK
Figure imgf000152_0001
[0473] The results showed substantial off-target activity by GA6 (TALEN with homodimeric Fokl), as evident from the large number of p53BPl foci formed in response to transfection and also showed the high specificity of GA6 EK (TALEN with heterodimeric Fokl).
[0474] In another experiment, the p53BPl tools and methods of analysis of the present disclosure were used to evaluate the contribution of individual components of a nuclease. For example, the specificity of individual monomers of GA6 TALEN (GA6 L (left TALEN) and GA6 R (right TALEN)) was measured in K562 cells and compared GA6 homodimers (GA6 LR (left and right TALEN s)) and a blank transfection control. Cells were transfected with mRNA encoding either GA6 L, GA6 R, or both GA6 L + GA6 R (GA6 LR) and incubated at 30°C for 24 hours prior to sampling. FIG. 11 shows the number of p53BPl foci per cell in K562 cells transfected with GA6 L, GA6 R, GA6 LR versus untreated control cells. The genome editing efficiency of GA6 LR was 54%. The genome editing efficiencies of the individual monomers of the GA6 TALEN was 0% for GA6 L and GA6 R
[0475] The results demonstrated substantial off- target DNA cutting by the GA6 homodimer, as evident from a large number of phospho-53BPl foci forming in response to TALEN treatment. At the same time, it was evident that the GA6 L monomer alone contributed to the lack of specificity, being responsible for the majority of nuclease-induced DNA repair response while failing to produce DNA cleavage at the target site. Thus, it was possible to pinpoint the component responsible for the lack of nuclease specificity and guide design efforts in order to reduce off-target activity.
[0476] In another experiment, nuclease performance was optimized by varying the length of the DNA binding domain in a homodimeric Fokl GA6-targeting TALEN. As described above, the GA6 L monomer appeared responsible for the lack of specificity and high number of p53BPl foci per cell, as shown in FIG. 11. To investigate if the specificity of the homodimeric Fokl GA6-targeting TALEN could be improved, the DNA binding domain was extended from 14 repeat units (GA6 L14) to 17 repeat units (GA6 L17) and 19 repeat units (GA6 L19). FIG. 10 shows the number of p53BPl foci per cell in K562 cells transfected with GA6 L14, GA6 L17, and GA6 L19.
[0477] TABLE 12 below shows the nuclease designs including the left TALEN arm (bold), the right TALEN arm (italics), and the target sequence (underlined).
TABLE 12 - TALEN Nuclease Constructs
Figure imgf000153_0001
[0478] TABLE 13 below shows the genome editing efficiency of each GA6 L monomer with its corresponding GA6 R monomer.
TABLE 13 - Genome Editing Efficiency
Figure imgf000153_0002
[0479] Assessment of p53BPl foci showed that as the TALEN was tuned to have longer DNA binding domains, there was a dramatic reduction in oft- target activity. At the same time, when combined with a match GA6 R monomer, GA6 L19 still exhibited unperturbed, high on-target editing efficiency.
EXAMPLE 11
Multiplexed p53BPl, FLAG, and Nano-FISH Staining and Analysis Use of p53BPl Analysis and Nano-FTSH to Dissect On-Target versus Off-Target Activity of Nucleases for Genome Editing
[0480] This example illustrates multiplexed p53BPl, FLAG, and Nano-FISH staining and analysis and the use of p53BPl analysis and Nano-FISH to dissect on-target and off-target activity of nucleases for genome editing. Multiplexed p53BPl, FLAG, and Nano-FISH Staining and Analysis
[0481] Nuclease specificity was assessed in a site-specific manner at the genomic locus of interest by imaging and analyzing nuclease (tagged with FLAG) induced double strand breaks (indicated by staining for p53BPl) at a particular genomic locus of interest, which is visualized by oligonucleotide Nano-FISH probe sets.
[0482] Cell Preparation. Cells were prepared for co-staining by seeding onto poly-l-lysine coated #1.5 glass coverslip (12 mm round or 18 mm square). Coverslips were placed into each well of a 6-well tissue culture plate, cells were prewashed with PBS and resuspended to ~2, 000, 000 cells/mL in PBS. Cells were spotted (50-100 ul) onto the center of each coverslip and cells were allowed to settle for 10-15 minutes at room temperature. Cells were fixed in 2mL/well with fresh fixative (4% formaldehyde in lx PBS) and incubated for 10 minutes at room temperature. Cells were washed twice with 3 mL/well of lx PBS, each over 5 minutes. Cells were permeabilized in 2 mL/well 0.5% Triton X-100, lx PBS for 15 minutes at room temperature, cells were washed twice with 3 mL/well of lx PBS for 5 minutes each, cells were incubated with 1.5 mL/well 0.1M HC1 for 4 minutes at room temperature, and cells were washed twice with 3 mL/well of 2x SSC over 5 minutes. Cells were incubated in 1.5 mL/well of 2x SSC + 25 ug/mL RNase A for 30 minutes at 37 °C, washed twice with 3 mL/well of 2x SSC, for 5 minutes each. Finally, cells were pre- equilibrated with 1.5 mL/well of 50% Formamide, 2x SSC [pH 7] for at least 30 minutes at room temperature prior to denaturation.
[0483] Denaturation/Hybridization. Denaturation solution (70% formamide, 2x SSC) was added at 3 mL/well in a new 6-well plate and the well-plate was heated for at least 30 minutes on a hotplate set to 78° C. Denaturation was carried out as follows. Coverslips were transferred into the well plate with preheated denaturation solution and incubated for 4.5 minutes at 78°C, then immediately transferred onto hybridization solution. All subsequent steps were carried out so that samples were protected from light.
Hybridization solution with oligonucleotide Nano-FISH probes was prepared as follows.
A hybridization buffer containing 50% formamide, 10% dextran sulfate, 0.05% Tween- 20, 2x SSC. Oligonucleotides Nano-FISH probes at a concentration of 10 uM were diluted in Hybridization buffer at a ratio of 1:40, such that the final concentration was 250 nM. Oligonucleotide Nano-FISH probes were synthesized to include the Quasar-670 dye, which was imaged in the Cy5 channel. A humidified chamber was set up by placing a sheet of Parafilm onto a wet paper towel inside a dark plastic container. On a sheet of Parafilm, Hybridization solution was spotted at a volume of 80 ul. Hybridization was carried out by removing coverslips from the denaturation solution, inverting onto Hybridization solution spots inside the humidified chamber, and incubating overnight at 37° C.
[0484] TABLE 10 below shows the oligonucleotide Nano-FISH probe set for AAVS1.
TABLE 10 - AAVS1 Olignucleotide Nano-FISH Probe Set
Figure imgf000155_0001
Figure imgf000156_0001
Figure imgf000157_0001
Figure imgf000158_0001
[0485] Post-hybridization washes. Coverslips were transferred from the humidified chamber into a new 6- well plate filled with 3 mL/well of 2x SSC and the plate was gently rocked to mix the remaining hybridization solution with SSC. SSC was aspirated and cells were washed with 3 mL/well of 2xSSC three times, each for 10 minutes, at room
temperature. Cells were washed twice with 0.2x SSC, 0.2% Tween-20 with 2 mL/well of wash buffer on a digital hot plate set to 56°C for 7 minutes. Cells were washed with 2 mL/well of 4x SSC, 0.2% Tween-20 for 5 minutes at room temperature and cells were subsequently washed twice with 2x SSC for 5 minutes per wash.
[0486] IF Staining for p53BPl and FLAG. Blocking buffer was prepared containing 2% BSA (from 10% BSA/PBS), 0.05% Tween-20, lx PBS. Cells were blocked with 1.5 mL/well of blocking buffer in a 6-well plate for 30 minutes at room temperature. Primary antibody incubation was carried out by first diluting the primary antibody in a blocking buffer at the following ratios: 1:500 for anti-p53BPl, 1:2000 for anti-FLAG. A humidified chamber was prepared and on a sheet of Parafilm inside the humidified chamber, 100 ul spots of primary antibody solution was placed. Coverslips were removed from the 6-well plate, inverted onto primary antibody spots, and incubated for 2 hours at room
temperature. Coverslips were returned into the original 6-well plate with blocking buffer and cells were washed three times with 3 mL/well of lx PBS for 5 minutes each.
Secondary antibody incubation was carried out by first diluting secondary antibodies (donkey-anti -rabbit- AF 488 and donkey- anti-mouse-AF594) in blocking buffer at a ratio of 1:500. On a new sheet of Parafilm inside the humidified chamber, secondary antibody solution was spotted at a volume of 100 ul. Coverslips were removed from the 6-well plate, inverted onto the secondary antibody spots, and incubated for 1.5 hours at room temperature. Coverslips were returned into the original 6-well plate and cells were washed three times with 3 mL/well of lx PBS for 5 minutes each. Cells were stained with DAPI to visualize the nuclease by incubating cells in 1.5 mL/well of lx PBS + 100 ng/mL DAPI for 10 minutes at room temperature and cells were washed once with lx PBS.
[0487] Mounting. Prolong Gold was placed at 10 ul drops onto pre-cleaned microscope slide. Coverslips were removed from the 6-well plate with tweezers, inverted onto Prolong Gold, and allowed to cure for 24 hours at room temperature. After 24 hours, coverslips were further sealed with nail polish, cleaned with water, and wiped dry prior to imaging.
Use of p53BPl Analysis and Nano-FISH to Dissect On-Target versus Off-Target Activity of Nucleases for Genome Editing
[0488] The combination of Nano-FISH imaging methods and p53BPl imaging disclosed herein allows for in situ visualization of on-target versus off-target nuclease cutting activity. Fluorophore-conjugated oligonucleotide Nano-FISH probes were designed to hybridize to a target DN A genomic locus of interest. K 562 cells were transfected with AAVS1 -targeting TAT UN for 24 hours as described in EXAMPLE 2. A fluorescently labeled Nano-FISH oligonucleotide probe was allowed to hybridize to the AAVS1 genomic locus in K 562 cells and cells were additionally stained for p53BPl, as described above.
[0489] FIG. 13 shows fluorescence microscopy images of control cells and AAVS1- targeting TALEN treated cells. A DAPI stain (gray) was used to visualize nuclei, p53BPl is shown in green and the AAVS1 oligonucleotide Nano-FISH probe was visualized in red. Imaging showed that in cells transfected with AAVS 1 -targeting TALEN, spots indicative of double stranded breaks (indicated by p53BPl foci) co-localized with AAVS1 oligonucleotide Nano-FISH probe spots. These results showed that the AAVS1 -targeting TALEN exhibited nuclease specificity, as confirmed by co-localization of DNA repair signals at the genomic locus of interest.
[0490] After imaging at high magnification on a fluorescence microscope, the pairwise distances between all AAVS1 Nano-FISH spots and p53BPl foci were measured and quantified. FIG. 14 shows histograms of the proportion of pairwise distances between AAVS1 Nano-FISH spots and p53BPl foci. FIG. 14A shows histograms of control and AAVS1 TALEN treated cells at pairwise distances of 0.1 to 0.5. FIG. 14B shows histograms of control and AAVS1 TALEN treated cells at pairwise distances of 0 to 0.025. FIG. 14C shows histograms of control and AAVS1 TALEN treated cells at pairwise distances of 0 - -0.08. Histograms showed a significantly higher co-location between AAVS1 loci and sites of DNA repair in TALEN-treated cells relative to untreated control cells. Thus, the combination ofNano-FISH and p53BPl foci visualization enable the measurement of off- target activity (the number of p53BPl foci not co-localized with their target genomic loci).
EXAMPLE 12
Use of p53BPl Analysis for Diverse Micro Imaging Platforms and Small Cell Samples
[0491] This example illustrates the use of p53BPl analysis for diverse micro imaging platforms and small cell samples. Nuclease specificity has also been determined using the compositions and methods described herein in on several types of imaging platforms and in smaller sample sizes. Samples were imaged using a Nikon microscope or the Stellar- Vision microscope, as described in EXAMPLE 1.
[0492] FIG. 15 shows evaluation of nuclease specificity by counting p53BPl foci in cells transfected with AAVS1 -targeting TALENs. FIG. 15A illustrates the number of p53BPl foci on the x axis versus the proportion of cells with p53BPl foci on the y-axis in cells transfected with AAVS1 -targeting TALENs and, in 3D, imaged on a Nikon widefield fluorescence microscope with a 60x magnification lens using oil immersion contact techniques. “Ref’ samples indicate control cells that were not transfected with TALENs. Biological replicates are shown for control and transfected cells (indicated by set x). The number of cells analyzed in each sample is indicated by“n”
[0493] FIG. 15B illustrates the number of p53BPl foci on the x axis versus the proportion of cells with p53BPl foci on the y-axis in cells transfected with AAVS1 -targeting TALENs and imaged, in 3D, on a Nikon widefield fluorescence microscope with a 40x magnification lens using non- contact techniques. “Ref’ samples indicate control cells that were not transfected with TALENs. Biological replicates are shown for control and transfected cells. The number of cells analyzed in each sample is indicated by“n.”
[0494] FIG. 15C illustrates the number of p53BPl foci on the x axis versus the proportion of cells with p53BPl foci on the y-axis in cells transfected with AAVS1 -targeting TALENs and imaged on a Stellar- Vision (SV) fluorescence microscope using non-contact techniques.
“Ref’ samples indicate control cells that were not transfected with TALENs. Biological replicates are shown for control and transfected cells. The number of cells analyzed in each sample is indicated by“n.”
[0495] TABLE 14 below shows p values from several statistical tests including a t-test, Kolmogorov- Smirnov (KS) test, and Wilcoxon- smith (WS) test comparing of p53BPl spots in transfected cells and control cells.
TABLE 14
Figure imgf000161_0001
[0496] TABLE 15 below shows p-values from a t- test comparing p53BPl spots in transfected cells and control cells for different sample sizes. The results below show a high degree of statistical significance even when analyzing a small number of cells across all imaging modalities. These results demonstrated the utility of using p53BPl analysis for clinically relevant applications that involve the use of small sample sizes to screen nucleases for lead candidates.
TABLE 15
Figure imgf000161_0002
EXAMPLE 13
Screening of Nucleases for Specificity
[0497] This example illustrates screening of nucleases for a nuclease with high specificity using the compositions and methods disclosed herein for staining, imaging, and analyzing a protein (e.g., p53BPl) that accumulates at the site of a double strand break. Several nucleases of various types (e.g., TALENS, Cas9) are screened for nuclease specificity in immortalized cells (e.g., K562, A549) and primary cells (e.g., CD34+ progenitor cells, naive or stimulated T cells). Nucleases are transfected in immortalized or primary cells, as described in
EXAMPLE 2 or EXAMPLE 3. Cells are stained for p53BPl using the methods as set forth in EXAMPLE 4. Imaging, image analysis, and informatics is carried out using the methods set forth in EXAMPLE 1. p53BPl foci are automatically counted and plotted against a parameter of interest for each nuclease (dose of nuclease, RVD length, etc.). Nuclease specificity is assessed for each nuclease tested by quantifying the total p53BPl load (e.g., number of protein foci or total protein content within the nucleus). A high p53BPl load indicates nucleases with relatively poor specificity. A lower p53BP load indicates nucleases with better specificity.
EXAMPLE 14
Confirming Specificity of Genome Editing with a Nuclease
[0498] This example illustrates confirming specificity of genome editing with a nuclease. A genome editing complex comprising a nuclease (e.g., TALENs, zinc finger nucleases (ZFNs), or CRISPR/Cas9) targeting a therapeutic gene of interest for genome editing is transfected in immortalized or primary cells as set forth in EXAMPLE 2 or EXAMPLE 3. The nuclease induces double stranded breaks. Cells are stained and analyzed as described in EXAMPLE 10 with an oligonucleotide Nano-FISH probe set for the particular genomic locus of the therapeutic gene of interest and for p53BPl, indicative of double strand breaks induced by the nuclease. Cells are imaged and analyzed as described in EXAMPLE 1. Co- localization of oligonucleotide Nano-FISH probes and all double strand breaks is observed, indicating a nuclease with high specificity and no off target activity.
EXAMPLE 15
Screening of Epigenomic Repressors for Specificity
[0499] This example illustrates screening of repressors for a repressor with high specificity using the compositions and methods disclosed herein for staining, imaging, and analyzing a protein (e.g., KAP1, H3K9me3 or HPl) that accumulates at the site of repression (e.g., by KRAB). Repressors of various types (e.g., KRAB, Sin3a, LSD1, SUV39H1, G9A (EHMT2), DNMT1, DNMT3A-DNMT3L, DNMT3B, KOX, TGF-beta-inducible early gene (TIEG), v- erbA, SID, MBD2, MBD3, Rb, or MeCP2) are screened for specificity in immortalized cells (e.g., K562, A549) and primary cells (e.g., CD34+ progenitor cells, naive or stimulated T cells). Repressors coupled to a binding domain (e.g., RVDs for TALENs, guideRNAs for CRISPR/dCas9 systems) are transfected in immortalized or primary cells, as described in EXAMPLE 2 or EXAMPLE 3. Cells are stained for a protein (e.g., KAP1) using the methods as set forth in EXAMPLE 4 with antibodies specific to the protein. Imaging, image analysis, and informatics is carried out using the methods set forth in EXAMPLE 1. Protein (e.g., KAP1) foci are automatically counted and plotted against a parameter of interest for each repressor (e.g., dose of repressor, RVD length, etc.). Repressor specificity is assessed for each repressor tested by counting for protein (e.g., KAP1) foci. A high number of protein (e.g., KAP1) foci indicate repressors with relatively low specificity. A lower number of protein (e.g., KAPl) foci indicate repressors with better specificity. Site-specific detection of proteins such as H3K9me3 or HPl can be confirmed by combination imaging with Nano- FISH, as described in EXAMPLE 10.
EXAMPLE 16
Detecting Chromosomal Translocation Events Using p53BPl Foci Analysis
[0500] This example illustrates the detection of translocation events using the image-based analyses of p53BPl load disclosed herein. A genome editing complex (e.g., TALEN, ZFN, CRISPR/Cas9, megaTAL, meganuclease) is transfected to an immortalized or primary cell, as described in EXAMPLE 2 or EXAMPLE 3. Cells are stained for p53BPl as described in EXAMPLE 4 with a first detectable agent and subsequently administered a oligonucleotide Nano-FISH probe set with a second detectable agent for the target genomic locus and a different oligonucleotide Nano-FISH probe set with a third detectable agent for an off-target genomic locus. Samples are imaged as set forth in EXAMPLE 1. Foci of p53BPl are visualized by signal from the first detectable agent, indicating a double strand break and gene editing with the genome editing complex. Foci of the first oligonucleotide Nano-FISH probe set are visualized by signal from the second detectable agent, indicating the target genomic locus. Foci of the second oligonucleotide Nano-FISH probe set are visualized by signal from the third detectable agent, indicating the off- target genomic locus. In the absence of a translocation event, co-localization of the signal from the first detectable agent and the second detectable agent is observed, indicating co-localization of p53BPl with the oligonucleotide Nano-FISH probe set for the target genomic locus. When chromosomal translocation occurs, co-localization of the signal from the first detectable agent, the second detectable agent, and the third detectable agent is observed, indicating co-localization of p53BPl with the oligonucleotide Nano-FISH probe set for the target genomic locus and the oligonucleotide Nano-FISH probe set for the off-target genomic locus. .
EXAMPLE 17
Determining Specificity of Genome Editing with a Transthyretin (TTR)-Targeting
Nuclease
[0501] This example illustrates determining specificity of genome editing with a transthyretin (TTR)-targeting nuclease. A genome editing complex (e.g., TALEN, ZFN, CRISPR/Cas9, megaTAL, meganuclease) targeting TTR is transfected in immortalized or primary cells as set forth in EXAMPLE 2 or EXAMPLE 3. The nuclease induces double stranded breaks. Cells are stained and analyzed as described in EXAMPLE 11 with an oligonucleotide Nano- FISH probe set for TTR and for p53BPl, indicative of double strand breaks induced by the nuclease. Cells are imaged and analyzed as described in EXAMPLE 1. Co-localization of signal from oligonucleotide Nano-FISH probes and p53BPl is quantified to determine the specificity of the nuclease for TTR and any off-target activity of the nuclease. A nuclease with high specificity for TTR and low to none off-target activity is used to administer in a subject in need thereof The subject has transthyretin amyloidosis (ATTR).
EXAMPLE 18
Determining Specificity of Genome Editing with a CCR5-Targeting Nuclease
[0502] This example illustrates determining specificity of genome editing with a CCR5- targeting nuclease. A genome editing complex (e.g., TALEN, ZFN, CRISPR/Cas9, megaTAL, meganuclease) targeting CCR5 is transfected in immortalized or primary cells as set forth in EXAMPLE 2 or EXAMPLE 3. The nuclease induces double stranded breaks. Cells are stained and analyzed as described in EXAMPLE 11 with an oligonucleotide Nano- FISH probe set for CCR5 and for p53BPl, indicative of double strand breaks induced by the nuclease. Cells are imaged and analyzed as described in EXAMPLE 1. Co-localization of signal from oligonucleotide Nano-FISH probes and p53BPl is quantified to determine the specificity of the nuclease for CCR5 and any off-target activity of the nuclease. A nuclease with high specificity for CCR5 and low to none off-target activity is used to administer in a subject in need thereof The subject has HIV.
EXAMPLE 19
Determining Specificity of Genome Editing with a Glucocorticoid Receptor (NR3C1)-
Targeting Nuclease
[0503] This example illustrates determining specificity of genome editing with a
glucocorticoid receptor (NR3 C l )- targeting nuclease. A genome editing complex (e.g., TALEN, ZFN, CRISPR/Cas9, megaTAL, meganuclease) targeting NR3C 1 is transfected in immortalized or primary cells as set forth in EXAMPLE 2 or EXAMPLE 3. The nuclease induces double stranded breaks. Cells are stained and analyzed as described in EXAMPLE 11 with an oligonucleotide Nano-FISH probe set for NR3C1 and for p53BPl, indicative of double strand breaks induced by the nuclease. Cells are imaged and analyzed as described in EXAMPLE 1. Co-localization of signal from oligonucleotide Nano-FISH probes and p53BPl is quantified to determine the specificity of the nuclease for NR3C1 and any off- target activity of the nuclease. A nuclease with high specificity for NIG C l and low to none off-target activity is used to administer in a subject in need thereof The subject has glioblastoma multiforme.
EXAMPLE 20
Determining Specificity of Genome Editing with a TRA-Targeting Nuclease and/or a
CD52-Targeting Nuclease
[0504] This example illustrates determining specificity of genome editing with a TRA- targeting nuclease and/or a CD52-targeting nuclease. A genome editing complex (e.g., TALEN, ZFN, CRISPR/Cas9, megaTAL, meganuclease) targeting TRA and a genome editing complex (e.g., TALEN, ZFN, CRISPR/Cas9, megaTAL, meganuclease) targeting CD52 are transfected in immortalized or primary cells as set forth in EXAMPLE 2 or EXAMPLE 3. The nuclease induces double stranded breaks. Cells are stained and analyzed as described in EXAMPLE 1 with an oligonucleotide Nano-FISH probe set for TRA and/or CD52 and for p53BPl, indicative of double strand breaks induced by the nuclease. Cells are imaged and analyzed as described in EXAMPLE 1. Co- localization of signal from oligonucleotide Nano-FISH probes and p53BPl is quantified to determine the specificity of the nuclease for TRA and/or CD52 and any off-target activity of the nuclease. A nuclease with high specificity for TRA and/or CD52 and low to none off-target activity is used to administer to cells ex vivo to generate a universal T cell therapy, to be administered to a subject in need thereof The subject has a cancer, such as acute lymphoblastic leukemia or acute myeloid leukemia.
EXAMPLE 21
Determining Specificity of Genome Editing with a Nuclease Targeting the Erythroid
Specific Enhancer of BCL11A
[0505] This example illustrates determining specificity of genome editing with a nuclease targeting the erythroid specific enhancer ofBCLUA. A genome editing complex (e.g., TALEN, ZFN, CRISPR/Cas9, megaTAL, meganuclease) targeting the erythroid specific enhancer of BCL11 A is transfected in immortalized or primary cells as set forth in
EXAMPLE 2 or EXAMPLE 3. The nuclease induces double stranded breaks. Cells are stained and analyzed as described in EXAMPLE 11 with an oligonucleotide Nano-FISH probe set for the erythroid specific enhancer of BCL11 A and for p53BPl, indicative of double strand breaks induced by the nuclease. Cells are imaged and analyzed as described in EXAMPLE 1. Co- localization of signal from oligonucleotide Nano-FISH probes and p53BPl is quantified to determine the specificity of the nuclease for the erythroid specific enhancer ofBCLUA and any off- target activity of the nuclease. A nuclease with high specificity for the erythroid specific enhancer ofBCLUA and low to none off- target activity is used to engineer hematopoietic stem cells ex vivo, to be administered to a subject in need thereof The subject has beta- thalassemia or sickle cell disease.
EXAMPLE 22
Determining Specificity of Genome Editing with a Nuclease to Insert Alpha-L
Iduronidase (IDUA)
[0506] This example illustrates determining specificity of genome editing with a nuclease disclosed herein to insert alpha-L iduronidase (IDUA). A genome editing complex (e.g., TALEN, ZFN, CRISPR/Cas9, megaTAL, meganuclease) targeting a desired genomic locus for insertion of an ectopic nucleic acid encoding for IDUA is transfected in immortalized or primary cells as set forth in EXAMPLE 2 or EXAMPLE 3. The nuclease induces double stranded breaks to insert a functional IDUA gene. Cells are stained and analyzed as described in EXAMPLE 11 with an oligonucleotide Nano-FISH probe set for IDUA and for p53BPl, indicative of double strand breaks induced by the nuclease. Cells are imaged and analyzed as described in EXAMPLE 1. Co-localization of signal from oligonucleotide Nano-FISH probes and p53BPl is quantified to determine the specificity of the nuclease and any off- target activity of the nuclease. A nuclease with high and low to none off-target activity is used to administer in a subject in need thereof The subject has MPSI.
EXAMPLE 23
Determining Specificity of Genome Editing with a Nuclease to Insert Iduronate-2-
Sulfatase (IDS)
[0507] This example illustrates determining specificity of genome editing with a nuclease disclosed herein to insert iduronate-2-sulfatase (IDS). A genome editing complex (e.g., TALEN, ZFN, CRISPR/Cas9, megaTAL, meganuclease) targeting a desired genomic locus for insertion of an ectopic nucleic acid encoding for IDS is transfected in immortalized or primary cells as set forth in EXAMPLE 2 or EXAMPLE 3. The nuclease induces double stranded breaks to insert a functional IDS gene. Cells are stained and analyzed as described in EXAMPLE 11 with an oligonucleotide Nano-FISH probe set for IDS and for p53BPl, indicative of double strand breaks induced by the nuclease. Cells are imaged and analyzed as described in EXAMPLE 1. Co-localization of signal from oligonucleotide Nano-FISH probes and p53BPl is quantified to determine the specificity of the nuclease and any off- target activity of the nuclease. A nuclease with high specificity and low to none off-target activity is used to administer in a subject in need thereof The subject has MPSII.
EXAMPLE 24
Determining Specificity of Genome Editing with a Nuclease to Insert Factor IX
[0508] This example illustrates determining specificity of genome editing with a nuclease to insert Factor LX. A genome editing complex (e.g., TALEN, ZFN, CRISPR/Cas9, megaTAL, meganuclease) targeting a desired genomic locus for insertion of an ectopic nucleic acid encoding for Factor 9 is transfected in immortalized or primary cells as set forth in
EXAMPLE 2 or EXAMPLE 3. The nuclease induces double stranded breaks to insert a functional Factor 9 gene. Cells are stained and analyzed as described in EXAMPLE 11 with an oligonucleotide Nano-FISH probe set for Factor 9 and for p53BPl, indicative of double strand breaks induced by the nuclease. Cells are imaged and analyzed as described in
EXAMPLE 1. Co-localization of signal from oligonucleotide Nano-FISH probes and p53BPl is quantified to determine the specificity of the nuclease and any off- target activity of the nuclease. A nuclease with high specificity and low to none off-target activity is used to administer in a subject in need thereof The subject has Hemophilia B.
EXAMPLE 25
Determining Specificity of Genome Editing with a PDCD 1-Targeting Nuclease, a TRA- Targeting Nuclease, and/or a TRB-Targeting Nuclease
[0509] This example illustrates determining specificity of genome editing with a PDCD 1- targeting nuclease, a TRA-target nuclease, and/or a TRB-targeting nuclease. A genome editing complex (e.g., TALEN, ZFN, CRISPR/Cas9, megaTAL, meganuclease) targeting PDCD1, TRA, and/or TRB is transfected in immortalized or primary cells as set forth in
EXAMPLE 2 or EXAMPLE 3. The nuclease induces double stranded breaks. Cells are stained and analyzed as described in EXAMPLE 11 with an oligonucleotide Nano-FISH probe set for PDCD1, TRA, and/or TRB and for p53BPl, indicative of double strand breaks induced by the nuclease. Cells are imaged and analyzed as described in EXAMPLE 1. Co- localization of signal from oligonucleotide Nano-FISH probes and p53BPl is quantified to determine the specificity of the nuclease for PDCD1, TRA, and/or TRB and any off- target activity of the nuclease. A nuclease with high specificity for PDCD1, TRA, and/or TRB and low to none off-target activity is used to administer to engineer CAR T cells ex vivo, to be administered to a subject in need thereof The subject has cancer, such as multiple myeloma, melanoma, or sarcoma.
EXAMPLE 26
Determining Specificity of Genome Editing with a TRA-Targeting Nuclease, a TRB- Targeting Nuclease, and/or a CS-1 -Targe ting Nuclease
[0510] This example illustrates determining specificity of genome editing with a TRA- targeting nuclease, a TRB-targeting nuclease, and/or a CS- l -targeting nuclease. A genome editing complex (e.g., TALEN, ZFN, CRISPR/Cas9, megaTAL, meganuclease) targeting TRA, TRB, and/or CS- l- lis transfected in immortalized or primary cells as set forth in
EXAMPLE 2 or EXAMPLE 3. The nuclease induces double stranded breaks. Cells are stained and analyzed as described in EXAMPLE 11 with an oligonucleotide Nano-FISH probe set for TRA, TRB, and/or CS- l and for p53BPl, indicative of double strand breaks induced by the nuclease. Cells are imaged and analyzed as described in EXAMPLE 1. Co- localization of signal from oligonucleotide Nano-FISH probes and p53BPl is quantified to determine the specificity of the nuclease for TRA, TRB, and/or CS- l and any off- target activity of the nuclease. A nuclease with high specificity for TRA, TRB, and/or CS- l and low to none off-target activity is used to administer to engineer CAR T cells ex vivo, to be administered to a subject in need thereof The subject has cancer, such as multiple myeloma.
EXAMPLE 27
Determining Specificity of Genome Editing with a TRA-Targeting Nuclease and/or a
TRB-Targeting Nuclease
[0511] This example illustrates determining specificity of genome editing with a TRA- targeting nuclease and/or a TRB-targeting nuclease. A genome editing complex (e.g.,
TALEN, ZFN, CRISPR/Cas9, megaTAL, meganuclease) targeting TRA and/or TRB is transfected in immortalized or primary cells as set forth in EXAMPLE 2 or EXAMPLE 3. The nuclease induces double stranded breaks. Cells are stained and analyzed as described in EXAMPLE 11 with an oligonucleotide Nano-FISH probe set for TRA and/or TRBand for p53BPl, indicative of double strand breaks induced by the nuclease. Cells are imaged and analyzed as described in EXAMPLE 1. Co-localization of signal from oligonucleotide Nano- FISH probes and p53BPl is quantified to determine the specificity of the nuclease for TRA and/or TRBand any off-target activity of the nuclease. A nuclease with high specificity for TRA and/or TRBand low to none off-target activity is used to administer to engineer CAR T cells ex vivo, to be administered to a subject in need thereof The subject has cancer, such as acute lymphoblastic leukemia.
EXAMPLE 28
Determining Specificity of Genome Editing with a CEP290-Targeting Nuclease
[0512] This example illustrates determining specificity of genome editing with a CEP290- targeting nuclease. A genome editing complex (e.g., TALEN, ZFN, CRISPR/Cas9, megaTAL, meganuclease) targeting CEP290 is transfected in immortalized or primary cells as set forth in EXAMPLE 2 or EXAMPLE 3. The nuclease induces double stranded breaks. Cells are stained and analyzed as described in EXAMPLE 11 with an oligonucleotide Nano- FISH probe set for CEP290 and for p53BPl, indicative of double strand breaks induced by the nuclease. Cells are imaged and analyzed as described in EXAMPLE 1. Co- localization of signal from oligonucleotide Nano-FISH probes and p53BPl is quantified to determine the specificity of the nuclease for CEP290 and any off-target activity of the nuclease. A nuclease with high specificity for CEP290 and low to none off-target activity is used to administer to a subject in need thereof The subject has Leber congenital amaurosis (LCA10).
EXAMPLE 29
Determining Specificity of Genome Editing with a TRA-Targeting Nuclease, a TRB- Targeting Nuclease, and/or a B2M-Targeting Nuclease
[0513] This example illustrates determining specificity of genome editing with a TRA- targeting nuclease, a TRB-targeting nuclease, and/or a B2M-targeting nuclease. A genome editing complex (e.g., TALEN, ZFN, CRISPR/Cas9, megaTAL, meganuclease) targeting TRA, TRB, and/or B2M is transfected in immortalized or primary cells as set forth in
EXAMPLE 2 or EXAMPLE 3. The nuclease induces double stranded breaks. Cells are stained and analyzed as described in EXAMPLE 11 with an oligonucleotide Nano-FISH probe set for TRA, TRB, and/or B2M and for p53BPl, indicative of double strand breaks induced by the nuclease. Cells are imaged and analyzed as described in EXAMPLE 1. Co- localization of signal from oligonucleotide Nano-FISH probes and p53BPl is quantified to determine the specificity of the nuclease for TRA, TRB, and/or B2M and any off-target activity of the nuclease. A nuclease with high specificity for TRA, TRB, and/or B2M and low to none off-target activity is used to administer to engineer CAR T cells ex vivo, to be administered to a subject in need thereof The subject has cancer, such as CD19 malignancies or BCMA-related malignancies.
EXAMPLE 30
Multiplexed p53BPl, FLAG, and Nano-FISH Staining for Fine Structural Analysis
[0514] This example shows multiplexed p53BPl, FLAG, and Nano-FISH staining and analysis for line structural analysis of specific genomic loci within the nucleus. Fine structural analysis using Nano-FISH is carried by, for example, probe pools are designed to target a l .6kb region of chromosome 19 and a l .4kb region of chromosome 18. Distinct spots are produced by Nano-FISH probes targeting specific loci on these chromosomes. To measure the relative localization of the detected loci, the relative radial distance (RRD), a normalized measure of the position of the detected spot with respect to the nuclear centroid, was calculated. Distributions are obtained across 2,396 chromosome 18 signals and 3,388 chromosome 19 signals. The differences in the distribution of signals with respect to the nuclear centroid are readily apparent in the histograms. Fine structural analysis using Nano- FISH is extended to the multiplexed p53BPl, FLAG, and Nano-FISH staining and analysis disclosed herein to spatially resolve the target genomic locus within the nucleus in 2D or 3D.
EXAMPLE 31
Examination of Enhancer-Promoter Interactions Using Multiplexed p53BPl, FLAG, and Nano-FISH Staining
[0515] This example shows multiplexed p53BPl, FLAG, and Nano-FISH staining and analysis for examining the interaction of a gene enhancer with its target gene promoter. The positioning of a known enhancer is examined. Nano-FISH probes targeting the enhancer and promoter are designed and synthesized. The normalized inter- spot distance (NID) between two genomic loci is compared. Small size of genomic regions targeted by Nano-FISH permits fine scale localization of regulatory DNA regions and provides a granular view of their spatial localizations within nuclei. Examination of enhancer-promoter interactions using Nano-FISH is extended to the multiplexed p53BPl, FLAG, and Nano-FISH staining and analysis disclosed herein to examine enhancer-promoter interactions after editing cells with a genome editing complex (e.g., TALEN, ZFN, CRISPR/Cas9, megaTAL, meganuclease). EXAMPLE 32
Fine Scale Genome Localization Using Multiplexed p53BPl, FLAG, and Nano-FISH
Staining and Super-Resolution Microscopy
[0516] This example shows multiplexed p53BPl, FLAG, and Nano-FISH staining and analysis super-resolution microscopy to obtain very fine-scale genome localization. Fine scale genome localization using Nano-FISH and super- resolution microscopy is carried out as follows. A custom automated stimulated emission and depletion (STED) microscope is utilized to efficiently acquire multiple measurements of the physical distance between the HS2 and HS3 genomic loci, which are separated by 4. lkb of linear genomic distance. Pair wise measurements of other closely situated genomic segments such as HS1-HS4 (~l2kb) and HS2-HGB2 (~25kb) are also readily obtained and revealed non-linear compaction of the b-globin locus control region and the surrounding genome which contains its target genes. Importantly, the high-throughput STED microscopy approach enables calculation of the distribution of actual distances between these various loci. Nano-FISH is suitable for super resolution STED microscopy experiments. Examination of fine scale genome localization using Nano-FISH is extended to the multiplexed p53BPl, FLAG, and Nano-FISH staining and analysis disclosed herein to examine fine scale genome localization after editing cells with a genome editing complex (e.g., TALEN, ZFN, CRISPR/Cas9, megaTAL,
meganuclease).
EXAMPLE 33
Determining Specificity of Genome Editing with a CBLB-Targeting Nuclease
[0517] This example illustrates determining specificity of genome editing with a CBLB- targeting nuclease. A genome editing complex (e.g., TALEN, ZFN, CRISPR/Cas9, megaTAL, meganuclease) targeting CBLB is transfected in immortalized or primary cells as set forth in EXAMPLE 2 or EXAMPLE 3. The nuclease induces double stranded breaks. Cells are stained and analyzed as described in EXAMPLE 11 with an oligonucleotide Nano- FISH probe set for CBLB and for p53BPl, indicative of double strand breaks induced by the nuclease. Cells are imaged and analyzed as described in EXAMPLE 1. Co-localization of signal from oligonucleotide Nano-FISH probes and p53BPl is quantified to determine the specificity of the nuclease for CBLB and any off-target activity of the nuclease. A nuclease with high specificity for CBLB and low to none off-target activity is administered to engineer CAR T cells ex vivo, to be administered to a subject in need thereof The subject has cancer. EXAMPLE 34
Determining Specificity of Genome Editing with a TGFBR-Targeting Nuclease
[0518] This example illustrates determining specificity of genome editing with a TGFbR- targeting nuclease. A genome editing complex (e.g., TALEN, ZFN, CRISPR/Cas9, megaTAL, meganuclease) targeting TGFBR is transfected in immortalized or primary cells as set forth in EXAMPLE 2 or EXAMPLE 3. The nuclease induces double stranded breaks. Cells are stained and analyzed as described in EXAMPLE 11 with an oligonucleotide Nano- FISH probe set for TGFBR and for p53BPl, indicative of double strand breaks induced by the nuclease. Cells are imaged and analyzed as described in EXAMPLE 1. Co- localization of signal from oligonucleotide Nano-FISH probes and p53BPl is quantified to determine the specificity of the nuclease for TGFBR and any off-target activity of the nuclease. A nuclease with high specificity for TGFBR and low to none off-target activity is administered to engineer CAR T cells ex vivo, to be administered to a subject in need thereof The subject has multiple myeloma.
EXAMPLE 35
Determining Specificity of Genome Editing with a DMD-Targeting Nuclease
[0519] This example illustrates determining specificity of genome editing with a DMD- targeting nuclease. A genome editing complex (e.g., TALEN, ZFN, CRISPR/Cas9, megaTAL, meganuclease) targeting DMD is transfected in immortalized or primary cells as set forth in EXAMPLE 2 or EXAMPLE 3. The nuclease induces double stranded breaks. Cells are stained and analyzed as described in EXAMPLE 11 with an oligonucleotide Nano- FISH probe set for DMD and for p53BPl, indicative of double strand breaks induced by the nuclease. Cells are imaged and analyzed as described in EXAMPLE 1. Co- localization of signal from oligonucleotide Nano-FISH probes and p53BPl is quantified to determine the specificity of the nuclease for DMD and any off-target activity of the nuclease. A nuclease with high specificity for DMD and low to none off-target activity is administered to a subject in need thereof The subject has duchenne muscular dystrophy (DMD).
EXAMPLE 36
Determining Specificity of Genome Editing with a CFTR-Targeting Nuclease
[0520] This example illustrates determining specificity of genome editing with a CFTR- targeting nuclease. A genome editing complex (e.g., TALEN, ZFN, CRISPR/Cas9, megaTAL, meganuclease) targeting CFTR is transfected in immortalized or primary cells as set forth in EXAMPLE 2 or EXAMPLE 3. The nuclease induces double stranded breaks. Cells are stained and analyzed as described in EXAMPLE 11 with an oligonucleotide Nano- FISH probe set for CFTR and for p53BPl, indicative of double strand breaks induced by the nuclease. Cells are imaged and analyzed as described in EXAMPLE 1. Co-localization of signal from oligonucleotide Nano-FISH probes and p53BPl is quantified to determine the specificity of the nuclease for CFTR and any off-target activity of the nuclease. A nuclease with high specificity for CFTR and low to none off- target activity is administered to a subject in need thereof The subject has cystic fibrosis.
EXAMPLE 37
Determining Specificity of Genome Editing with a Serpinal-Targeting Nuclease
[0521] This example illustrates determining specificity of genome editing with a serpinal - targeting nuclease. A genome editing complex (e.g., TALEN, ZFN, CRISPR/Cas9, megaTAL, meganuclease) targeting serpinalis transfected in immortalized or primary cells as set forth in EXAMPLE 2 or EXAMPLE 3. The nuclease induces double stranded breaks. Cells are stained and analyzed as described in EXAMPLE 11 with an oligonucleotide Nano- FISH probe set for serpinal and for p53BPl, indicative of double strand breaks induced by the nuclease. Cells are imaged and analyzed as described in EXAMPLE 1. Co-localization of signal from oligonucleotide Nano-FISH probes and p53BPl is quantified to determine the specificity of the nuclease for serpinal and any off- target activity of the nuclease. A nuclease with high specificity for serpinal and low to none off-target activity is administered to a subject in need thereof The subject has alpha- 1 antitrypsin deficiency (dAlAT def).
EXAMPLE 38
Determining Specificity of Genome Editing with an IL2Rg-Targeting Nuclease
[0522] This example illustrates determining specificity of genome editing with an IL2Rg- targeting nuclease. A genome editing complex (e.g., TALEN, ZFN, CRISPR/Cas9, megaTAL, meganuclease) targeting IL2Rg is transfected in immortalized or primary cells as set forth in EXAMPLE 2 or EXAMPLE 3. The nuclease induces double stranded breaks. Cells are stained and analyzed as described in EXAMPLE 11 with an oligonucleotide Nano- FISH probe set for IL2Rg and for p53BPl, indicative of double strand breaks induced by the nuclease. Cells are imaged and analyzed as described in EXAMPLE 1. Co-localization of signal from oligonucleotide Nano-FISH probes and p53BPl is quantified to determine the specificity of the nuclease for IL2Rg and any off-target activity of the nuclease. A nuclease with high specificity for IL2Rg and low to none off- target activity is administered to a subject in need thereof The subject has X- linked severe combined immunodeficiency (X-SCID).
EXAMPLE 39
Determining Specificity of Genome Editing with Nuclease Targeting HBV Genomic
DNA in Infected Cells
[0523] This example illustrates determining specificity of genome editing with a nuclease targeting HBV genomic DNA in infected cells. A genome editing complex (e.g., TALEN, ZFN, CRISPR/Cas9, megaTAL, meganuclease) targeting HBV genomic DNA is transfected in immortalized or primary cells as set forth in EXAMPLE 2 or EXAMPLE 3. The nuclease induces double stranded breaks. Cells are stained and analyzed as described in EXAMPLE 11 with an oligonucleotide Nano-FISH probe set for HBV genomic DNA and for p53BPl, indicative of double strand breaks induced by the nuclease. Cells are imaged and analyzed as described in EXAMPLE 1. Co-localization of signal from oligonucleotide Nano-FISH probes and p53BPl is quantified to determine the specificity of the nuclease for HBV genomic DNA and any off-target activity of the nuclease. A nuclease with high specificity for HBV genomic DNA and low to none off- target activity is administered to a subject in need thereof The subject has Hepatitis B.
EXAMPLE 40
Calculation of Nuclease Specificity
[0524] A modular software framework of image processing methods to quantify the amount and localization of proteins (such as p53bpl) on a per-cell basis in response to a perturbant such as a nuclease has been developed. For the protein of interest, morphometric data (such as foci (spot) count, foci size, foci intensity, overall nuclear expression (load), spatial localization patterns of foci, etc) are automatically estimated from the image data on a per- cell basis for the nuclease-treated and mock- treated (control) samples. A generalizable informatics framework of statistical methods to model and analyze the data distributions has also been developed. The informatics framework ultimately yields a numerical estimate ([0,1] or expressed as a percentage) for the specificity of the nuclease. The framework is depicted in Fig. 18. This framework thus provides an objective route for high throughput screening of nucleases to identify lead nucleases against therapeutically useful genomic targets. EXAMPLE 41
Calculation of Nuclease Specificity Using per-cell p53BPl Foci Counts
[0525] Per-cell spot counts for the p53bpl protein in control and nuclease-treated cells can be modeled and analyzed using the informatics framework detailed in Fig. 18 to yield numerical estimates of the nuclease specificity. The model incorporates parameters to reflect the sensitivity of the protein marker used, and the ploidy of the target locus that is being edited. The nuclease-treated cell distribution was normalized relative to the distribution of the control sample, and the fraction of cells with p53bpl foci above the ploidy of the target genomic locus was computed as the promiscuity of the nuclease. Nuclease specificity was estimated to be 1 - the promiscuity value. A method for calculation of nuclease specificity based on p53bpl foci counts is depicted in Fig. 19.
EXAMPLE 42
Calculation of Nuclease Specificity Using per-cell p53BPl Foci Counts vs. Guide-Seq
[0526] Guide-seq is a bulk-cell genomic sequencing-based assay that generally considered as the defacto method to derive the specificity of nucleases. The imaging assay disclosed herein provides a complementary estimate of the nuclease specificity, but within a fraction of the time and expense of the guide-seq assay.
[0527] The specificity of p53BPl imaging assay was compared with guide-seq in K562 cells for 3 nucleases that are considered to have high on-target potency but differing specificities. The p53BPl imaging-based assay mirrors the specificity profiles provided by guide-seq, but within a fraction of the time and cost of the guide-seq assay. See Fig. 20.
EXAMPLE 43
p53BPl Imaging Based Optimization of Nuclease Specificity by Altering DNA Binding
Domain
[0528] p53BPl imaging assay was utilized to optimize the specificity of nucleases in primary cells by modifying their design. CD34+ cells were treated with either TALENs featuring homodimeric Fokl nuclease domains (GA6 14) or their variants that contained more repeat units (i.e. GA6 17 and GA6 19) in one of the monomers (the left monomer in this case) to enhance specific recognition of their target genomic locus. The assay revealed a dramatic reduction in off-target activity by using longer GA6 L monomers while still providing a comparable on- target editing efficiency (58% for GA6_l4, 54% for GA6_l7, and 52% for GA6 19). See Fig. 21. EXAMPLE 44
p53BPl Imaging Based Optimization of Nuclease Specificity by Altering Nuclease
Domain
[0529] p53BPl imaging assay was utilized to optimize the specificity of nuclease action in primary cells. CD34+ cells were treated with either TALENs featuring homodimeric Fokl nuclease domains (GA6, GA7) or their variants that contained obligate heterodimeric
ELD/KKR Fokl nuclease domains (GA6 EK, GA7 EK). The assay revealed a substantial decrease in the off-target nuclease activity of the obligate heterodimer variant of the GA6 talen. The improved specificity does occur with a collateral of lower editing (47% for GA6, 58% for GA7 vs 29% for GA6-EK and 21% for GA7-EK). See Fig. 22.
EXAMPLE 45
p53BPl Imaging Based Optimization of Nuclease Specificity by Altering Nuclease
Domain
[0530] By multiplexing immunofluorescence with NanoFISH, p53BPl imaging assay can be used to assess both on- and off-target activity on a per-cell basis. K562 cells or CD34+ progenitor cells were treated with AAVS1 and GA6 TALENs that target distinct genomic regions. Untransfected and mock transfected cells were used as controls. An mRNA dose of 2ug per monomer was used for the TALENs. 24 hours post transfection, all cells were subject to p53BPl/FLAG immunofluorescence and NanoFISH with a pool of 115 oligoprobes that were designed to target the 5 kb genomic region adjacent to AAVS1 TALEN cut site. K562 cell experiments were conducted in duplicate. Colocalization analysis of the AAVS1 FISH probes and the p53BPl protein foci revealed a significantly higher colocalization of AAVS1 FISH foci with p53BPl foci in the AAVS1 TALEN treated cells compared to all the other conditions in both cell types. See Figs. 23 A and 23B. These results highlight the utility of the assay for a per-allele per-cell readout of on- and off-target activity of a nuclease.
EXAMPLE 46
Imaging-based specificity screen to identify lead nucleases for therapeutic genetic targets
[0531] The p53BPl imaging assay was used to rapidly identify lead nucleases against therapeutically relevant genomic loci. TALENs against the first constant exon of the TCR- alpha gene and the first exon of the PDCD1 gene were designed, and their on- target potency and specificity on primary CD3+ T cells was evaluated. Multiple TALENs provided comparable on-target potency, TALEN #6 had the highest specificity. See Figs. 24A and 24B. Thus, the p53BPl imaging assay identified TALEN #6 as the lead nuclease for these genes.
[0532] Figs. 24A-24B: Primary CD3+ T cells were transfected with a set of 8 TALENs against either TCR- alpha (Fig. 24 A) or PDCD- l (Fig. 24B), at a dose of 2 ug per monomer. TALEN mRNA was used for the transfection. Transfected cells were subject to cold shock (30C) for 24 hours, after which they were retrieved, washed with PBS, seeded onto PLL- coated, glass bottom 24-well plates, stained for p53BPl and FLAG, and imaged in 3D using a Nikon epi fluorescence microscope fitted with an Andor Zyla camera and 60x, 1.4 NA oil objective.
[0533] % on-target potency: On target potency is a measure of the cutting efficacy of the nuclease at the intended genomic target site. Genomic DNA is retrieved from cells 72 - 96 hours post transfection, amplicons generated for the intended target site, and these were sequenced with the miniseq (up to 500,000 reads). The on-target potency value is calculated from the sequencing data as the proportion of reads that contain either insertions or deletions at the edited target genomic locus to the total number of reads sequenced for the sample.
[0534] % nuclease specificity is computed from the per-cell p53bpl foci count data. The data distributions for the nuclease-treated and the corresponding untreated reference
(background) cell samples are computed. Given the detection efficiency of the p53BPl assay (PD) at the target site and the proliferating cell fraction (Fp), a theoretical on- target distribution is calculated for the on-target activity of the nuclease. Subsequently, the distribution of the nuclease-treated sample is normalized by the distribution of the control sample and the theoretical on-target distribution using a process of non-negative least squares deconvolution. Lastly, the specificity is calculated as follows from the distribution of the background-normalized cell population: Given the ploidy (PT) of the editing target, nuclease specificity is the % fraction of background-normalized cells containing p53BPl foci from 0 to PT . For simplicity in modeling, Fp and PD are set to 0 and 1, respectively.
EXAMPLE 47
Imaging-based dose titration for identification of optimal nuclease dosing
[0535] The p53BPl imaging assay can be used to be used to optimize nuclease doses and thereby further reduce off-target effects of potent nucleases. The lead TALEN against the first constant exon of the TCR- alpha gene was evaluated for the effect of varying its dosage between 0.1 ug to 2 ug per monomer in primary CD3+ T cells. The off-target effects became more pronounced above a dose of lug per monomer, while the on-target potency did not considerably increase. See Fig. 25. Thus, the nuclease dosage for a nuclease against a therapeutically relevant target was optimized using the p53BPl imaging assay.
[0536] Fig. 25: Primary CD3+ T cells were transfected with a high- specificity TALEN against TCR-alpha, at doses of 0, 0.1, 0.25, 0.5, 1, and 2 ug per monomer. TALEN mRNA was used for the transfection. Transfected cells were subject to cold shock (30C) for 24 hours, after which they were retrieved, washed with PBS, seeded onto PLL-coated, glass bottom 24-well plates, stained for p53BPl and FLAG, and imaged in 3D using a Nikon epi fluorescence microscope fitted with an Andor Zyla camera and 60x, 1.4 NA oil objective. %on-target potency and % nuclease specificity were calculated as detailed above.
EXAMPLE 48
High throughput screening of nucleases for clinically relevant applications
[0537] The p53BPl imaging assay was used to rapidly screen nucleases on the basis of their specificity. 47 TALEN s for a clinically relevant genomic target in the vicinity of the human gamma hemoglobin gene were generated, and their specificity evaluated in human erythroid HUDEP2 cells. A subset of TALEN s that were highly specific while still being potent were identified. See Fig. 26.
[0538] Fig. 26: HUDEP2 cells were transfected with 47 TALENs against the HBG1/2 gene promoter locus, each at dose of 2.5 ug per monomer. TALEN mRNA was used for the transfection. Transfected cells were subject to cold shock (30C) for 24 hours, after which they were retrieved, washed with PBS, seeded onto PLL-coated, glass bottom 24-well plates or 96-well plates, stained for p53BPl and FLAG, and imaged in 3D using a Nikon epi fluorescence microscope fitted with an Andor Zyla camera and 40x, 0.9 NA air objective. %on-target potency and % nuclease specificity were calculated as detailed above. % indel rates were calculated from cells retrieved 14 days post transfection.
EXAMPLE 49
Analysis of Cellular Perturbation
[0539] The methods provided herein can be used to evaluate the variation in any protein that responds to an external stimulus or perturbation. The change in foci spot distributions for 4 different DNA repair proteins (p53bpl, gamma-FLZAX, BRCA1, and MRE- l l) in 3 cell types (K562, HUDEP2, and CD3+ T cells) was analyzed. All of these proteins could be used to estimate nuclease specificity in a cell-type specific manner. Fig. 27.
[0540] The examples and embodiments described herein are for illustrative purposes only and various modifications or changes suggested to persons skilled in the art are to be included within the spirit and purview of this application and scope of the appended claims.
[0541] For reasons of completeness, certain embodiments of the methods of the present disclosure are set out in the following numbered aspects:
1. A method of quantifying a protein load, the method comprising quantifying a protein that accumulates in a primary cell in response to a cellular perturbation on a per allele per cell basis.
2. A method of quantifying a protein load, the method comprising quantifying a protein that accumulates in a plurality of cells in response to a cellular perturbation in less than 24 hours on a per allele per cell basis.
3. A method of screening a plurality of cell engineering tools for specificity, the method comprising quantifying a protein load in an intact cell in less than 24 hours and determining the specificity of the cell engineering tool for a target genomic locus based on the protein load.
4. A method of producing a potent and specific cell engineering tool, the method comprising:
a) administering a cell engineering tool to a cell;
b) determining specificity, activity, or a combination thereof of the cell engineering tool for a target genomic locus by quantifying a protein load;
c) quantifying potency of the cell engineering tool by measuring gene editing efficiency, activation of gene expression, or repression of gene expression; and
d) adjusting a parameter of the cell engineering tool to increase specificity for the target genomic locus.
5. The method of any one of aspects 3-4, wherein the protein accumulates in response to a cellular perturbation.
6. The method of any one of aspects 3-5, wherein the method further comprises quantifying the protein load on a per allele per cell basis.
7. The method of any one of aspects 3 or 5-6, wherein the intact cell comprises an intact primary cell. 8 The method of any one of aspects 1 or 4-6, wherein the cell or primary cell comprises an intact primary cell.
9. The method of any one of aspects 1 or 5-8, wherein the cellular perturbation
comprises administering a cell engineering tool.
10. The method of aspect 9, the method further comprising determining specificity of the cell engineering tool for a target genomic locus.
11. The method of any one of aspects 1-2 or 5- 10, the method further comprising quantifying gene editing efficiency, activation of gene expression, or repression or gene expression.
12. The method of aspect 2, wherein the plurality of cells comprises at least 5 cells, at least 10 cells, at least 20 cells, at least 50 cells, at least 100 cells, at least 200 cells, at least 500 cells, or at least 1000 cells.
13. The method of any one of aspects 1-12, wherein the protein indicates a cellular response.
14. The method of aspect 13, wherein the cellular response comprises a double strand break, activation of transcription, repression of transcription, or chromosome translocation.
15. The method of any one of aspects 1-14, wherein the cell or intact cell comprises an immortalized cell.
16. The method of any one of aspects 4 or 9- 15, wherein the cell engineering tool comprises a genome editing complex or a gene regulator.
17. The method of aspect 16, wherein the gene regulator comprises a gene activator or a gene repressor.
18. The method of any one of aspects 1-17, wherein the protein comprises phosphorylated r53BR1 (r53BR1), gH2AC, 53BP1, H3K4mel, H3K4me2, H3K27ac, KAP1, H3K9me3, H3K27me3, or HPl .
19. The method of any one of aspects 1-18, wherein the protein comprises p53BPl .
20. The method of any one of aspects 1-19, the method further comprising staining the cell for the protein.
21. The method of aspect 20, wherein the staining the cell for the protein comprises labeling with a primary antibody against the protein and a secondary antibody conjugated to a first fluorophore. 22. The method of aspect 20, wherein the staining the cell for the protein comprises direct labeling with a primary antibody conjugated to a first fluorophore.
23. The method of any one of aspects 21-22, the method lurther comprising imaging the cell for one or more protein foci comprising the first fluorophore.
24. The method of any one of aspects 21-23, the method lurther comprising image analysis of the cell for the one or more protein foci comprising the first fluorophore.
25. The method of aspect 24, the method lurther comprising quantifying the protein load from the one or more protein foci comprising the first fluorophore.
26. The method of any one of aspects 1-25, wherein the protein load comprises a number of protein foci, total protein content within the nucleus, spatial localization pattern, or any combination thereof
27. The method of any one of aspects 3-26, wherein the cell engineering tool further comprises a polypeptide tag.
28. The method of aspect 27, wherein the polypeptide tag is a FLAG tag.
29. The method of any one of aspects 3-28, the method lurther comprising staining the cell for the cell engineering tool.
30. The method of aspect 29, wherein the staining the cell for the cell engineering tool comprises staining with a primary antibody against the polypeptide tag and a secondary antibody conjugated to a second fluorophore.
31. The method of aspect 29, wherein the staining the cell for the cell
engineering tool comprises direct labeling with a primary antibody conjugated to a second fluorophore.
32. The method of aspect 29, wherein the staining of the cell for the cell engineering tool comprises staining with a primary antibody against the nuclease and a secondary antibody conjugated to a second fluorophore.
33. The method of aspect 29, wherein the staining the cell for the cell engineering tool comprises direct labeling with a primary antibody conjugated to a second fluorophore.
34. The method of aspect 33, lurther comprising imaging the cell for one or more cell engineering tool foci comprising the second fluorophore.
35. The method of aspect 34, turther comprising image analysis of the cell for the one or more cell engineering tool foci comprising the second fluorophore.
36. The method of aspect 35, the method further comprising quantifying cell engineering tool load from the one or more cell engineering tool foci comprising the second fluorophore. 37. The method of aspect 36, wherein the cell engineering tool load comprises a number of cell engineering tool foci, total content of the cell engineering tool within the nucleus, spatial localization pattern, or any combination thereof
38. The method of any one of aspects 1-37, the method lurther comprising hybridizing a probe set comprising a plurality of probes to the cell, wherein the probe set targets and binds to a target genomic locus.
39. The method of aspect 38, wherein each probe of the plurality of probes comprises a third fluorophore.
40. The method of any one of aspects 38-39, wherein the probe set comprises an oligonucleotide probe set.
41. The method of aspect 40, lurther comprising imaging the cell for one or more Nano-FISH foci comprising the third fluorophore.
42. The method of aspect 41, lurther comprising image analysis of the cell for the one or more Nano-FISH foci comprising the third fluorophore.
43. The method of any one of aspects 39-42, wherein co- localization of signal from the first fluorophore and the third fluorophore indicates that the cellular perturbation occurs at the target genomic locus.
44. The method of any one of aspects 1-43, the method lurther comprising hybridizing a second probe set comprising a second plurality of probes to the cell, wherein the second probe set targets and binds to an off-target genomic locus.
45. The method of aspect 44, wherein each probe of the second plurality of probes comprises a fourth fluorophore.
46. The method of any one of aspects 44-45, wherein the second probe set comprises a second oligonucleotide probe set.
47. The method of aspect 46, further comprising imaging the cell for one or more Nano-FISH foci comprising the fourth fluorophore.
48. The method of aspect 47, further comprising image analysis of the cell for the one or more Nano-FISH foci comprising the fourth fluorophore.
49. The method of any one of aspects 44-48, wherein co-localization of signal from the first fluorophore, the third fluorophore, and the fourth fluorophore indicates a chromosome translocation.
50. The method of any one of aspects 23-49, wherein imaging the cell comprises acquiring images of the cell by a microscopy mode selected from the group consisting of: epilluorescence, widefield, confocal, selective plane illumination, tomography, holography, super-resolution, and synthetic aperture optics (SAO).
51. The method of aspect 50, lurther comprising processing the acquired images to identify regions of interest (ROIs) comprising cell nuclei, protein marker foci, sites of cell engineering tool localization, or a combination thereof
52. The method of aspect 51, lurther comprising processing the ROIs to extract a plurality of features selected from the group consisting of count, spatial location, size (area/volume), shape (circular ity/sp her icity, eccentricity, irregularity (concavity/convexity), diameter, perimeter/surface area, quantitative measures of image texture that are pixel-based or region-based over a tunable length scale, nuclear diameter, nuclear area, nuclear volume, perimeter, surface area, DNA content, DNA texture measures, number of protein marker foci, size of protein marker foci, shape of protein marker foci, amount of protein marker per cell, spatial location and localization pattern of protein marker foci, number of nuclease per cell, amount of nuclease per cell, nuclease localization or texture, number of cell engineering tool foci, size of cell engineering tool foci, shape of cell engineering tool foci, amount of cell engineering tool foci per cell, spatial location and localization pattern of cell engineering tool foci, number of Nano-FISH foci, size of Nano-FISH foci, shape of Nano-FISH foci, amount of Nano-FISH foci, spatial location of Nano-FISH foci, and localization pattern of Nano- FISH foci.
53. The method of aspect 52, lurther comprising processing the extracted plurality of features to measure a degree of co- localization between the one or more Nano-FISH foci and the one or more protein marker foci, thereby determining specificity of the genome editing complex or the gene regulator.
54. The method of any one of aspects 52-53, further comprising applying a machine learning predictor to the extracted plurality of features to evaluate performance of cell engineering tools by predicting a distinction capability of nucleases.
55. The method of any one of aspects 16-54, wherein the genome editing complex comprises a DNA binding domain and a nuclease.
56. The method of aspect 55, wherein the genome editing complex further comprises a linker.
57. The method of any one of aspects 17-54, wherein the gene activator comprises a DNA binding domain and an activation domain.
58. The method of aspect 57, wherein the gene activator lurther comprises a linker. 59. The method of any one of aspects 17-54, wherein the gene repressor comprises a DNA binding domain and a repressor domain.
60. The method of aspect 59, wherein the gene repressor llirther comprises a linker.
61. The method of any one of aspects 55-60, wherein the DNA binding domain comprises a transcription activator- like effector (TALE) protein, a zinc finger protein (ZFP), or a single guide RNA (sgRNA).
62. The method of any one of aspects 16-54 or 55-56, wherein the genome editing complex is a TALEN, a ZFN, a CRISPR/Cas9, a megaTAL, or a meganuclease.
63. The method of any one of aspects 53-54 or 59-60, wherein the nuclease comprises Fokl.
64. The method of aspect 63, wherein Fokl has at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, at least 97%, or at least 99% sequence identity to SEQ ID NO: 1062.
65. The method of any one of aspects 56-64, wherein the linker comprises the naturally occurring C-terminus of a TALE protein or any truncation thereof
66. The method of any one of aspects 56-64, wherein the linker comprises 0- 15 residues of glycine, methionine, aspartic acid, alanine, lysine, serine, leucine, threonine, tryptophan, or any combination thereof
67. The method of any one of aspects 57-66, wherein the activation domain comprises VP 16, VP64, p65, p300 catalytic domain, TET1 catalytic domain, TDG, Ldbl self- associated domain, SAM activator (VP64, p65, HSF1), VPR (VP64, p65, Rta).
68. The method of any one of aspects 59-66, wherein the repressor domain comprises KRAB, Sin3a, LSD1, SUV39H1, G9A (EHMT2), DNMT1, DNMT3A-DNMT3L, DNMT3B, KOX, TGF-beta-inducible early gene (T1EG), v-erbA, SID, MBD2, MBD3, Rb, or MeCP2.
69. The method of any one of aspects 16-68 wherein a parameter of the genome editing complex or the gene regulator is adjusted improve specificity.
70. The method of aspect 69, wherein the parameter is a sequence of the DNA binding domain or length of the DNA binding domain.
71. The method of any one of aspects 1-70, the protein load is quantified in at least 50 to 100,000 cells.
72. The method of aspect 71, wherein the protein load is quantified in no more than 1000, no more than 500, no more than 100, or no more than 50 cells. 73. The method of any one of aspects 1-72, wherein the cell comprises a hematopoietic stem cells (HSC), a T cell, a chimeric antigen receptor T cell (CAR T cell).
74. The method of any one of aspects 1-72, wherein the cell is from a normal solid tissue or a tumorigenic solid tissue.
75. The method of any one of aspects 1-74, wherein the target genomic locus is within a PDCD1 gene, a CTLA4 gene, a LAG3 gene, a TET2 gene, a BTLA gene, a
HAVCR2 gene, a CCR5 gene, a CXCR4 gene, a TRA gene, a TRB gene, a B2M gene, an albumin gene, a HBB gene, a HBAl gene, a TTR gene, a NR3C l gene, a CD52 gene, an erythroid specific enhancer of the BCL1 1A gene, a CBLB gene, a TGFBR1 gene, a
SERPINA1 gene, a HBV genomic DNA in infected cells, a CEP290 gene, a DMD gene, a CFTR gene, an IL2RG gene, or a combination thereof
76. The method of any one of aspects 1-75, wherein a chimeric antigen receptor (CAR), engineered T cell receptor (TCR), alpha-L iduronidase (IDUA), iduronate-2- sulfa tase (IDS), IL- 12, or Factor 9 (F9) is inserted upon cleavage of a region of the target nucleic acid sequence.

Claims

CLAIMS WHAT IS CLAIMED IS:
1. A method comprising:
contacting a live cell with a cell engineering tool comprising a DNA binding domain and a nuclease domain, a gene repressor, or a gene activator, wherein the live cell comprises genomic DNA comprising a target genomic locus for the DNA binding domain of the cell engineering tool;
fixing the cell and contacting the fixed cell with a plurality of nucleic acid probes complementary to the target genomic locus and assaying for presence of a protein indicative of cellular response to the contacting; and
assaying for colocalization of the probes and the protein, wherein detection of the colocalization indicates activity of the cell engineering tool at the target genomic locus and absence of the colocalization indicates activity of the cell engineering tool at an off- target site.
2. The method of claim 2, wherein assaying for colocalization comprises imaging the cell at 40X or higher magnification.
3. The method of any one of claims 1-3, wherein the fixing of the cell is performed within 24 hours or less of the contacting.
4. The method of any one of claims 1-3, wherein the cell engineering tool comprises a DNA binding domain and a nuclease domain.
5. The method of claim 4, wherein the nuclease domain induces a double strand break in the genomic DNA and wherein the protein indicative of cellular response to the contacting comprises a DNA repair protein.
6. The method of claim 5, wherein DNA repair protein comprises p53BPl, gH2AC, MRE- l l, BRCA1, RAD-51, phospho-ATM or MDC 1.
7. The method of any one of claims 1-3, wherein the cell engineering tool comprises a DNA binding domain and a gene repressor.
8. The method of claim 7, wherein the gene repressor comprises KRAB, Sin3a, LSD1, SUV39H1, G9A (EHMT2), DNMT1, DNMT3 A-DNMT3 L, DNMT3B, KOX, TGF- b eta- inducible early gene (TIEG), v-erbA, SID, MBD2, MBD3, Rb, or MeCP2.
9. The method of any one of claims 1-3, wherein the cell engineering tool comprises a DNA binding domain and a gene activator.
10. The method of claim 9, wherein the gene activator comprises VP 16, VP64, p65, p300 catalytic domain, TET1 catalytic domain, TDG, Ldb l self- associated domain, SAM activator (VP64, p65, HSF 1), VPR (VP64, p65, Rta).
1 1. The method any one of claims 1- 10, wherein the DNA binding domain comprises a transcription activator- like effector (TALE) protein, a zinc linger protein (ZFP), or a single guide RNA (sgRNA).
12. The method of any one of claims 1- 1 1, wherein the cell is a primary cell.
13. The method of any one of claims 1- 1 1, wherein the cell is a hematopoietic stem cell (HSC), a T cell, a chimeric antigen receptor T cell (CAR T cell).
14. The method of any one of claims 1- 1 1, wherein the cell is from a normal solid tissue or a tumorigenic solid tissue.
15. The method of any one of claims 1- 1 1, wherein the cell is an immortalized cell.
16. The method of any one of claims 1- 15, wherein the target genomic locus is within a PDCD1 gene, a CTLA4 gene, a LAG3 gene, a TET2 gene, a BTLA gene, a HAVCR2 gene, a CCR5 gene, a CXCR4 gene, a TRA gene, a TRB gene, a B2M gene, an albumin gene, a HBB gene, a HBAl gene, a TTR gene, a NIG C l gene, a CD52 gene, an erythroid specific enhancer of the BCL1 1A gene, a CBLB gene, a TGFBR1 gene, a SERPINA1 gene, a HBV genomic DNA in infected cells, a CEP290 gene, a DMD gene, a CFTR gene, or an IL2RG gene.
17. The method of any one of claims 1- 16, wherein assaying for the colocalization comprises imaging the cell by a microscopy mode selected from the group consisting of epifluorescence, widefield, confocal, selective plane illumination, tomography, holography, super-resolution, and synthetic aperture optics (SAO).
18. The method of any one of claims 1- 17, wherein the plurality of nucleic acid probes are 30-60 bases in length.
19. The method of any one of claims 1- 18, wherein the plurality of nucleic acid probes comprise 20-200 probes having distinct sequences.
20. The method of any one of claims 1- 19, wherein the plurality of nucleic acid probes bind to a 1 kilobase (kb) to 5 kb region comprising the target genomic locus.
21. The method of any one of claim 1-20, wherein when the absence of colocalization is detected, the method fijrther comprises adjusting a parameter of the genome editing tool to improve specificity.
22. The method of claim 21, wherein the parameter is a sequence of the DNA binding domain or length of the DNA binding domain.
23. The method of claim 21, wherein the parameter is an amount of the genome editing tool introduced into the cell.
24. A method comprising:
contacting a live cell with a cell engineering tool comprising a DNA binding domain and a nuclease domain, a gene repressor, or a gene activator, wherein the live cell comprises genomic DNA comprising a target genomic locus for the DNA binding domain of the cell engineering tool;
fixing the cell and assaying for presence of a measurable change in nuclear protein load of a protein indicative of cellular response to the contacting, wherein the measurement reflects the total activity of the cell engineering tool.
25. The method of claim 24, further comprising contacting the fixed cell with a plurality of nucleic acid probes complementary to the target genomic locus; and assaying for colocalization of the probes and the protein indicative of cellular response, wherein detection of the colocalization indicates activity of the cell engineering tool at the target genomic locus and absence of the colocalization indicates activity of the cell engineering tool at an off-target site.
26. The method of claim 24 or 25, wherein assaying for the change in nuclear protein load comprises imaging the cell by a microscopy mode selected from the group consisting of epifluorescence, widefield, confocal, selective plane illumination, tomography, holography, super-resolution, and synthetic aperture optics (SAO) and comparing to nuclear protein load in a reference cell not contacted with the cell engineering tool.
27. The method of any one of claims 24-26, wherein when the measured change in protein load above an application- specific baseline level is detected, the method ftuther comprises adjusting a parameter of the genome editing tool to improve specificity.
28. The method of claim 1, wherein assaying comprises imaging the cell at 40X or higher magnification.
29. The method of any one of claims 24-28, wherein the fixing of the cell is performed within 24 hours or less of the contacting.
30. The method of any one of claims 24-29, wherein the cell engineering tool comprises a DNA binding domain and a nuclease domain.
31. The method of claim 30, wherein the nuclease domain induces a double strand break in the genomic DNA and wherein the protein indicative of cellular response to the contacting comprises a DNA repair protein.
32. The method of claim 31, wherein DNA repair protein comprises p53BPl, gH2AC, MRE- l l, BRCA1, RAD-51, phospho-ATM or MDC 1.
33. The method of any one of claims 24-28, wherein the cell engineering tool comprises a DNA binding domain and a gene repressor.
34. The method of claim 33, wherein the gene repressor comprises KRAB, Sin3a, LSD1, SUV39H1, G9A (EHMT2), DNMT1, DNMT3 A-DNMT3 L, DNMT3B, KOX, TGF- b eta- inducible early gene (TIEG), v-erbA, SID, MBD2, MBD3, Rb, or MeCP2.
35. The method of any one of claims 24-28, wherein the cell engineering tool comprises a DNA binding domain and a gene activator.
36. The method of claim 35, wherein the gene activator comprises VP 16, VP64, p65, p300 catalytic domain, TET1 catalytic domain, TDG, Ldb l self- associated domain, SAM activator (VP64, p65, HSF 1), VPR (VP64, p65, Rta).
37. The method any one of claims 24-36, wherein the DNA binding domain comprises a transcription activator- like effector (TALE) protein, a zinc linger protein (ZFP), or a single guide RNA (sgRNA).
38. The method of any one of claims 24-37, wherein the cell is a primary cell.
39. The method of any one of claims 24-37, wherein the cell is a hematopoietic stem cell (HSC), a T cell, a chimeric antigen receptor T cell (CAR T cell).
40. The method of any one of claims 24-37, wherein the cell is from a normal solid tissue or a tumorigenic solid tissue.
41. The method of any one of claims 24-37, wherein the cell is an immortalized cell.
42. The method of any one of claims 24-41, wherein the target genomic locus is within a PDCDl gene, a CTLA4 gene, a LAG3 gene, a TET2 gene, a BTLA gene, a HAVCR2 gene, a CCR5 gene, a CXCR4 gene, a TRA gene, a TRB gene, a B2M gene, an albumin gene, a HBB gene, a HBAl gene, a TTR gene, a NIG C l gene, a CD52 gene, an erythroid specific enhancer of the BCL1 1A gene, a CBLB gene, a TGFBR1 gene, a SERPINA1 gene, a HBV genomic DNA in infected cells, a CEP290 gene, a DMD gene, a CFTR gene, or an IL2RG gene.
43. The method of any one of claims 25-42, wherein the plurality of nucleic acid probes are 30-60 bases in length.
44. The method of any one of claims 25-43, wherein the plurality of nucleic acid probes comprise 20-200 probes having distinct sequences.
45. The method of any one of claims 25-44, wherein the plurality of nucleic acid probes bind to a 1 kilobase (kb) to 5 kb region comprising the target genomic locus.
46. The method of any one of claim 25-45, wherein when the absence of colocalization is detected, the method fijrther comprises adjusting a parameter of the genome editing tool to improve specificity.
47. The method of claim 46, wherein the parameter is a sequence of the DNA binding domain or length of the DNA binding domain.
48. The method of claim 46, wherein the parameter is an amount of the genome editing tool introduced into the cell.
PCT/US2019/028200 2018-04-18 2019-04-18 Methods for assessing specificity of cell engineering tools WO2019204661A1 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
EP19788289.7A EP3781704A4 (en) 2018-04-18 2019-04-18 Methods for assessing specificity of cell engineering tools
CA3098427A CA3098427A1 (en) 2018-04-18 2019-04-18 Methods for assessing specificity of cell engineering tools
US17/047,456 US20210147922A1 (en) 2018-04-18 2019-04-18 Methods for assessing specificity of cell engineering tools

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US201862659664P 2018-04-18 2018-04-18
US62/659,664 2018-04-18
US201862690908P 2018-06-27 2018-06-27
US62/690,908 2018-06-27

Publications (1)

Publication Number Publication Date
WO2019204661A1 true WO2019204661A1 (en) 2019-10-24

Family

ID=68239932

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2019/028200 WO2019204661A1 (en) 2018-04-18 2019-04-18 Methods for assessing specificity of cell engineering tools

Country Status (4)

Country Link
US (1) US20210147922A1 (en)
EP (1) EP3781704A4 (en)
CA (1) CA3098427A1 (en)
WO (1) WO2019204661A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020106633A1 (en) * 2018-11-19 2020-05-28 Altius Institute For Biomedical Sciences Compositions and methods for detection of cleavage of genomic dna by a nuclease
US11661459B2 (en) 2020-12-03 2023-05-30 Century Therapeutics, Inc. Artificial cell death polypeptide for chimeric antigen receptor and uses thereof
TWI820630B (en) * 2021-04-28 2023-11-01 友達光電股份有限公司 Cell manipulation panel, cell isolation device and method of performing cell manipulation
US11883432B2 (en) 2020-12-18 2024-01-30 Century Therapeutics, Inc. Chimeric antigen receptor system with adaptable receptor specificity

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020129979A1 (en) * 2018-12-18 2020-06-25 富士フイルム株式会社 Image processing device, method, and program
DE102022131448A1 (en) 2022-11-28 2024-05-29 Carl Zeiss Microscopy Gmbh Method and apparatus for processing data to identify analytes
DE102022131449A1 (en) 2022-11-28 2024-05-29 Carl Zeiss Microscopy Gmbh Method and apparatus for processing data to identify analytes
DE102022131447A1 (en) 2022-11-28 2024-05-29 Carl Zeiss Microscopy Gmbh Method and apparatus for processing data to identify analytes
DE102022131444A1 (en) 2022-11-28 2024-05-29 Carl Zeiss Microscopy Gmbh Method for identifying analytes in an image sequence
DE102022131446A1 (en) 2022-11-28 2024-05-29 Carl Zeiss Microscopy Gmbh Method and apparatus for processing data to identify analytes
DE102022131450A1 (en) 2022-11-28 2024-05-29 Carl Zeiss Microscopy Gmbh Method and apparatus for processing data to identify analytes
DE102022131451A1 (en) 2022-11-28 2024-05-29 Carl Zeiss Microscopy Gmbh Method and device for determining a signal composition of signal sequences of an image sequence
DE102022131445A1 (en) 2022-11-28 2024-05-29 Carl Zeiss Microscopy Gmbh Method and apparatus for processing data to identify analytes

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140073520A1 (en) * 2011-12-23 2014-03-13 California Institute Of Technology Imaging chromosome structures by super-resolution fish with single-dye labeled oligonucleotides
WO2016061523A1 (en) * 2014-10-17 2016-04-21 Howard Hughes Medical Institute Genomic probes
WO2018017774A1 (en) * 2016-07-19 2018-01-25 Altius Institute For Biomedical Sciences Methods for fluorescence imaging microscopy and nano-fish

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200283743A1 (en) * 2016-08-17 2020-09-10 The Broad Institute, Inc. Novel crispr enzymes and systems

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140073520A1 (en) * 2011-12-23 2014-03-13 California Institute Of Technology Imaging chromosome structures by super-resolution fish with single-dye labeled oligonucleotides
WO2016061523A1 (en) * 2014-10-17 2016-04-21 Howard Hughes Medical Institute Genomic probes
WO2018017774A1 (en) * 2016-07-19 2018-01-25 Altius Institute For Biomedical Sciences Methods for fluorescence imaging microscopy and nano-fish

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
CHEN, B ET AL.: "Dynamic Imaging Of Genomic Loci In Living Human Cells By An Optimized CRISPR/Cas System", CELL, vol. 155, no. 7, 19 December 2013 (2013-12-19), pages 1 - 23, XP055181416 *
See also references of EP3781704A4 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020106633A1 (en) * 2018-11-19 2020-05-28 Altius Institute For Biomedical Sciences Compositions and methods for detection of cleavage of genomic dna by a nuclease
US11661459B2 (en) 2020-12-03 2023-05-30 Century Therapeutics, Inc. Artificial cell death polypeptide for chimeric antigen receptor and uses thereof
US11883432B2 (en) 2020-12-18 2024-01-30 Century Therapeutics, Inc. Chimeric antigen receptor system with adaptable receptor specificity
TWI820630B (en) * 2021-04-28 2023-11-01 友達光電股份有限公司 Cell manipulation panel, cell isolation device and method of performing cell manipulation

Also Published As

Publication number Publication date
EP3781704A1 (en) 2021-02-24
CA3098427A1 (en) 2019-10-24
EP3781704A4 (en) 2021-12-15
US20210147922A1 (en) 2021-05-20

Similar Documents

Publication Publication Date Title
US20210147922A1 (en) Methods for assessing specificity of cell engineering tools
US20220397526A1 (en) Methods for Fluorescence Imaging Microscopy and Nano-Fish
Hsieh et al. Enhancer–promoter interactions and transcription are largely maintained upon acute loss of CTCF, cohesin, WAPL or YY1
Yarrington et al. Nucleosomes inhibit target cleavage by CRISPR-Cas9 in vivo
Cai et al. RIC-seq for global in situ profiling of RNA–RNA spatial interactions
Yan et al. BLISS is a versatile and quantitative method for genome-wide profiling of DNA double-strand breaks
Dequeker et al. MCM complexes are barriers that restrict cohesin-mediated loop extrusion
Liu et al. Multiplexed imaging of nucleome architectures in single cells of mammalian tissue
Lungu et al. Modular fluorescence complementation sensors for live cell detection of epigenetic signals at endogenous genomic sites
Hwang et al. Protein induced fluorescence enhancement as a single molecule assay with short distance sensitivity
Fraser et al. An overview of genome organization and how we got there: from FISH to Hi-C
EP3822362B1 (en) On-slide staining by primer extension
Townshend et al. High-throughput cellular RNA device engineering
Holtzer et al. Nucleic acid templated chemical reaction in a live vertebrate
Zhang et al. Resolving cadherin interactions and binding cooperativity at the single-molecule level
Gutierrez-Escribano et al. A conserved ATP-and Scc2/4-dependent activity for cohesin in tethering DNA molecules
Jayathilaka et al. Inhibition of the function of class IIa HDACs by blocking their interaction with MEF2
Koh et al. ATP-independent diffusion of double-stranded RNA binding proteins
McFadden et al. Biochemical methods to investigate lncRNA and the influence of lncRNA: protein complexes on chromatin
JP2020511648A (en) Analyte detection
You et al. Dynamic submicroscopic signaling zones revealed by pair correlation tracking and localization microscopy
Galli et al. DNA G-quadruplex recognition in vitro and in live cells by a structure-specific nanobody
Bauer et al. The application of aptamers for immunohistochemistry
Barshad et al. RNA polymerase II dynamics shape enhancer–promoter interactions
Hu et al. Chromatin tracing: imaging 3D genome and nucleome

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19788289

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 3098427

Country of ref document: CA

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2019788289

Country of ref document: EP

Effective date: 20201118