WO2022173828A1 - Identification des types de cellule dans des images in situ multiplexées par combinaison des profils d'expression et des informations spatiales - Google Patents

Identification des types de cellule dans des images in situ multiplexées par combinaison des profils d'expression et des informations spatiales Download PDF

Info

Publication number
WO2022173828A1
WO2022173828A1 PCT/US2022/015819 US2022015819W WO2022173828A1 WO 2022173828 A1 WO2022173828 A1 WO 2022173828A1 US 2022015819 W US2022015819 W US 2022015819W WO 2022173828 A1 WO2022173828 A1 WO 2022173828A1
Authority
WO
WIPO (PCT)
Prior art keywords
cell
cells
imaging
tissue
multiplexed
Prior art date
Application number
PCT/US2022/015819
Other languages
English (en)
Inventor
Sylvia K. Plevritis
Weiruo ZHANG
Irene Li
Original Assignee
The Board Of Trustees Of The Leland Stanford Junior University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by The Board Of Trustees Of The Leland Stanford Junior University filed Critical The Board Of Trustees Of The Leland Stanford Junior University
Publication of WO2022173828A1 publication Critical patent/WO2022173828A1/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/69Microscopic objects, e.g. biological cells or cellular parts
    • G06V20/698Matching; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • G06T7/0014Biomedical image inspection using an image reference approach
    • G06T7/0016Biomedical image inspection using an image reference approach involving temporal comparison
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/762Arrangements for image or video recognition or understanding using pattern recognition or machine learning using clustering, e.g. of similar faces in social networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10056Microscopic image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10064Fluorescence image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20076Probabilistic image processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30024Cell structures in vitro; Tissue sections in vitro

Definitions

  • Clustering groups cells with similar protein expressions and introduces a bias through selection of number of clusters and possibly that individual clusters may be a mixture of cell types, thereby comprising single-cell resolution (Shekhar et al. Proc. Natl. Acad. Sci. U.S.A. 111 , 202-207 (2014)).
  • subjective manual assessment is still required to assign cell types to the clusters, and clusters with cell type mixtures particularly demand substantial manual input. Such manual interventions could limit reproducibility of cell type identification and require much time consumption.
  • both gating and clustering approaches only utilize protein expression profiles to identify cell types, without utilizing spatial information that can be obtained from the in situ images. Protein expressions are susceptible to technical artifacts introduced by the imaging platform (Kim et al. J. Opt. 15, 1-30 (2013); Rich et al. Anal. Bioanal. Chem. 405, 2065-2075 (2013)).
  • Methods, systems, and devices, including computer programs encoded on a computer storage medium are provided for identifying cell types of individual cells within a tissue using both expression profiles for cellular markers and spatial information.
  • a machine learning algorithm referred to as “CELESTA” is provided that automates identification of cell types in multiplexed in situ images.
  • a method for identifying cell types and spatial locations of cells within a tissue sample comprising: i) performing multiplexed in situ imaging of a plurality of cellular markers in the tissue to produce an image; ii) segmenting the image to generate a plurality of image segments, wherein each image segment contains an image of a single cell of the tissue, wherein the spatial location of the single cell in each image segment is quantified by X and Y coordinates, and wherein an expression profile of the single cell in each image segment is determined from the in situ imaging of the plurality of cellular markers by analysis of the segmented image; iii) comparing the expression profile of the single cell in each image segment to reference expression profiles for the plurality of cellular markers in an initial cell-type signature matrix, wherein the initial cell-type signature matrix defines a plurality of known cell types based on prior knowledge of cellular markers known to be expressed in specific cell types, wherein for each cell type, the initial cell-type signature matrix indicates whether a cellular
  • the method further comprises performing filtering on the plurality of image segments to remove artifactual cell-like objects, such as, but not limited to, cellular debris misidentified as cells, adjacent cells merged in the same image segment, and auto-fluorescent non cell objects.
  • artifactual cell-like objects such as, but not limited to, cellular debris misidentified as cells, adjacent cells merged in the same image segment, and auto-fluorescent non cell objects.
  • the method further comprises identifying cell-type co-localization patterns based on the spatial locations of the cells in each image segment and the identified cell neighborhood cell types.
  • the method further comprises performing spatial analysis on the identified cell types to locate substructures within the tissue.
  • the method further comprises adjusting a bandwidth parameter to eliminate cells from the cell neighborhood of each index cell that are beyond a set distance from the index cell.
  • the method further comprises setting a user defined convergence limit for assigning the cell types to the index cells iteratively, wherein when the number of index cells without assigned cell types is smaller than the user defined convergence limit, the index cells without assigned cell types are assigned as having an unknown cell type.
  • the multiplexed in situ imaging comprises performing multiplexed antibody-based protein imaging.
  • imaging can be performed after multiplexed antibody staining, multiplexed DNA-tagged antibody staining, or multiplexed metal-isotope-tagged antibody staining of the tissue sample.
  • the imaging is performed with a labeled primary antibody or a labeled secondary antibody.
  • the labeled primary antibody or the labeled secondary antibody comprises a fluorescent label, a chromogenic label, or a metal- isotope label (e.g., a lanthanide).
  • Any suitable method known in the art may be used for imaging a tissue sample including, without limitation, fluorescence microscopy, confocal microscopy, two-photon microscopy, multi photon microscopy, light-field microscopy, expansion microscopy, and light sheet microscopy.
  • Exemplary techniques for performing multiplexed in situ imaging of the tissue include, without limitation, multiplexed fluorescence imaging, multiplexed immunofluorescence imaging, multiplexed immunohistochemistry, multiplexed immunohistochemistry/immunofluorescence (mIHC/IF) imaging, multiplexed ion beam imaging (MIBI), mass spectrometry imaging (MSI), multi-epitope-ligand cartography (MELC), multiplex modified hapten-based imaging, co-detection by indexing (CODEX) imaging, and NanoString digital spatial profiling (DSP).
  • multiplexed fluorescence imaging multiplexed immunofluorescence imaging
  • multiplexed immunohistochemistry multiplexed immunohistochemistry/immunofluorescence (mIHC/IF) imaging
  • MIBI multiplexed ion beam imaging
  • MSI mass spectrometry imaging
  • MELC multi-epitope-ligand cartography
  • CODEX co-detection by indexing
  • the tissue sample is a biopsy or surgical tissue specimen.
  • the tissue sample may be a biopsy of a tumor or a resected tumor specimen.
  • the tissue sample is a thin tissue section (e.g., having a thickness of 5-20 pm).
  • the tissue sample is a thick tissue section (e.g., having a thickness of 50-200 pm).
  • the tissue sample comprises live tissue, fixed tissue, or permeabilized tissue.
  • a fixed tissue sample may be a formalin-fixed, paraffin-embedded (FFPE) tissue section.
  • FFPE formalin-fixed, paraffin-embedded
  • the method further comprises performing single-cell RNA- sequencing (scRNA-seq) analysis on a cell of interest in the tissue sample.
  • scRNA-seq single-cell RNA- sequencing
  • a computer implemented method for identifying cell types and spatial locations of cells within a tissue sample comprising: receiving a multiplexed in situ image of a plurality of cellular markers in the tissue sample; segmenting the image to generate a plurality of image segments, wherein each image segment contains an image of a single cell of the tissue, wherein the spatial location of the single cell in each image segment is quantified by X and Y coordinates, and wherein an expression profile of the single cell in each image segment is determined from the in situ imaging of the plurality of cellular markers by analysis of the segmented image; providing an initial cell-type signature matrix comprising reference expression profiling data for a plurality of known cell types, wherein the initial cell-type signature matrix defines a plurality of cell types based on prior knowledge of cellular markers known to be expressed in specific cell types, wherein for each cell type, the initial cell-type signature matrix indicates whether a cellular marker is expressed or not expressed in that cell type; using a marker scoring function to assess
  • the computer implemented method further comprises performing filtering on the plurality of image segments to remove artifactual cell-like objects, such as, but not limited to, cellular debris misidentified as cells, adjacent cells merged in the same image segment, and auto-fluorescent non-cell objects.
  • artifactual cell-like objects such as, but not limited to, cellular debris misidentified as cells, adjacent cells merged in the same image segment, and auto-fluorescent non-cell objects.
  • the computer implemented method further comprises identifying cell-type co-localization patterns based on the spatial locations of the cells in each image segment and the identified cell neighborhood cell types.
  • the computer implemented method further comprises performing spatial analysis on the identified cell types to locate substructures within the tissue sample.
  • the computer implemented method further comprises adjusting a bandwidth parameter to eliminate cells from the cell neighborhood of each index cell that are beyond a set distance from the index cell.
  • the computer implemented method further comprises setting a user defined convergence limit for assigning the cell types to the index cells iteratively, wherein when the number of index cells without assigned cell types is smaller than the user defined convergence limit, the index cells without assigned cell types are assigned as having an unknown cell type.
  • the computer implemented method further comprises displaying a listing of cell types identified in the tissue sample.
  • the computer implemented method further comprises displaying a listing of cell lineages identified in the tissue sample.
  • the computer implemented method further comprises displaying cell type labels superimposed on an image of the tissue based on the spatial location determined for the single cell in each image segment as quantified by the X and Y coordinates of the single cell.
  • the cell type labels are color coded to differentiate different cell types and/or different cell lineages.
  • the image of the tissue sample is produced by multiplexed fluorescence imaging, multiplexed immunofluorescence imaging, multiplexed immunohistochemistry, multiplexed immunohistochemistry/immunofluorescence (mIHC/IF) imaging, multiplexed ion beam imaging (MIBI), mass spectrometry imaging (MSI), multi-epitope-ligand cartography (MELC), multiplex modified hapten-based imaging, co-detection by indexing (CODEX) imaging, or NanoString digital spatial profiling (DSP).
  • multiplexed fluorescence imaging multiplexed immunofluorescence imaging
  • multiplexed immunohistochemistry multiplexed immunohistochemistry/immunofluorescence (mIHC/IF) imaging
  • MIBI multiplexed ion beam imaging
  • MSI mass spectrometry imaging
  • MELC multi-epitope-ligand cartography
  • CODEX co-detection by indexing
  • DSP NanoString digital spatial profiling
  • a non-transitory computer-readable medium comprising program instructions that, when executed by a processor in a computer, causes the processor to perform a computer implemented method for identifying cell types and spatial locations of cells within a tissue sample, as described herein.
  • kits comprising the non-transitory computer-readable medium and instructions for identifying cell types and spatial locations of cells within a tissue sample is provided.
  • a system for identifying cell types and spatial locations of cells within a tissue sample comprising: a processor programmed to identify cell types and spatial locations of cells within a tissue sample according to the computer implemented method described herein; and a display component for displaying information regarding the identified cell types and the spatial locations of the identified cells within the tissue sample.
  • the processor is provided by a computer or handheld device (e.g., a cell phone or tablet).
  • a computer or handheld device e.g., a cell phone or tablet.
  • the system further comprises an imaging device to perform multiplexed in situ imaging of the tissue.
  • the imaging device is a fluorescence microscope, a confocal microscope, a laser-scanning microscope, a high-resolution laser ablation system, a mass spectrometer, a charge-coupled device (CCD), an active-pixel sensor (APS), or a CMOS sensor.
  • FIGS. 1 A-1C Image analysis pipeline with CELESTA.
  • FIG. 1 A Typical analysis pipeline for multiplexed in situ image data.
  • FIG. 1B Schematic illustration demonstrates CELESTA’S cell type assignment is an iterative process.
  • FIG. 1C CELESTA’S iterative assignments on an image tile from a real tissue sample.
  • FIGS. 2A-2F Overview of CELESTA.
  • FIG. 2A CELESTA flowchart.
  • FIG. 2B CELESTA’S inputs and preprocessing steps.
  • FIG. 2C Illustration of CELESTA’S marker-scoring function.
  • FIG. 2D Illustration of CELESTA’S spatial-scoring function, using spatial neighborhood information for each non-anchor cell /. The cell type information from spatially nearest neighboring cells of cell / is derived using the energy function of the Potts model.
  • Each non-anchor cell C is associated with an unknown state S which is the cell type to be inferred. Cells are represented as nodes in an undirected graph with edges connecting N-nearest spatial neighbors.
  • FIG. 2F Illustration of the cell-type resolution strategy used by CELESTA, based on our HNSCC imaging panel marker in parentheses. Abbreviations: Markov Random Field (MRF), cytokeratin (CK), natural killer (NK) cell, conventional dendritic cell (cDC), plasmacytoid dendritic cell (pDC).
  • MRF Markov Random Field
  • CK cytokeratin
  • NK natural killer
  • cDC conventional dendritic cell
  • pDC plasmacytoid dendritic cell
  • FIGS. 3A-3F CELESTA applied to a published CODEX dataset generated from a tissue microarray (TMA) of colorectal cancer primary samples (Schurch et al.).
  • TMA tissue microarray
  • FIGS. 3A-3F CELESTA applied to a published CODEX dataset generated from a tissue microarray (TMA) of colorectal cancer primary samples (Schurch et al.).
  • FIG. 3A Representative TMA core with seven channel overlay CODEX image (Left Panel), image using CELESTA-assigned cell types (Middle Panel) and image using Schurch et al. annotated cell types (Right Panel).
  • FIG. 3B Cell type composition from CELESTA-assigned cell types versus Schurch et al. annotations, across the 70 cores of entire TMA.
  • a cell type is defined as rare if it has, on average, fewer than 100 cells per core.
  • FIG. 3E CELESTA cell type assignments on a cluster which Schurch et al. annotated as a mixture of vasculature or immune cells. CELESTA cell type compositions are shown on the left panel and average canonical marker expressions for each cell type in the cluster are shown on the right panel.
  • FIG. 3F CELESTA cell type assignments on a cluster which Schurch et al. annotated as a mixture of tumor or immune cells.
  • CELESTA cell type compositions are shown on the left panel and average canonical marker expressions for each cell type in the cluster are shown on the right panel.
  • FIGS. 4A-4F CELESTA applied to CODEX data generated from fresh-frozen HNSCC primary tumor samples.
  • FIG. 4A CODEX image overlay (Left Panel) and CELESTA (Right Panel) for a primary tumor HNSCC sample associated with lymph node metastasis (N+).
  • FIG. 4B CODEX image overlay (Left Panel) and CELESTA (Right Panel) for a primary tumor sample of HNSCC not associated with lymph node metastasis (NO).
  • FIG. 4C Cell-type compositions from scRNA-seq data (Left Panel) and CELESTA inferred cell types on CODEX data (Middle Panel), by HNSCC sample.
  • scRNA-seq data were obtained from proximal tissue section with tissue section used to generate CODEX data on four patient samples. Correlation (Pearson correlation test) between CELESTA inferred cell compositions and scRNA-seq cell compositions on the same four samples (Right Panel).
  • FIG. 4D Adjusted Rand Index (ARI) to assess CELESTA’S performance against manual gating for each HNSCC sample. Error bars indicate standard deviations (SD) calculated based on 50 runs of random sampling and centers indicate mean values.
  • FIG. 4E Correlation (Pearson correlation test) between CELESTA-inferred cell compositions and manual gating compositions.
  • FIGS. 5A-5G Spatial pairwise cell-type co-localization analysis based on CELESTA- identified cell types in the HNSCC study cohort.
  • FIG. 5A Schematic representation of two different pairwise cell-type spatial patterns: low pairwise cell-type co-localization (Left Panel) and high pairwise cell-type colocalization (Right Panel).
  • FIG. 5A Schematic representation of two different pairwise cell-type spatial patterns: low pairwise cell-type co-localization (Left Panel) and high pairwise cell-type colocalization (Right Panel).
  • FIG. 5C Graphical illustration of inferred spatial architectural differences of cell-cell co-localizations in NO samples (Top Panel) vs N+ samples (Bottom Panel). Created with BioRender.
  • FIG. 5D Representative regions of a NO sample (Left Panel) and N+ sample (Right Panel) depicted as three-color overlay images with FOXP3 (yellow), Cytokeratin (blue) and CD8 (magenta).
  • FIG. 5E Representative regions for NO sample (Left) and N+ sample (Right) depicted as four-color overlay images with Cytokeratin (red), CD4 (green), CD8 (blue) and CD31 (magenta).
  • FIG. 5F Representative HNSCC TMA cores for NO and N+ patients depicted as overlay images with Cytokeratin and FOXP3 staining. Each TMA core size is approximately 1 mm diameter.
  • Center line of box plot defines data median, top value indicates largest value within 1.5 times interquartile range above 75 th percentile, bottom value indicates smallest value within 1.5 times interquartile range below 25 th percentile, and upper and lower bounds of the box plot indicate 75 th and 25 th percentile respectively.
  • FIGS. 6A-6J scRNA-seq analysis guided by spatial biology reveals cell-cell interactions unique to primary HNSCC associated with lymph node metastasis.
  • FIG. 6A UMAP of identified cell type clusters using HNSCC scRNA-seq data.
  • FIG. 6B UMAP of malignant cells (Cluster 11) with node status (Left) and CXCL10 expression (Right).
  • FIG. 6E Graphical illustration shows cell-cell crosstalk with identified chemokine ligand-receptor pairs mediating cellular spatial co-localization in N+ samples. Created with BioRender.
  • FIG. 6H Schematic workflow of in vivo experiments. Created with BioRender.
  • Center line of box plot defines data median, top value indicates largest value within 1.5 times interquartile range above 75 th percentile, bottom value indicates smallest value within 1.5 times interquartile range below 25 th percentile, and upper and lower bounds of the box plot indicate 75 th and 25 th percentile respectively.
  • FIGS. 7A-7E Example illustrating differences on tumor cell identification between CELESTA and Schurch et al. annotations on the colorectal cancer dataset.
  • FIG. 7A Nuclei staining of one example core region (032).
  • FIG. 7B Cytokeratin staining of region 032.
  • FIG. 7C Tumor cells identified in Schurch et al. annotations of region 032 (yellow crosses) overlaid on cytokeratin staining.
  • FIG. 7D Tumor cells identified by CELESTA of region 032 (yellow crosses) overlaid on cytokeratin staining.
  • FIG. 7A Nuclei staining of one example core region (032).
  • FIG. 7B Cytokeratin staining of region 032.
  • FIG. 7C Tumor cells identified in Schurch et al. annotations of region 032 (yellow crosses) overlaid on cytokeratin staining.
  • FIG. 7D Tumor cells identified by CELESTA of region 03
  • FIGS. 8A-8L Manual assessment of CELESTA identified cell types on an example HNSCC sample. Identified cells are shown as crosses using the X and Y coordinates overlaid on canonical marker staining (white) CODEX images. For each cell type, nuclei staining and three example markers (positive and negative) important for the cell type are shown.
  • FIG. 9 Gating strategies on the head and neck squamous cell carcinoma (HNSCC) samples. Gating strategies used to identify key cell types relevant to the HNSCC study including malignant cells, endothelial cells and subtypes of T cells.
  • HNSCC head and neck squamous cell carcinoma
  • FIGS. 10A-10G Additional scRNA-seq analysis.
  • FIG. 10A UMAP plot of identified cell type clusters with node status.
  • FIGS. 10B-10C UMAP plots of FOXP3, IL2RA, CXCR3, CD4 and CD8A.
  • FIG. 10D CXCR3 expression in different T cell clusters showed that CXCR3 only differentially expressed in Treg cells.
  • FIG. 10E Violin plot of STAT1 expression in the Treg cell cluster between N+ and NO samples. STAT1 is a CXCR3 inducer.
  • FIG. 10F Violin plot of CXCL9 and CXCL11 in the malignant cell cluster between N+ and NO samples.
  • CXCL9 and CXCL11 are both ligands of CXCR3, but they are not differentially expressed in our data.
  • FIG. 10G Heatmap shows expressions of CD274 (PD-L1), MUC1 , EMT markers (CDH1 and VIM) and sternness markers (CD44 and CD24).
  • PD-L1 CD274
  • MUC1 MUC1
  • EMT markers CDH1 and VIM
  • sternness markers CD44 and CD24
  • FIGS. 11A-11H Additional scRNA-seq analysis using public domain data from Puram et al. (2017).
  • FIG. 11 A UMAP plot of identified cell type clusters.
  • FIG. 11 B UMAP plot of identified cell type clusters with node status.
  • FIGS. 11C-11F UMAP plots of CD4, CD8A, FOXP3, and IL2RA.
  • FIG. 11 G UMAP plot of CXCR3 and violin plots of CXCR3 in the T cell clusters.
  • FIG. 11 H Violin plot of CXCL10 in malignant cell cluster 0. * : adjusted p-value ⁇ 0.05, ** : adjusted p-value ⁇ 0.01, * ** : adjusted p-value ⁇ 0.005, **** : adjusted p-value ⁇ 0.001.
  • FIG. 12 Example gating strategies used for mouse model studies. DETAILED DESCRIPTION OF THE INVENTION
  • Methods, systems, and devices, including computer programs encoded on a computer storage medium are provided for identifying cell types of individual cells within a tissue using both expression profiles for cellular markers and spatial information.
  • a machine learning algorithm referred to as “CELESTA” is provided that automates identification of cell types in multiplexed in situ images.
  • a cell includes a plurality of such cells and reference to “the fluorophore” includes reference to one or more fluorophores and equivalents thereof, e.g., fluorescent dyes, fluorescent labels, and the like, known to those skilled in the art, and so forth.
  • peptide oligopeptide
  • polypeptide protein
  • amino acid polymers in which one or more amino acid residue is an artificial chemical mimetic of a corresponding naturally occurring amino acid, as well as to naturally occurring amino acid polymers and non-naturally occurring amino acid polymers. Both full-length proteins and fragments thereof are encompassed by the definition.
  • the terms also include post-expression modifications of the polypeptide, for example, phosphorylation, glycosylation, acetylation, hydroxylation, oxidation, and the like as well as chemically or biochemically modified or derivatized amino acids and polypeptides having modified peptide backbones.
  • the terms also include fusion proteins, including, but not limited to, fusion proteins with a heterologous amino acid sequence, fusions with heterologous and homologous leader sequences, with or without N-terminal methionine residues; immunologically tagged proteins; and the like.
  • the terms include polypeptides including one or more of a fatty acid moiety, a lipid moiety, a sugar moiety, and a carbohydrate moiety.
  • isolated when referring to a protein, polypeptide, or peptide, that the indicated molecule is separate and discrete from the whole organism with which the molecule is found in nature or is present in the substantial absence of other biological macro molecules of the same type.
  • isolated with respect to a polynucleotide is a nucleic acid molecule devoid, in whole or part, of sequences normally associated with it in nature; or a sequence, as it exists in nature, but having heterologous sequences in association therewith; or a molecule disassociated from the chromosome.
  • vertebrate any member of the subphylum Chordata, including, without limitation, humans and other primates, including non-human primates such as chimpanzees and other apes and monkey species; farm animals such as cattle, sheep, pigs, goats and horses; domestic mammals such as dogs and cats; laboratory animals including rodents such as mice, rats and guinea pigs; birds, including domestic, wild and game birds such as chickens, turkeys and other gallinaceous birds, ducks, geese, and the like.
  • the term does not denote a particular age. Thus, both adult and newborn individuals are intended to be covered.
  • the terms “specific binding,” “specifically binds,” and the like, refer to non-covalent or covalent preferential binding to a molecule relative to other molecules or moieties in a solution or reaction.
  • the affinity of one molecule for another molecule to which it specifically binds is characterized by a K D (dissociation constant) of 10 5 M or less (e.g., 10 6 M or less, 10 7 M or less, 10 8 M or less, 10 9 M or less, 10 10 M or less, 10 11 M or less, 10 12 M or less).
  • affinity refers to the strength of binding, increased binding affinity being correlated with a lower K D .
  • affinity is determined by surface plasmon resonance (SPR), e.g., as used by Biacore systems.
  • SPR surface plasmon resonance
  • the affinity of one molecule for another molecule is determined by measuring the binding kinetics of the interaction, e.g., at 25°C.
  • tumor refers to a cell or population of cells whose growth, proliferation or survival is greater than growth, proliferation or survival of a normal counterpart cell, e.g., a cell proliferative, hyperproliferative or differentiative disorder. Typically, the growth is uncontrolled.
  • malignancy refers to invasion of nearby tissue.
  • metastasis or a secondary, recurring or recurrent tumor, cancer or neoplasia refers to spread or dissemination of a tumor, cancer or neoplasia to other sites, locations or regions within the subject, in which the sites, locations or regions are distinct from the primary tumor or cancer.
  • Neoplasia, tumors and cancers include benign, malignant, metastatic and non-metastatic types, and include any stage (I, II, III, IV or V) or grade (G1 , G2, G3, etc.) of neoplasia, tumor, or cancer, or a neoplasia, tumor, cancer or metastasis that is progressing, worsening, stabilized or in remission.
  • carcinomas such as squamous cell carcinoma, adenocarcinoma, adenosquamous carcinoma, anaplastic carcinoma, large cell carcinoma, and small cell carcinoma
  • cancers such as, but are not limited to, pancreatic cancer, lung cancer (non-small cell lung cancer, small cell lung cancer), gastric cancer, ovarian cancer, endometrial cancer, colorectal cancer, oral cancer, skin cancer, cholangiocarcinoma, head and neck cancer, breast cancer, ovarian cancer, melanoma, peripheral neuroma, glioblastoma, adrenocortical carcinoma, AIDS-related lymphoma, anal cancer, bladder cancer, meningioma, glioma, astrocytoma, cervical cancer, chronic myeloproliferative disorders, colon cancer, endometrial cancer, ependymoma, esophage
  • carcinomas such as squamous cell carcinoma, adenocarcinoma, a
  • the terms “detectable label”, “detection agent”, “diagnostic agent”, and “detectable moiety” are used interchangeably and refer to a molecule or substance capable of detection, including, but not limited to, fluorescers, chemiluminescers, chromophores, bioluminescent proteins, enzymes, enzyme substrates, enzyme cofactors, enzyme inhibitors, isotopic labels, semiconductor nanoparticles, dyes, metal ions, metal sols, ligands (e.g., biotin, streptavidin or haptens) and the like.
  • fluorescer refers to a substance or a portion thereof which is capable of exhibiting fluorescence in the detectable range.
  • detectable labels which may be used in the practice of the invention include isotopic labels, including radioactive and non-radioactive isotopes, such as, 3 H, 2 H, 120 I, 123 l, 124 l, 125 l, 131 1, 35 S, 11 C, 13 C, 14 C, 32 P , 15 N, 13 N, 1 10 ln, 111 In, 177 Lu, 18 F, 52 Fe, 62 Cu, 64 Cu, 67 Cu, 67 Ga, 68 Ga, 86 Y, 90 Y, 89 Zr, 94m Tc, 94 Tc, 99m Tc, 154 Gd, 155 Gd, 1 56 Gd, 157 Gd, 158 Gd, 15 0, 186 Re, 188 Re, 51 M, 52m Mn, 55 Co, 72 As, 75 Br, 76 Br, 82m Rb, and 83 Sr.
  • isotopic labels including radioactive and non-radioactive isotopes, such as, 3 H, 2 H, 120
  • detectable labels may comprise positron-emitting radionuclides suitable for PET imaging such as, but not limited to, 64 Cu, 89 Zr, 68 Ga, 177 Lu, 82 Rb, 11 C, 13 N, 15 0, and 18 F; or gamma-emitting radionuclides suitable for single photon emission computed tomography (SPECT) imaging such as, but not limited to, 67 Ga, 99m Tc, 123 l, and 131 1.
  • PET imaging such as, but not limited to, 64 Cu, 89 Zr, 68 Ga, 177 Lu, 82 Rb, 11 C, 13 N, 15 0, and 18 F
  • gamma-emitting radionuclides suitable for single photon emission computed tomography (SPECT) imaging such as, but not limited to, 67 Ga, 99m Tc, 123 l, and 131 1.
  • SPECT single photon emission computed tomography
  • Detectable labels may also include lanthanide isotopes suitable for multiplexed ion beam imaging (MIBI) such as, but not limited to, 139 La, 143 Nd, 147 Sm, 154 Sm, 158 Gd, 1 62 Dy, 166 Er, 168 Er, 176 Yb.
  • Detectable labels may also include non-radioactive, paramagnetic metal ions suitable for MRI imaging such as, but not limited to, Mn 2+ , Fe 3+ , Fe 2+ , Gd 3+ , Ti 2+ , Cr 3+ , Co 2+ , Ni 2+ , and Cu 2+ .
  • MIBI multiplexed ion beam imaging
  • Detectable labels may also include fluorophores including without limitation, SYBR green, SYBR gold, a CAL Fluor dye such as CAL Fluor Gold 540, CAL Fluor Orange 560, CAL Fluor Red 590, CAL Fluor Red 610, and CAL Fluor Red 635, a Quasar dye such as Quasar 570, Quasar 670, and Quasar 705, an Alexa Fluor such as Alexa Fluor 350, Alexa Fluor 488, Alexa Fluor 546, Alexa Fluor 555, Alexa Fluor 594, Alexa Fluor 647, Alexa Fluor 660, Alexa Fluor 700, Alexa Fluor 750, and Alexa Fluor 784, and Alexa Fluor 790, a cyanine dye such as Cy 3, Cy3.5, Cy5, Cy5.5, Cy7, and Cy7, IRDye dyes such as IRDye 800CW, IRDye 680RD, IRDye 700, IRDye 750, and IRDy
  • Enzyme tags are used with their cognate substrate.
  • the terms also include chemiluminescent labels such as luminol, isoluminol, acridinium esters, and peroxyoxalate and bioluminescent proteins such as firefly luciferase, bacterial luciferase, Renilla luciferase, and aequorin.
  • microspheres with xMAP technology produced by Luminex (Austin, TX)
  • microspheres containing quantum dot nanocrystals, for example, containing different ratios and combinations of quantum dot colors e.g., Qdot nanocrystals produced by Life Technologies (Carlsbad, CA)
  • glass coated metal nanoparticles see e.g., SERS nanotags produced by Nanoplex Technologies, Inc.
  • SonoVue microbubbles comprising sulfur hexafluoride
  • Optison microbubbles comprising an albumin shell and octafluoropropane gas core
  • Levovist microbubbles comprising a lipid/galactose shell and an air core
  • Perflexane lipid microspheres comprising perfluorocarbon microbubbles
  • Perflutren lipid microspheres comprising octafluoropropane encapsulated in an outer lipid shell
  • magnetic resonance imaging (MRI) contrast agents e.g., gadodiamide, gadobenic acid, gadopentetic acid, gadoteridol, gadofosveset, gadoversetamide, gadoxetic acid
  • radiocontrast agents such as for computed tomography (CT), radiography, or fluoroscopy (e.g., diatrizoic acid, metrizoic acid, iodamide, iotalamic acid,
  • a "ligand” or “binding agent” is any molecule that specifically binds to a cellular marker or other target.
  • the ligand or binding agent is a molecule that selectively binds to a target analyte of interest (e.g., cellular marker) with high binding affinity.
  • high binding affinity is meant a binding affinity of at least about 10 4 M, usually at least about 10 6 M or higher, e.g., 10 9 M or higher.
  • the binding agent may be any of a variety of different types of molecules, as long as it exhibits the requisite binding affinity for the target analyte when conjugated to a detectable label (e.g., fluorophore, chromophore, or isotope).
  • the binding agent has medium or even low affinity for its target analyte, e.g., less than about 10 _4 M.
  • the binding agent or ligand may be a small molecule or large molecule.
  • small molecule is meant a molecule having a size of less than 10,000 daltons, usually ranging in size from about 50 to about 5,000 daltons, and more usually from about 100 to about 1000 daltons in molecular weight.
  • large molecule is meant a molecule having a size of more than 10,000 daltons in molecular weight.
  • a small molecule binding agent or ligand may be any molecule, as well as binding portion or fragment thereof, that is capable of binding with the requisite affinity to the target analyte of interest (e.g., cellular marker).
  • the small molecule is a small organic molecule that is capable of binding to the target analyte of interest.
  • the small molecule will include one or more functional groups necessary for structural interaction with the target analyte, e.g., groups necessary for hydrophobic, hydrophilic, electrostatic or even covalent interactions.
  • the drug moiety will include functional groups necessary for structural interaction with proteins, such as hydrogen bonding, hydrophobic-hydrophobic interactions, electrostatic interactions, etc., and will typically include at least an amine, amide, sulfhydryl, carbonyl, hydroxyl or carboxyl group, preferably at least two of the functional chemical groups.
  • the small molecule will also comprise a region that may be modified and/or participate in conjugation to a fluorophore, without substantially adversely affecting the small molecule's ability to bind to its target analyte.
  • Small molecule ligands may comprise cyclical carbon or heterocyclic structures and/or aromatic or polyaromatic structures substituted with one or more of the above functional groups.
  • Small molecule ligands may also include organic compounds comprising alkyl groups (including alkanes, alkenes, alkynes and heteroalkyl), aryl groups (including arenes and heteroaryl), alcohols, ethers, amines, aldehydes, ketones, acids, esters, amides, cyclic compounds, heterocyclic compounds (including purines, pyrimidines, benzodiazepins, beta-lactams, tetracylines, cephalosporins, and carbohydrates), steroids (including estrogens, androgens, cortisone, ecodysone, etc.), alkaloids (including ergots, vinca, curare, pyrollizdine, and mitomycines), organometallic compounds, hetero-atom bearing compounds, amino acids, and nucleosides.
  • the small molecule may be derived from a naturally occurring or synthetic compound that may be obtained from a wide variety of sources, including libraries of synthetic or natural compounds. For example, numerous means are available for random and directed synthesis of a wide variety of organic compounds and biomolecules, including the preparation of randomized oligonucleotides and oligopeptides. Alternatively, libraries of natural compounds in the form of bacterial, fungal, plant and animal extracts are available or readily produced. Additionally, natural or synthetically produced libraries and compounds are readily modified through conventional chemical, physical and biochemical means, and may be used to produce combinatorial libraries. Known small molecules may be subjected to directed or random chemical modifications, such as acylation, alkylation, esterification, amidification, etc. to produce structural analogs.
  • the small molecule may be obtained from a library of naturally occurring or synthetic molecules, including a library of compounds produced through combinatorial means, i.e., a compound diversity combinatorial library.
  • a library of compounds produced through combinatorial means i.e., a compound diversity combinatorial library.
  • the small molecule employed will have demonstrated some desirable affinity for the protein target in a convenient binding affinity assay.
  • Combinatorial libraries, as well as methods for the production and screening, are known in the art and described in: U.S. Pat. Nos.
  • Small molecule ligands may also include known drugs that selectively bind to receptors on cells, including, without limitation, growth factor receptors, receptor tyrosine kinases, receptor protein serine/threonine kinases, G-protein coupled receptors, cytokine receptors, lectin receptors, and folate receptors.
  • drugs that selectively bind to receptors on cells including, without limitation, growth factor receptors, receptor tyrosine kinases, receptor protein serine/threonine kinases, G-protein coupled receptors, cytokine receptors, lectin receptors, and folate receptors.
  • anti-cancer drugs that bind to such cellular receptors may be used as ligands to target fluorophores to cancer cells.
  • Exemplary drugs that may be used as ligands to target cancer cells include, without limitation, Acitinib, Afatinib, Axitinib, Erlotinib, Cabozantinib, Crizotinib, Gefitinib, Imatinib, Ibrutinib, Lapatinib, Neovastat, Nilotinib, Pazopanib, Perifosine, Ponatinib, Regorafenib, Sorafenib, Sunitinib, Trametinib, and Vandetenib.
  • the ligand or binding agent can also be a large molecule.
  • large molecule binding agents are antibodies, as well as binding fragments and mimetics thereof.
  • peptoids and aptamers are also suitable for use as binding agents.
  • the ligand or binding agent may include a domain or moiety that can be covalently attached to a detectable label without substantially abolishing the binding affinity for its target analyte (e.g., cellular marker).
  • antibody encompasses monoclonal antibodies as well as hybrid antibodies, altered antibodies, chimeric antibodies, and humanized antibodies.
  • the term antibody includes: hybrid (chimeric) antibody molecules (see, for example, Winter et al. (1991) Nature 349:293-299; and U.S. Pat. No. 4,816,567); F(ab') and F(ab) fragments; F v molecules (noncovalent heterodimers, see, for example, Inbar et al. (1972) Proc Natl Acad Sci USA 69:2659-2662; and Ehrlich et al.
  • Fv is an antibody fragment which contains an antigen-recognition and -binding site. This region consists of a dimer of one heavy- and one light-chain variable domain in tight, non-covalent association. It is in this configuration that the three CDRs of each variable domain interact to define an antigen-binding site on the surface of the V H -V L dimer. Collectively, the six CDRs confer antigen binding specificity to the antibody. However, even a single variable domain (or half of an Fv comprising only three CDRs specific for an antigen) has the ability to recognize and bind antigen, although often at a lower affinity than the entire binding site.
  • Single-chain Fv or “scFv” antibody fragments comprise the V H and V L domains of an antibody, wherein these domains are present in a single polypeptide chain.
  • the Fv polypeptide further comprises a polypeptide linker between the V H and V L domains which enables the scFv to form the desired structure for antigen binding.
  • scFv see, for example, Pluckthun, A. in The Pharmacology of Monoclonal Antibodies, vol. 113, Rosenburg and Moore eds., Springer- Verlag, New York, pp. 269-315 (1994).
  • diabodies refers to small antibody fragments with two antigen-binding sites, which fragments comprise a heavy-chain variable domain (V H ) connected to a light-chain variable domain (V L ) on the same polypeptide chain (V H -V L ).
  • V H heavy-chain variable domain
  • V L light-chain variable domain
  • the domains are forced to pair with the complementary domains of another chain and create two antigen-binding sites.
  • Diabodies are described more fully in, for example, EP 404,097; WO 93/11161 ; and Holliger et al., (1993) Proc. Natl. Acad. Sci. USA, 90: 6444-6448.
  • affibody molecule refers to a molecule that consists of three alpha helices with 58 amino acids and has a molar mass of about 6 kDa.
  • a monoclonal antibody, for comparison, is 150 kDa, and a single-domain antibody, the smallest type of antigen-binding antibody fragment, 12-15 kDa.
  • the phrase "specifically (or selectively) binds" with reference to binding of an antibody or other binding agent to an antigen or analyte refers to a binding reaction that is determinative of the presence of the antigen or analyte in a heterogeneous population of proteins and other biologies.
  • an antigen or analyte e.g., cellular marker such as a tumor-marker or immune activation marker
  • the specified antibodies or other binding agents bind to a particular antigen or analyte at at least two times the background and do not substantially bind in a significant amount to other molecules present in the sample.
  • Specific binding to an antigen or analyte under such conditions may require an antibody or other binding agent that is selected for its specificity for a particular antigen or analyte.
  • antibodies raised to an antigen from specific species such as rat, mouse, or human can be selected to obtain only those antibodies that are specifically immunoreactive with the antigen and not with other proteins, except for polymorphic variants and alleles. This selection may be achieved by subtracting out antibodies that cross-react with molecules from other species.
  • a variety of immunoassay formats may be used to select antibodies specifically immunoreactive with a particular antigen.
  • solid-phase ELISA immunoassays are routinely used to select antibodies specifically immunoreactive with a protein (see, e.g., Flarlow & Lane. Antibodies, A Laboratory Manual (1988), for a description of immunoassay formats and conditions that can be used to determine specific immunoreactivity).
  • a specific or selective reaction will be at least twice background signal or noise and more typically more than 10 to 100 times background.
  • conjugated refers to the joining by covalent or noncovalent means of two compounds or agents (e.g., binding agent specific for a cellular marker conjugated to a fluorophore or other detectable label).
  • the system may include: a processor programmed to identify cell types and spatial locations of cells within a tissue sample based on analysis of a segmented multiplexed in situ image of the tissue; and a display component for displaying information regarding the identified cell types and the spatial locations of the identified cells within the tissue sample.
  • the system may also comprise one or more graphic boards for processing and outputting graphical information of a tissue image to the display component.
  • the system further comprises an imaging device to perform multiplexed in situ imaging of a tissue.
  • the imaging device may include, without limitation, a fluorescence microscope, a confocal microscope, a laser-scanning microscope, a high-resolution laser ablation system, a mass spectrometer, a charge-coupled device (CCD), an active-pixel sensor (APS), or a CMOS sensor.
  • the system may also include reagents for performing multiplexed in situ imaging of the tissue such as detectably labeled binding agents that specifically bind to cellular markers of interest in the tissue sample.
  • the system comprises reagents for performing multiplexed fluorescence imaging, multiplexed immunofluorescence imaging, multiplexed immunohistochemistry, multiplexed immunohistochemistry/immunofluorescence (mIHC/IF) imaging, multiplexed ion beam imaging (MIBI), mass spectrometry imaging (MSI), multi-epitope-ligand cartography (MELC), multiplex modified hapten-based imaging, or co-detection by indexing (CODEX) imaging.
  • the system may include fluorescently labeled antibodies, DNA- tagged antibodies, or metal-isotope-tagged antibodies that specifically bind to cellular markers of interest.
  • a computer implemented method for analyzing a multiplexed in situ image of a tissue to identify cell types and spatial locations of cells within the tissue.
  • the processor may be programmed to perform steps of the computer implemented method comprising: receiving a multiplexed in situ image of a plurality of cellular markers in a tissue sample; segmenting the image to generate a plurality of image segments, wherein each image segment contains an image of a single cell of the tissue, wherein the spatial location of the single cell in each image segment is quantified by X and Y coordinates, and wherein an expression profile of the single cell in each image segment is determined from the in situ imaging of the plurality of cellular markers by analysis of the segmented image; providing an initial cell-type signature matrix comprising reference expression profiling data for a plurality of known cell types, wherein the initial cell-type signature matrix defines a plurality of known cell types based on prior knowledge of cellular markers known to be expressed in specific cell types, wherein for each cell type, the initial cell-type signature
  • the computer implemented method further comprises performing filtering on the plurality of image segments to remove artifactual cell-like objects, such as, but not limited to, cellular debris misidentified as cells, adjacent cells merged in the same image segment, and auto-fluorescent non-cell objects.
  • artifactual cell-like objects such as, but not limited to, cellular debris misidentified as cells, adjacent cells merged in the same image segment, and auto-fluorescent non-cell objects.
  • the computer implemented method further comprises identifying cell-type co-localization patterns based on the spatial locations of the cells in each image segment and the identified cell neighborhood cell types.
  • the computer implemented method further comprises performing spatial analysis on the identified cell types to locate substructures within the tissue sample.
  • the computer implemented method further comprises adjusting a bandwidth parameter to eliminate cells from the cell neighborhood of each index cell that are beyond a set distance from the index cell.
  • the computer implemented method further comprises setting a user defined convergence limit for assigning the cell types to the index cells iteratively, wherein when the number of index cells without assigned cell types is smaller than the user defined convergence limit, the index cells without assigned cell types are assigned as having an unknown cell type.
  • the computer implemented method further comprises displaying a listing of cell types identified in the tissue sample.
  • the display may be color coded to differentiate different cell types and different cell lineages (e.g., each cell type or cell lineage having a different color).
  • the display may be adjustable to allow control over how many cell types are listed. In certain embodiments, the display can be adjusted to list only certain cell types or cell lineages and their spatial locations in the tissue, or any other desired listing.
  • the display may further show cell type labels superimposed on an image of the tissue based on the spatial location determined for the single cell in each image segment as quantified by its X and Y coordinates.
  • the image of the tissue sample is produced by multiplexed fluorescence imaging, multiplexed immunofluorescence imaging, multiplexed immunohistochemistry, multiplexed immunohistochemistry/immunofluorescence (mIHC/IF) imaging, multiplexed ion beam imaging (MIBI), mass spectrometry imaging (MSI), multi-epitope-ligand cartography (MELC), multiplex modified hapten-based imaging, or co-detection by indexing (CODEX) imaging, as described further below.
  • multiplexed fluorescence imaging multiplexed immunofluorescence imaging
  • multiplexed immunohistochemistry multiplexed immunohistochemistry/immunofluorescence (mIHC/IF) imaging
  • MIBI multiplexed ion beam imaging
  • MSI mass spectrometry imaging
  • MELC multi-epitope-ligand cartography
  • CODEX co-detection by indexing
  • the method can be implemented in digital electronic circuitry, or in computer software, firmware, or hardware.
  • the disclosed and other embodiments can be implemented as one or more computer program products, i.e., one or more modules of computer program instructions encoded on a computer readable medium for execution by, or to control the operation of, a data processing apparatus.
  • the computer readable medium can be a machine-readable storage device, a machine- readable storage substrate, a memory device, a composition of matter effecting a machine-readable propagated signal, or any combination thereof.
  • a computer program (also known as a program, software, software application, script, or code) can be written in any form of programming language, including compiled or interpreted languages, and it can be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment.
  • a computer program does not necessarily correspond to a file in a file system.
  • a program can be stored in a portion of a file that holds other programs or data (e.g., one or more scripts stored in a markup language document), in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, sub programs, or portions of code).
  • a computer program can be deployed to be executed on one computer or on multiple computers that are located at one site or distributed across multiple sites and interconnected by a communication network.
  • the system for performing the computer implemented method may include a processor, a storage component (i.e., memory), a display component, and other components typically present in general purpose computers.
  • the processor is provided by a computer or handheld device (e.g., a cell phone or tablet).
  • the storage component stores information accessible by the processor, including instructions that may be executed by the processor and data that may be retrieved, manipulated or stored by the processor.
  • the storage component includes instructions.
  • the storage component includes instructions for analyzing a multiplexed in situ image of tissue to identify cell types and spatial locations of cells within the tissue.
  • the computer processor is coupled to the storage component and configured to execute the instructions stored in the storage component in order to receive a multiplexed in situ image of a plurality of cellular markers in a tissue sample and analyze the data according to one or more algorithms, as described herein.
  • the display component displays information regarding the identified cell types and the spatial locations of the identified cells within the tissue sample.
  • the storage component may be of any type capable of storing information accessible by the processor, such as a hard-drive, memory card, ROM, RAM, DVD, CD-ROM, USB Flash drive, write- capable, and read-only memories.
  • the processor may be any well-known processor, such as processors from Intel Corporation. Alternatively, the processor may be a dedicated controller such as an ASIC.
  • the instructions may be any set of instructions to be executed directly (such as machine code) or indirectly (such as scripts) by the processor.
  • the terms "instructions,” “steps” and “programs” may be used interchangeably herein.
  • the instructions may be stored in object code form for direct processing by the processor, or in any other computer language including scripts or collections of independent source code modules that are interpreted on demand or compiled in advance.
  • Data may be retrieved, stored or modified by the processor in accordance with the instructions.
  • the data may be stored in computer registers, in a relational database as a table having a plurality of different fields and records, XML documents, or flat files.
  • the data may also be formatted in any computer-readable format such as, but not limited to, binary values, ASCII or Unicode.
  • the data may comprise any information sufficient to identify the relevant information, such as numbers, descriptive text, proprietary codes, pointers, references to data stored in other memories (including other network locations) or information which is used by a function to calculate the relevant data.
  • the processor and storage component may comprise multiple processors and storage components that may or may not be stored within the same physical housing.
  • some of the instructions and data may be stored on removable CD-ROM and others within a read-only computer chip. Some or all of the instructions and data may be stored in a location physically remote from, yet still accessible by, the processor.
  • the processor may comprise a collection of processors which may or may not operate in parallel.
  • the method can be performed using a cloud computing system.
  • the image data files and the programming can be exported to a cloud computer, which runs the program, and returns an output to the user.
  • tissue sample can be imaged using any of a number of different types of microscopy such as, but not limited to, fluorescence microscopy, confocal microscopy, two-photon microscopy, multi-photon microscopy, light-field microscopy, expansion microscopy, and light sheet microscopy.
  • images of the sample may be taken at different focal planes and used to reconstruct a three-dimensional image of the tissue sample.
  • Exemplary multiplexed in situ imaging techniques include, without limitation, multiplexed fluorescence imaging, multiplexed immunofluorescence imaging, multiplexed immunohistochemistry, multiplexed immunohistochemistry/immunofluorescence (mIHC/IF) imaging, multiplexed ion beam imaging (MIBI), mass spectrometry imaging (MSI), multi-epitope-ligand cartography (MELC), multiplex modified hapten-based imaging, co-detection by indexing (CODEX) imaging, and NanoString digital spatial profiling (DSP).
  • multiplexed in situ imaging techniques see, e.g., Tan et al. (2020) Cancer Commun (Lond). 40(4):135-153, Parra (2016) J. Cancer Treatment Diagn. 2(1):43-53, and Macheast et al. (2021) Genes (Basel) 12(4):538; herein incorporated by reference.
  • the multiplexed in situ imaging comprises performing multiplexed antibody-based protein imaging.
  • imaging can be performed after multiplexed antibody staining, multiplexed DNA-tagged antibody staining, or multiplexed metal-isotope-tagged antibody staining of cellular markers in the tissue sample.
  • the imaging is performed with labeled primary antibodies or labeled secondary antibodies that bind specifically to cellular markers.
  • a labeled primary antibody or a labeled secondary antibody comprises a fluorescent label, a chromogenic label, or a metal-isotope label (e.g., a lanthanide).
  • Multiplexed immunofluorescence (IF) methods can be used to detect multiple markers on cells simultaneously using fluorescently labeled antibodies that specifically bind to cellular markers on cells.
  • the fluorophores conjugated to antibodies emit fluorescent light upon exposure to light at an excitation wavelength.
  • a fluorescence microscope can be used for imaging the fluorescent signals to detect the locations where the fluorescently labeled antibodies bind to markers on cells of the tissue.
  • the fluorescence microscope can be any type of microscope that uses fluorescence to generate an image, including, without limitation, an epifluorescence microscope or a confocal microscope.
  • a fluorophore may be conjugated directly to a primary antibody that specifically binds to a cellular marker.
  • an unlabeled primary antibody that specifically binds to a cellular marker may be used in combination with a fluorescently labeled secondary antibody that specifically binds to the primary antibody.
  • a fluorescently labeled secondary antibody that specifically binds to the primary antibody.
  • multiple secondary antibodies that bind to a primary antibody are used to amplify the fluorescent signal.
  • Fluorescence imaging can be performed with other methods of fluorescent staining,
  • fluorophores can be conjugated to non-antibody binding agents such as, but not limited to, aptamers, antibody mimetics, proteins, peptoids, or ligands that specifically bind to a cellular marker, as described further below. Immunofluorescence can be used in combination with such non antibody-based fluorescent staining methods.
  • 4',6-diamidino-2-phenylindole (DAPI) is a fluorescent stain that can be used to label DNA in the nuclei of cells and can be used in combination with antibody-based and non-antibody-based methods of fluorescent staining.
  • a fluorophore conjugate comprises a fluorophore conjugated to a binding agent that selectively binds directly or indirectly to a marker on a cell of interest in the tissue.
  • the tissue is illuminated to provide excitation light to the fluorophore conjugates bound to the tissue.
  • Multiple light sources can be used that emit light at different excitation wavelengths suitable for generating fluorescence from multiple fluorophore conjugates bound to different target markers on cells of interest.
  • Fluorescent light emitted from the fluorophores is detected, and a fluorescence image of the tissue is recorded using, for example, a fluorescence microscope, a charge-coupled device (CCD), an active-pixel sensor (APS), a CMOS sensor, or other image sensor.
  • a fluorescence microscope for example, a fluorescence microscope, a charge-coupled device (CCD), an active-pixel sensor (APS), a CMOS sensor, or other image sensor.
  • Fluorescence imaging may also be performed with one or more fluorophores having a fluorescence emission in the visible, near-infrared (NIR), or infrared (IR) regions of the light spectrum, in the ranges from about 380 nm to 750 nm, 750 nm-1100 nm, and 1100 nm to 1500 nm, respectively.
  • Preferred imaging wavelengths are within the optical window of tissue, extending from approximately 500 nm to 900 nm.
  • fluorescence imaging is performed with one or more fluorophores having a fluorescence emission in the NIR region of the light spectrum, which ranges from about 700 nm to 1700 nm.
  • NIR fluorophores is advantageous in minimizing interference from tissue autofluorescence and enhancing tissue penetration compared to other fluorophores.
  • the fluorescent emission wavelength is in a region of the spectrum where blood and tissue absorb minimally, and tissue penetration is maximal, such as in the range from 700 nm to 1000 nm.
  • Any NIR fluorophore with an emission in the NIR region of the spectrum may be used, including, but not limited to, fluorophores with fluorescence emissions at about 700, 720, 740, 760, 780, 800, 820, 840, 860, 880, 900, 920, 940, 960, 980, or 1000 nm, or any wavelength in between.
  • Exemplary NIR fluorophores include, without limitation, IRDye dyes (e.g., IRDye 800CW, IRDye 680RD, IRDye 700, IRDye 750, and IRDye 800RS), CF dyes (e.g., CF680, CF680R, CF750, CF770, and CF790), Tracy dyes (e.g., Tracy 645 and Tracy 652), Alexa dyes (e.g., Alexa Fluor® 660 dye, Alexa Fluor® 700 dye, Alexa Fluor® 750 dye, and Alexa Fluor® 790), cyanine dyes (e.g., Cy7 and Cy7.5), thienothiadiazole dyes, phthalocyanine dyes, squaraine dyes, rhodamine dyes and analogues (e.g., Si-pyronine, Si-rhodamine, Te-rhodamine, and Changsha), borondipyrrome
  • Exemplary fluorophores with emissions in the visible region of the light spectrum include, without limitation, SYBR green, SYBR gold, a CAL Fluor dye such as CAL Fluor Gold 540, CAL Fluor Orange 560, CAL Fluor Red 590, CAL Fluor Red 610, and CAL Fluor Red 635, a Quasar dye such as Quasar 570, Quasar 670, and Quasar 705, an Alexa Fluor such as Alexa Fluor 488, Alexa Fluor 546, Alexa Fluor 555, Alexa Fluor 594, Alexa Fluor 647, and Alexa Fluor 784, a cyanine dye such as Cy3, Cy3.5, Cy5, Cy5.5, and Cy7, fluorescein, 2', 4', 5', 7'-tetrachloro-4-7-dichlorofluorescein (TET), carboxyfluorescein (FAM), fluorescein isothiocyanate (FITC), 6-carboxy-4',5'
  • an antioxidant compound is included in imaging buffers (i.e., "anti fade buffers") to reduce photobleaching during fluorescence imaging.
  • exemplary antioxidants include, without limitation, propyl-gallate, tertiary butylhydroquinone, butylated hydroxyanisole, butylated hydroxytoluene, glutathione, ascorbic acid, and tocopherols.
  • Such antioxidants have an antifade effect on fluorophores. That is, the antioxidant reduces photobleaching and enhances the signal-to-noise ratio of sensitive fluorophores, which improves imaging particularly of thicker tissue samples.
  • including an antioxidant increases the concentration of the non- bleached fluorophore during exposure to light and allows longer exposure times to be used (i.e., extends the fluorophore lifetime before photobleaching).
  • multi-epitope-ligand cartography is used for multiplexed in situ fluorescence imaging.
  • MELC uses multiple rounds of fluorescent detection to map the location of different proteins of cells in tissue.
  • fluorescently labeled antibodies are added and phase contrast and fluorescence images are acquired using a charged coupled device (CCD) sensor.
  • CCD charged coupled device
  • Repetitive incubation-imaging-photobleaching cycles are used, which allows the same fluorescence channel to be used after photobleaching so that the same fluorescent dye can be conjugated to different antibodies specific for different cellular markers in different cycles. See, e.g., Schubert et al. (2006) Nat Biotechnol. 24(10): 1270-1278, Friedenberger et al. (2007) Nat. Protoc. 2(9): 2285-94, Dornieden et al. (2021) J. Am. Soc. Nephrol. 32(9):2223-2241 ; herein incorporated by reference.
  • CO-Detection by indEXing uses multiple rounds of multiplexed DNA-tagged antibody staining.
  • the DNA tags comprise DNA barcodes and fluorescent deoxynucleoside triphosphate (dNTP) analogs
  • Antibodies or other binding agents
  • dNTP fluorescent deoxynucleoside triphosphate
  • Antibodies or other binding agents
  • the cells are stained with a mixture of tagged antibodies specific for different cellular markers.
  • the cells are contacted with a nucleotide mix that contains one of two non-fluorescent “index” nucleotides and two fluorescent labeling nucleotides.
  • the 5’ overhangs include a region to be filled by the index nucleotides and a position for a labeled dye nucleotide.
  • the antibodies to be visualized first generally have shorter overhangs than the antibodies to be visualized later
  • the index nucleotide fills in a first index position across all antibodies bound to the cells.
  • the DNA tags are designed such that only the first two antibodies are capable of being labeled with one of the two fluorescent dNTPs, and labeling only occurs if the index nucleotide has been previously incorporated.
  • the two labeled antibodies are imaged by fluorescence microscopy.
  • the fluorophores are cleaved and removed by washing the tissue sample, and another cycle can be performed on the tissue in which a different indexing nucleotide is used.
  • another cycle can be performed on the tissue in which a different indexing nucleotide is used.
  • a multiparameter image is constructed. See, e.g., Goltsev et al. (2016) Cell 174:968-981 , Black et al. (2021 ) Nat. Protoc. 16(8):3802-3835; herein incorporated by reference.
  • NanoString digital spatial profiling can be used for multiplexed imaging of protein and/or RNA targets in tissues.
  • the method uses oligonucleotide barcodes conjugated through a photocleavable linker to primary antibodies or nucleic acid probes. See, e.g., Decalf et al. (2019) J. Pathol. 247(5):650-661 ; herein incorporated by reference.
  • UltraPlex hapten-based fluorescent immunohistochemistry uses primary antibodies in combination with a panel of anti-hapten secondary antibodies, each labeled with a different fluorophore to provide multiplex signal amplification. Images can be acquired with a standard fluorescent microscope or a digital slide scanner. See, e.g., Levin et al. (2021) Methods Mol. Biol. 2350:267-287; herein incorporated by reference.
  • Mass spectrometry imaging can be used to vaporize cellular markers within specific regions of a tissue into gas phase-ions and measure their mass. An image of the cellular markers that initially resided in each region of a tissue, prior to vaporization, can be reconstructed. See, e.g., Matros et al. (2013) Front Plant Sci 4:89 27; herein incorporated by reference.
  • mIHC/IF imaging allows simultaneous detection of multiple markers on a single tissue section.
  • Primary antibodies are bound to cellular markers followed by incubation of the tissue with secondary antibodies labeled with horseradish peroxidase (HFtP).
  • HFtP horseradish peroxidase
  • the HFtP is reacted with a substrate bound to a chromogenic dye, which results in generation of a colored precipitate at the site of the marker.
  • Tyramine chemistry- based chromogenic dyes with different colors can be used in this method for in situ analysis with conventional brightfield microscopes for multiplex imaging.
  • a fluorophore-conjugated to a tyramide molecule serves as the substrate for HRP, resulting in a fluorescence signal at the site of the cellular marker that can be imaged by fluorescence microscopy.
  • Metal-based immunohistochemistry techniques can also be used for multiplexed imaging.
  • imaging mass cytometry uses a primary antibody tagged with a metal isotope of known molecular mass. Analysis is carried out using laser ablation coupled to mass cytometry. See, e.g., Wang et al. (2019) Cell Metabolism 29:769-783; herein incorporated by reference.
  • MIBI Multiplexed ion beam imaging
  • Primary antibodies are conjugated to stable metal isotopes, typically lanthanide isotopes, to provide a unique metal isotope label for each antibody for multiplex imaging.
  • Lanthanide isotopes suitable for use for MIBI include, without limitation, 139 La, 143 Nd, 147 Sm, 154 Sm, 158 Gd, 162 Dy, 166 Er, 168 Er, and 176 Yb.
  • the metal-isotope labeled antibodies are incubated with a tissue specimen, which is imaged using time-of-flight secondary ion mass spectrometry, and the masses of detected metal isotope labels are assigned to target cellular markers. See, e.g., Angelo et al. (2014) Nat Med. 20(4):436-442, Ptacek et al. (2020) Laboratory Investigation 100:1111-1123; herein incorporated by reference.
  • Any suitable method known in the art can be used for image segmentation, which involves the identification of the boundaries of individual cells in an image.
  • Automatic or semiautomatic image analysis methods may be used for image segmentation.
  • cell-membrane markers, conventional thresholding, and watershed segmentation may be used to identify cells in images.
  • a supervised classifier is used to automate identification of single cells in images.
  • fully automatic segmentation sometimes yields poor results, especially for complicated images.
  • Various factors can complicate image analysis, including noise, autofluorescence, low resolution, blur, unstable brightness, overlapping targets, unclear boundaries, deformation, etc.
  • human intervention may be needed to accurately identify separate cells in an image.
  • a human may outline at least some of the single-cells in an image to produce a set of single cells that can be used to train machine learning algorithms.
  • Various software programs are currently available for image segmentation, including, but not limited to, the llastik Toolkit, which uses a random forest classifier for cell segmentation, DeepCell, which uses a deep-learning algorithm utilizing deep convolutional neural networks for cell segmentation, Open Segmentation Framework (OpSeF), which semi-automates image segmentation using deep learning convolutional neural networks with the user manually providing some training data, CellSeg, which uses a mask region- convolutional neural network (R-CNN) for image segmentation, CODEX image processing pipeline software, which uses reference cellular markers, a reference nuclear stain, and a reference membrane stain to aid image segmentation, and CellProfiler, which uses conventional thresholding to classify a pixel as foreground if it is brighter than a certain “thre
  • further image processing may be performed after segmentation such as filtering image segments to remove artifactual cell-like objects, including, but not limited to, cellular debris misidentified as cells, adjacent cells merged in the same image segment, and auto-fluorescent non-cell objects.
  • the methods disclosed include identifying cell types and spatial locations of cells within a tissue sample based on analysis of multiplexed in situ imaging of cellular markers.
  • the tissue may be any type of tissue where imaging is desired such as diseased or damaged tissue, cancerous tissue, inflamed tissue, or tissue at risk of future disease requiring periodic or continuous monitoring (e.g., precancerous tissue, transplanted tissue, tissue at high risk for tumor or disease development due to an underlying genetic abnormality or mutation, such as breast tissue in a BRCA mutation carrier, or autoimmunity).
  • Tissue specimens suitable for use with the methods described herein generally include any type of tissue specimens collected from living or dead subjects, such as biopsy specimens, surgical specimens, and autopsy specimens, including, but not limited to, epithelium, muscle, connective, and nervous tissue. Tissue specimens may be collected and imaged immediately or may be preserved and imaged at a future time, e.g., after storage for an extended period of time. In some embodiments, the methods described herein may be used to preserve tissue specimens in a stable, accessible and fully intact form for future analysis. In some embodiments, the methods described herein may be used to analyze a previously preserved or stored tissue specimen. In some embodiments, the tissue includes cancerous tissue such as a resected tumor specimen.
  • the tissue is a thin slice with a thickness of 5-20 pm, including, but not limited to, e.g., 5-18 pm, 5-15 pm, or 5-10 pm.
  • the intact tissue is a thick slice with a thickness of 50-200 pm, including, but not limited to, e.g., 50-150 pm, 50-100 pm, or 50-80 pm.
  • fixation is the process of preserving biological material (e.g., tissues, cells, organelles, molecules, etc.) from decay and/or degradation. Fixation may be accomplished using any convenient protocol. Fixation can include contacting the sample with a fixation reagent (i.e., a reagent that contains at least one fixative). Samples can be contacted with a fixation reagent for a wide range of times, which can depend on the temperature, the nature of the sample, and on the fixative(s).
  • a fixation reagent i.e., a reagent that contains at least one fixative
  • a sample can be contacted with a fixation reagent for 24 or less hours, 18 or less hours, 12 or less hours, 8 or less hours, 6 or less hours, 4 or less hours, 2 or less hours, 60 or less minutes, 45 or less minutes, 30 or less minutes, 25 or less minutes, 20 or less minutes, 15 or less minutes, 10 or less minutes, 5 or less minutes, or 2 or less minutes.
  • a sample can be contacted with a fixation reagent for a period of time in a range of from 5 minutes to 24 hours, e.g., from 10 minutes to 20 hours, from 10 minutes to 18 hours, from 10 minutes to 12 hours, from 10 minutes to 8 hours, from 10 minutes to 6 hours, from 10 minutes to 4 hours, from 10 minutes to 2 hours, from 15 minutes to 20 hours, from 15 minutes to 18 hours, from 15 minutes to 12 hours, from 15 minutes to 8 hours, from 15 minutes to 6 hours, from 15 minutes to 4 hours, from 15 minutes to 2 hours, from 15 minutes to 1.5 hours, from 15 minutes to 1 hour, from 10 minutes to 30 minutes, from 15 minutes to 30 minutes, from 30 minutes to 2 hours, from 45 minutes to 1.5 hours, or from 55 minutes to 70 minutes.
  • a fixation reagent for a period of time in a range of from 5 minutes to 24 hours, e.g., from 10 minutes to 20 hours, from 10 minutes to 18 hours, from 10 minutes to 12 hours, from 10 minutes to 8 hours, from 10 minutes to 6 hours,
  • a sample can be contacted with a fixation reagent at various temperatures, depending on the protocol and the reagent used.
  • a sample can be contacted by a fixation reagent at a temperature ranging from -22°C to 55°C, where specific ranges of interest include, but are not limited to 50 to 54°C, 40 to 44°C, 35 to 39°C, 28 to 32°C, 20 to 26°C, 0 to 6°C, and -18 to -22°C.
  • a sample can be contacted by a fixation reagent at a temperature of -20°C, 4°C, room temperature (22-25°C), 30°C, 37°C, 42°C, or 52°C.
  • fixation reagent Any convenient fixation reagent can be used.
  • Common fixation reagents include crosslinking fixatives, precipitating fixatives, oxidizing fixatives, mercurials, and the like.
  • Crosslinking fixatives chemically join two or more molecules by a covalent bond and a wide range of cross-linking reagents can be used.
  • suitable cross-liking fixatives include but are not limited to aldehydes (e.g., formaldehyde, also commonly referred to as "paraformaldehyde” and “formalin”; glutaraldehyde; etc.), imidoesters, NHS (N- Hydroxysuccinimide) esters, and the like.
  • suitable precipitating fixatives include but are not limited to alcohols (e.g., methanol, ethanol, etc.), acetone, acetic acid, etc.
  • the fixative is formaldehyde (i.e., paraformaldehyde or formalin).
  • a suitable final concentration of formaldehyde in a fixation reagent is 0.1 to 10%, 1-8%, 1- 4%, 1-2%, 3-5%, or 3.5-4.5%, including about 1.6% for 10 minutes.
  • the sample is fixed in a final concentration of 4% formaldehyde (as diluted from a more concentrated stock solution, e.g., 38%, 37%, 36%, 20%, 18%, 16%, 14%, 10%, 8%, 6%, etc.). In some embodiments the sample is fixed in a final concentration of 10% formaldehyde. In some embodiments the sample is fixed in a final concentration of 1 % formaldehyde. In some embodiments, the fixative is glutaraldehyde. A suitable concentration of glutaraldehyde in a fixation reagent is 0.1 to 1%. A fixation reagent can contain more than one fixative in any combination. For example, in some embodiments the sample is contacted with a fixation reagent containing both formaldehyde and glutaraldehyde.
  • permeabilization refers to the process of rendering the cells (cell membranes etc.) of a sample permeable to experimental reagents such as nucleic acid probes, antibodies, chemical substrates, etc. Any convenient method and/or reagent for permeabilization can be used. Suitable permeabilization reagents include detergents (e.g., Saponin, Triton X-100, Tween-20, etc.), organic fixatives (e.g., acetone, methanol, ethanol, etc.), enzymes, etc. Detergents can be used at a range of concentrations.
  • 0.001 %-1% detergent, 0.05%-0.5% detergent, or 0.1%-0.3% detergent can be used for permeabilization (e.g., 0.1 % Saponin, 0.2% tween-20, 0.1 -0.3% triton X-100, etc.).
  • methanol on ice for at least 10 minutes is used to permeabilize.
  • the same solution can be used as the fixation reagent and the permeabilization reagent.
  • the fixation reagent contains 0.1%- 10% formaldehyde and 0.001 %-1% saponin. In some embodiments, the fixation reagent contains 1% formaldehyde and 0.3% saponin.
  • a sample can be contacted with a permeabilization reagent for a wide range of times, which can depend on the temperature, the nature of the sample, and on the permeabilization reagent(s).
  • a sample can be contacted by a permeabilization reagent for 24 or more hours, 24 or less hours, 18 or less hours, 12 or less hours, 8 or less hours, 6 or less hours, 4 or less hours, 2 or less hours, 60 or less minutes, 45 or less minutes, 30 or less minutes, 25 or less minutes, 20 or less minutes, 15 or less minutes, 10 or less minutes, 5 or less minutes, or 2 or less minutes.
  • a sample can be contacted by a permeabilization reagent at various temperatures, depending on the protocol and the reagent used.
  • a sample can be contacted by a permeabilization reagent at a temperature ranging from -82°C to 55°C, where specific ranges of interest include, but are not limited to: 50 to 54°C, 40 to 44°C, 35 to 39°C, 28 to 32°C, 20 to 26°C, 0 to 6°C, -18 to -22 °C, and -78 to -82°C.
  • a sample can be contacted by a permeabilization reagent at a temperature of -80°C, -20°C, 4°C, room temperature (22-25°C), 30°C, 37°C, 42°C, or 52°C.
  • a sample is contacted with an enzymatic permeabilization reagent.
  • Enzymatic permeabilization reagents that permeabilize a sample by partially degrading extracellular matrix or surface proteins that hinder the permeation of the sample by assay reagents.
  • Contact with an enzymatic permeabilization reagent can take place at any point after fixation and prior to target detection.
  • the enzymatic permeabilization reagent is proteinase K, a commercially available enzyme. In such cases, the sample is contacted with proteinase K prior to contact with a post-fixation reagent.
  • Proteinase K treatment i.e., contact by proteinase K; also commonly referred to as “proteinase K digestion”
  • proteinase K digestion can be performed over a range of times at a range of temperatures, over a range of enzyme concentrations that are empirically determined for each cell type or tissue type under investigation.
  • a sample can be contacted by proteinase K for 30 or less minutes, 25 or less minutes, 20 or less minutes, 15 or less minutes, 10 or less minutes, 5 or less minutes, or 2 or less minutes.
  • a sample can be contacted by 1 pg/ml or less, 2 pg/m or less, 4 gg/ml or less, 8 pg/rnl or less, 10 pg/rnl or less, 20 pg/rnl or less, 30 pg/rnl or less, 50 pg/rnl or less, or 100gg/ml or less proteinase K.
  • a sample can be contacted by proteinase K at a temperature ranging from 2°C to 55°C, where specific ranges of interest include, but are not limited to: 50 to 54°C, 40 to 44°C, 35 to 39°C, 28 to 32°C, 20 to 26°C, and 0 to 6°C.
  • a sample can be contacted by proteinase K at a temperature of 4°C, room temperature (22-25°C), 30°C, 37°C, 42°C, or 52°C.
  • a sample is not contacted with an enzymatic permeabilization reagent.
  • a sample is not contacted with proteinase K.
  • the tissue Before imaging, the tissue is contacted with detectably labeled binding agents that specifically bind to cellular markers of interest.
  • the binding agents may bind to any type of molecule, including proteins, lipids, polysaccharides, proteoglycans, metabolites, or the like.
  • the binding agents are conjugated to detectable labels, which may include any molecule or substance capable of detection for multiplexed in situ imaging, including, but not limited to, fluorescers, chemiluminescers, chromophores, bioluminescent proteins, enzymes, enzyme substrates, enzyme cofactors, enzyme inhibitors, isotopic labels, semiconductor nanoparticles, dyes, metal ions, metal sols, ligands (e.g., biotin, streptavidin or haptens) and the like.
  • detectable labels may include any molecule or substance capable of detection for multiplexed in situ imaging, including, but not limited to, fluorescers, chemiluminescers, chromophores, bioluminescent proteins, enzymes, enzyme substrates, enzyme cofactors, enzyme inhibitors, isotopic labels, semiconductor nanoparticles, dyes, metal ions, metal sols, ligands (e.g., biotin, streptavidin or haptens) and the like.
  • Detectable labels may be conjugated to any agent that specifically binds to a marker of interest (e.g., tumor marker or immune activation marker).
  • the binding agent binds to a marker of interest with high affinity.
  • binding agents include, without limitation, antibodies, antibody fragments, antibody mimetics, and aptamers as well as small molecules, peptides, peptoids, or ligands that bind selectively to cellular markers.
  • the conjugates used in the subject methods include at least one detectable label attached to the binding agent.
  • a conjugate is used that comprises a binding agent that selectively binds to a cell- specific marker.
  • multiple conjugates are used, wherein the different conjugates bind to different markers on cells of the same cell-type or different cell-types.
  • the binding agent comprises an antibody that specifically binds to the marker of interest.
  • Any type of antibody may be used in conjugates, including, without limitation, monoclonal antibodies, polyclonal antibodies, as well as hybrid antibodies, altered antibodies, chimeric antibodies, and humanized antibodies.
  • Antibodies may include hybrid (chimeric) antibody molecules (see, for example, Winter et al. (1991) Nature 349:293-299; and U.S. Pat. No. 4,816,567); F(ab') and F(ab) fragments; F v molecules (noncovalent heterodimers, see, for example, Inbar et al.
  • the binding agent comprises an aptamer that specifically binds to the marker of interest.
  • aptamer Any type of aptamer may be used, including a DNA, RNA, xeno-nucleic acid (XNA), or peptide aptamer that specifically binds to the tumor antigen.
  • XNA xeno-nucleic acid
  • Such aptamers can be identified, for example, by screening a combinatorial library.
  • Nucleic acid aptamers e.g., DNA or RNA aptamers
  • SELEX exponential enrichment
  • Peptide aptamers that bind to a marker of interest may be isolated from a combinatorial library and improved by directed mutation or repeated rounds of mutagenesis and selection.
  • Aptamers Tools for Nanotherapy and Molecular Imaging (R.N. Veedu ed., Pan Stanford, 2016)
  • Nucleic Acid and Peptide Aptamers Methods and Protocols (Methods in Molecular Biology, G. Mayer ed., Humana Press, 2009)
  • Nucleic Acid Aptamers Selection, Characterization, and Application (Methods in Molecular Biology, G.
  • the binding agent comprises an antibody mimetic.
  • Any type of antibody mimetic may be used, including, but not limited to, affibody molecules (Nygren (2008) FEBS J. 275 (11):2668-2676), affilins (Ebersbach et al. (2007) J. Mol. Biol. 372 (1 ):172-185), affimers (Johnson et al. (2012) Anal. Chem. 84 (15):6553-6560), affitins (Krehenbrink et al. (2008) J. Mol. Biol. 383 (5):1058-1068), alphabodies (Desmet et al.
  • the binding agent comprises a small molecule ligand.
  • Small molecule ligands encompass numerous chemical classes, e.g., small organic compounds having a molecular weight of less than about 10,000 daltons, less than about 5,000 daltons, or less than about 2,500 daltons.
  • the small molecule will include one or more functional groups necessary for structural interaction with the target analyte, e.g., groups necessary for hydrophobic, hydrophilic, electrostatic or even covalent interactions.
  • the ligand will include functional groups necessary for structural interaction with proteins, such as hydrogen bonding, hydrophobic-hydrophobic interactions, electrostatic interactions, etc., and will typically include at least an amine, amide, sulfhydryl, carbonyl, hydroxyl or carboxyl group, or preferably at least two of the functional chemical groups.
  • the small molecule may also comprise a region that may be modified and/or participate in conjugation to a detectable label, without substantially adversely affecting the small molecule's ability to bind to its target analyte.
  • Small molecule ligands can comprise cyclical carbon or heterocyclic structures and/or aromatic or polyaromatic structures substituted with one or more of the above functional groups.
  • Small molecule ligands may also include organic compounds comprising alkyl groups (including alkanes, alkenes, alkynes and heteroalkyl), aryl groups (including arenes and heteroaryl), alcohols, ethers, amines, aldehydes, ketones, acids, esters, amides, cyclic compounds, heterocyclic compounds (including purines, pyrimidines, benzodiazepins, beta-lactams, tetracylines, cephalosporins, and carbohydrates), steroids (including estrogens, androgens, cortisone, ecodysone, etc.), alkaloids (including ergots, vinca, curare, pyrollizdine, and mitomycines), organometallic compounds, hetero-atom bearing compounds, amino acids, and nucleosides.
  • Small molecule ligands are also found among biomolecules including peptides, carbohydrates, fatty acids, steroids, purines, pyrimidines, derivatives, structural analogs, or combinations thereof.
  • the small molecule may be derived from a naturally occurring or synthetic compound that may be obtained from a wide variety of sources, including libraries of synthetic or natural compounds. For example, numerous means are available for random and directed synthesis of a wide variety of organic compounds and biomolecules, including the preparation of randomized oligonucleotides and oligopeptides. Alternatively, libraries of natural compounds in the form of bacterial, fungal, plant and animal extracts are available or readily produced.
  • natural or synthetically produced libraries and compounds are readily modified through conventional chemical, physical and biochemical means, and may be used to produce combinatorial libraries.
  • Known small molecules may be subjected to directed or random chemical modifications, such as acylation, alkylation, esterification, amidification, etc. to produce structural analogs.
  • the small molecule may be obtained from a library of naturally occurring or synthetic molecules, including a library of compounds produced through combinatorial means, i.e., a compound diversity combinatorial library.
  • a library of compounds produced through combinatorial means i.e., a compound diversity combinatorial library.
  • the small molecule employed will have demonstrated some desirable affinity for the protein target in a convenient binding affinity assay.
  • Combinatorial libraries, as well as methods for the production and screening, are known in the art and described in: U.S. Pat. Nos.
  • Small molecule ligands may also include known drugs that selectively bind to receptors on cells, including, without limitation, growth factor receptors, receptor tyrosine kinases, receptor protein serine/threonine kinases, G-protein coupled receptors, cytokine receptors, lectin receptors, folate receptors, prostate-specific membrane antigen (PSMA), carbonic anhydrase IX receptor, and biotin receptors.
  • anti-cancer drugs that bind to such cellular receptors may be used as ligands to target a detectable label to cancer cells.
  • Exemplary drugs that may be used as ligands to target cancer cells include, without limitation, Acitinib, Afatinib, Axitinib, Erlotinib, Cabozantinib, Crizotinib, Gefitinib, Imatinib, Ibrutinib, Lapatinib, Neovastat, Nilotinib, Pazopanib, Perifosine, Ponatinib, Regorafenib, Sorafenib, Sunitinib, Trametinib, and Vandetenib.
  • the binding agent comprises a membrane-targeted cleavable probe that becomes activated when it encounters a protease.
  • probes comprise a synthetic peptide substrate comprising a protease cleavage site coupled to a detectable label and a membrane targeting domain. Upon cleavage by a protease, the detectable label is deposited in cell membranes.
  • protease-activated peptide probes see, e.g., Page et al. (2015) Nature Communications 6 (8448), Backes et al. (2000) Nat. Biotechnol. 18:187-193; herein incorporated by reference.
  • Detectable labels may be conjugated to binding agents by any suitable method.
  • the detectable label and binding agent may be directly linked, e.g., via a single bond, or indirectly linked e.g., through the use of a suitable linker, e.g., a polymer linker, a chemical linker, or one or more linking molecules or moieties.
  • attachment of the detectable label and binding agent may be by way of one or more covalent interactions.
  • the detectable label or binding agent may be functionalized, e.g., by addition or creation of a reactive functional group. Functionalized detectable labels or binding agents may be modified to contain any convenient reactive functional group for conjugation such as an amine functional group, a carboxylic functional group, a sulfhydryl group, a thiol functional group, and the like.
  • Any convenient method of bioconjugation may be used including, but not limited to, glutaraldehyde crosslinking, carbodiimide crosslinking, succinimide ester crosslinking, imidoester, crosslinking, maleimide crosslinking, iodoacetamide crosslinking, benzidine crosslinking, periodate crosslinking, isothiocyanate crosslinking, and the like.
  • Such conjugation methods may optionally use a reactive sidechain group of an amino acid residue of the binding agent (e.g., a reactive side-chain group of a Lys, Cys, Ser, Thr, Tyr, His or Arg amino acid residue of the protein, i.e., a polypeptide linking group may be amino-reactive, thiol-reactive, hydroxyl-reactive, imidazolyl-reactive or guanidinyl-reactive).
  • a chemoselective reactive functional group may be utilized.
  • conjugation reagents that can be used include, but are not limited to, e.g., homobifunctional conjugation reagents (e.g., (bis(2-[succinimidooxycarbonyloxy]ethyl) sulfone, l,4-Di-(3'- [2'pyridyldithioj-propionamido) butane, disuccinimidyl suberate, disuccinimidyl tartrate, sulfodisuccinimidyl tartrate, dithiobis(succinimidyl propionate), 3,3'-dithiobis(sulfosuccinimidyl propionate), ethylene glycol bis(succinimidyl succinate), and the like), heterobifunctional conjugation reagents (e.g., m-maleimidobenzoyl-N-hydroxysuccinimide ester, m-maleimidobenzoyl-N
  • a functional linker refers to any suitable linker that has one or more functional groups for the attachment of one molecule to another.
  • the functional linker comprises an amino functional group, a thiol functional group, a hydroxyl functional group, an imidazolyl functional group, a guanidinyl functional group, an alkyne functional group, an azide functional group, or a strained alkyne functional group.
  • Further exemplary functional groups and methods of crosslinking and conjugation are described in, e.g., Hermanson Bioconjugate Techniques (Academic Press, 3 rd edition, 2013), herein incorporated by reference in its entirety.
  • imaging is performed with detectable labels conjugated to cancer targeted binding agents.
  • the cancer-targeting agent may comprise, for example, an antibody, an antibody mimetic, a peptide, a peptoid, an aptamer, or a small molecule ligand that selectively binds to a tumor-specific antigen or a tumor-associated antigen on cancerous cells.
  • Multiplexed in situ imaging of a tissue can be used to simultaneously detect conjugates targeting different tumor antigens to allow imaging of multiple cell types in tumors.
  • Exemplary tumor-specific antigens and tumor-associated antigens include, without limitation, oncogene protein products, mutated or dysregulated tumor suppressor proteins, oncovirus proteins, oncofetal antigens, mutated or dysregulated differentiation antigens, overexpressed or aberrantly expressed cellular proteins (e.g., mutated or aberrantly expressed growth factors, mitogens, receptor tyrosine kinases, cytoplasmic tyrosine kinases, serine/threonine kinases and their regulatory subunits, G proteins, and transcription factors), and altered cell surface glycolipids and glycoproteins on cancerous cells.
  • oncogene protein products mutated or dysregulated tumor suppressor proteins, oncovirus proteins, oncofetal antigens, mutated or dysregulated differentiation antigens, overexpressed or aberrantly expressed cellular proteins (e.g., mutated or aberrantly expressed growth factors, mitogens, receptor tyrosine kinases, cytoplasmic tyros
  • tumor-specific antigens and tumor-associated antigens may include without limitation, dysregulated or mutated RAS, WNT, MYC, ERK, TRK, CTAG1 B, MAGEA1 , Bcr-Abl, p53, c-Sis, epidermal growth factor receptor (EGFR), platelet-derived growth factor receptor (PDGFR), vascular endothelial growth factor receptor (VEGFR), FIER2/neu, Src- family, Syk-ZAP-70 family proteins, and BTK family of tyrosine kinases, Abl, Raf kinase, cyclin- dependent kinases, alphafetoprotein (AFP), carcinoembryonic antigen (CEA), CA-125, MUC-1, epithelial tumor antigen (ETA), tyrosinase, melanoma-associated antigen (MAGE), and other abnormal or dysregulated proteins expressed on cancerous cells.
  • the cancer- targeted binding agent binds to
  • the tumor marker targeted by a binding agent is the urokinase plasminogen activator receptor (uPAR) or urokinase plasminogen activator (uPA).
  • uPAR urokinase plasminogen activator receptor
  • uPA urokinase plasminogen activator
  • a number of anti- uPAR antibodies are available including the 2G10 antibody, which inhibits the uPAR interaction with urokinase plasminogen activator, and anti- uPAR antibody, 3C6, which inhibits the association of uPAR with b1 integrin (see, e.g., LeBeau et al. (2013) Cancer Res. 73(7):2070-2081 ).
  • Anti-PAR and anti-uPA antibodies can be conjugated to detectable labels for use in multiplexed imaging, according to the methods described herein, for detection of cancerous cells expressing uPAR or uPA, respectively, including, without limitation, those of breast cancer including triple negative breast cancer, pancreas cancer, prostate cancer, and melanoma.
  • the tumor marker targeted by a binding agent is PD-L1 .
  • a number of anti-PD-L1 antibodies are commercially available including durvalumab, pembrolizumab, atezolizumab and avelumab.
  • Other anti-PD-L1 antibodies include C4 and DFO-C4 (see, e.g., Truillet C et al. (2016) Bioconjug. Chem. 29(1):96-103).
  • Such anti-PD-L1 antibodies can be conjugated to detectable labels for use in multiplexed imaging, according to the methods described herein, for detection of cancerous cells expressing PD-L1 , including, without limitation, those of melanoma, lung cancer, including non-small cell lung cancer (NSCLC) and small cell lung cancer (SCLC), head and neck cancer, Hodgkin lymphoma, stomach cancer, prostate cancer, bladder cancer, urothelial carcinoma, breast cancer including triple-negative breast cancer (TNBC), hepatocellular carcinoma (HCC), Merkel cell carcinoma, and renal cell carcinoma.
  • NSCLC non-small cell lung cancer
  • SCLC small cell lung cancer
  • TNBC triple-negative breast cancer
  • HCC hepatocellular carcinoma
  • Merkel cell carcinoma and renal cell carcinoma.
  • the tumor marker targeted by a binding agent is the epidermal growth factor receptor (EGFR).
  • EGFR epidermal growth factor receptor
  • a number of anti-EGFR antibodies are available including panitumumab, cetuximab, zalutumumab, nimotuzumab, and matuzumab, which can be conjugated to detectable labels for use in multiplexed imaging, according to the methods described herein, for detection of cancerous cells expressing EGFR, including, without limitation, those of head and neck cancer, colorectal cancer, lung cancer, ovarian cancer, breast cancer, endometrial cancer, cervical cancer, bladder cancer, gastric cancer, and esophageal cancer.
  • a number of small molecule drugs are also available that target EGFR including, without limitation, Gefitinib, Erlotinib, Lapatinib, Sorafenib, and Vandetenib, which can be conjugated to detectable labels for use in multiplexed imaging of cancerous cells expressing EGFR, according to the methods described herein.
  • the tumor marker targeted by a binding agent is FIER2.
  • FIER2 antibodies are also available including trastuzumab, pertuzumab, and margetuximab, which can be conjugated to detectable labels for use in multiplexed imaging, according to the methods described herein, for detection of cancerous cells expressing FIER2, including, without limitation, those of breast cancer, ovarian cancer, stomach cancer, lung cancer, uterine cancer, gastric cancer, colon cancer, head and neck cancer, and salivary duct carcinoma.
  • target FIER2 including, without limitation, Lapatinib and Neratinib, which can be conjugated to detectable labels for use in multiplexed imaging of cancerous cells expressing FIER2, according to the methods described herein.
  • the tumor marker targeted by a binding agent is the epithelial cell adhesion molecule (EpCAM) 17-1 A.
  • EpCAM 17-1 A antibodies are also available including edrecolomab, catumaxomab, and nofetumomab, which can be conjugated to detectable labels for use in multiplexed imaging of cancerous cells expressing EpCAM 17-1 A to detect cancerous cells in epithelial-derived neoplasms and various carcinomas, such as lung cancer, gastrointestinal cancer, breast cancer, ovarian cancer, pancreatic cancer, renal cancer, cervical cancer, colorectal cancer, and bladder cancer.
  • the tumor marker targeted by a binding agent is CD20.
  • CD20 A number of anti-CD20 antibodies are also available including rituximab, tositumomab, ocrelizumab, obinutuzumab, ocaratuzumab, ofatumumab, ibritumomab tiuxetan, ublituximab, and veltuzumab, which can be conjugated to detectable labels for use in multiplexed imaging, according to the methods described herein, for detection of cancerous cells expressing CD20, including, without limitation, those of lymphoma such as, but not limited to, marginal zone lymphoma, Flodgkins lymphoma, non-Flodgkins lymphoma; leukemia such as, but not limited to, chronic lymphocytic leukemia, acute lymphoblastic leukemia, myelogenous leukemia, and chemotherapy-resistant hairy cell leukemia; and thyroid cancer
  • the tumor marker targeted by a binding agent is CD52.
  • CD52 is CD52.
  • a number of anti-CD52 antibodies are also available including alemtuzumab, which can be conjugated to detectable labels for use in multiplexed imaging, according to the methods described herein, for detection of cancerous cells expressing CD52, including, without limitation, those of lymphoma such as, but not limited to, cutaneous T-cell lymphoma (CTCL) and T-cell lymphoma and chronic lymphocytic leukemia (CLL).
  • CTCL cutaneous T-cell lymphoma
  • CLL chronic lymphocytic leukemia
  • the tumor marker targeted by a binding agent is CD22.
  • CD22 is CD22.
  • anti-CD22 antibodies are also available including inotuzumab, which can be conjugated to detectable labels for use in multiplexed imaging, according to the methods described herein, for detection of cancerous cells expressing CD22, including, without limitation, those of leukemia such as, but not limited to, lymphoblastic leukemia and hairy cell leukemia; lymphoma, and lung cancer.
  • the tumor antigen targeted by a binding agent is CD19.
  • a number of anti-C19 antibodies are also available including blinatumomab, MEDI-551 and MOR-208, which can be conjugated to detectable labels for use in multiplexed imaging, according to the methods described herein, for detection of cancerous cells expressing CD19, including, without limitation, those of B-cell neoplasms, non-Hodgkin lymphoma (NHL), chronic lymphocytic leukemia (CLL), acute lymphoblastic leukemia (ALL), and multiple myeloma (MM).
  • NHL non-Hodgkin lymphoma
  • CLL chronic lymphocytic leukemia
  • ALL acute lymphoblastic leukemia
  • MM multiple myeloma
  • the tumor marker targeted by a binding agent is carcinoembryonic antigen (CEA).
  • CEA carcinoembryonic antigen
  • a number of anti-CEA antibodies are available including arcitumomab, which can be conjugated to detectable labels for use in multiplexed imaging, according to the methods described herein, for detection of cancerous cells expressing CEA, including, without limitation, those of colorectal carcinoma, gastric carcinoma, pancreatic carcinoma, lung carcinoma, breast carcinoma, and medullary thyroid carcinoma.
  • the tumor marker targeted by a binding agent is prostate-specific membrane antigen (PSMA).
  • PSMA prostate-specific membrane antigen
  • a number of anti-PSMA antibodies are available including capromab, PSMA30 nanobody, and IAB2M minibody, which can be conjugated to detectable labels for use in multiplexed imaging, according to the methods described herein, for detection of cancerous cells expressing PSMA, including, without limitation, those of prostate cancer.
  • a number of small molecule drugs are also available that target PSMA including, without limitation, zinc binding compounds linked to a glutamate isostere or glutamate, phosphonate-, phosphate-, and phosphoramidates and ureas, fluciclovine (Axumin), MIP-1072, MIP-1095, N-(N-((S)-1 ,3-dicarboxypropyl) carbamoyl)-4- (18F)fluorobenzyl-L-cysteine (18F-DCFBC), which can be conjugated to detectable labels for use in multiplexed imaging of cancerous cells expressing PSMA, according to the methods described herein.
  • target PSMA including, without limitation, zinc binding compounds linked to a glutamate isostere or glutamate, phosphonate-, phosphate-, and phosphoramidates and ureas, fluciclovine (Axumin), MIP-1072, MIP-1095, N-(N-((S)-1
  • the tumor marker targeted by a binding agent is the folate receptor (FR).
  • FR folate receptor
  • a number of anti-FR antibodies are available including farletuzumab and m909, which can be conjugated to detectable labels for use in multiplexed imaging, according to the methods described herein, for detection of cancerous cells expressing FR, including, without limitation, those of ovarian cancer, breast cancer, lung cancer, pleura cancer, cervical cancer, endometrial cancer, kidney cancer, bladder cancer and brain cancer,
  • the small molecule, folate can also be conjugated to detectable labels for use in multiplexed imaging of cancerous cells expressing FR, according to the methods described herein.
  • the tumor marker targeted by a binding agent is a matrix- metalloproteinase (MMP), including, without limitation, MMP1 , MMP3, MMP7, MMP9, MMP10, MMP11 , MMP12, MMP13, and MMP14.
  • MMP matrix- metalloproteinase
  • anti-MMP antibodies are available including, which can be conjugated to detectable labels for use in multiplexed imaging, according to the methods described herein, for detection of cancerous cells expressing MMPs, including, without limitation, those of ovarian cancer, breast cancer, lung cancer, prostate cancer, stomach cancer, thyroid cancer, skin cancer, brain cancer, kidney cancer, colon cancer, bladder cancer, esophageal cancer, endometrial cancer, hepatocellular cancer, and head and neck cancer.
  • Endogenous glycoprotein inhibitors such as tissue inhibitor of metalloproteinases (TIMPs), including TIMP-1 , TIMP-2, TIMP-3, and TIMP-4 as well as a number of small molecule drugs are available that target MMPs including, without limitation, doxycycline, marimastat (BB-2516), and cipemastat, which can be conjugated to detectable labels for use in multiplexed imaging of cancerous cells expressing MMPs, according to the methods described herein.
  • TIMPs tissue inhibitor of metalloproteinases
  • target MMPs including, without limitation, doxycycline, marimastat (BB-2516), and cipemastat, which can be conjugated to detectable labels for use in multiplexed imaging of cancerous cells expressing MMPs, according to the methods described herein.
  • the binding agent selectively binds to an immune activation marker, which may include adaptive immunity activation markers and innate immunity activation markers.
  • immune activation markers include, without limitation, B220, CTLA-4, PD-1 , CD1c, CD3, CD5, CD8, CD11b, CD11c, CD13, CD14, CD16, CD18, CD20, CD21 , CD23, CD25, CD27, CD28, CD32, CD38, CD40, CD41 , CD43, CD44, CD45RA, CD45RO, CD54, CD56, L-selectin (CD62L), CD63, CD66b, CD68, CD69, CD80, CD83, CD86, CD88, CD95, CD107a, CD161 , CD163, CD164, granzymes, perforin, IL-1 , IL-1 B, IL-2, IL-4, IL-5, IL-6, IL-8, IL-10, IL-12, IL-13, IL-17,
  • the tissue sample may be stained using a cytological stain, either before or after performing multiplexed in situ imaging of the cellular markers.
  • the stain may be, for example, phalloidin, gadodiamide, acridine orange, bismarck brown, barmine, Coomassie blue, bresyl violet, brystal violet, DAPI, hematoxylin, eosin, ethidium bromide, acid fuchsine, haematoxylin, hoechst stains, iodine, malachite green, methyl green, methylene blue, neutral red, Nile blue, Nile red, osmium tetroxide (formal name: osmium tetraoxide), rhodamine, safranin, phosphotungstic acid, osmium tetroxide, ruthenium tetroxide, ammonium moly
  • the stain may be specific for any feature of interest, such as a protein or class of proteins, phospholipids, DNA (e.g., dsDNA, ssDNA), RNA, an organelle (e.g., cell membrane, mitochondria, endoplasmic recticulum, golgi body, nuclear envelope, and so forth), or a compartment of the cell (e.g., cytosol, nuclear fraction, and so forth).
  • the stain may enhance contrast or imaging of intracellular or extracellular structures.
  • the sample may be stained with haematoxylin and eosin (H&E).
  • Kits are also provided for carrying out the methods described herein.
  • the kit comprises software for carrying out the computer implemented methods for identifying cell types and spatial locations of cells within a tissue sample, as described herein.
  • the kit may comprise a non-transitory computer-readable medium and instructions for identifying cell types and spatial locations of cells within a tissue sample, as described herein.
  • the kit comprises a system comprising a processor programmed to identify cell types and spatial locations of cells within a tissue sample according to a computer implemented method described herein; and a display component for displaying information regarding the identified cell types and the spatial locations of the identified cells within the tissue sample.
  • the kit may also include reagents for performing multiplexed in situ imaging of a tissue such as detectably labeled binding agents that specifically bind to cellular markers of interest in the tissue sample.
  • the kit comprises reagents for performing multiplexed fluorescence imaging, multiplexed immunofluorescence imaging, multiplexed immunohistochemistry, multiplexed immunohistochemistry/immunofluorescence (mIHC/IF) imaging, multiplexed ion beam imaging (MIBI), mass spectrometry imaging (MSI), multi-epitope-ligand cartography (MELC), multiplex modified hapten-based imaging, or co-detection by indexing (CODEX) imaging.
  • multiplexed fluorescence imaging multiplexed immunofluorescence imaging, multiplexed immunohistochemistry, multiplexed immunohistochemistry/immunofluorescence (mIHC/IF) imaging, multiplexed ion beam imaging (MIBI), mass spectrometry imaging (MSI), multi-
  • the kit may include fluorescently labeled antibodies, DNA-tagged antibodies, or metal-isotope-tagged antibodies that specifically bind to cellular markers of interest.
  • the kit further comprises an imaging device to perform multiplexed in situ imaging of the tissue.
  • the imaging device may include, without limitation, a fluorescence microscope, a confocal microscope, a laser-scanning microscope, a high-resolution laser ablation system, a mass spectrometer, a charge-coupled device (CCD), an active-pixel sensor (APS), or a CMOS sensor.
  • kits may further include (in certain embodiments) instructions for practicing the subject methods.
  • These instructions may be present in the subject kits in a variety of forms, one or more of which may be present in the kit.
  • instructions may be present as printed information on a suitable medium or substrate, e.g., a piece or pieces of paper on which the information is printed, in the packaging of the kit, in a package insert, and the like.
  • Another form of these instructions is a computer readable medium, e.g., diskette, compact disk (CD), flash drive, and the like, on which the information has been recorded.
  • Yet another form of these instructions that may be present is a website address which may be used via the internet to access the information at a removed site.
  • the methods described herein provide accurate and reliable cell type assignments to cells within a tissue.
  • the subject methods further enable statistically significant cell-type co-localization patterns to be determined and tissue architectures to be defined.
  • Tissue architectures can be associated with tissue development, disease progression, and treatment response and may have clinical relevance.
  • the subject methods include displaying a listing of cell types and/or cell lineages identified in a tissue sample.
  • the subject methods include displaying cell type labels superimposed on an image of the tissue based on the spatial locations determined for individual cells in the tissue.
  • the cell type labels may be color coded to differentiate different cell types or different cell lineages.
  • the multiplexed in situ images analyzed by the methods described herein may be viewed side-by-side or, in some embodiments, the images may be superimposed or combined. In some cases, the images may be in color, where the colors used in the images may correspond to the color of the detectable labels used to stain the cellular markers in the tissue.
  • the methods described herein find general use in a wide variety of applications for analysis of any tissue sample (e.g., in the analysis of tissue specimens, tissue sections, whole tissues or parts thereof, tissue arrays, etc.).
  • the method may be used to analyze any tissue, including tissue that has been clarified, e.g., through lipid elimination, for example.
  • the sample may be prepared using expansion microscopy methods (see, e.g., Chozinski et al. Nature Methods 2016 13: 485-488), which involves creating polymer replicas of a biological system created through selective co-polymerization of organic polymer and cell components.
  • the method may have many biomedical applications in high throughput screening and drug discovery and the like. Further, the method has a variety of clinical applications, including, but not limited to, diagnostics, prognostics, disease stratification, personalized medicine, clinical trials and drug accompanying tests.
  • the sample may be a section of a tissue biopsy obtained from a patient.
  • Biopsies of interest include both tumor and non-neoplastic biopsies of skin (melanomas, carcinomas, etc.), soft tissue, bone, breast, colon, liver, kidney, adrenal, gastrointestinal, pancreatic, gall bladder, salivary gland, cervical, ovary, uterus, testis, prostate, lung, thymus, thyroid, parathyroid, pituitary (adenomas, etc.), brain, spinal cord, ocular, nerve, and skeletal muscle, etc.
  • the detectably labeled binding agents used for multiplexed in situ imaging of cellular markers in tissue may bind to any type of molecule, including proteins, lipids, polysaccharides, proteoglycans, metabolites, or the like.
  • the binding agents specifically bind to cellular markers, including cancer markers, that may be proteinaceous.
  • Exemplary cancer markers include, but are not limited to carcinoembryonic antigen (for identification of adenocarcinomas), cytokeratins (for identification of carcinomas but may also be expressed in some sarcomas), CD15 and CD30 (for Hodgkin's disease), alpha fetoprotein (for yolk sac tumors and hepatocellular carcinoma), CD117 (for gastrointestinal stromal tumors), CD10 (for renal cell carcinoma and acute lymphoblastic leukemia), prostate specific antigen (for prostate cancer), estrogens and progesterone (for tumour identification), CD20 (for identification of B-cell lymphomas) and CD3 (for identification of T-cell lymphomas).
  • carcinoembryonic antigen for identification of adenocarcinomas
  • cytokeratins for identification of carcinomas but may also be expressed in some sarcomas
  • CD15 and CD30 for Hodgkin's disease
  • alpha fetoprotein for yolk sac tumors and hepatocellular carcinoma
  • the above-described methods can be used to analyze tissue from a subject to determine, for example, whether the tissue is normal or not, or to determine whether the tissue is responding to a treatment.
  • the method may be employed to determine the degree of dysplasia of cells within a tissue.
  • a tissue sample may be isolated from an individual, e.g., from a soft tissue.
  • the method may be used to distinguish different types of cancer cells in FFPE tissue samples.
  • the methods described above find particular utility in examining tissue samples using a plurality of antibodies, each antibody recognizing a different cellular marker.
  • the method may involve obtaining a multiplexed in situ image as described above (an electronic form of which may have been forwarded from a remote location), and the image may be analyzed using the methods described herein by a doctor or other medical professional to determine whether a patient has abnormal cells (e.g., cancerous cells) or which type of abnormal cells are present.
  • the image may be used as a diagnostic to determine whether the subject has a disease or condition, e.g., a cancer.
  • the method may be used to determine the stage of a cancer, subtype a cancer, identify metastasized cells, or to monitor a patient's response to a treatment.
  • data can be forwarded to a "remote location", where "remote location,” means a location other than the location at which the image is generated.
  • a remote location could be another location (e.g., office, lab, etc.) in the same city, another location in a different city, another location in a different state, another location in a different country, etc.
  • office, lab, etc. another location in the same city
  • another location in a different city e.g., another location in a different city
  • another location in a different state e.g., another location in a different state
  • another location in a different country etc.
  • the two items can be in the same room but separated, or at least in different rooms or different buildings, and can be at least one mile, ten miles, or at least one hundred miles apart.
  • Communication information refers to transmitting the data representing that information as electrical signals over a suitable communication channel (e.g., a private or public network).
  • a suitable communication channel e.g., a private or public network.
  • Forceing an item refers to any means of getting that item from one location to the next, whether by physically transporting that item or otherwise (where that is possible) and includes, at least in the case of data, physically transporting a medium carrying the data or communicating the data. Examples of communicating media include radio or infra-red transmission channels as well as a network connection to another computer or networked device, and the internet or including email transmissions and information recorded on websites and the like.
  • the image may be analyzed by an MD or other qualified medical professional, and a report based on the results of the analysis of the image may be forwarded to the patient from which the sample was obtained.
  • the method may be employed in a variety of diagnostic, drug discovery, and research applications that include, but are not limited to, diagnosis or monitoring of a disease or condition (where the image identifies a marker or cell type associated with the disease or condition), discovery of drug targets (where the marker or cell type in the image may be targeted for drug therapy), drug screening (where the effects of a drug are monitored by a marker, cell type, or tissue substructure identified from analysis of the image of a tissue sample), determining drug susceptibility (where drug susceptibility is associated with a marker, cell type, or tissue substructure identified from analysis of the image of a tissue sample) and basic research (where is it desirable to measure changes in tissue cell types, cell co localization patterns, or architecture associated with, for example, development, disease progression, environmental changes, or exposure to chemicals, toxins, or pathogens).
  • two different tissue samples may be compared using the above methods.
  • the different samples may be composed of an "experimental” sample, i.e., a sample of interest, and a "control" sample to which the experimental sample may be compared.
  • the different tissue samples are analyzed to determine the presence, number, or spatial location of a cell type of interest, e.g., an abnormal cell or normal, cell.
  • Cell types of interest may include, for example, abnormal cells of a tissue biopsy (e.g., from a tissue having a disease such as cancer or infected with a pathogen, etc.) and normal cells from the same tissue, usually from the same patient; cells infected with a pathogen, or treated (e.g., with environmental or chemical agents such as peptides, hormones, altered temperature, growth condition, physical stress, cellular transformation, etc.), and a normal cell from a tissue (e.g., a tissue that is otherwise identical to the experimental tissue except that it is not cancerous, infected, or treated, etc.); a cell in tissue isolated from a mammal with a cancer, a disease, a geriatric mammal, or a mammal exposed to a condition, and a cell in tissue from a mammal of the same species, preferably from the same family, that is healthy or young; and differentiated cells and non-differentiated cells from tissue of the same mammal (e.g., one cell
  • Tissue from any organism e.g., from plants and animals, such as fish, birds, reptiles, amphibians and mammals may be used in the subject methods.
  • mammalian tissue i.e., tissue from mice, rabbits, primates, or humans, or cultured derivatives thereof, may be used.
  • a method for identifying cell types and spatial locations of cells within a tissue sample comprising: performing multiplexed in situ imaging of a plurality of cellular markers in the tissue to produce an image; segmenting the image to generate a plurality of image segments, wherein each image segment contains an image of a single cell of the tissue, wherein the spatial location of the single cell in each image segment is quantified by X and Y coordinates, and wherein an expression profile of the single cell in each image segment is determined from the in situ imaging of the plurality of cellular markers by analysis of the segmented image; comparing the expression profile of the single cell in each image segment to reference expression profiles for the plurality of cellular markers in an initial cell-type signature matrix, wherein the initial cell-type signature matrix defines a plurality of known cell types based on prior knowledge of cellular markers known to be expressed in specific cell types, wherein for each cell type, the initial cell-type signature matrix indicates whether a cellular marker is expressed or not expressed in that cell type; using a marker scoring function to assess how well the
  • imaging comprises imaging the tissue using fluorescence microscopy, confocal microscopy, two-photon microscopy, multi-photon microscopy, light-field microscopy, expansion microscopy, or light sheet microscopy.
  • imaging is multiplexed fluorescence imaging, multiplexed immunofluorescence imaging, multiplexed immunohistochemistry, multiplexed immunohistochemistry/immunofluorescence (mIHC/IF) imaging, multiplexed ion beam imaging (MIBI), mass spectrometry imaging (MSI), multi-epitope-ligand cartography (MELC), multiplex modified hapten-based imaging, co-detection by indexing (CODEX) imaging, or NanoString digital spatial profiling (DSP).
  • MIBI multiplexed ion beam imaging
  • MSI mass spectrometry imaging
  • MELC multi-epitope-ligand cartography
  • CODEX co-detection by indexing
  • DSP NanoString digital spatial profiling
  • the labeled primary antibody or the labeled secondary antibody comprises a fluorescent label, a chromogenic label, or a metal-isotope label.
  • tissue sample is a biopsy or surgical tissue specimen.
  • tissue sample comprises live tissue, fixed tissue, or permeabilized tissue.
  • the fixed tissue is a formalin-fixed, paraffin- embedded (FFPE) tissue section.
  • RNA-sequencing single-cell RNA-sequencing
  • a computer implemented method for identifying cell types and spatial locations of cells within a tissue sample comprising: receiving a multiplexed in situ image of a plurality of cellular markers in the tissue sample; segmenting the image to generate a plurality of image segments, wherein each image segment contains an image of a single cell of the tissue, wherein the spatial location of the single cell in each image segment is quantified by X and Y coordinates, and wherein an expression profile of the single cell in each image segment is determined from the in situ imaging of the plurality of cellular markers by analysis of the segmented image; providing an initial cell-type signature matrix comprising reference expression profiling data for a plurality of known cell types, wherein the initial cell-type signature matrix defines a plurality of known cell types based on prior knowledge of cellular markers known to be expressed in specific cell types, wherein for each cell type, the initial cell-type signature matrix indicates whether a cellular marker is expressed or not expressed in that cell type; using a marker scoring function to assess how well the expression profile of the single
  • the image is produced by multiplexed fluorescence imaging, multiplexed immunofluorescence imaging, multiplexed immunohistochemistry, multiplexed immunohistochemistry/immunofluorescence (mIHC/IF) imaging, multiplexed ion beam imaging (MIBI), mass spectrometry imaging (MSI), multi- epitope-ligand cartography (MELC), multiplex modified hapten-based imaging, co-detection by indexing (CODEX) imaging, or NanoString digital spatial profiling (DSP).
  • multiplexed fluorescence imaging multiplexed immunofluorescence imaging, multiplexed immunohistochemistry, multiplexed immunohistochemistry/immunofluorescence (mIHC/IF) imaging, multiplexed ion beam imaging (MIBI), mass spectrometry imaging (MSI), multi- epitope-ligand cartography (MELC), multiplex modified hapten-based imaging, co-detection by indexing (CODEX) imaging, or NanoString
  • kits comprising the non-transitory computer-readable medium of aspect 38 and instructions for identifying cell types and spatial locations of cells within a tissue sample.
  • kit of aspect 39 further comprising an imaging device to perform multiplexed in situ imaging of the tissue sample.
  • the imaging device is a fluorescence microscope, a confocal microscope, a laser-scanning microscope, a high-resolution laser ablation system, a mass spectrometer, a charge-coupled device (CCD), an active-pixel sensor (APS), or a CMOS sensor.
  • kit of any one of aspects 39-41 further comprising reagents for performing multiplexed fluorescence imaging, multiplexed immunofluorescence imaging, multiplexed immunohistochemistry, multiplexed immunohistochemistry/immunofluorescence (mIHC/IF) imaging, multiplexed ion beam imaging (MIBI), mass spectrometry imaging (MSI), multi-epitope-ligand cartography (MELC), multiplex modified hapten-based imaging, or co-detection by indexing (CODEX) imaging.
  • multiplexed fluorescence imaging multiplexed immunofluorescence imaging
  • multiplexed immunohistochemistry multiplexed immunohistochemistry/immunofluorescence (mIHC/IF) imaging
  • MIBI multiplexed ion beam imaging
  • MSI mass spectrometry imaging
  • MELC multi-epitope-ligand cartography
  • CODEX co-detection by indexing
  • kit of any one of aspects 39-42 further comprising fluorescently labeled antibodies, DNA-tagged antibodies, or metal-isotope-tagged antibodies that specifically bind to cellular markers of interest.
  • a system comprising: a processor programmed to identify cell types and spatial locations of cells within a tissue sample according to the computer implemented method of any one of aspects 25-37; and a display component for displaying information regarding the identified cell types and the spatial locations of the identified cells within the tissue sample.
  • the imaging device is a fluorescence microscope, aconfocal microscope, a laser-scanning microscope, a high-resolution laser ablation system, a mass spectrometer, a charge-coupled device (CCD), an active-pixel sensor (APS), or a CMOS sensor.
  • any one of aspects 44-48 further comprising reagents for performing multiplexed fluorescence imaging, multiplexed immunofluorescence imaging, multiplexed immunohistochemistry, multiplexed immunohistochemistry/immunofluorescence (mIHC/IF) imaging, multiplexed ion beam imaging (MIBI), mass spectrometry imaging (MSI), multi-epitope-ligand cartography (MELC), multiplex modified hapten-based imaging, or co-detection by indexing (CODEX) imaging.
  • multiplexed fluorescence imaging multiplexed immunofluorescence imaging, multiplexed immunohistochemistry, multiplexed immunohistochemistry/immunofluorescence (mIHC/IF) imaging, multiplexed ion beam imaging (MIBI), mass spectrometry imaging (MSI), multi-epitope-ligand cartography (MELC), multiplex modified hapten-based imaging, or co-detection by indexing (CODEX) imaging.
  • CELESTA CELI typE identification with SpaTiAI information
  • CELESTA is a robust and fast (on the order of minutes) algorithm for cell type identification that assigns individual cells to their most likely cell types through an optimization framework leveraging prior knowledge in a transparent manner.
  • CODEX CO- Detection by indEXing
  • CODEX is an immunofluorescence-based imaging technology that could quantify over fifty proteins, across tens of thousands of cells in a tissue slice.
  • CELESTA To evaluate CELESTA’S performance against extant methods, we applied CELESTA to a published CODEX dataset generated on colorectal cancer where cell type identification, which we adopted as the gold standard, was based on clustering and manual assessment by a pathologist 6 .
  • CELESTA provides cell type assignments comparable to the gold standard, in a manner that can be robustly evaluated.
  • CELESTA A typical image analysis pipeline often starts with segmenting pixel- based images into cells followed by cell type identification and spatial analysis (FIG. 1 A).
  • CELESTA first assigns cell types to cells whose marker expressions match prior knowledge of cell-type marker expressions; these cells are defined as “anchor cells”. Remaining cells, whose marker expressions do not clearly associate with a cell type, are referred as “non-anchor cells”.
  • non-anchor cells For each non-anchor cell, CELESTA uses the cell’s neighboring cell-type information, in addition to the cell’s marker expressions, to identify the cell type. Because cells are organized in coherent spatial patterns, we reason that spatial location is valuable information in additional to marker expressions to infer cell type.
  • CELESTA uses an iterative optimization framework to assign cell types for non-anchor cells (FIGS. 1 B-1 C).
  • CELESTA (flowchart of the algorithm is shown in FIG. 2A), requires two main inputs.
  • the first input is an image segmented into individual cells. Each cell is defined by its marker expressions and spatial location (FIG. 2B).
  • CELESTA determines whether a marker is over- or under-expressed in a given cell by fitting two-mode Gaussian mixture model to the marker expression distribution (FIG. 7B) derived from the cells in a sample 17 .
  • CELESTA converts marker expression into a probability using a sigmoid function, where the expression levels are scaled between 0 and 1 and the midpoint is the intersection of two-mode Gaussian distributions.
  • the second input to CELESTA is a cell-type-signature matrix that relies on prior knowledge of markers known to have high or low expression in specific cell types.
  • the cell- type-signature matrix is initialized as 1 vs 0 if the marker has high vs low, respectively, probability of expression, for a given cell type.
  • a marker is denoted as “NA” if considered irrelevant for cell type identification.
  • the cell-type-signature matrix is updated as more cells assigned.
  • An example initial cell-type-signature matrix is provided in Table 1 , with the final cell-type-signature matrix in FIG. 7C.
  • CELESTA matches a cell’s marker expression probability profile to the cell-type signatures using a marker-scoring function (Methods and FIG. 2C).
  • a marker-scoring function Methods and FIG. 2C.
  • CELESTA assigns the corresponding cell type to that cell and defines it as an “anchor cell”.
  • non-anchor cell For a cell whose cell type cannot be identified using marker expressions alone (“non-anchor cell”), CELESTA leverages cell-type information from its N-nearest spatial neighbors (FIG. 2D) using a spatial-scoring function that utilizes the Potts Model energy function.
  • the Potts Model has been used for image segmentation 18_2 °, clustering method on spatial transcriptomics data 21 and analysis of pathological images 22 .
  • CELESTA represents each non-anchor cell as a node in an undirected graph with edges connecting to its N-nearest neighbors.
  • CELESTA associates each node with a hidden state, which is the cell type to be inferred, and assumes the joint distribution of the hidden states satisfies a discrete Markov random field (FIG. 2E).
  • FOG. 2E discrete Markov random field
  • CELESTA employs a pseudo-EM algorithm using mean field approximation 23 . In each iteration, if the thresholds are met (Methods), cell types with maximum probabilities are assigned to the non anchor cells.
  • CELESTA re-evaluates the cell on the next iteration as additional neighboring cells have been assigned. The process is repeated until a user-defined convergence threshold is met, whereupon unassigned cells are labeled as “unknown.”
  • CELESTA introduces a “cell-type resolution” strategy whereby cell type assignment is performed in multiple rounds where each round increases cell-type resolution based on known cell lineages (FIG. 2F). This strategy reduces computational complexity and improves robustness when cell types from different lineages share marker expressions. More details on CELESTA are provided in Methods.
  • CELESTA achieved average accuracy scores (Rand Index) around 0.9, average precisions between 0.6 to 0.8 and F1 scores between 0.6 to 0.7 across the major cell types (FIG. 3D). For rare populations, CELESTA achieved average precision and F1 scores between 0.4 to 0.6. Noteworthy, there are two clusters assigned as cell-type mixtures in the benchmarked annotations (FIGS. 3E-3F); for cells in these two clusters, CELESTA assigned cell types matched consistent with canonical marker expression patterns.
  • FIG. 8 To evaluate the mismatched assignments, a confusion matrix comparing the cell types between CELESTA and benchmarked assignments is shown in FIG. 8. While there is high agreement between CELESTA and the benchmark annotations for most cell types, we found that tumor cells assigned in the benchmarked annotations but not by CELESTA expressed low to none cytokeratin, which is the tumor-specific marker defined in CELESTA’S cell-type-signature matrix. CELESTA assigned majority (around 80%) of those cells to unknown category (FIG. 8). It is possible that cell morphology from the FI&E images was used to identify low-cytokeratin-expressing malignant cells. While the benchmarked annotations included morphological features, CELESTA does not use morphology in its current implementation.
  • CELESTA applied to primary HNSCC tumors imaged by CODEX.
  • the input cell-type-signature matrix is shown in Table 4.
  • CELESTA achieved adjusted rand index between 0.6 to 0.9 (FIG. 4D). Due to imaging artifacts and lower tissue quality, cell type identification was harder in some samples. In terms of cell-type compositions, CELESTA and gating were highly correlated (FIG. 4E). CELESTA achieved average F1 scores around 0.7 and accuracy scores around 0.9 for malignant, endothelial and T cells (FIG. 4F). For T cell subtypes, CELESTA achieved average F1 scores around 0.55 (FIG. 4F).
  • FIG. 5B Representative CODEX images illustrate FOXP3 (Treg marker) is more co localized with cytokeratin (tumor marker) staining, and CD4 and CD8 (T cell markers) are more co localized with CD31 (endothelial marker) staining, in N+ vs. NO FINSCC (FIGS. 5D-5E).
  • CELESTA an unsupervised machine learning method, for facilitating cell type identification on multiplexed images.
  • CELESTA can process a tissue sample with 100K cells on the order of minutes on a typical laptop.
  • CELESTA converts marker expressions into probabilities to facilitate assessing whether a marker expression is high or low in a cell, thereby reducing subjectiveness for this step found in most existing methods.
  • CELESTA incorporates the cell’s spatial information. We showed cells with the same cell types were enriched in each other’s nearest spatial neighborhoods. Such important information is often ignored in cell type identification.
  • CELESTA uses a pseudo-EM algorithm for iterative cell-type assignment. CELESTA is not based on manual gating or clustering and instead assigns the cell type to individual cells based on probabilities, preserving single-cell resolution.
  • CELESTA uses a “cell-type resolution” strategy that incorporates cell lineage information, and improves computational speed and robustness when cell types from different lineages have shared markers. Users define the inputs required by CELESTA, and the effect of these inputs can be transparently evaluated through sensitivity analyses. While our current analysis prioritized accuracy over the number of cells classified, the users can choose the parameters that trade-off accuracy and quantity of cells classified.
  • CELESTA was applied to images generated on CODEX, but CELESTA could also be extended to other imaging platforms.
  • CELESTA requires segmented cells as input and thereby relies on the performance of segmentation algorithm. For rare cell types, because their neighborhoods could be enriched with a different cell type with larger abundance, we recommend using smaller neighborhood sizes (5 cells or less). Technical artifacts from the imaging platform could add noise to the marker expression 3940 ; in such cases, some manual intervention may still need after CELESTA’S fast assessment.
  • CELESTA relies on markers in the user-defined cell-type-signature matrix. A poorly informed cell- type-signature matrix will negatively impact the results, as would mislabeling a cell cluster. In addition, too few anchor cells assigned for a cell type may not provide enough spatial information to identify non-anchor cells for that cell type.
  • CELESTA does not account for morphological features. Despite this, CELESTA demonstrated relatively low misclassification rates compared to cell-type assignments of a pathologist whose assessment included morphological features from H&E images. Future additions to improve CELESTA could include morphological features.
  • CELESTA as a fast and robust cell type identification method for multiplexed in situ images.
  • the scRNA-seq data are deposited at GEO: GSE140042.
  • HNSCC imaging data are hosted at Synapse.org SageBionetworks at https://doi.org/10.7303/syn26242593.
  • the benchmark public imaging data can be found at doi.org/10.7937/tcia.2020.fqn0-0326.
  • Marker-scoring function assesses how well a cell’s marker expression profile matches the cell-type markers defined by the cell-type-signature matrix. To apply the marker-scoring function, we first need to quantify whether a marker has high or low expression in a cell. We apply a two-mode Gaussian mixture model to fit each marker’s expressions across the cells in a sample:
  • M total number of markers
  • x m expressions cross cells for marker m
  • f the mixing probabilities that sum up to one
  • m is the mean
  • the variance.
  • CELESTA assigns the corresponding cell type to that cell and defines it as an “anchor cell”. For example, by setting the cell type probability threshold as 0.5 and the high and low expression probability thresholds as 0.7 and 0.3, for a cell to be a tumor cell, it needs have a marker-score of 0.5 or greater in Equation 5. In addition, it needs to have cytokeratin expression probability of 0.7 or greater and marker expression probabilities for all other measured markers needs to be 0.3 or lower.
  • the high and low thresholds for expression probability provide the user flexibility to reduce artifacts due to, for example, doublets or noise from nonspecific staining.
  • Markov Random Field For the cells whose marker expression probability profile is ambiguous (non-anchor cells), CELESTA is designed to maximize the joint probability distribution using Markov Random Field (MRF) 47 that includes a spatial-scoring function component accounting for cell spatial information together with a marker-scoring function component accounting for the marker expression profile.
  • MRF Markov Random Field
  • MRF hidden Markov Random Field
  • Spatial-scoring function We use the Potts model energy function defined as: where N is the number of nearest spatial neighboring cells of cell i based on cells’ X and Y coordinates obtained from the image. Each time a neighbor cell j has cell type k, the energy function is increased by one for the cell type k.
  • this probability threshold should be set higher than 0.25. We recommend this threshold value is no greater than 0.5, otherwise it could result in too many unassigned cells. If the cell type probabilities do not pass the threshold, then no cell type is assigned for that cell in the current iteration and the cell is carried over to the next iteration as more cell types assigned that cell may have increased neighborhood information. After each iteration, we update the cell-type-signature matrix, b and the neighborhood cell types based on the newly assigned cells. The algorithm converges when the percentage of additional assigned cells is smaller than a user-defined threshold. The default convergence threshold is 1%. After convergence, a cell is assigned to “unknown” category if it is not assigned a cell type.
  • the remaining tissue was placed on ice and in 50 pi tissue digestion media, DMEM-F12+ with magnesium and calcium (Corning Cellgro, Manassas, VA), 1%FBS (heat inactivated), 10 units/ml Penicillin-10 pg/ml Streptomycin (Gibco, Grand Island, NY), 25 mM hepes (Gibco, Grand Island, NY).
  • Sections were equilibrated in S2 [61 mM NaH 2 PC>4 7 H 2 0 (Sigma)], 39 mM NaH 2 P0 (Sigma) and 250 mM NaCI (Sigma) in a 1 :0.7 v/v solution of S1 and doubly- distilled H 2 0 (ddH 2 0); with final pH of 6.8-7.0 for 10 minutes, and placed in blocking buffer for 30 minutes. All steps followed the Akoya CODEX instructions.
  • Each tissue was imaged with a 20x objective in a 7x9 tiled acquisition at 1386x1008 pixels per tile and 396 nm/pixel resolution and 13 z-planes per tile (axial resolution 1500 nm). Images were chosen with the best focus from the z-planes and were subjected to deconvolution to remove out-of-focus light. Acquired images were pre-processed (alignment and deconvolution with Microvolution software http://www.microvolution.com/) and segmented (including lateral bleed compensation) using publicly available CODEX image processing pipeline available at github.com/nolanlab/CODEX.
  • CELESTA performance on the HNSCC cohort was assessed manually by mapping CELESTA assigned cell types onto the original images using the X and Y coordinates using the ImageJ plugin from github.com/nolanlab/CODEX (FIG. 12). For each cell type, CELESTA assigned cells were plotted as yellow crosses on the canonical marker staining images. Marker staining was shown as white signals on a black background. Key marker staining for each cell type is illustrated in FIG. 12. Assessment for each cell were defined as positive canonical marker signals for that cell type. Manual gating of HNSCC cohort
  • the segmented dataset was uploaded onto the Cytobank analysis platform and transformed with an inverse hyperbolic sine (cofactor of 5).
  • the gating strategy was as follows: Cells were defined by DRAQ5 nuclear expression and size, followed by endothelial (CD31+) and malignant cells (Cytokeratin+). CD4+ T cells (CD4+ CD8- CD3+ CD31- Cytokeratin-), CD8+ T cells (CD8+ CD4- CD3+ CD31- Cytokeratin-) and T regulatory cells (FOXP3+ CD25+ CD4+ CD8- CD3+ CD31- Cytokeratin-) were defined. To adjust for the variability between sample image collection, each gate was tailored to individual sample.
  • CLQ co-location quotient
  • Tumor tissue was thoroughly minced with a sterile scalpel and placed in a gentleMACS C- tube (Miltenyi Biotec, Sunnyvale, CA) containing 1.5mls of tissue digestion media. Tissue was mechanically digested on the GentleMACS dissociator five times under the human tumor tissue program h_tumor_01. Tissue was filtered with a 40 pm nylon cell strainer (Falcon, Corning, NY) into a 14 ml tube filled up to 14 ml of tissue digestion media and spun at 4°C for 10 min at 514RCF. The mechanically digested cell pellet was re-suspended for 2 minutes on ice in 1-4 ml of ACK lysis buffer
  • the solid tissue in the C-tube was incubated at 37° Celsius on a rotator for 1 hour, then filtered with a 40 pm nylon cell strainer (Falcon, Corning, NY) into a 14 ml tube filled up to 14 ml of tissue digestion media and spun at 4°C for 10 min at 514RCF.
  • the enzymatically digested cell pellet was re-suspended in 1-4 ml of ACK lysis buffer (Gibco, Grand Island, NY) depending on the pellet size and number of red blood cells present for 2 minutes on ice.
  • FACS buffer Phosphate Buffered Saline without calcium or magnesium (Corning, Manassas, VA), 2%FBS heat inactivated, 10units/ml Penicillin-10 pg/ml Streptomycin (Gibco, Grand Island, NY), 1 mM Ultra pure EDTA (Invitrogen, Carlsbad, CA) and spun at 4°C for 10 min at 514RCF.
  • Cells were re-suspended in FACS buffer, counted on a hemacytometer and washed one more time with FACS buffer. Cells were kept in FACS buffer on ice until flow cytometry staining. Sorting panel is shown in Table 5.
  • RNA and library preparations were performed according to 10x Genomics vs 2.0 handbook. Single cells were obtained from tissue dissociation. Cells were stained with DAPI for live/dead detection and sorted for up to 500,000 live cells on a BD Aria II. Cells were counted after sort and right before 10x chip prep. 10x/Abseq by BD biosciences, followed the same protocol as the 10x Genomics samples except for the addition of FcBlock and Abseq antibody staining according to the manufacturer’s handbook. Reads were aligned using CellRanger. Preprocessing, data normalization and batch correction were done following Seurat SCTransform integration pipeline. Cells were clustered by shared nearest neighbor modularity optimization. Cell types present were identified with canonical markers.
  • TMA Formalin-fixed paraffin-embedded tissue blocks of HNSCC from 79 patients were pulled from the Stanford Health Care Department of Pathology archives. The area of malignancy was marked by a board-certified pathologist (C.S.K.). TMA was constructed from 1mm diameter cores punched from the tissue blocks. 4 urn thick sections were stained with hematoxylin and eosin, FOXP3 (clone 236A/E7, 1 :100 dilution; Leica BOND epitope retrieval solution 2) and cytokeratin mix (AE1/AE3, 1 :75 dilution & CAM5.2, 1 :25 dilution; Ventana Ultra; protease retrieval).
  • the slides were digitized using Leica whole slide scanner with 40x magnification. Three samples with unknown nodal status were excluded from analysis.
  • the “ground truth” was defined using the published annotations.
  • the “ground truth” is defined using manual gating.
  • cell types with fewer than 5 cells in a sample region in the annotations were excluded.
  • true positives (TP) is the number of cells assigned by both CELESTA and ground truth benchmark.
  • False positives (FP) is the number of cells which were assigned by CELESTA but not ground truth benchmark.
  • False negatives (FN) is the number of cells assigned in benchmark but not CELESTA.
  • True negatives (TN) is the number of cells that are not assigned by both CELESTA and benchmark.
  • ARI adjusted rand index
  • Precision is defined as TP/(TP+FP)
  • recall is defined as TP/(TP+FN).
  • F1 score is defined as 2(precisionxrecall)/(precision+recall).
  • Rand index to measure accuracy is defined as (TP+TN)/ (TP+TN+FP+FN).
  • FIGS. 5 and 6 Parts of FIGS. 5 and 6 were created using Biorender online tool (biorender.com). Multi channel overlay images were created using ImageJ.
  • FoxP3 EGFP mice 48 were acquired from Jackson (006772) and bred at Stanford University. Splenocytes were harvested from tumor-naive female FoxP3 EGFP mice. All studies were performed in female mice between 7 to 9 weeks of age. Mice were housed in facilities maintained at temperatures between 65-75 degrees Fahrenheit, with humidity between 40-60% and subjected to 12/12 light/dark cycles (7am to 7pm). Spleens were subjected to mechanical dissociation on 70pm cell strainers and washed with HBSS supplemented with 2% FBS and 2mM Ethylenediaminetetraacetic acid (EDTA) (HBSSFE).
  • HBSSFE Ethylenediaminetetraacetic acid
  • Tregs were cultured in RPMI-1640 supplemented with 10% FBS, 2 mM L-glutamine, 15 mM HEPES, 14.3 mM 2-mercaptoethanol, 1 mM Sodium Pyruvate, 1 c MEM Non-Essential Amino Acids Solution, and 300 IU hlL-2 (Peprotech) for 72 hours.
  • T umor cell line suspensions were prepared by washing with phosphate buffered saline (PBS) followed by treatment with StemPro Accutase (Thermo, A1110501). 10 5 tumor cells were plated in the bottom chamber of the 24-well transwell plates 24 hours prior to the assay. 5 pm transwell membranes (Costar, 3421) were incubated in complete RPMI for 24 hours prior to the assay. Membranes were transferred to the tumor-containing wells and suspensions of 5x10 4 Tregs were added to the top chambers of the transwells. Cells were cultured for 2 hours at 37°C in 5% C0 2 , after which the membranes were removed, and cells from the bottom chamber were processed for analysis by flow cytometry.
  • PBS phosphate buffered saline
  • Stemo Stemo, A1110501
  • mice housed in our facility at Stanford.
  • B16-F0 or LN6-987AL tumor cells were washed with PBS and dissociated from tissue culture plastic with StemPro Accutase (Thermo, A1110501).
  • Cell suspensions of 2x10 5 cells in phenol-red free DMEM were injected into the subcutaneous region of the left flank of nine-week-old female mice (Jackson, 000664) following removal of fur with surgical clippers. After 15 days of tumor growth, mice were euthanized and their tumors were processed for analysis by flow cytometry.
  • Tumors were weighed followed by digestion in RPMI-1640 supplemented with 4mg/mL Collagenase Type 4 (Worthington, LS004188) and 0.1mg/mL Deoxyribonuclease I (DNAse I, Sigma, DN25) at 37°C for 20 minutes with agitation. Tumors were then dissociated on 70pm strainers, washed with HBSSFE, and stained for viability using LIVE/DEAD Fixable Blue Dead Cell Stain (Thermo, L34962).
  • LN6-987AL cells were prepared as above and injected into seven-week-old FoxP3 EGFP mice. Mice were treated with AMG487 (R&D Systems, 4487) at 5mg/kg every 48 hrs starting on day one following tumor implantation. After 9 days of tumor growth, mice were euthanized and their tumors were processed for analysis by flow cytometry (BD FACS Diva 8.0.2) as described above.
  • MIBI Multiplexed ion beam imaging
  • Table 1 An example of the initial cell-type signature matrix used in CELESTA based on the CODEX panel used for the colorectal cancer FFPE samples (Schurch et al. 2020).
  • DC dendritic cell.
  • CK cytokeratin.
  • PDPN podoplanin.
  • Table 2 Staging information of head and neck squamous cell carcinoma (HNSCC) samples included in the study.
  • HNSCC head and neck squamous cell carcinoma
  • Table 3 OCT CODEX panel of head and neck squamous cell carcinoma study.
  • Table 4 An example of initial cell-type signature matrix used in CELESTA based on the OCT CODEX panel in the head and neck squamous cell carcinoma study.
  • pDC plasmacytoid dendritic cell.
  • cDC conventional dendritic cell.
  • CK cytokeratin.
  • An exemplary CELESTA program is provided at github.com/plevritis-lab/CELESTA. This version of the CELESTA program currently relies on the following R packages: - Rmixmod: for performing Gaussian Mixture Modeling - spdep: for obtaining spatial neighborhood information - zeallot: for R code styling and provides a % ⁇ -% operator to perform multiple, unpacking, and destructuring assignment in R. - ggplot2 reshape2: for plotting. [00268] The program uses two inputs:
  • Segmented imaging data a dataframe with rows as the cells with (1) two columns named X and Y to define the XY coordinates of the cells and (2) other columns having the protein marker expressions for each cell.
  • the first column has the cell types to be inferred
  • the second column has the lineage information for each cell type.
  • the lineage information has three numbers connected by (underscore).
  • the first number indicates round. Cell types with the same lineage level are inferred at the same round. Increasing number indicates increased cell-type resolution. For example, immune cells -> CD3+ T cells -> CD4+ T cells.
  • the third number is a number assigned to the cell type, i.e, cell type number.
  • the middle number tells the previous lineage cell type number for the current cell type. For example, the middle number for CD3+ T cells is 5, because it is a subtype of immune cells which have cell type number assigned to 5.
  • each column is a protein marker. If the protein marker is known to be expressed for that cell type, then it is denoted by “1 ”. If the protein marker is known to not express for a cell type, then it is denoted by “0”. If the protein marker is irrelevant or uncertain to express for a cell type, then it is denoted by “NA”.
  • Pre-saved imaging data is taken from reg009 (e.g., used published CODEX data Schurch et al. (2020) Cell 182(5) :1341 -1359; herein incorporated by reference in its entirety).
  • a cell with every marker having expression probability higher than 0.9 are filtered out.
  • a cell with every marker having expression probability lower than 0.4 are filtered out.
  • maxjteration is used to define the maximum iterations allowed in the EM algorithm per round.
  • cell_change_threshold is a user-defined ending condition for the EM algorithm.
  • 0.01 means that when fewer than 1% of the total number of cells do not change identity, the algorithm will stop.
  • the cell_number_to_use corresponds to the defined numbers in the prior cell-type signature matrix.
  • 1 corresponds to endothelial cell
  • 2 corresponds to tumor cell
  • CELESTA After running AssignCells() function, CELESTA will output a .csv file with the cell type assignment to each cell for each round and the final combined cell types.
  • CELESTA can also plot the assigned cells by using the PlotCellsAnyCombination() function.
  • AssignCellsO In the AssignCellsO function, it requires four vectors to define the high and low thresholds for each cell type. The length of the vector equals to the total number of cell types defined in the cell- type signature matrix. Examples of the thresholds are provided under the foldendata.
  • the two vectors are required for defining the “high expression threshold”, one for anchor cells and one for index cells.
  • the thresholds defined how much the marker expression probability is in order to be considered as expressed.
  • the PlotExpProb() function can be applied.
  • the segmented data may have some compensation in the values which are the inputs to CELESTA, the expression probabilities are calculated based on the segmented data. It’s useful to compare the expression probabilities with the CODEX staining for each marker. For example, for endothelial cells, if we plot the expression probabilities of CD31 (left) and compare with the CD31 staining, approximately 0.9 and 0.8 would be the right threshold for defining how much the cell should express CD31 .
  • anchor cells use a slightly higher threshold than index cells.
  • the two vectors are required for defining the “low marker threshold”, one for anchor cells and one for index cells.
  • the thresholds defined how much the marker expression probability is in order to be considered as not expressed. Normally 1 is assigned to this value unless there are a lot of doublets or co-staining in the data.

Abstract

Procédés, systèmes et dispositifs, y compris des programmes informatiques codés sur un support de stockage informatique, permettant d'identifier des types de cellule des cellules individuelles à l'intérieur d'un tissu à l'aide de l'expression des marqueurs cellulaires et des informations spatiales. En particulier, l'invention concerne un algorithme d'apprentissage automatique, appelé « CELESTA », qui automatise l'identification des types de cellule dans des images in situ multiplexées.
PCT/US2022/015819 2021-02-09 2022-02-09 Identification des types de cellule dans des images in situ multiplexées par combinaison des profils d'expression et des informations spatiales WO2022173828A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US202163147596P 2021-02-09 2021-02-09
US63/147,596 2021-02-09

Publications (1)

Publication Number Publication Date
WO2022173828A1 true WO2022173828A1 (fr) 2022-08-18

Family

ID=82837895

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2022/015819 WO2022173828A1 (fr) 2021-02-09 2022-02-09 Identification des types de cellule dans des images in situ multiplexées par combinaison des profils d'expression et des informations spatiales

Country Status (1)

Country Link
WO (1) WO2022173828A1 (fr)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115346599A (zh) * 2022-10-19 2022-11-15 四川大学华西医院 H&e图像基因和细胞异质性预测方法、系统和存储介质
WO2024083853A1 (fr) * 2022-10-19 2024-04-25 F. Hoffmann-La Roche Ag Détection d'anomalie dans une image d'échantillon

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020160044A1 (fr) * 2019-01-28 2020-08-06 The Broad Institute, Inc. Transcriptomique spatiale in-situ
US20210279866A1 (en) * 2020-03-06 2021-09-09 Bostongene Corporation Machine learning image processing techniques

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020160044A1 (fr) * 2019-01-28 2020-08-06 The Broad Institute, Inc. Transcriptomique spatiale in-situ
US20210279866A1 (en) * 2020-03-06 2021-09-09 Bostongene Corporation Machine learning image processing techniques

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
PETTIT JEAN-BAPTISTE, TOMER RAJU, ACHIM KAIA, RICHARDSON SYLVIA, AZIZI LAMIAE, MARIONI JOHN: "Identifying Cell Types from Spatially Referenced Single-Cell Expression Datasets", PLOS COMPUTATIONAL BIOLOGY, vol. 10, no. 9, 25 September 2014 (2014-09-25), XP055963412, DOI: 10.1371/journal.pcbi.1003824 *
YURY GOLTSEV, NIKOLAY SAMUSIK, JULIA KENNEDY-DARLING, SALIL BHATE, MATTHEW HALE, GUSTAVO VAZQUEZ, SARAH BLACK, GARRY P. NOLAN: "Deep Profiling of Mouse Splenic Architecture with CODEX Multiplexed Imaging", CELL, ELSEVIER, AMSTERDAM NL, vol. 174, no. 4, 1 August 2018 (2018-08-01), Amsterdam NL , pages 968 - 981.e15, XP055668657, ISSN: 0092-8674, DOI: 10.1016/j.cell.2018.07.010 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115346599A (zh) * 2022-10-19 2022-11-15 四川大学华西医院 H&e图像基因和细胞异质性预测方法、系统和存储介质
WO2024083853A1 (fr) * 2022-10-19 2024-04-25 F. Hoffmann-La Roche Ag Détection d'anomalie dans une image d'échantillon

Similar Documents

Publication Publication Date Title
Crosby et al. Early detection of cancer
Bergholtz et al. Best practices for spatial profiling for breast cancer research with the GeoMx® digital spatial profiler
Janesick et al. High resolution mapping of the breast cancer tumor microenvironment using integrated single cell, spatial and in situ analysis of FFPE tissue
Gazzaniga et al. Prognostic value of circulating tumor cells in nonmuscle invasive bladder cancer: a CellSearch analysis
Feuchtinger et al. Image analysis of immunohistochemistry is superior to visual scoring as shown for patient outcome of esophageal adenocarcinoma
US9778263B2 (en) Quantitative in situ characterization of biological samples
WO2022173828A1 (fr) Identification des types de cellule dans des images in situ multiplexées par combinaison des profils d'expression et des informations spatiales
Xu et al. Quantum dot-based, quantitative, and multiplexed assay for tissue staining
Godin et al. A novel approach for quantifying cancer cells showing hybrid epithelial/mesenchymal states in large series of tissue samples: towards a new prognostic marker
Nair et al. An observational study of circulating tumor cells and 18F-FDG PET uptake in patients with treatment-naive non-small cell lung cancer
Le Du et al. EpCAM-independent isolation of circulating tumor cells with epithelial-to-mesenchymal transition and cancer stem cell phenotypes using ApoStream® in patients with breast cancer treated with primary systemic therapy
EP3198278B1 (fr) Diagnostic basé sur les cellules tumorales circulantes pour l'identification d'une résistance aux thérapies ciblant les récepteurs des androgènes
US10539565B2 (en) Methods for determining prognosis of colorectal cancer
Saunders et al. Individual patient oesophageal cancer 3D models for tailored treatment
Rozova et al. Machine learning reveals mesenchymal breast carcinoma cell adaptation in response to matrix stiffness
Mezheyeuski et al. An immune score reflecting pro-and anti-tumoural balance of tumour microenvironment has major prognostic impact and predicts immunotherapy response in solid cancers
Senosain et al. HLA-DR cancer cells expression correlates with T cell infiltration and is enriched in lung adenocarcinoma with indolent behavior
Patkulkar et al. Mapping spatiotemporal heterogeneity in tumor profiles by integrating high-throughput imaging and omics analysis
Camacho et al. Improved demonstration of immunohistochemical prognostic markers for survival in follicular lymphoma cells
Ivanova et al. Empowering renal cancer management with AI and digital pathology: Pathology, diagnostics and prognosis
Yeo et al. Accurate isolation and detection of circulating tumor cells using enrichment-free multiparametric high resolution imaging
Smolkova et al. Liquid biopsy and preclinical tools for advancing diagnosis and treatment of patients with pancreatic neuroendocrine neoplasms
Lee et al. A novel 3D pillar/well array platform using patient-derived head and neck tumor to predict the individual radioresponse
Lin et al. Multi-modal digital pathology for colorectal cancer diagnosis by high-plex immunofluorescence imaging and traditional histology of the same tissue section
Brouwer et al. HER-2 status of circulating tumor cells in a metastatic breast cancer cohort: A comparative study on characterization techniques

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22753265

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22753265

Country of ref document: EP

Kind code of ref document: A1