WO2020086992A1 - Highly multiplexed fluorescence in situ hybridization (fish) platform for gene copy number evaluation - Google Patents

Highly multiplexed fluorescence in situ hybridization (fish) platform for gene copy number evaluation Download PDF

Info

Publication number
WO2020086992A1
WO2020086992A1 PCT/US2019/058126 US2019058126W WO2020086992A1 WO 2020086992 A1 WO2020086992 A1 WO 2020086992A1 US 2019058126 W US2019058126 W US 2019058126W WO 2020086992 A1 WO2020086992 A1 WO 2020086992A1
Authority
WO
WIPO (PCT)
Prior art keywords
fluorophores
sample
location
gene
image
Prior art date
Application number
PCT/US2019/058126
Other languages
French (fr)
Inventor
Anthony John IAFRATE
Maristela Lika ONOZATO
Hunter Lee ELLIOT
Clarence YAPP
Original Assignee
The General Hospital Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by The General Hospital Corporation filed Critical The General Hospital Corporation
Publication of WO2020086992A1 publication Critical patent/WO2020086992A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6813Hybridisation assays
    • C12Q1/6841In situ hybridisation

Definitions

  • the present disclosure relates to methods and reagents for evaluating gene copy number, in particular using fluorescence in situ hybridization (FISH).
  • FISH fluorescence in situ hybridization
  • CNAs gene copy number alterations
  • FISH DNA fluorescence in situ hybridization
  • the inventors have developed a clinical grade high-throughput DNA-FISH platform that enables automated single gene CNA analysis of a large panel of genes.
  • This method can generally be used in slides containing cells whose DNA has been morphologically preserved, such as formalin-fixed paraffin embedded (FFPE) tumor biopsy samples and in isolated circulating tumor cells (CTCs).
  • FFPE formalin-fixed paraffin embedded
  • CTCs isolated circulating tumor cells
  • the platform may include two parts: (1) custom multiplex fluorescent DNA probes, and (2) custom software that identifies gene copy number from the captured fluorescent images.
  • the inventors have designed bacterial artificial chromosome (B AC) clone and PCR-based or synthetic DNA probes with combinatorial fluorescent labels, such that at least (but not limited to) 35 genes can each be "barcoded” with a unique fluorophore combination using at least (but not limited to) 6 fluorophore colors with 2, 3, or more color co-labeling per gene.
  • B AC bacterial artificial chromosome
  • the invention includes a clinical grade high-throughput
  • DNA-FISH platform that enables the automated copy number analysis of a large panel of genes mainly in (but not limited to) formalin-fixed paraffin embedded (FFPE) tumor biopsy samples and in isolated circulating tumor cells (CTCs).
  • FFPE formalin-fixed paraffin embedded
  • CTCs isolated circulating tumor cells
  • This quantitative single-slide assay platform utilizes a library of 35 locus-specific DNA sequence probes for a combination of 6 fluorophores. Each gene is "bar-coded” with a unique combination of fluorophores (multiplexed). Multiplexed slide datasets are captured and the profile for each fluorophore is identified by multispectral analysis with linear unmixing or datasets from specific fluorescence filters and the custom built software quantifies the number of copies of each gene.
  • DNA fluorescence in situ hybridization is the gold standard method to detect copy number alterations, but it is limited by the number of genes one can quantify simultaneously.
  • the inventors disclose herein a fluorescent "barcode" system for the unique labeling of dozens of genes and an automated image analysis algorithm that enabled their simultaneous hybridization for the quantification of gene copy numbers. The reliability of this multiplex approach is demonstrated on normal human lymphocytes, metaphase spreads of transformed cell lines, and cultured circulating tumor cells.
  • the invention provides a method for multiplex labeling of a sample and gene copy number evaluation, including: providing a plurality of fluorescently- labeled polynucleotide probes, each of the plurality of fluorescently-labeled polynucleotide probes being directed to a different polynucleotide and being labeled with a distinct combination of fluorophores selected from a plurality of fluorophores; applying the plurality of fluorescently- labeled polynucleotide probes to a sample; obtaining an image of the sample including emissions from the plurality of fluorophores; analyzing the image to identify a location within the sample having a group of fluorophores of the plurality of fluorophores associated with the location; identifying a gene associated with the location based on identifying the location within the sample having the group of fluorophores; and determining a copy number of the identified gene
  • the invention provides an apparatus for multiplex labeling of a sample and gene copy number evaluation, including: a processor in communication with an imaging system, the processor to: obtain an image of a sample from the imaging system, the image including emissions from a plurality of fluorophores associated with the sample, and the sample including a plurality of fluorescently-labeled polynucleotide probes applied to the sample, each of the plurality of fluorescently-labeled polynucleotide probes being directed to a different polynucleotide and being labeled with a distinct combination of fluorophores selected from the plurality of fluorophores; analyze the image to identify a location within the sample having a group of fluorophores of the plurality of fluorophores associated with the location; identify a gene associated with the location based on identifying the location within the sample having the group of fluorophores; and determine a copy number of the identified gene.
  • FIGS. 1 A-1H show combinations of fluorophores used to barcode each gene.
  • FIG. 1A shows multiplex FISH probe mixes that were constructed with the goal of“barcoding” each gene probe with a unique combination of two (left) or three (right) fluorophores.
  • FIG. 1B shows a five-plex probe hybridization assay (PDGFRA, MET, EGFR,
  • FIGS. 1C-1H show efficient fluorophore incorporation and a robust probe specific activity that were demonstrated by high signal-to-noise ratio in testing. Co-localization of fluorophores was evident, and reflected the expected pre-experimental labeling plan.
  • FIGS. 2A-2C show a probe labeling schema or‘Probe Matrix’.
  • FIG. 2A shows a probe matrix for five genes.
  • FIG. 2B shows a probe matrix for ten genes.
  • FIG. 2C shows a probe matrix for fifteen genes. Each column shows the respective color combinations for each gene. Labeling of each gene was carried out by nick translation separately and the products were then combined in a single tube and concentrated to produce the probe mix.
  • the schema is used by the image analysis algorithm for identification of colocalized signals that are considered statistically significant and assign the corresponding gene.
  • FIGS. 3 A-3C show a workflow for quantifying gene count.
  • FIG. 3 A shows a volumetric rendering of a circulating tumor cell hybridized with 15 genes.
  • FIG. 3 A shows a volumetric rendering of a circulating tumor cell hybridized with 15 genes.
  • FIG. 3B shows a nuclear masking to eliminate false positive point source detection outside of nucleus and point source detection based on fitting a 3D Gaussian model. Maxima of each channel are shown.
  • FIG. 3C shows coincident spots that are located within a predefined radius are matched to genes based on a label matrix.
  • FIGS. 4A-4G show gene quantification of a 10 gene probe mix.
  • FIG. 4A shows a confocal image of circulating tumor cell hybridized with 10 gene probe mix.
  • FIG. 4B shows a representative copy number analysis of normal lymphocytes.
  • FIG. 4C shows a representative copy number analysis of cultured circulating tumor cell BRX-7.
  • FIG. 4D shows a representative copy number analysis of cultured circulating tumor cell BRX-42.
  • FIG. 4E shows a representative copy number analysis of cultured circulating tumor cell BRX-61.
  • FIG. 4F shows a representative copy number analysis of tumor cell line H1975.
  • FIG. 4G shows a representative copy number analysis of tumor cell line GBM18.
  • FIGS. 5A-5C show gene quantification of a 15 gene probe mix.
  • FIG. 5 A shows a confocal image of circulating tumor cell (BRX-68) hybridized with 15 gene probe mix.
  • FIG. 5B shows a representative image of copy number analysis of cultured circulating tumor cells for BRX-68.
  • FIG. 5C shows a representative image of copy number analysis of cultured circulating tumor cells for BRX-82.
  • Cells were hybridized with a probe mix containing 15 genes bar-coded with two fluorophores. Presence of noise due to non-homogeneous nature of the nucleus resulted in false negative calls for NMYC and CDK4 in (FIG. 5B) BRX-68 and (FIG.
  • FIGS. 6A-6D show parameters collected with the confocal microscope helped to build the image analysis platform and will allow to shift the imaging acquisition to a widefield optical system in order to decrease its turnaround time.
  • FIG. 6 A shows a side projection of serial optical sections that were acquired by confocal microscopy and visualized as three-dimensional renderings to aid in the development of the image analysis algorithm. Side projections were used to determine a suitable axial step size to adequately sample neighboring spots in widefield microscopy.
  • FIG. 6B shows an upper view of the serial optical sections of FIG. 6A.
  • FIG. 6D shows an upper view of the serial optical sections of FIG. 6C. The higher throughput of the widefield scope is more suited for a clinical setting.
  • FIGS. 7A-7B show color shift correction.
  • FIG. 7A shows color shift correction in the lateral dimension.
  • FIG. 7B shows color shift correction in the axial directions.
  • Channels showed significant shift in the (FIG. 7A) lateral and (FIG. 7B) axial dimensions.
  • Each fluorophore was registered to the reference channel, Aqua 431. Transformations were
  • FIG. 8 shows highly amplified gene quantification. Volume rendering of a H1975 cell with spots and large blobs (indicating high amplification of the MYC gene) coinciding in the Green 496 and Red 650 channel.
  • the detection algorithm is able to detect single copies of genes from spots and highly amplified genes in the form of blobs, each marked and visualized with spheres. Single spots are marked with a single sphere, while blobs have multiple overlaid spheres. The number of spheres for each blob corresponds to the ratio of the blob volume to an average spot as an estimation of the extent to which that gene is amplified.
  • FIG. 9A shows multicolor fluorescence in situ hybridization (multiplex-FISH or
  • M-FISH automated quantification and reference single FISH analysis of 10 genes in the H1975 cell line. Dots indicate the copy number derived from aCGH, as an average of probes spanning the relevant genes.
  • FIG. 9B shows genome copy number summary for aCGH analysis of H1975.
  • the horizontal axis represents the linear position along the chromosome, whereas the vertical axis represents the measured log 2 signal ratio (-2 to 2). Relevant gene names are placed in the appropriate genomic positions.
  • FIG. 9C shows a zoomed-in view of chromosome 8, depicting high-level copy number gain for MYC.
  • FIG. 9D shows an M-FISH automated quantification and reference single FISH analysis of 10 genes in the 293T cell line, with dots indicating the copy number derived from aCGH.
  • FIG. 9E shows a genome copy number summary for aCGH analysis of 293T.
  • FIG. 10A shows multiplex FISH quantification and comparison to single FISH and aCGH (dots) for BUML.
  • FIG. 10B shows aCGH genomic profiles corresponding to FIG. 10 A, shown as log 2 ratios, for BUML.
  • FIG. 10C shows multiplex FISH quantification and comparison to single FISH and aCGH (dots) for UACC62.
  • FIG. 10D shows aCGH genomic profiles corresponding to FIG. 10C, shown as log 2 ratios, for UACC62.
  • FIG. 10E shows multiplex FISH quantification and comparison to single FISH and aCGH (dots) for MCF10A.
  • FIG. 10F shows aCGH genomic profiles corresponding to FIG. 10E, shown as log 2 ratios, for MCF 10 A.
  • FIG. 11 A shows multiplex FISH quantification for 10 genes in GBM18 and comparison to single FISH and aCGH (dots).
  • FIG. 11B shows aCGH genomic profiles corresponding to FIG. 11 A shown as log 2 ratios, for GBM18.
  • FIG. 11C shows multiplex FISH quantification for 15 genes in PC3 cell lines and comparison to single FISH and aCGH (dots).
  • FIG. 11D shows aCGH genomic profiles corresponding to FIG. 11C shown as log 2 ratios, for PC3.
  • FIG. 12 shows reproducibility analysis of multiplex fluorescence in situ hybridization (FISH). Box-and-whisker plots are shown for mean and interquartile range for copy number counts for each replicate experiment.
  • FIG. 13 is a scatterplot of average gene automated copy number counts in two independent hybridizations using the lO-plex probe mix for H1975 (circles) and 293 T (triangles) cell lines.
  • FIG. 14A shows tested sequence specific Tail-PCR probe in cultured circulating tumor cells (CTCs).
  • FIG. 14B shows tested sequence specific formalin-fixed paraffin-embedded glioblastoma multiforme (GBM) cancer samples.
  • FIG. 14C shows a table of 10 genes contained in the probe of FIG. 10 A.
  • FIG. 15A shows an FFPE prepared with robotic slide processor for breast cancer samples
  • FIG. 15B shows an FFPE prepared with robotic slide processor for GBM cancer samples
  • FIG. 16A shows a formalin fixed paraffin embedded (FFPE) specimen imaged with a Vectra multispectral imaging system.
  • FIG. 16B shows an FFPE specimen imaged with a Pannoramic confocal whole slide image scanner.
  • FIG. 16C shows an FFPE specimen imaged with a Cyto Vision ® platform.
  • FIG. 17A shows formalin fixed paraffin embedded (FFPE) of glioblastoma multiforme (GBM) case hybridized with a multiplex probe recognizing 10 genes and submitted to signal quantification.
  • FFPE formalin fixed paraffin embedded
  • FIG. 17B is a graph quantifying the signals sensed with the probe of FIG. 17A.
  • FIG. 18 A shows FFPE of a GBM case.
  • FIG. 18B shows FFPE of another GBM case.
  • FIG. 19A shows efficiency of nick translation labeling performed at time points of 8 and 16 hours.
  • FIGS. 19B-19C show efficiency of nick translation labeling performed at time points of 2.5, 3, and 4 hours.
  • FIG. 20 A shows a 3 -hour nick translation 2% agarose gel showing unlabeled
  • FIG. 20B shows gel showing DNA on the left and DNA labeled with two and three fluorophores.
  • FIG. 21 shows the same data shown as in FIG. 12, but with a fixed y-axis range of
  • FIG. 22A shows accuracy displayed as a deviation plot of mean copy number for the lymphocyte data set.
  • FIG. 22B shows detailed MDM4 data showing copy number counts for each individual replicate, with each dot representing a single-cell count and with mean and SD indicated.
  • FIG. 23 A shows a first portion of a table showing reference gene copy number values obtained by single FISH (mean ⁇ SD).
  • FIG. 23B shows a second portion of the table showing reference gene copy number values obtained by single FISH (mean ⁇ SD).
  • FIG. 23 C shows a third portion of the table showing reference gene copy number values obtained by single FISH (mean ⁇ SD).
  • FIG. 23D shows a fourth portion of the table showing reference gene copy number values obtained by single FISH (mean ⁇ SD).
  • FIG. 24 shows an example of a system for multiplex labeling of a sample and gene copy number evaluation in accordance with some embodiments of the disclosed subject matter.
  • FIG. 25 shows an example of hardware that can be used to implement a computing device and server in accordance with some embodiments of the disclosed subject matter.
  • Fluorescence in situ hybridization is the gold standard technique for the detection of gene copy number changes including amplifications and deletions critical in the diagnosis and management of cancer.
  • the utility of FISH has been limited by the number of genes that can be evaluated at a time, and its use has diminished in the era of genome-wide approaches such as array comparative genomic hybridization and next generation sequencing. Nonetheless, it is still extremely powerful as it allows absolute copy number quantification at the single cell level in the context of tumor section architecture.
  • the inventors have sought to increase its multiplicity, so one could more rapidly determine copy number alterations among genes believed to play a role in oncogenesis.
  • a multiplex FISH assay might also be expected to have direct clinical applicability in simultaneously assaying many actionable copy number changes in tumor samples. The approach may be especially powerful when samples have a limited number of cells, such as isolated circulating tumor cells. In these settings multiplex FISH would be faster and less expensive than other options such as NGS.
  • the development of a multiplex, locus-specific FISH method employing combinatorial labeling of 15 genes with 2 or 3 fluorophores each is reported herein, although in other embodiments a larger number of genes and fluorophores may be used.
  • Hybridized cells were imaged with a laser-scanning confocal microscope, and the spectral signature of each fluorophore was identified by linear unmixing.
  • the inventors developed a custom automated image analysis pipeline, allowing rapid and accurate gene quantification. The approach was validated by measuring the gene copy number of normal lymphocytes, cultured tumor cell lines, and circulating tumor cells, and comparing these results to known copy numbers obtained via traditional methods.
  • a multiplex FISH (M-FISH) assay can be expected to have direct clinical applicability in simultaneously assaying many actionable copy number changes in tumor samples.
  • the approach may be especially powerful when samples have a limited number of cells, such as isolated circulating tumor cells (CTCs).
  • CTCs isolated circulating tumor cells
  • multiplex FISH would be faster and less expensive than other options, such as NGS.
  • bacterial artificial chromosome (BAC) clone and PCR- based DNA probes may be provided which have combinatorial fluorescent labels, such that at least (but not limited to) 35 genes can each be "barcoded” with a unique fluorophore combination using at least (but not limited to) 6 fluorophore colors with 2, 3, or more colors co-labeling per gene.
  • BAC bacterial artificial chromosome
  • a particular polynucleotide probe may be labeled with several (2, 3, 4, or more) different colors/fluorophores. This may be achieved by attaching multiple different fluorophores to the same polynucleotide chain or by labeling separate subgroups of the probe each with a different single fluorophore, mixing the subgroups together (e.g. in generally equal amounts), and applying the mixture to the sample for hybridization and labeling and subsequent imaging and analysis.
  • Possible fluorophores (generally coupled to dUTP) that may be used include aqua 431 (7-Diethylaminocoumarin-3-carboxylic acid, DEAC), green 496 (5 -Fluorescein), green 500 (5-Carboxyrhodamine 110), Alexa fluor 488, cyanine 3, gold 525 (5(6)-Carboxyrhodamine 6G), gold 550 (Cyanine-3E), orange 552 (5-TAMRA), Alexa Fluor 568, red 580 (5-ROX), red 594, Alexa Fluor 594, red 598, cyanine 5, red 650 (Cyanine- 5E), Alexa Fluor 647, and far red 673, although other fluorophores may also be used.
  • the fluorophores that are employed for a particular labeling set are selected so that no two fluorophores associated with a given sample have overlapping emission spectra.
  • a sample may include single or multiple cells, e.g. adhered to glass such as a slide, or may include tissue sections.
  • the sample may be fixed and fluorescently-labeled polynucleotide probes may be applied to the sample under conditions which promote
  • hybridization of the probes with one or more genes in the sample where the genes are generally located in nuclei of the sample.
  • the labeled sample may then be imaged, for example using widefield
  • imaging may be performed by collecting several separate images each using a different combination of excitation and emission wavelengths (e.g. using filter sets tuned for particular fluorophores). In other embodiments, imaging may be performed by obtaining spectral scans of the sample, e.g.
  • a wavelength range such as 400- 700 nm, 450-650 nm, etc.
  • separating out the signals from each individual fluorophore for example computationally using a technique such as linear unmixing, which may be determined relative to one or more reference spectra for each respective fluorophore.
  • Imaging of the sample generates one or more images which include fluorescence emissions from the fluorophores associated with the probes that were applied to the sample.
  • the fluorescence emission is in the form of localized spots (e.g. see FIG. 1A).
  • the image is analyzed to identify the localized spots and to determine which of the spots are colocalized in order to determine which genes have been identified and the copy number of the genes; in certain embodiments the gene copy number may be determined based on the relative brightness of the spots, as discussed further below.
  • a particular group of spots that have been determined to be sufficiently close together to be colocalized include an orange spot and a red spot
  • this location is determined to be associated with the RET gene (see FIG. 2A)
  • the fluorescently-labeled polynucleotide probe (or mix of probes) associated with the RET gene have been labeled with fluorophores that emit orange and red light.
  • using a set of six different fluorophores permits the establishment of 5- plex, lO-plex, or l5-plex labeling schemes to uniquely identify 5, 10, or 15 genes within the same sample (see FIGS. 2A-2C) by associating two labels with each gene.
  • associating 3 labels with each gene may permit up to 20 genes to be uniquely identified.
  • the cells used for various experiments disclosed herein were grown according to standard protocols specified for each cell line.
  • Established tumor cell lines were obtained from ATCC (Manassas, VA) (H1975, MCF10A, PC3, H460, UACC62, HCC1954, 293T, LM2, Sk- Mel, BL209, BUML, and LNCaP) and by collaboration with Dr. Hiroaki Wakimoto (Department of Neurosurgery, Massachusetts General Hospital, Boston, MA) (GBM18 and GBM29); and they were grown according to standard protocols specified for each cell line.
  • Patient-derived CTC lines (BRX-07, BRX-42, BRX-50, BRX-61, BRX-68, BRX-82, and BRX-142) were previously described. Normal lymphocytes were obtained from blood draws from five healthy donors (three males and two females). These cells are listed in Table 1 and described further below.
  • the multiplex-FISH assay may include three principal steps: 1) probe construction and hybridization, 2) image acquisition and 3) automated image analysis.
  • BAC Bacterial artificial chromosome
  • FISH probes were derived from BAC clones purchased from Children’s Hospital
  • E. coli transformed with individual BAC clones were cultured using Luria-Bertani (LB) media (SIGMA, St. Louis, MO) containing 12.5 pg /ml chloramphenicol (Teknova, Hollister, CA). Overnight cultures were extracted using the Qiagen Midiprep Kit (Qiagen, Valencia, CA) following the manufacturer’s protocol (Suppl. Materials and Methods). Extracted BAC DNA was then amplified by multiple displacement amplification with the Qiagen Repli-G midi kit (Qiagen), following the manufacturer’s protocol before proceeding to labeling by nick translation.
  • LB Luria-Bertani
  • Teknova Hollister, CA
  • Qiagen Midiprep Kit Qiagen, Valencia, CA
  • Extracted BAC DNA was then amplified by multiple displacement amplification with the Qiagen Repli-G midi kit (Qiagen), following the manufacturer’s protocol before proceeding to labeling by nick translation.
  • DNAs were labeled by nick translation kit (Abbott Molecular Inc., Des Plaines,
  • fluorophore-conjugated dUTPs Enzo Life Sciences Inc., Farmingdale, NY
  • fluorophore-conjugated dUTPs Enzo Life Sciences Inc., Farmingdale, NY
  • aqua 43 l-dUTP Excitation 431/ Emission 480
  • green 496-dUTP Ex. 496/ Em. 520
  • gold 525- dUTP Ex. 525/ Em. 551
  • orange 552-dUTP Ex. 552/ Em. 576
  • red 580-dUTP Ex. 580/ Em. 603
  • red 650 (Cy5)-dUTP Ex. 650/ Ex. 662).
  • two- or three- fluorophore combinations were chosen to minimize spectral overlap (FIG. 1 A).
  • Optimal fluorophore-conjugated dUTP mixture concentrations were empirically-determined following probe visualization under the microscope (see below).
  • Ethanol-precipitated nick translation reactions were resuspended in hybridization buffer (see below).
  • FIGS. 1C-1H are each shown with a bar 104 representing a scale of 10 pm.
  • the microscope was configured with four lasers, Diode 440 nm, Argon 488, DPSS (Diode- Pumped Solid-State) 561 nm and HeNe 633 nm, high efficiency triple band pass beam splitter MSB 488/561/633 and single line MBS 445.
  • Two objective lenses were used in this study: a Plan-Apochromat l00x/l.46 NA and Plan-Apochromat 63x/l.40 NA.
  • Chromatic aberrations were measured and corrected for using tetra-spec fluorescent beads (Life Technologies) and Y- chromosome FISH slide labeled with all six fluorophores.
  • Images were collected with lOOx objective lens with bit depth of l6-bit, frame size of 512 c 512 pixels with image pixel size of 0.06 pm. Multiple z stacks were collected at an interval of 0.10 pm. Analysis of the punctae sizes in the lOOx objective lens captured images allowed us to broaden pixel size and intervals and adopt the 63x objective lens. Images acquired with the 63x objective lens had a bit depth of 16- bit, frame size of 256 x 256 pixels with image pixel size of 0.12 pm and z intervals of 0.20 pm.
  • Fluorescent probe identities were detected using lambda mode and spectral unmixing with a 34-channel photomultiplier tube for high-resolution spectral image acquisition. It was determined the reference spectra for each fluorophore by obtaining their profiles from slides hybridized with each gene labeled as a single color. The resulting spectra were stored in the Spectra Database of the microscope. Linear unmixing separated mixed signals pixel by pixel, using the algorithm in Zen Black 2011, the Zeiss proprietary software. This algorithm
  • the algorithm for detection and localization of fluorescent point sources included the following sequential operations: 1) image registration 2) nuclear segmentation and delimiting the region of interest; 3) identification and precise localization of locations with sufficient probability of containing a point-source signal 4) identification of colocalized spots and comparison with gene panel matrix.
  • the multiplex FISH profiles were validated in two ways. First, in order to confirm the multiplex FISH results, the baseline copy number of each gene was verified in each cell line by traditional single-FISH manual quantification, which is shown in Table 2 included in FIG.
  • Table 2 shows reference gene copy number values obtained by single FISH (mean ⁇ SD) and is spread across four pages due to the size of the table.
  • the correspondent centromere enumeration probe was used as an internal reference control. Normal cutoff values were established by scoring 200 interphase nuclei for each tumor cell line, 100 interphase nuclei for each cultured circulating tumor cells and 200 interphase nuclei of normal peripheral blood. The mean copy number value and ranges were obtained and used as reference to establish the scoring criteria and acceptable values for the multiplex assay.
  • options for multiplexing locus-specific DNA FISH probes are limited by the number of available spectrally-distinct fluorophores and matched filter sets. Practically, that means 4 probes are the maximum in common clinical FISH applications.
  • the inventors sought to use combinatorial labeling approaches, previously used in SKY, MFISH and in some locus-specific applications, to increase the number of probes using standard fluorophores. The probe sets were tested and validated using a scanning laser confocal microscope combined with a custom analysis pipeline, as detailed below.
  • FIG. 1 A Before developing a multiplex assay, single BAC probes labeled with two or three fluorophores hybridized to control cells were sampled (FIG. 1 A).
  • FIG. 1 A is shown with a bar 100 and a bar 102 each representing a scale of 2 pm. This step allowed us to set initial parameters for image capture, such as laser power, and to correct for chromatic aberrations in the z axis.
  • a prototype two-color multiplex labeling schema using mixtures of six fluorophores was designed for five important cancer genes first utilizing BACs overlapping the gene coding regions ( PDGFRA , MET, EGFR, MYC, and RET genes)(FIG. 1B).
  • fluorophores and serial optical z-sections, required for separation of axially-adjacent punctae, were created and used to build the image analysis algorithm.
  • Standard clinical FISH imaging systems generally use maximum intensity projections, which would result in a high false positive rate since spots from different z-planes would erroneously appear to overlap.
  • Reference spectra were obtained for each of the six fluorophores individually from normal lymphocytes hybridized with B AC probes labeled with only the fluorophore of interest. The reference spectra libraries were used to evaluate the contributions of mixed fluorophore signals in the pooled probe hybridization.
  • FIGS. 3A-C are each shown with a bar 300 indicating a scale size of 10 pm. This algorithm was applied to 3D volumes including serial optical sections of fluorescent signals (FIG. 3A). Analysis steps included: 1) nuclear segmentation for delimitation of the region of interest; 2) defining spots with a high probability of containing a point-source signal (including the estimation of the spot signal intensity and spot 3D sub-pixel localization) (FIG. 3B); and 3) matching of colocalized fluorescent spots with the gene panel matrix to identify and quantify each gene (FIG. 3C).
  • This approach relies on statistical comparison of intensity of local fluorescence maxima with a model of the microscope point-spread function (PSF) to rigorously detect low signal-to-noise ratio (SNR) spots in a manner which is adaptive to local variations in the signal and background, without requiring specification of arbitrary thresholds.
  • PSF microscope point-spread function
  • SNR signal-to-noise ratio
  • the automated gene quantification algorithm established that spots in different channels colocalized if the centers of the spots were within 0.24pm, a value which was empirically determined to provide accurate results in cells with known copy numbers. In various embodiments two or more spots may be considered to be colocalized if they are within 0.1 pm, 0.2pm, 0.3 pm, 0.4pm, 0.5 pm or other suitable distances. Since each gene should be represented by a number of spots that match the established labeling schema, the gene copy number count is determined by the number of colocalized spot sets. The definition and elimination of non-noise disruptive features, and accurate identification of probe barcodes in non-uniform background spectra, are the major challenges that the inventors are addressing to improve automated counting.
  • Single FISH Single-probe FISH
  • M-FISH Multiplex FISH
  • FIG. 9 shows array comparative genomic hybridization (aCGH) analysis of the 293T and H1975 cell lines.
  • FIG. 9A shows multicolor fluorescence in situ hybridization [multiplex-FISH (M-FISH)] automated quantification and reference single FISH analysis of 10 genes in the H1975 cell line.
  • FIG. 9B shows genome copy number summary for aCGH analysis of H1975.
  • the horizontal axis represents the linear position along the chromosome, whereas the vertical axis represents the measured log 2 signal ratio (-2 to 2).
  • Relevant gene names are placed in the appropriate genomic positions.
  • FIG. 9C shows a zoomed-in view of chromosome 8, depicting high-level copy number gain for MYC.
  • FIG. 9D shows an M-FISH automated quantification and reference single FISH analysis of 10 genes in the 293T cell line, with dots indicating the copy number derived from aCGH.
  • FIG. 9B shows genome copy number summary for aCGH analysis of H1975.
  • the horizontal axis represents the linear position along the chromosome, whereas the vertical axis represents the measured log 2 signal ratio (-2 to 2).
  • Relevant gene names are placed in the appropriate genomic positions.
  • FIG. 9C shows a zoomed-in view of chromosome 8, depicting high-level copy number gain
  • FIG. 9E shows a genome copy number summary for aCGH analysis of 293T. Data are expressed as means ⁇ SD (FIG. 9A and FIG. 9D). Chr, chromosome.
  • FIG. 10 shows array comparative genomic hybridization (aCGH) and fluorescence in situ hybridization (FISH) validation for cell lines. Multiplex FISH quantification and comparison to single FISH and aCGH (dots) for BUML are shown in FIG. 10 A, for UACC62 in FIG. 10C, and for MCF10A in FIG. 10E. Corresponding aCGH genomic profiles, shown as log 2 ratios, for BEIML are shown in FIG. 10B, for UACC62 are shown in FIG. 10D, and for MCF10A are shown in FIG. 10F. Data are expressed as means ⁇ SD (FIGS. 10A, 10C, 10E). Chr, chromosome; M-FISH, multiplex- FISH.
  • FIG. 11 shows array comparative genomic hybridization (aCGH) and fluor
  • aCGH fluorescence in situ hybridization
  • FISH fluorescence in situ hybridization
  • the definition and elimination of non-noise disruptive features and accurate identification of probe bar codes in non-uniform background spectra are the major challenges that were addressed to improve quantification in the multiplex assay.
  • FIG. 12 shows reproducibility analysis of multiplex fluorescence in situ hybridization (FISH). Box-and-whisker plots are shown for mean and interquartile range for copy number counts for each replicate experiment. Panels show grouped replicates for 293T (left column), H1975 (middle column), and normal lymphocyte preparations (right column), with a different gene in each row.
  • the y axis is scaled to optimize the ability to visualize the full range of copies per gene across the three lines.
  • Displayed P values are for analysis of variance performed within each group of replicates. Above each replicate is indicated either significant (triangles) or nonsignificant (circles) deviation from gold standard single FISH in the same cell lines n Z 12 replicates (left column); n Z 9 (middle column); n Z 25 (right column). Analysis of variance was performed to compare across replicates, and the P values for the analysis of variance are listed in each panel. For the lymphocyte data set, 7 of 10 genes showed no statistically significant variation across the 25 replicates. For the three genes with an analysis of variance P ⁇ 0.01, the mean copy number values of each replicate are actually near the expected value of two copies.
  • FIG. 21 shows the same data shown as in FIG. 12, but with a fixed y-axis range of 0 to 40 copies, to optimally visualize the difference between amplified (MYC) and nonamplified genes.
  • FIG. 13 is a scatterplot of average gene automated copy number counts in two independent hybridizations using the lO-plex probe mix for H1975 (circles) and 293 T (triangles) cell lines.
  • the scatterplot shows reproducibility analyzed by duplicate multiplex fluorescence in situ hybridization analysis of H1975 and 293T cells. Correlation coefficient for the replicates is shown. Dotted line indicates linear regression line. Error bars indicate 1 SD. [00110] To help visualize assay accuracy, above each replicate in FIG. 12, it is indicated whether that individual replicate copy number mean deviates significantly from the gold standard single FISH score.
  • FIG. 22A shows accuracy displayed as a deviation plot of mean copy number for the lymphocyte data set. They axis shows the absolute copy number deviation of automated multiplex fluorescence in situ hybridization (FISH) replicates (R) from the expected two copies. Of 10 probes, 9 have a mean deviation of ⁇ 0.5, with MET being undercalled by multiplex FISH slightly.
  • FISH automated multiplex fluorescence in situ hybridization
  • FIG. 22B shows detailed MDM4 data showing copy number counts for each individual replicate, with each dot representing a single-cell count and with mean and SD indicated. A small bias to slightly more than two copies is demonstrated.
  • Deviation for each of the 10 genes from the expected is shown in detail for the lymphocyte data, revealing that 9 of 10 probes have a mean deviation of ⁇ 0.5 copies, with only MET being just beyond a -0.5 deviation.
  • FIG. 5A is shown with a bar 500 representing a scale of 10 pm. Currently, spot combinations that do not match the label matrix are ignored by the quantification algorithm.
  • FIGS. 6A and 6B include a bar 600 representing a scale of 2 pm. Therefore, moving to a widefield platform will be essential for clinical use.
  • the inventors piloted the multiplex assay on the Zeiss Cell Observer widefield system.
  • FIGS. 6C and 6D include a bar 604 representing a scale of 20 pm.
  • a multiplex FISH assay such as the one developed here, has the potential to be clinically useful and therapeutically informative.
  • analyses of one sample of cells across two slides could render copy number quantification of up to 30 genes.
  • multiplex FISH has a unique role.
  • Two prominent competing technologies for multiplex FISH are aCGH and NGS, both of which still face difficulties with accurate somatic copy number assessment in cancer. Since both of these techniques utilize DNA extracted from whole tissue sections, there is always substantial dilution of the tumor cell contribution by normal stromal or inflammatory cells. The degree of dilution depends on the tumor cell fraction and the actual true copy state of the genes in tumor cells. Thus, analysis of a gene like ERBB2 , for which a copy number ratio for a positive call is 2.0 by FISH, will show signals that do not exceed noise using aCGH or NGS, and will not be called as positive.
  • FISH fluorescent in situ RNA sequencing
  • the multiplex FISH assay described here is envisioned as a possible stand-alone clinical assay, but more likely as an adjunct performed side-by-side with NGS mutational panels in comprehensive genotyping laboratories. Since only one or two slides will be required, an efficient, automated, and cost-effective work flow can be established. This assay may be especially powerful in small samples as well as those with very low tumor fraction. The optimal sample for this technology may in fact be CTCs for which one often obtains only a few cells per blood draw. One would be able to render an accurate copy number assessment even with 5-10 cells. There will be numerous research uses of multiplex FISH, especially the critical study of copy number genetic heterogeneity in cancer.
  • Lymphocyte spreads were further processed by preheating with 2X saline sodium citrate (SSC) with 0.25% Triton X- 100 (Sigma-Aldrich, St. Louis, MO) until boiling. Slides were then immersed for 2 minutes, followed by rinsing in 2X SSC at room temperature;
  • SSC 2X saline sodium citrate
  • Triton X- 100 Sigma-Aldrich, St. Louis, MO
  • Bacterial artificial chromosome (BAC) derived probes Bacterial artificial chromosome (BAC) derived probes
  • BAC clone searches were performed using the University of California Santa Cruz (UCSC) Genome Browser (http://genome.ucsc.edu/) mapped to Feb. 2009 (GRCh37/hgl9) and Dec. 2013 (GRCh38/hg38) Human Genome Assemblies.
  • BACs were purchased from Children’s Hospital Oakland Research Institute (CHORI, Oakland, CA; http://bacpac.chori.org/).
  • Table 3 specifies BAC clones included in this study. Specificity of the clones were checked in metaphase spreads.
  • the supernatant was discarded and the pellet was resuspended in 4 ml resuspension buffer Pl followed by addition of 4 ml of lysis buffer P2 to the bacterial suspension.
  • the tube was inverted vigorously to mix the content. It was then incubated for 5 min at room temperature and 4 ml of prechilled buffer P3 was added.
  • the tube was inverted vigorously to mix the content and incubated in ice for 15 min.
  • the bacterial lysate was centrifuged at 20,000xg for 30 min at 4°C and supernatant was collected to be applied in Qiagen- tip column. Column was equilibrated with 4 ml of Buffer QBT and after the column emptied by gravity flow, supernatant was applied and allowed to fill the column resin by gravity flow.
  • Extracted BAC DNA was amplified by multiple displacement amplification with Qiagen REPLI-g midi kit.
  • the extracted BAC DNA (2 m ⁇ ) underwent gentle alkaline denaturation by adding DB reagent and incubation for 2min. Reaction was neutralized with the stop solution. After neutralized, REPLI-g master mix was added and the sample incubated at 30°C for 16 h in a thermocycler for isothermal amplification. DNA polymerase was inactivated at 65°C for 3 min at the end of cycle. DNA was then cleaned (de-salted) by ethanol precipitation by adding 1/10 volume of 3M sodium acetate (pH 5.2) and 2-3 volumes of 100% ethanol.
  • each dye should be equivalent but this was not observed in practice, requiring adjustments in concentration during probe preparation.
  • Table 4a specifies the amount of each fluorophore and respective dTTP. For instance if green and orange were chosen to label a particular gene, 3 m ⁇ of green dUTP and 4.5 m ⁇ of dTTP should be pipetted in the mix.
  • Reaction was carried at l5°C for 3 h and terminated by heating to 70°C for 10 min. Samples were then placed on ice and protected from light. Efficiency of labeling was primarily checked by agarose gel. The efficiency of the nick translation reaction was confirmed by analyzing the product size distribution using agarose gel electrophoresis (FIGS. 19 and 20). Gel electrophoresis was performed with 2% agarose diluted in Tris base, acetic acid, and EDTA buffer; IO,OOOC GelRed was then added (l x final concentration; Phenix Research Products, Candler, NC). Each reaction (5 pL) was run at 120 V for 30 to 40 minutes.
  • FIG. 19A shows efficiency of nick translation labeling performed at time points of 8 and 16 hours, resulting in shorter fragments (200 to 100 bp); however, the quality of hybridization is lower, with speckles in the background and weaker-intensity signals.
  • FIGS. 19B-19C show efficiency of nick translation labeling performed at time points of 2.5, 3, and 4 hours, resulting in fragments of predominantly 100 to 400 bp and higher quality of specimen hybridization. Aq, aqua; Gd/Gld, gold; Gr, green; Or, orange; R/Rd, red.
  • FIG. 20 shows three-hour nick translation product gels.
  • FIG. 20 A shows a 3 -hour nick translation 2% agarose gel showing unlabeled DNA bands on the left and DNA labeled with combinations of two fluorophores and size range of 100 to 400 bp.
  • FIG. 20B shows gel showing DNA on the left and DNA labeled with two and three fluorophores.
  • A/Aq aqua; Cy, Cy5; Gd, gold; Gn, green; Or, orange; Rd, red.
  • nick translation reactions for each gene were pooled at equal volumes, and a l.5x amount of Cot-l human DNA (Life Technologies, Carlsbad, CA) was added (>l.5x began to suppress fluorescent signals), followed by the addition of a 1 : 10 total volume of 3 mol/L sodium acetate (pH 5.2) and 2 to 3 volumes of 100% ethanol.
  • the reaction would contain a total of 5 mg of various BAC DNAs, plus 7.5 mg of Cot-l.
  • the mixture was centrifuged at 18,000 x g for 20 minutes at 4°C to pellet the labeled DNA.
  • the supernatant was discarded, and the pellet was washed with 70% ethanol and centrifuged at 18,000 x g- for 10 minutes. Supernatants were discarded, and the pellet was air dried in the dark for 5 to 10 minutes. The probe was re-suspended with nuclease-free water, and five volumes of
  • hybridization buffer were added and mixed well. The mixture was denatured by heating at 72°C and immediately placed on ice. Probes were stored at -20°C until use.
  • Hybridization buffer was composed of 50% v/v deionized formamide, 2x SSC, 50 mM potassium dihydrogen phosphate/di sodium hydrogen phosphate buffer (KH2PO4/ Na2HP0 4 , pH 7.0), 1 mM EDTA, 5-10% v/v dextran sulfate.
  • the final pH of the solution should be adjusted to pH 7.6 with hydrochloric acid. No counterstain such as DAPI (4',6-Diamidine-2'-phenylindole) was added in order to minimize unwanted background fluorescence.
  • Chromatic aberration the vertical shift in apparent position of objects, is a concern when imaging samples with multiple fluorophores. Blue wavelengths are focused closer to the lens than red wavelengths. Without corrective measures, two spots of different
  • wavelengths appear as significantly separated and risk being undetected as a pair, despite actually overlapping.
  • search radius for each spot was increased to compensate for the shift, this caused several false-positive problems in certain genes, especially between pairs of spots originating from neighboring wavelengths, where the shift is not as large.
  • the nuclear mask was obtained by low-pass filtering and then thresholding a channel with high background - in this case channel 6 - using an Otsu method-derived threshold value (FIG. 2B).
  • FOG. 2B Otsu method-derived threshold value
  • Fluorescence spots are composed of point sources as well as extended sources.
  • the inventors applied point source detection algorithms to detect the probes in each channel.
  • This approach relies on statistical comparison of image intensity local maxima with a model of the microscope point-spread function (PSF) as a 3D Gaussian function to detect low signal-to- noise ratio (SNR) spots in a manner which is adaptive to local variations in the signal and background, without requiring specification of arbitrary thresholds (FIG. 3 A-C).
  • PSF microscope point-spread function
  • SNR signal-to- noise ratio
  • the sigma factor for the Laplacian of the Gaussian function in the lateral and axial directions were measured and averaged over several datasets for each magnification. For each detected local maximum, the amplitude intensity and the variation of noise in the back- ground were extracted to approximate the SNR, as well as the calculated position in x, y, and z. Local maxima that were below a alpha value set threshold were eliminated. This threshold was optimized for each channel so that there were no false negatives but with a low amount of false positives as visualized in Imaris (Bitplane, USA).
  • genes such as MYC in the lung cell line H1975 could be highly amplified to such an extent that the captured image manifested large blob-like shapes as opposed to spots.
  • the blobs were segmented by first excluding areas of the nucleus with spots, as determined by filtering with a Laplacian of Gaussian filter. Once the spots had been removed, the remaining nucleus region (i.e., background) intensity was sampled, and a threshold for segmenting blobs was robustly set as 3 SDs above the median intensity of this background. Noise was approximated by measuring the variance of the background from an annulus around the blob.
  • FIG. 8 shows an example of a nucleus with a mixture of spots and blobs.
  • FIG. 8 includes a bar 800 representing a scale of 0.8 pm.
  • the analysis algorithm requires a label matrix in the form of a .csv file that contains a list of pairs of channels for each gene. Briefly, coincident points for all possible combinations of fluorophores were identified as being spots within four pixels of each other. Any pairs that did not exist in the label matrix were immediately eliminated. In cases where there were more than 2 spots located at a position, the most likely combination was determined on the basis of brightest intensity.
  • the number of coincident spot sets for each gene and signal -to- noise ratio for each channel were stored as .csv files.
  • the algorithm can also be run in a batch analysis mode, which loops through multiple datasets and reports a summary of the average, median, and standard deviation of the gene count.
  • genomic DNA was extracted from tumor tissue using the QIAamp Blood Mini Kit using a modified protocol incorporating deparaffinization and protease digestion (Qiagen).
  • Agilent Sureprint 4 l80k CGH SNP microarrays (Agilent Technologies, Santa Clara, CA) containing approximately 180,000 copy number probes, covering both coding and noncoding human sequences, were used. Briefly, 1.0 mg of human reference DNA, male genomic control DNA (Coriell Institute, Camden, NJ), and 1.0 mg of tumor DNA were digested with Alul and Rsal, and then heat treated at 95°C for 5 minutes.
  • Control and tumor DNAs were labeled by random priming with CY3-dUTP and CY5-dUTP dyes, respectively, using the Agilent SureTag Complete DNA Labeling Kit.
  • the labeled DNAs were purified with the SureTag Reaction Purification Column and mixed in equal proportion for hybridization to the array in the presence of Cot-l DNA (Invitrogen, Carlsbad, CA) using the Agilent
  • the inventors have developed a robust and quantitative single-slide hybridization assay utilizing a library of at least 50 locus-specific DNA sequence probes.
  • the inventors have constructed probes having 5, 10, 15 and 20 genes obtained by BAC clone derived DNAs and probes having 5 and 10 genes obtained by PCR.
  • the inventors produced a standard recipe for optimally mixing fluorophore labels with double and triple bar-codes for single BAC clones/PCR product. Labeling of each gene was carried out by nick translation separately and the products were then combined in a single tube, the probe mix. Combinatorial-labeled DNAs hybridization mixture, the probe mix, was tested in tumor cell lines and in cultured circulating tumor cells and hybridization conditions were established for these cells.
  • the inventors further have optimized PCR based probe construction by adding a sequence of additional nucleotides (a tail) at the 5’ ends of amplimers at the first part of PCR reaction. This tail prevents formation of primer dimers and allows carrying out massive reactions with multiple amplimers simultaneously.
  • the inventors set a library of 10 genes and hybridized cultured tumor cells as well as formalin fixed paraffin embedded (FFPE) samples (FIG.14).
  • FIG. 14A shows tested sequence specific Tail-PCR probe in cultured circulating tumor cells (CTCs).
  • FIG. 14B shows tested sequence specific formalin-fixed paraffin-embedded glioblastoma multiforme (GBM) cancer samples.
  • the probe contained 10 genes listed in the table shown in FIG. 14C.
  • the inventors detected similar count distributions for the ten genes tested with the PCR approach compared to B AC, demonstrating the sensitivity of the method and efficiency of hybridization.
  • the inventors have used the Zeiss-Elyra (Zeiss, Oberkochen, Germany), a laser confocal microscope, to capture volumetric renditions of the samples and to acquire multiplex image data that was further used to build the image analysis algorithm. Volumetric renderings of confocal images allowed us to determine minimal pixel size, rule out the possibility of overlap of spots in z plane and to determine the minimal distance necessary for image acquisition. The inventors have observed that a minimal z distance of 0.2 pm, within the limit of a widefield microscope, is sufficient to individualize the signals in z axis therefore supporting a shift towards a widefield based platform.
  • FIG. 15A shows an FFPE prepared with robotic slide processor for breast cancer samples.
  • FIG. 15B shows an FFPE prepared with robotic slide processor for GBM cancer samples. It was important that the hybridization efficiency of formalin fixed paraffin embedded surgical biopsy specimens processed with the VP2000 and hybridized with PCR-based FISH probe mix recognizing 10 genes to supporting feasibility of automation and high volume workload platforms was checked. Preparation of slides as well as the probe showed efficiency comparable with manual slide processing and B AC derived probe. Sample was imaged with a widefield microscope and each wavelength was recognized with filter cubes.
  • the whole slide image scanner was able to get serial sections of large areas for each channel (each wavelength).
  • the scanner uses light-emitting diodes (LEDs) as the fluorescence light source.
  • LEDs light-emitting diodes
  • FIG. 16A shows a formalin fixed paraffin embedded (FFPE) specimen imaged with the Vectra multispectral imaging system
  • FIG. 16B shows an FFPE specimen imaged with the Pannoramic confocal whole slide image scanner
  • FIG. 16C shows an FFPE specimen imaged with the CytoVision ® platform.
  • the Vectra image cannot collect multiple z planes precluding the evaluation of potential overlapping gene signals.
  • the whole slide image scanner provided multiple z planes of large fields of view, features that are appealing in a clinical setting.
  • the Cyto Vision ® platform can provide good signal-to-noise and resolution within one field of view.
  • FIG. 17A shows formalin fixed paraffin embedded (FFPE) of glioblastoma multiforme (GBM) case hybridized with a multiplex probe recognizing 10 genes and submitted to signal quantification. The signals are quantified in the graph shown in FIG. 17B.
  • FFPE formalin fixed paraffin embedded
  • GBM glioblastoma multiforme
  • FIGS. 18A and 18B show FFPE of GBM cases illustrating the need to improve nuclear segmentation to focus the analysis on genomic content and away from other spurious features in FFPE samples.
  • the inventors started working with FFPE specimens of glioblastoma multiforme cases. As mentioned above the inventors observed a highly variable spot intensities and noise levels, therefore the inventors will compare the present multi-channel spot detection with sensitivity tuned to match the six channels with a single-detection and spectral matching algorithm and select the approach with best performance and most consistent results.
  • FIG. 24 an example 2400 of a system for multiplex labeling of a sample and gene copy number evaluation is shown in accordance with some embodiments of the disclosed subject matter.
  • a computing device 2410 can receive information regarding an image of a sample to which a plurality of fluorescentlydabeled polynucleotide probes has been applied from a database and/or user interface 2402.
  • computing device 2410 can execute at least a portion of a system for multiplex labeling of a sample and gene copy number evaluation 2404 to identify a gene and determine a copy number of the identified gene based on data received from the database and/or user interface 2402.
  • computing device 2410 can communicate information about the image data received from the database and/or user interface 2402 to a server 2420 over a communication network 2406, which can execute at least a portion of system for multiplex labeling of a sample and gene copy number evaluation 2404 to identify a gene and determine a copy number of the identified gene.
  • server 2420 can return information to computing device 2410 (and/or any other suitable computing device) indicative of an output of system for multiplex labeling of a sample and gene copy number evaluation 2404, such as a signal obtained from a sample that is imaged by the imaging system.
  • This information may be transmitted and/or presented to a user (e.g. a researcher, an operator, a clinician, etc.) and/or may be stored (e.g. as part of a research database or a medical record associated with a subject).
  • computing device 2410 and/or server 2420 can be any suitable computing device or combination of devices, such as a desktop computer, a laptop computer, a smartphone, a tablet computer, a wearable computer, a server computer, a virtual machine being executed by a physical computing device, etc.
  • system for multiplex labeling of a sample and gene copy number evaluation 2404 can present information about an identified gene, a copy number of the gene, and/or another output of system for multiplex labeling of a sample and gene copy number evaluation 2404, such as an image obtained from a sample by the imaging system to a user (e.g., researcher and/or physician).
  • the imaging system can be any imaging system that is suitable for obtaining images for a system for multiplex labeling of a sample and gene copy number evaluation 2404.
  • the imaging system may be local to computing device 2410.
  • the imaging system may be integrated with computing device 2410 (e.g., computing device 2410 can be configured as part of a device for multiplex labeling of a sample and gene copy number evaluation).
  • the imaging system may be connected to computing device 2410 by a cable, a direct wireless link, etc. so that computing device 2410 can control the imaging system remotely.
  • the imaging system can be located locally and/or remotely from computing device 2410, and can be in communication with computing device 2410 (and/or server 2420) via a communication network (e.g., communication network 2406).
  • communication network 2406 can be any suitable communication network or combination of communication networks. For example,
  • communication network 2406 can include a Wi-Fi network (which can include one or more wireless routers, one or more switches, etc.), a peer-to-peer network (e.g., a Bluetooth network), a cellular network (e.g., a 3G network, a 4G network, etc., complying with any suitable standard, such as CDMA, GSM, LTE, LTE Advanced, WiMAX, etc.), a wired network, etc.
  • communication network 2406 can be a local area network, a wide area network, a public network (e.g., the Internet), a private or semi-private network (e.g., a corporate or university intranet), any other suitable type of network, or any suitable combination of networks.
  • Communications links shown in FIG. 24 can each be any suitable communications link or combination of communications links, such as wired links, fiber optic links, Wi-Fi links, Bluetooth links, cellular links, etc.
  • FIG. 25 shows an example 2500 of hardware that can be used to implement computing device 2410 and server 2420 in accordance with some embodiments of the disclosed subject matter.
  • computing device 2410 can include a processor 2502, a display 2504, one or more inputs 25025, one or more communication systems 2508, and/or memory 2510.
  • processor 2502 can be any suitable hardware processor or combination of processors, such as a central processing unit, a graphics processing unit, etc.
  • display 2504 can include any suitable display devices, such as a computer monitor, a touchscreen, a television, etc.
  • inputs 25025 can include any suitable input devices and/or sensors that can be used to receive user input, such as a keyboard, a mouse, a touchscreen, a microphone, etc.
  • communications systems 2508 can include any suitable hardware, firmware, and/or software for communicating information over communication network 2406 and/or any other suitable communication networks.
  • communications systems 2508 can include one or more transceivers, one or more communication chips and/or chip sets, etc.
  • communications systems 2508 can include hardware, firmware and/or software that can be used to establish a Wi-Fi connection, a Bluetooth connection, a cellular connection, an Ethernet connection, etc.
  • memory 2510 can include any suitable storage device or devices that can be used to store instructions, values, etc., that can be used, for example, by processor 2502 to present content using display 2504, to communicate with server 2420 via communications system(s) 2508, etc.
  • Memory 2510 can include any suitable volatile memory, non-volatile memory, storage, or any suitable combination thereof.
  • memory 2510 can include RAM, ROM, EEPROM, one or more flash drives, one or more hard disks, one or more solid state drives, one or more optical drives, etc.
  • memory 2510 can have encoded thereon a computer program for controlling operation of computing device 2410.
  • processor 2502 can execute at least a portion of the computer program to present content (e.g., images, user interfaces, graphics, tables, etc.), receive content from server 2420, transmit information to server 2420, etc.
  • content e.g., images, user interfaces, graphics, tables, etc.
  • server 2420 can include a processor 2512, a display 2514, one or more inputs 25125, one or more communications systems 2518, and/or memory 2520.
  • processor 2512 can be any suitable hardware processor or combination of processors, such as a central processing unit, a graphics processing unit, etc.
  • display 2514 can include any suitable display devices, such as a computer monitor, a touchscreen, a television, etc.
  • inputs 25125 can include any suitable input devices and/or sensors that can be used to receive user input, such as a keyboard, a mouse, a touchscreen, a microphone, etc.
  • communications systems 2518 can include any suitable hardware, firmware, and/or software for communicating information over communication network 2406 and/or any other suitable communication networks.
  • communications systems 2518 can include one or more transceivers, one or more communication chips and/or chip sets, etc.
  • communications systems 2518 can include hardware, firmware and/or software that can be used to establish a Wi-Fi connection, a Bluetooth connection, a cellular connection, an Ethernet connection, etc.
  • memory 2520 can include any suitable storage device or devices that can be used to store instructions, values, etc., that can be used, for example, by processor 2512 to present content using display 2514, to communicate with one or more computing devices 2410, etc.
  • Memory 2520 can include any suitable volatile memory, non volatile memory, storage, or any suitable combination thereof.
  • memory 2520 can include RAM, ROM, EEPROM, one or more flash drives, one or more hard disks, one or more solid state drives, one or more optical drives, etc.
  • memory 2520 can have encoded thereon a server program for controlling operation of server 2420. In such
  • processor 2512 can execute at least a portion of the server program to transmit information and/or content (e.g., information regarding the virtual lens, the desired intensity pattern, the modified hologram, any data collected from a sample that is illuminated, a user interface, etc.) to one or more computing devices 2410, receive information and/or content from one or more computing devices 2410, receive instructions from one or more devices (e.g., a personal computer, a laptop computer, a tablet computer, a smartphone, etc.), etc.
  • information and/or content e.g., information regarding the virtual lens, the desired intensity pattern, the modified hologram, any data collected from a sample that is illuminated, a user interface, etc.
  • processor 2512 can execute at least a portion of the server program to transmit information and/or content (e.g., information regarding the virtual lens, the desired intensity pattern, the modified hologram, any data collected from a sample that is illuminated, a user interface, etc.) to one or more computing devices 2410, receive information and/or
  • any suitable computer readable media can be used for storing instructions for performing the functions and/or processes described herein.
  • computer readable media can be transitory or non-transitory.
  • non-transitory computer readable media can include media such as magnetic media (such as hard disks, floppy disks, etc.), optical media (such as compact discs, digital video discs, Blu-ray discs, etc.), semiconductor media (such as RAM, Flash memory, electrically
  • transitory computer readable media can include signals on networks, in wires, conductors, optical fibers, circuits, or any suitable media that is fleeting and devoid of any semblance of permanence during transmission, and/or any suitable intangible media.
  • the optical signals are detected by photodiodes.
  • any option-electronic conversion device including but not limited to photo detectors, photodiodes, line-scan and two-dimensional cameras, and photodiode arrays can be used to perform this detection function.
  • the term mechanism can encompass hardware, software, firmware, or any suitable combination thereof.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Microbiology (AREA)
  • Immunology (AREA)
  • Physics & Mathematics (AREA)
  • Molecular Biology (AREA)
  • Biotechnology (AREA)
  • Biophysics (AREA)
  • Analytical Chemistry (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

A method for multiplex labeling of a sample and gene copy number evaluation, including: providing a plurality of fluorescently-labeled polynucleotide probes, each of the plurality of fluorescently-labeled polynucleotide probes being directed to a different polynucleotide and being labeled with a distinct combination of fluorophores selected from a plurality of fluorophores; applying the plurality of fluorescently-labeled polynucleotide probes to a sample; obtaining an image of the sample including emissions from the plurality of fluorophores; analyzing the image to identify a location within the sample having a group of fluorophores of the plurality of fluorophores associated with the location; identifying a gene associated with the location based on identifying the location within the sample having the group of fluorophores; and determining a copy number of the identified gene.

Description

HIGHLY MULTIPLEXED FLUORESCENCE IN SITU HYBRIDIZATION (FISH) PLATFORM FOR GENE COPY NUMBER EVALUATION
FEDERAL FUNDING NOTICE
[0001] This invention was made with government support under grant number
1R21CA183686-01A1 awarded by the National Institutes of Health. The government has certain rights in the invention.
FIELD OF THE DISCLOSURE
[0002] The present disclosure relates to methods and reagents for evaluating gene copy number, in particular using fluorescence in situ hybridization (FISH).
BACKGROUND INFORMATION
[0003] Quantification of gene copy number alterations (CNAs) is important in clinical management of cancer, and is standard for ERBB2 in breast cancer and esophagogastric tumors. DNA fluorescence in situ hybridization (FISH) is the gold standard method for detecting CNAs in cancer, but the number of genes that can be evaluated at once is limited by the number of available spectrally-distinct fluorophores and filters.
[0004] Other methods of detecting CNAs do exist, but are limited to certain types of samples, and are costly, or lack sufficient sensitivity.
[0005] Accordingly, there is a need for improved reagents and methods for labeling and identifying gene copy number alterations.
SUMMARY OF THE INVENTION
[0006] The inventors have developed a clinical grade high-throughput DNA-FISH platform that enables automated single gene CNA analysis of a large panel of genes. This method can generally be used in slides containing cells whose DNA has been morphologically preserved, such as formalin-fixed paraffin embedded (FFPE) tumor biopsy samples and in isolated circulating tumor cells (CTCs). The platform may include two parts: (1) custom multiplex fluorescent DNA probes, and (2) custom software that identifies gene copy number from the captured fluorescent images. The inventors have designed bacterial artificial chromosome (B AC) clone and PCR-based or synthetic DNA probes with combinatorial fluorescent labels, such that at least (but not limited to) 35 genes can each be "barcoded" with a unique fluorophore combination using at least (but not limited to) 6 fluorophore colors with 2, 3, or more color co-labeling per gene.
[0007] Initial tests have been performed by labeling 15 genes with co-labeling of each gene with 2 or 3 colors. Slides containing fixed cells of FFPE samples are hybridized with the multiplex probes, an image of the slide is acquired, and the profile for each fluorophore is identified using multispectral analysis with linear unmixing or fluorescence filter sets. In various embodiments (e.g. for clinical analysis), non-spectral filters or single and multibandpass filters may be used. The inventors have also developed custom software that uses these results to quantify the copy number of each probed gene.
[0008] In certain embodiments the invention includes a clinical grade high-throughput
DNA-FISH platform that enables the automated copy number analysis of a large panel of genes mainly in (but not limited to) formalin-fixed paraffin embedded (FFPE) tumor biopsy samples and in isolated circulating tumor cells (CTCs). This quantitative single-slide assay platform utilizes a library of 35 locus-specific DNA sequence probes for a combination of 6 fluorophores. Each gene is "bar-coded" with a unique combination of fluorophores (multiplexed). Multiplexed slide datasets are captured and the profile for each fluorophore is identified by multispectral analysis with linear unmixing or datasets from specific fluorescence filters and the custom built software quantifies the number of copies of each gene.
[0009] The quantification of changes in gene copy number is important for the understanding of tumor biology and for the clinical management of cancer patients. DNA fluorescence in situ hybridization is the gold standard method to detect copy number alterations, but it is limited by the number of genes one can quantify simultaneously. To increase the throughput of this informative technique, the inventors disclose herein a fluorescent "barcode" system for the unique labeling of dozens of genes and an automated image analysis algorithm that enabled their simultaneous hybridization for the quantification of gene copy numbers. The reliability of this multiplex approach is demonstrated on normal human lymphocytes, metaphase spreads of transformed cell lines, and cultured circulating tumor cells. This novel approach to in situ hybridization opens the door to the development of gene panels for more comprehensive analysis of copy number changes in tissue including the study of heterogeneity and of high throughput clinical assays that could provide rapid quantification of gene copy numbers in samples with limited cellularity such as circulating tumor cells.
[0010] In one embodiment the invention provides a method for multiplex labeling of a sample and gene copy number evaluation, including: providing a plurality of fluorescently- labeled polynucleotide probes, each of the plurality of fluorescently-labeled polynucleotide probes being directed to a different polynucleotide and being labeled with a distinct combination of fluorophores selected from a plurality of fluorophores; applying the plurality of fluorescently- labeled polynucleotide probes to a sample; obtaining an image of the sample including emissions from the plurality of fluorophores; analyzing the image to identify a location within the sample having a group of fluorophores of the plurality of fluorophores associated with the location; identifying a gene associated with the location based on identifying the location within the sample having the group of fluorophores; and determining a copy number of the identified gene
[0011] In another embodiment the invention provides an apparatus for multiplex labeling of a sample and gene copy number evaluation, including: a processor in communication with an imaging system, the processor to: obtain an image of a sample from the imaging system, the image including emissions from a plurality of fluorophores associated with the sample, and the sample including a plurality of fluorescently-labeled polynucleotide probes applied to the sample, each of the plurality of fluorescently-labeled polynucleotide probes being directed to a different polynucleotide and being labeled with a distinct combination of fluorophores selected from the plurality of fluorophores; analyze the image to identify a location within the sample having a group of fluorophores of the plurality of fluorophores associated with the location; identify a gene associated with the location based on identifying the location within the sample having the group of fluorophores; and determine a copy number of the identified gene.
[0012] The foregoing and other aspects and advantages of the invention will appear from the following description. In the description, reference is made to the accompanying drawings which form a part hereof, and in which there is shown by way of illustration preferred
embodiments of the invention. Such embodiments do not necessarily represent the full scope of the invention, however, and reference is made therefore to the claims herein for interpreting the scope of the invention.
BRIEF DESCRIPTION OF THE DRAWINGS
[0013] Further objects, features and advantages of the present disclosure will become apparent from the following detailed description taken in conjunction with the accompanying figures showing illustrative embodiments of the present disclosure, in which:
[0014] FIGS. 1 A-1H show combinations of fluorophores used to barcode each gene. FIG.
1A shows multiplex FISH probe mixes that were constructed with the goal of“barcoding” each gene probe with a unique combination of two (left) or three (right) fluorophores.
[0015] FIG. 1B shows a five-plex probe hybridization assay (PDGFRA, MET, EGFR,
MYC, and RET genes) that was tested with two-fluorophore barcode.
[0016] FIGS. 1C-1H show efficient fluorophore incorporation and a robust probe specific activity that were demonstrated by high signal-to-noise ratio in testing. Co-localization of fluorophores was evident, and reflected the expected pre-experimental labeling plan.
[0017] FIGS. 2A-2C show a probe labeling schema or‘Probe Matrix’. FIG. 2A shows a probe matrix for five genes. FIG. 2B shows a probe matrix for ten genes. FIG. 2C shows a probe matrix for fifteen genes. Each column shows the respective color combinations for each gene. Labeling of each gene was carried out by nick translation separately and the products were then combined in a single tube and concentrated to produce the probe mix. The schema is used by the image analysis algorithm for identification of colocalized signals that are considered statistically significant and assign the corresponding gene. [0018] FIGS. 3 A-3C show a workflow for quantifying gene count. FIG. 3 A shows a volumetric rendering of a circulating tumor cell hybridized with 15 genes. FIG. 3B shows a nuclear masking to eliminate false positive point source detection outside of nucleus and point source detection based on fitting a 3D Gaussian model. Maxima of each channel are shown. FIG. 3C shows coincident spots that are located within a predefined radius are matched to genes based on a label matrix.
[0019] FIGS. 4A-4G show gene quantification of a 10 gene probe mix. FIG. 4A shows a confocal image of circulating tumor cell hybridized with 10 gene probe mix. FIG. 4B shows a representative copy number analysis of normal lymphocytes. FIG. 4C shows a representative copy number analysis of cultured circulating tumor cell BRX-7. FIG. 4D shows a representative copy number analysis of cultured circulating tumor cell BRX-42. FIG. 4E shows a representative copy number analysis of cultured circulating tumor cell BRX-61. FIG. 4F shows a representative copy number analysis of tumor cell line H1975. FIG. 4G shows a representative copy number analysis of tumor cell line GBM18. A lO-plex probe mix was used to fine-tune the software detection sensitivity. Presence of noise due to non-homogeneous nature of the nucleus resulted in false positive calls increasing the detected number. Black bars represent copy number values obtained by automated quantification (M-FISH), white bars represent copy number values obtained by manual standard-single-FISH evaluation (Single-FISH), the‘gold standard’. In FIG. 4F, H1975 cell line showed amplification of MYC that was captured by the automated quantification and supported by traditional analysis.
[0020] FIGS. 5A-5C show gene quantification of a 15 gene probe mix. FIG. 5 A shows a confocal image of circulating tumor cell (BRX-68) hybridized with 15 gene probe mix. FIG. 5B shows a representative image of copy number analysis of cultured circulating tumor cells for BRX-68. FIG. 5C shows a representative image of copy number analysis of cultured circulating tumor cells for BRX-82. Cells were hybridized with a probe mix containing 15 genes bar-coded with two fluorophores. Presence of noise due to non-homogeneous nature of the nucleus resulted in false negative calls for NMYC and CDK4 in (FIG. 5B) BRX-68 and (FIG. 5C) BRX-82. Copy number values obtained by automated quantification (M-FISH), copy number values obtained by manual standard-single-FISH evaluation (Single-FISH). [0021] FIGS. 6A-6D show parameters collected with the confocal microscope helped to build the image analysis platform and will allow to shift the imaging acquisition to a widefield optical system in order to decrease its turnaround time. FIG. 6 A shows a side projection of serial optical sections that were acquired by confocal microscopy and visualized as three-dimensional renderings to aid in the development of the image analysis algorithm. Side projections were used to determine a suitable axial step size to adequately sample neighboring spots in widefield microscopy. FIG. 6B shows an upper view of the serial optical sections of FIG. 6A. The time required to acquire each image turns confocal into an impractical solution in a clinical laboratory setting. FIG. 6C shows the same cells as FIG. 6A captured with the Zeiss Cell Observer widefield fluorescence microscope. Images were captured with a 63 x objective lens (NA= 1.40). The widefield microscope provided larger fields of view, better photon collection efficiency giving rise to brighter nuclei (inset) and faster image acquisition time. Substantial compromise of z axis resolution was not observed. FIG. 6D shows an upper view of the serial optical sections of FIG. 6C. The higher throughput of the widefield scope is more suited for a clinical setting.
[0022] FIGS. 7A-7B show color shift correction. FIG. 7A shows color shift correction in the lateral dimension. FIG. 7B shows color shift correction in the axial directions. Channels showed significant shift in the (FIG. 7A) lateral and (FIG. 7B) axial dimensions. Each fluorophore was registered to the reference channel, Aqua 431. Transformations were
constrained to translations in x, y and z.
[0023] FIG. 8 shows highly amplified gene quantification. Volume rendering of a H1975 cell with spots and large blobs (indicating high amplification of the MYC gene) coinciding in the Green 496 and Red 650 channel. The detection algorithm is able to detect single copies of genes from spots and highly amplified genes in the form of blobs, each marked and visualized with spheres. Single spots are marked with a single sphere, while blobs have multiple overlaid spheres. The number of spheres for each blob corresponds to the ratio of the blob volume to an average spot as an estimation of the extent to which that gene is amplified.
[0024] FIG. 9A shows multicolor fluorescence in situ hybridization (multiplex-FISH or
M-FISH) automated quantification and reference single FISH analysis of 10 genes in the H1975 cell line. Dots indicate the copy number derived from aCGH, as an average of probes spanning the relevant genes.
[0025] FIG. 9B shows genome copy number summary for aCGH analysis of H1975. The horizontal axis represents the linear position along the chromosome, whereas the vertical axis represents the measured log2 signal ratio (-2 to 2). Relevant gene names are placed in the appropriate genomic positions.
[0026] FIG. 9C shows a zoomed-in view of chromosome 8, depicting high-level copy number gain for MYC.
[0027] FIG. 9D shows an M-FISH automated quantification and reference single FISH analysis of 10 genes in the 293T cell line, with dots indicating the copy number derived from aCGH.
[0028] FIG. 9E shows a genome copy number summary for aCGH analysis of 293T.
[0029] FIG. 10A shows multiplex FISH quantification and comparison to single FISH and aCGH (dots) for BUML.
[0030] FIG. 10B shows aCGH genomic profiles corresponding to FIG. 10 A, shown as log2 ratios, for BUML.
[0031] FIG. 10C shows multiplex FISH quantification and comparison to single FISH and aCGH (dots) for UACC62.
[0032] FIG. 10D shows aCGH genomic profiles corresponding to FIG. 10C, shown as log2 ratios, for UACC62.
[0033] FIG. 10E shows multiplex FISH quantification and comparison to single FISH and aCGH (dots) for MCF10A.
[0034] FIG. 10F shows aCGH genomic profiles corresponding to FIG. 10E, shown as log2 ratios, for MCF 10 A.
[0035] FIG. 11 A shows multiplex FISH quantification for 10 genes in GBM18 and comparison to single FISH and aCGH (dots). [0036] FIG. 11B shows aCGH genomic profiles corresponding to FIG. 11 A shown as log2 ratios, for GBM18.
[0037] FIG. 11C shows multiplex FISH quantification for 15 genes in PC3 cell lines and comparison to single FISH and aCGH (dots).
[0038] FIG. 11D shows aCGH genomic profiles corresponding to FIG. 11C shown as log2 ratios, for PC3.
[0039] FIG. 12 shows reproducibility analysis of multiplex fluorescence in situ hybridization (FISH). Box-and-whisker plots are shown for mean and interquartile range for copy number counts for each replicate experiment.
[0040] FIG. 13 is a scatterplot of average gene automated copy number counts in two independent hybridizations using the lO-plex probe mix for H1975 (circles) and 293 T (triangles) cell lines.
[0041] FIG. 14A shows tested sequence specific Tail-PCR probe in cultured circulating tumor cells (CTCs).
[0042] FIG. 14B shows tested sequence specific formalin-fixed paraffin-embedded glioblastoma multiforme (GBM) cancer samples.
[0043] FIG. 14C shows a table of 10 genes contained in the probe of FIG. 10 A.
[0044] FIG. 15A shows an FFPE prepared with robotic slide processor for breast cancer samples
[0045] FIG. 15B shows an FFPE prepared with robotic slide processor for GBM cancer samples
[0046] FIG. 16A shows a formalin fixed paraffin embedded (FFPE) specimen imaged with a Vectra multispectral imaging system.
[0047] FIG. 16B shows an FFPE specimen imaged with a Pannoramic confocal whole slide image scanner. [0048] FIG. 16C shows an FFPE specimen imaged with a Cyto Vision® platform.
[0049] FIG. 17A shows formalin fixed paraffin embedded (FFPE) of glioblastoma multiforme (GBM) case hybridized with a multiplex probe recognizing 10 genes and submitted to signal quantification.
[0050] FIG. 17B is a graph quantifying the signals sensed with the probe of FIG. 17A.
[0051] FIG. 18 A shows FFPE of a GBM case.
[0052] FIG. 18B shows FFPE of another GBM case.
[0053] FIG. 19A shows efficiency of nick translation labeling performed at time points of 8 and 16 hours.
[0054] FIGS. 19B-19C show efficiency of nick translation labeling performed at time points of 2.5, 3, and 4 hours.
[0055] FIG. 20 A shows a 3 -hour nick translation 2% agarose gel showing unlabeled
DNA bands on the left and DNA labeled with combinations of two fluorophores and size range of 100 to 400 bp.
[0056] FIG. 20B shows gel showing DNA on the left and DNA labeled with two and three fluorophores.
[0057] FIG. 21 shows the same data shown as in FIG. 12, but with a fixed y-axis range of
0 to 40 copies, to optimally visualize the difference between amplified (MYC) and nonamplified genes.
[0058] FIG. 22A shows accuracy displayed as a deviation plot of mean copy number for the lymphocyte data set.
[0059] FIG. 22B shows detailed MDM4 data showing copy number counts for each individual replicate, with each dot representing a single-cell count and with mean and SD indicated.
[0060] FIG. 23 A shows a first portion of a table showing reference gene copy number values obtained by single FISH (mean ± SD).
[0061] FIG. 23B shows a second portion of the table showing reference gene copy number values obtained by single FISH (mean ± SD).
[0062] FIG. 23 C shows a third portion of the table showing reference gene copy number values obtained by single FISH (mean ± SD).
[0063] FIG. 23D shows a fourth portion of the table showing reference gene copy number values obtained by single FISH (mean ± SD).
[0064] FIG. 24 shows an example of a system for multiplex labeling of a sample and gene copy number evaluation in accordance with some embodiments of the disclosed subject matter.
[0065] FIG. 25 shows an example of hardware that can be used to implement a computing device and server in accordance with some embodiments of the disclosed subject matter.
DETAILED DESCRIPTION OF EXEMPLARY EMBODIMENTS
[0066] Fluorescence in situ hybridization (FISH) is the gold standard technique for the detection of gene copy number changes including amplifications and deletions critical in the diagnosis and management of cancer. The utility of FISH has been limited by the number of genes that can be evaluated at a time, and its use has diminished in the era of genome-wide approaches such as array comparative genomic hybridization and next generation sequencing. Nonetheless, it is still extremely powerful as it allows absolute copy number quantification at the single cell level in the context of tumor section architecture. To enhance the utility of FISH in the genomic era, the inventors have sought to increase its multiplicity, so one could more rapidly determine copy number alterations among genes believed to play a role in oncogenesis.
[0067] The simultaneous use of different color combinations in FISH has been employed with chromosome paint probes used in spectral karyotyping (SKY) or multicolor FISH (M- FISH) for the analysis of numerical and structural abnormalities of whole human chromosomes. Such techniques are only applicable to metaphase preparations, and so have a limited role for the analysis of interphase nuclei and fixed tissues. The feasibility of similar combinatorial labeling approaches for gene locus-specific regions have been previously explored, but recent advances in fluorescence digital imaging, data processing technologies, and more stable fluorophores have improved the ability to simultaneously examine multiple genes. Single cell next generation sequencing is another powerful approach, but the accuracy of copy number calling is still in its early stages, and the cost is prohibitive for routine use in the clinic.
[0068] Highly-multiplexed gene FISH probes could have important research and clinical applications. The inventors believe the study of genetic heterogeneity in cancerous cells could be greatly enhanced by the availability of a reliable multiplex FISH assay. Earlier work in human glioblastoma multiforme (GBM) showed a remarkable degree of copy number heterogeneity, where it was observed, using three color FISH, tumors with intermingled tumor cell populations containing high-level amplifications of EGFR, MET, or PDGFRA in different cells. It is possible that there will be additional examples discovered of such copy number "mosaicism" in GBM if more than three genes can be interrogated in each hybridization procedure. Importantly, mosaicism was only detected by FISH, as array comparative genomic hybridization (aCGH) and next generation sequencing (NGS) did not have the sensitivity to detect minor amplified sub populations. Clinically, it is suspected that this heterogeneity may in part explain the numerous failures of clinical trials of targeted therapies against kinases in GBM. A multiplex FISH assay might also be expected to have direct clinical applicability in simultaneously assaying many actionable copy number changes in tumor samples. The approach may be especially powerful when samples have a limited number of cells, such as isolated circulating tumor cells. In these settings multiplex FISH would be faster and less expensive than other options such as NGS.
[0069] In certain embodiments, the development of a multiplex, locus-specific FISH method employing combinatorial labeling of 15 genes with 2 or 3 fluorophores each is reported herein, although in other embodiments a larger number of genes and fluorophores may be used. Hybridized cells were imaged with a laser-scanning confocal microscope, and the spectral signature of each fluorophore was identified by linear unmixing. The inventors developed a custom automated image analysis pipeline, allowing rapid and accurate gene quantification. The approach was validated by measuring the gene copy number of normal lymphocytes, cultured tumor cell lines, and circulating tumor cells, and comparing these results to known copy numbers obtained via traditional methods. A multiplex FISH (M-FISH) assay can be expected to have direct clinical applicability in simultaneously assaying many actionable copy number changes in tumor samples. The approach may be especially powerful when samples have a limited number of cells, such as isolated circulating tumor cells (CTCs). In these settings, multiplex FISH would be faster and less expensive than other options, such as NGS.
[0070] In various embodiments, bacterial artificial chromosome (BAC) clone and PCR- based DNA probes may be provided which have combinatorial fluorescent labels, such that at least (but not limited to) 35 genes can each be "barcoded" with a unique fluorophore combination using at least (but not limited to) 6 fluorophore colors with 2, 3, or more colors co-labeling per gene.
[0071] In certain embodiments, a particular polynucleotide probe may be labeled with several (2, 3, 4, or more) different colors/fluorophores. This may be achieved by attaching multiple different fluorophores to the same polynucleotide chain or by labeling separate subgroups of the probe each with a different single fluorophore, mixing the subgroups together (e.g. in generally equal amounts), and applying the mixture to the sample for hybridization and labeling and subsequent imaging and analysis. Possible fluorophores (generally coupled to dUTP) that may be used include aqua 431 (7-Diethylaminocoumarin-3-carboxylic acid, DEAC), green 496 (5 -Fluorescein), green 500 (5-Carboxyrhodamine 110), Alexa fluor 488, cyanine 3, gold 525 (5(6)-Carboxyrhodamine 6G), gold 550 (Cyanine-3E), orange 552 (5-TAMRA), Alexa Fluor 568, red 580 (5-ROX), red 594, Alexa Fluor 594, red 598, cyanine 5, red 650 (Cyanine- 5E), Alexa Fluor 647, and far red 673, although other fluorophores may also be used. In various embodiments the fluorophores that are employed for a particular labeling set are selected so that no two fluorophores associated with a given sample have overlapping emission spectra.
[0072] A sample may include single or multiple cells, e.g. adhered to glass such as a slide, or may include tissue sections. The sample may be fixed and fluorescently-labeled polynucleotide probes may be applied to the sample under conditions which promote
hybridization of the probes with one or more genes in the sample, where the genes are generally located in nuclei of the sample.
[0073] The labeled sample may then be imaged, for example using widefield
microscopy, whole slide image scanner, confocal microscopy (e.g. laser scanning, spinning disk, programmable array microscopy (PAM), or other suitable types), or light sheet fluorescence microscopy. In some embodiments, imaging may be performed by collecting several separate images each using a different combination of excitation and emission wavelengths (e.g. using filter sets tuned for particular fluorophores). In other embodiments, imaging may be performed by obtaining spectral scans of the sample, e.g. for all or part of a wavelength range such as 400- 700 nm, 450-650 nm, etc., and subsequently separating out the signals from each individual fluorophore, for example computationally using a technique such as linear unmixing, which may be determined relative to one or more reference spectra for each respective fluorophore.
[0074] Imaging of the sample generates one or more images which include fluorescence emissions from the fluorophores associated with the probes that were applied to the sample. In general the fluorescence emission is in the form of localized spots (e.g. see FIG. 1A). The image is analyzed to identify the localized spots and to determine which of the spots are colocalized in order to determine which genes have been identified and the copy number of the genes; in certain embodiments the gene copy number may be determined based on the relative brightness of the spots, as discussed further below.
[0075] In one example, if a particular group of spots that have been determined to be sufficiently close together to be colocalized (e.g. are within 0.24pm or less of one another) include an orange spot and a red spot then this location is determined to be associated with the RET gene (see FIG. 2A), since the fluorescently-labeled polynucleotide probe (or mix of probes) associated with the RET gene have been labeled with fluorophores that emit orange and red light. In some embodiments, using a set of six different fluorophores permits the establishment of 5- plex, lO-plex, or l5-plex labeling schemes to uniquely identify 5, 10, or 15 genes within the same sample (see FIGS. 2A-2C) by associating two labels with each gene. In other
embodiments, associating 3 labels with each gene may permit up to 20 genes to be uniquely identified.
[0076] The following is a description of methods, apparatus, and/or systems for multiplex labeling of samples and gene copy number evaluation according to embodiments of the invention.
[0077] Cells and Cell Preparation
[0078] The cells used for various experiments disclosed herein were grown according to standard protocols specified for each cell line. Established tumor cell lines were obtained from ATCC (Manassas, VA) (H1975, MCF10A, PC3, H460, UACC62, HCC1954, 293T, LM2, Sk- Mel, BL209, BUML, and LNCaP) and by collaboration with Dr. Hiroaki Wakimoto (Department of Neurosurgery, Massachusetts General Hospital, Boston, MA) (GBM18 and GBM29); and they were grown according to standard protocols specified for each cell line. Patient-derived CTC lines (BRX-07, BRX-42, BRX-50, BRX-61, BRX-68, BRX-82, and BRX-142) were previously described. Normal lymphocytes were obtained from blood draws from five healthy donors (three males and two females). These cells are listed in Table 1 and described further below.
Table 1
Figure imgf000016_0001
Figure imgf000017_0001
[0079] Multiplex FISH Analysis
[0080] In various embodiments, the multiplex-FISH assay may include three principal steps: 1) probe construction and hybridization, 2) image acquisition and 3) automated image analysis.
[0081] Probe construction and hybridization
[0082] a) Bacterial artificial chromosome (BAC) probes
[0083] FISH probes were derived from BAC clones purchased from Children’s Hospital
Oakland Research Institute (CHORI, Oakland, CA; http://bacpac.chori.org/) and listed in Table 3. E. coli transformed with individual BAC clones were cultured using Luria-Bertani (LB) media (SIGMA, St. Louis, MO) containing 12.5 pg /ml chloramphenicol (Teknova, Hollister, CA). Overnight cultures were extracted using the Qiagen Midiprep Kit (Qiagen, Valencia, CA) following the manufacturer’s protocol (Suppl. Materials and Methods). Extracted BAC DNA was then amplified by multiple displacement amplification with the Qiagen Repli-G midi kit (Qiagen), following the manufacturer’s protocol before proceeding to labeling by nick translation.
[0084] DNAs were labeled by nick translation kit (Abbott Molecular Inc., Des Plaines,
IL) with fluorophore-conjugated dUTPs (Enzo Life Sciences Inc., Farmingdale, NY), including: aqua 43 l-dUTP (Excitation 431/ Emission 480), green 496-dUTP (Ex. 496/ Em. 520), gold 525- dUTP (Ex. 525/ Em. 551), orange 552-dUTP (Ex. 552/ Em. 576), red 580-dUTP (Ex. 580/ Em. 603), red 650 (Cy5)-dUTP (Ex. 650/ Ex. 662). For combinatorial labeling, two- or three- fluorophore combinations were chosen to minimize spectral overlap (FIG. 1 A). Optimal fluorophore-conjugated dUTP mixture concentrations were empirically-determined following probe visualization under the microscope (see below). The number of possible combinations C is a function of n fluorophores and k encoding sets, C (n, k)= n!/(k!(n-k)!). Therefore, with six fluorophores and using two-color encoding one would be able to theoretically generate: 6!/(2!(6- 2)!), or 15 unique combinations; for three color encoding, the number is: 6!/(3 ! (6-3)!), or 20 unique combinations. Ethanol-precipitated nick translation reactions were resuspended in hybridization buffer (see below).
[0085] As proof of principle a probe mix was initially constructed which included 5 genes (FIG. 1B) and the efficiency of "barcoding" (fluorophore incorporation) was assessed by imaging gene spot intensity and the colocalization of fluorophores (FIGS. 1C-1H). FIGS. 1C-1H are each shown with a bar 104 representing a scale of 10 pm.
[0086] b) Hybridization
[0087] Fixed cells were dropped onto microscope glass slides, air dried, and treated with
Digest-all (Thermo Fisher, Waltham, MA) at 37°C for 3 min, washed with 2x SSC and dehydrated in ethanol. Cells were co-denatured with FISH probes mixes using a Hybrite slide processor (Abbott Molecular, Des Plaines, IL) at 75°C for 5 minutes, followed by incubation for 36 hours at 40°C. For formalin-fixed, paraffin-embedded (FFPE) samples, denaturation was performed at 85°C for 5 minutes. After hybridization, slides were washed two times in
2xSSC/0.l% Nonidet-40 for 3 min each at 72°C. After washing, slides were mounted in a glycerol-based solution with n-propyl-gallate without DAPI (Suppl. Materials and Methods).
[0088] Image Acquisition
[0089] Images were acquired with a Zeiss LSM 710 (Zeiss, Germany) laser confocal microscope due to its ability to produce spectrally-resolved optical sections (stacks) that are amenable to volume-rendered 3D reconstructions of the specimens with reasonable accuracy.
The microscope was configured with four lasers, Diode 440 nm, Argon 488, DPSS (Diode- Pumped Solid-State) 561 nm and HeNe 633 nm, high efficiency triple band pass beam splitter MSB 488/561/633 and single line MBS 445. Two objective lenses were used in this study: a Plan-Apochromat l00x/l.46 NA and Plan-Apochromat 63x/l.40 NA. Chromatic aberrations were measured and corrected for using tetra-spec fluorescent beads (Life Technologies) and Y- chromosome FISH slide labeled with all six fluorophores. Images were collected with lOOx objective lens with bit depth of l6-bit, frame size of 512 c 512 pixels with image pixel size of 0.06 pm. Multiple z stacks were collected at an interval of 0.10 pm. Analysis of the punctae sizes in the lOOx objective lens captured images allowed us to broaden pixel size and intervals and adopt the 63x objective lens. Images acquired with the 63x objective lens had a bit depth of 16- bit, frame size of 256 x 256 pixels with image pixel size of 0.12 pm and z intervals of 0.20 pm.
[0090] Fluorescent probe identities were detected using lambda mode and spectral unmixing with a 34-channel photomultiplier tube for high-resolution spectral image acquisition. It was determined the reference spectra for each fluorophore by obtaining their profiles from slides hybridized with each gene labeled as a single color. The resulting spectra were stored in the Spectra Database of the microscope. Linear unmixing separated mixed signals pixel by pixel, using the algorithm in Zen Black 2011, the Zeiss proprietary software. This algorithm
encompasses the entire emission spectrum of each of the fluorescent markers in the sample. All image data were further used to build the image analysis algorithm.
[0091] Image Analysis
[0092] The algorithm for detection and localization of fluorescent point sources included the following sequential operations: 1) image registration 2) nuclear segmentation and delimiting the region of interest; 3) identification and precise localization of locations with sufficient probability of containing a point-source signal 4) identification of colocalized spots and comparison with gene panel matrix.
[0093] In brief, multiplex image datasets captured in multiple z stacks were submitted to the custom-built combinatorial probe detection algorithm. Each imaging session also involved acquisition of calibration images to correct for chromatic aberration. These were composed of a Y-chromosome labelled with all 6 colors. Transformations inferred from these Y-chromosome images were used to first register experimental FISH images. Nuclear segmentation was performed via low-pass filtering and Otsu thresholding followed by a watershed transform to separate contacting-cells. A point source detection algorithm was applied to detect the spots in each channel. Spots that were within specified distance of one another were considered coincident and these label combinations were then compared to a label matrix to identify each gene. Results were displayed as median copy number per cell, and positive/negative calls for amplification and deletion were based on standard criteria. [0094] Validation
[0095] The multiplex FISH profiles were validated in two ways. First, in order to confirm the multiplex FISH results, the baseline copy number of each gene was verified in each cell line by traditional single-FISH manual quantification, which is shown in Table 2 included in FIG.
23 A-D. Table 2 shows reference gene copy number values obtained by single FISH (mean ± SD) and is spread across four pages due to the size of the table. Second, in addition to the locus specific probe, the correspondent centromere enumeration probe was used as an internal reference control. Normal cutoff values were established by scoring 200 interphase nuclei for each tumor cell line, 100 interphase nuclei for each cultured circulating tumor cells and 200 interphase nuclei of normal peripheral blood. The mean copy number value and ranges were obtained and used as reference to establish the scoring criteria and acceptable values for the multiplex assay.
[0096] Probe Development
[0097] In certain embodiments, options for multiplexing locus-specific DNA FISH probes are limited by the number of available spectrally-distinct fluorophores and matched filter sets. Practically, that means 4 probes are the maximum in common clinical FISH applications. The inventors sought to use combinatorial labeling approaches, previously used in SKY, MFISH and in some locus-specific applications, to increase the number of probes using standard fluorophores. The probe sets were tested and validated using a scanning laser confocal microscope combined with a custom analysis pipeline, as detailed below.
[0098] Before developing a multiplex assay, single BAC probes labeled with two or three fluorophores hybridized to control cells were sampled (FIG. 1 A). FIG. 1 A is shown with a bar 100 and a bar 102 each representing a scale of 2 pm. This step allowed us to set initial parameters for image capture, such as laser power, and to correct for chromatic aberrations in the z axis. A prototype two-color multiplex labeling schema using mixtures of six fluorophores was designed for five important cancer genes first utilizing BACs overlapping the gene coding regions ( PDGFRA , MET, EGFR, MYC, and RET genes)(FIG. 1B). While the BAC labeling reactions were performed with standard nick translation protocols, significant troubleshooting was required to determine which fluorophores could be combined together and in what ratios. Given different emission and excitation spectra, different quantum yields, different incorporation efficiencies, and unpredictable quenching effects, empiric testing of combinations was required. With optimized fluorophore labeling combinations, signal intensity was evaluated and signal-to- noise of the pool of 5 probes in a cell line using a Zeiss LSM 710 laser scanning confocal microscope (FIGS. 1C-1H). Images reveal discrete nuclear spots with minimal background noise, reflecting efficient fluorophore incorporation and a robust probe specific activity.
Individual channel images were evaluated to ensure co-localization of fluorophores according to the expected labeling schema. Based on this preliminary work, a standard recipe for optimally combining fluorophores for two-color or three-color encoding was established. Since two-color "barcoding" produces larger and brighter spots (FIG. 1 A), subsequent microscopy settings and image analysis algorithm development utilized this approach. In total, four working probe pools of different -plex levels were designed and tested (FIGS. 2A-2C). The labeling schemas are detailed for the 5-plex mix (FIG. 2A), a larger lO-plex mix (FIG. 2B) and two l5-plex mixes (FIG.2C).
[0099] Imaging
[00100] Two major imaging challenges were: (1) what imaging parameters allowed for the most sensitive detection and localization of spots, and (2) what parameters allowed closely spaced, co-localized spots to be optimally resolved. Images were acquired with a Zeiss multicolor confocal microscope utilizing simultaneous excitation with four lasers, using a lOOx objective, and capturing approximately 60 z-planes. Confocal microscopy was used to reduce out-of-focus blur and, therefore, obtain optimal resolution in the z-axis; multispectral imaging combined with linear unmixing was used in order to obtain optimal identification of
fluorophores; and serial optical z-sections, required for separation of axially-adjacent punctae, were created and used to build the image analysis algorithm. Standard clinical FISH imaging systems generally use maximum intensity projections, which would result in a high false positive rate since spots from different z-planes would erroneously appear to overlap. Reference spectra were obtained for each of the six fluorophores individually from normal lymphocytes hybridized with B AC probes labeled with only the fluorophore of interest. The reference spectra libraries were used to evaluate the contributions of mixed fluorophore signals in the pooled probe hybridization.
[00101] Analysis [00102] Automated algorithms were developed to detect and localize barcoded probe signals and to quantify their copy number per nucleus (FIG. 3). FIGS. 3A-C are each shown with a bar 300 indicating a scale size of 10 pm. This algorithm was applied to 3D volumes including serial optical sections of fluorescent signals (FIG. 3A). Analysis steps included: 1) nuclear segmentation for delimitation of the region of interest; 2) defining spots with a high probability of containing a point-source signal (including the estimation of the spot signal intensity and spot 3D sub-pixel localization) (FIG. 3B); and 3) matching of colocalized fluorescent spots with the gene panel matrix to identify and quantify each gene (FIG. 3C). This approach relies on statistical comparison of intensity of local fluorescence maxima with a model of the microscope point-spread function (PSF) to rigorously detect low signal-to-noise ratio (SNR) spots in a manner which is adaptive to local variations in the signal and background, without requiring specification of arbitrary thresholds. With modifications to handle dense punctae from highly- amplified genes this approach allowed us to measure accurately FISH fluorescence signals taking into account the uncertainties of the fitted amplitude and local background when individually testing for the significance of each candidate signal.
[00103] The automated gene quantification algorithm established that spots in different channels colocalized if the centers of the spots were within 0.24pm, a value which was empirically determined to provide accurate results in cells with known copy numbers. In various embodiments two or more spots may be considered to be colocalized if they are within 0.1 pm, 0.2pm, 0.3 pm, 0.4pm, 0.5 pm or other suitable distances. Since each gene should be represented by a number of spots that match the established labeling schema, the gene copy number count is determined by the number of colocalized spot sets. The definition and elimination of non-noise disruptive features, and accurate identification of probe barcodes in non-uniform background spectra, are the major challenges that the inventors are addressing to improve automated counting.
[00104] Validation
[00105] Twenty-one cell lines were selected for validation (Table 1). To establish the true gene copy in each line, single color FISH was performed for each gene locus and scored the spots manually by eye, the current gold-standard in clinical FISH assays. Normal gene counts were established by manually scoring 200 (tumor cell line) or 100 (CTC) interphase nuclei for each cell preparation. The mean copy number value and ranges were obtained and used as reference to establish the accuracy of the automated count for the multiplex assay. Multiplex FISH was performed first using the lO-plex version of the assay, showing strong and discrete signals in cells in three dimensions (FIG. 4A). FIG. 4A is shown with a bar 400 representing a scale of 10 pm. Since most tumor lines are aneuploid, the inventors first tested the accuracy of automated scoring in interphase primary lymphocytes, for which the inventors were certain of the copy number state, and compared the scores to single gene manual scoring (FIG. 4B).
Manual scoring of single-probe FISH (bars labeled "Single FISH") confirmed that the cells possessed two signals for each probe, confirming the normal diploid state of the lymphocytes. Multiplex FISH (bars labeled "M-FISH") with 10 nuclei captured followed by automated counting revealed an accurate scoring, with the caveat that the probes for MET, EGFR and MYC averaged lower than 2 copies. Error bars did however encompass 2 copies for all genes (FIG.
4B).
[00106] With the first hurdle of accurate copy number assessment in lymphocytes overcome, the same 10 gene panel was applied to transformed cancer cell lines and breast cancer circulating tumor cells (FIGS. 4C-4G). The automated copy number calls were accurate across all lines tested, with high-level amplification of the MYC gene detected as expected in lung cancer line H1975 (FIG. 4F). Additionally, high-level ERBB2 amplification was detected in the breast cancer transformed line HCC1954, as has been previously described. TP 53 was underscored in glioblastoma cell line (FIG. 4G) using the multiplex assay, likely reflecting challenges with the overlapping spectra of the dual labels used (Red and Orange), or challenges with signal to noise in those specific experiments.
[00107] Cell lines were also investigated by aCGH, and results were plotted against single and multicolor FISH. aCGH showed overall good agreement with multicolor FISH across multiple cell lines (FIGS. 9, 10, and 11) and confirmed amplification of MYC observed in H1975 (FIGS. 9A-C). However, single FISH showed a stronger agreement with aCGH, indicating some improvement in the multiplex analysis is still needed (FIGS. 9D-E). FIG. 9 shows array comparative genomic hybridization (aCGH) analysis of the 293T and H1975 cell lines. FIG. 9A shows multicolor fluorescence in situ hybridization [multiplex-FISH (M-FISH)] automated quantification and reference single FISH analysis of 10 genes in the H1975 cell line. Dots indicate the copy number derived from aCGH, as an average of probes spanning the relevant genes. FIG. 9B shows genome copy number summary for aCGH analysis of H1975. The horizontal axis represents the linear position along the chromosome, whereas the vertical axis represents the measured log2 signal ratio (-2 to 2). Relevant gene names are placed in the appropriate genomic positions. FIG. 9C shows a zoomed-in view of chromosome 8, depicting high-level copy number gain for MYC. FIG. 9D shows an M-FISH automated quantification and reference single FISH analysis of 10 genes in the 293T cell line, with dots indicating the copy number derived from aCGH. FIG. 9E shows a genome copy number summary for aCGH analysis of 293T. Data are expressed as means ± SD (FIG. 9A and FIG. 9D). Chr, chromosome. FIG. 10 shows array comparative genomic hybridization (aCGH) and fluorescence in situ hybridization (FISH) validation for cell lines. Multiplex FISH quantification and comparison to single FISH and aCGH (dots) for BUML are shown in FIG. 10 A, for UACC62 in FIG. 10C, and for MCF10A in FIG. 10E. Corresponding aCGH genomic profiles, shown as log2 ratios, for BEIML are shown in FIG. 10B, for UACC62 are shown in FIG. 10D, and for MCF10A are shown in FIG. 10F. Data are expressed as means ± SD (FIGS. 10A, 10C, 10E). Chr, chromosome; M-FISH, multiplex- FISH. FIG. 11 shows array comparative genomic
hybridization (aCGH) and fluorescence in situ hybridization (FISH) validation for cell lines. Multiplex FISH quantification for 10 genes in GBM18 are shown in FIG. 11 A and for 15 genes in PC3 cell lines in FIG. 11C and comparison to single FISH and aCGH (dots). Corresponding aCGH genomic profiles, shown as log2 ratios, for GBM18 are shown in FIG. 11B and ratios for PC3 are shown in FIG. 11D. Data are expressed as means ± SD (FIGS. 11 A and 11C). Chr, chromosome; M-FISH, multiplex-FISH.
[00108] Some genes, particularly CDKN2A, showed poor performance in the multiplex probe mix compared with aCGH, as seen in BEIML (FIGS. 10A-B), UACC62 (FIGS. 10C-D), and MCF10A (FIGS. 10E-F). TP53 underperformed in GBM18 (FIGS. 11A-B), and MDM4 and ERBB2 underperformed in PC3 (FIGS. 11C-D). The definition and elimination of non-noise disruptive features and accurate identification of probe bar codes in non-uniform background spectra are the major challenges that were addressed to improve quantification in the multiplex assay. [00109] To assess multiplex FISH reproducibility, multiple replicates of the assay were performed in 293T cells, H1975 cells, and the normal lymphocyte preparation. These data are summarized in FIG. 12, as box-and-whisker plots for each gene replicate, and include >5000 single-probe data points. FIG. 12 shows reproducibility analysis of multiplex fluorescence in situ hybridization (FISH). Box-and-whisker plots are shown for mean and interquartile range for copy number counts for each replicate experiment. Panels show grouped replicates for 293T (left column), H1975 (middle column), and normal lymphocyte preparations (right column), with a different gene in each row. The y axis is scaled to optimize the ability to visualize the full range of copies per gene across the three lines. Displayed P values are for analysis of variance performed within each group of replicates. Above each replicate is indicated either significant (triangles) or nonsignificant (circles) deviation from gold standard single FISH in the same cell lines n Z 12 replicates (left column); n Z 9 (middle column); n Z 25 (right column). Analysis of variance was performed to compare across replicates, and the P values for the analysis of variance are listed in each panel. For the lymphocyte data set, 7 of 10 genes showed no statistically significant variation across the 25 replicates. For the three genes with an analysis of variance P < 0.01, the mean copy number values of each replicate are actually near the expected value of two copies. 293T and H1975 have higher variability than lymphocytes based on the number of genes with significant P values by analysis of variance. This variability correlates with the degree of aneuploidy, which perhaps increases the assay noise and generates challenges for spot detection. Even with variability seen in H1975, only one gene, MYC, in H1975 would have been called amplified. This is most clearly seen when the same data are visualized without autoscaling, but with a fixed Y scale up to 40 copies in FIG. 21. FIG. 21 shows the same data shown as in FIG. 12, but with a fixed y-axis range of 0 to 40 copies, to optimally visualize the difference between amplified (MYC) and nonamplified genes. A scatterplot of duplicate hybridization experiments of 293T and H1975 showed good correlation, with an R2 of 0.93 (FIG. 13). FIG. 13 is a scatterplot of average gene automated copy number counts in two independent hybridizations using the lO-plex probe mix for H1975 (circles) and 293 T (triangles) cell lines. The scatterplot shows reproducibility analyzed by duplicate multiplex fluorescence in situ hybridization analysis of H1975 and 293T cells. Correlation coefficient for the replicates is shown. Dotted line indicates linear regression line. Error bars indicate 1 SD. [00110] To help visualize assay accuracy, above each replicate in FIG. 12, it is indicated whether that individual replicate copy number mean deviates significantly from the gold standard single FISH score. Some genes, such as FGFR3, show little variation from the expected values across all replicates. Others, such as MDM4, show a systemic variation from the expected values. This is not necessarily clinically problematic because, in general, the variations from the gold standard are systematic but minor, as shown for the lymphocyte MDM4 panel. This is shown in detail for the same MDM4 data in FIG. 22. FIG. 22A shows accuracy displayed as a deviation plot of mean copy number for the lymphocyte data set. They axis shows the absolute copy number deviation of automated multiplex fluorescence in situ hybridization (FISH) replicates (R) from the expected two copies. Of 10 probes, 9 have a mean deviation of <0.5, with MET being undercalled by multiplex FISH slightly. Mean (red lines) and SD (blue lines) are indicated. FIG. 22B shows detailed MDM4 data showing copy number counts for each individual replicate, with each dot representing a single-cell count and with mean and SD indicated. A small bias to slightly more than two copies is demonstrated.
[00111] For this probe, there is a small bias to detect more than the expected two copies.
Deviation for each of the 10 genes from the expected is shown in detail for the lymphocyte data, revealing that 9 of 10 probes have a mean deviation of <0.5 copies, with only MET being just beyond a -0.5 deviation.
[00112] Hybridizations with a higher -plex level of 15 genes (FIG. 5 A) demonstrated that the biggest hurdle is to minimize false negatives that can lead to some of the genes being under represented during quantification (FIGS. 5B, 5C). FIG. 5A is shown with a bar 500 representing a scale of 10 pm. Currently, spot combinations that do not match the label matrix are ignored by the quantification algorithm.
[00113] While the current imaging pipeline using a laser scanning confocal with spectral imaging produces images with high contrast, detail, and reduced crosstalk between fluorophores (FIGS. 6A, 6B), it is impractical for clinical applications since it has only a throughput of approximately 10 nuclei per hour. FIGS. 6 A and 6B include a bar 600 representing a scale of 2 pm. Therefore, moving to a widefield platform will be essential for clinical use. To explore the use of a more standard and higher-throughput widefield fluorescence microscopy system, the inventors piloted the multiplex assay on the Zeiss Cell Observer widefield system. For confocal images, the data suggested that a sampling rate of 0.2 pm in the axial direction is sufficient to resolve spots in the Z axis. Since the axial resolution on a confocal microscope is only marginally better, the inventors retained this step size for widefield microscopy. Bright high- quality probe signals were easily detected, with good signal-to-noise (FIGS. 6C, 6D) by using a 63x objective and a sensitive sCMOS camera. Importantly, the 63x objective along with the large camera chip allowed for rapid capture time, increasing the system throughput to >100 nuclei per hour. FIGS. 6C and 6D include a bar 604 representing a scale of 20 pm.
[00114] The assessment of gene copy number is important in the management of cancer patients, and is standard of care for the ERBB2 gene in breast and esophagogastric tumors. As the number of targeted therapies grow, gene amplifications and deletions are likely to be of growing importance as predictive biomarkers. A growing unmet need for such tests in the clinical diagnostics operation and in the clinical trial support has been seen, with trials focused on copy number changes in ERBB2, EGFR, MET, PDGFRA, KIT, FGFR1, FGFR2, MYC, NMYC, CDK4, CDK6, and CDKN2A. It is estimated that there are 20-30 genes involved in well-annotated amplification and deletion events in cancer. There is currently no simple and accurate method to simultaneously assay this many genes in larger panels and more comprehensive analysis.
Therefore, a multiplex FISH assay, such as the one developed here, has the potential to be clinically useful and therapeutically informative. With the current limitation of a l5-plex FISH capability, analyses of one sample of cells across two slides could render copy number quantification of up to 30 genes.
[00115] While there are a number of other technologies to assess copy number at genome wide scale, multiplex FISH has a unique role. Two prominent competing technologies for multiplex FISH are aCGH and NGS, both of which still face difficulties with accurate somatic copy number assessment in cancer. Since both of these techniques utilize DNA extracted from whole tissue sections, there is always substantial dilution of the tumor cell contribution by normal stromal or inflammatory cells. The degree of dilution depends on the tumor cell fraction and the actual true copy state of the genes in tumor cells. Thus, analysis of a gene like ERBB2 , for which a copy number ratio for a positive call is 2.0 by FISH, will show signals that do not exceed noise using aCGH or NGS, and will not be called as positive. FISH, by definition is an in situ assay, has the major advantage of determining absolute gene spot counts within the tumor cell population only. Single cell sequencing is a very real and exciting alternative technology, but currently is very expensive, and DNA copy number has not yet been accurately proven at the single gene level. Genomic in situ assays such as Multiplexed Ion Beam Imaging, using multiple probes labeled with different mass metals and localized with mass spectrometry, or fluorescent in situ RNA sequencing (FISSEQ), which is essentially NGS with sequencing directly on nucleic acid molecules fixed in situ in tissue, do not have currently the resolution for gene locus required for copy number assessment.
[00116] The multiplex FISH assay described here is envisioned as a possible stand-alone clinical assay, but more likely as an adjunct performed side-by-side with NGS mutational panels in comprehensive genotyping laboratories. Since only one or two slides will be required, an efficient, automated, and cost-effective work flow can be established. This assay may be especially powerful in small samples as well as those with very low tumor fraction. The optimal sample for this technology may in fact be CTCs for which one often obtains only a few cells per blood draw. One would be able to render an accurate copy number assessment even with 5-10 cells. There will be numerous research uses of multiplex FISH, especially the critical study of copy number genetic heterogeneity in cancer. Single cell sequencing is allowing great strides in these studies of heterogeneity, however, accurate copy number calls are possible only for large segment gains and losses, not for single genes. Copy number heterogeneity is clinically understudied, and likely underlies poor clinical response to targeted therapies, for example with EGFR amplification in GBM and the failure of multiple anti-EGFR targeted trials.
[00117] Several challenges need to be overcome for this assay to be of general use in clinical or research laboratories. For instance, a more pragmatic approach is needed for image capture. Confocal microscopy is impractical in a clinical setting, as its handling requires more expertise and its maintenance is expensive; instead, a customized widefield microscope system, such as used routinely for M-FISH, would be needed. A Zeiss-based system equipped with sensitive optics that favors detection of even dim spots was piloted, however, in various embodiments additional filter sets will be needed to allow for the full 6-fluorophore assay.
Without spectral imaging, a filter cube-based approach will have challenges with fluorophore overlap. Tissue autofluorescence will also be a challenge for formalin-fixed paraffin embedded tissue sections. Nonetheless, a cost-effective system with higher throughput is practical.
[00118] In summary, the use of combinatorial probe labeling to allow highly-multiplexed FISH copy number analysis of up to 15 genes simultaneously has been demonstrated. This approach was validated with cells and cell lines of known copy number, and offers an accurate and practical method for absolute gene copy number assessment in cancer.
[00119] Cells and Cell Preparation
[00120] Established tumor cell lines were obtained from American Type Culture
Collection (ATCC, Manassas, VA) (H1975, MCF10A, PC3, H460, UACC62, HCC1954, 293T,LM2, Sk-Mel,BL209,BUML) and by collaboration with Dr. Hiro Wakimoto in the Department of Neurosurgery at Massachusetts General Hospital (MGH) (GBM18, GBM29), and grown according to standard protocols specified for each cell line. Patient-derived circulating tumor cell lines (BRX07, BRX42, BRX50, BRX61, BRX68, BRX82, BRX142) were previously described. Normal lymphocytes were obtained from blood draws from five healthy donors (three males and two females). The cells used in this study are listed in Table 1.
[00121] Cell suspension
[00122] Cell suspensions were fixed and processed for FISH following previously described methods with minor changes.
[00123] Grown cells were dissociated either mechanically or by trypsin treatment, washed with phosphate-buffered saline, and treated with a hypotonic solution of 0.075 mol/L potassium chloride to promote osmotic swelling for 20 to 40 minutes at 37°C, followed by three fixative washes with methanol/acetic acid solution (3:1). Interphase cell suspensions were kept in -20°C until use. Cell suspension slides were prepared by dropping the cell suspension on a clean, uncoated slide, heated at 65°C for 3 minutes.
[00124] Lymphocyte spreads were further processed by preheating with 2X saline sodium citrate (SSC) with 0.25% Triton X- 100 (Sigma-Aldrich, St. Louis, MO) until boiling. Slides were then immersed for 2 minutes, followed by rinsing in 2X SSC at room temperature;
dehydrated in 95% ethanol, followed by three changes of 100% ethanol; air dried; and heated at 65°C for 3 minutes.
[00125] Bacterial artificial chromosome (BAC) derived probes
[00126] 40 locus-specific DNA sequence probes derived from BAC clones were used. BAC clone searches were performed using the University of California Santa Cruz (UCSC) Genome Browser (http://genome.ucsc.edu/) mapped to Feb. 2009 (GRCh37/hgl9) and Dec. 2013 (GRCh38/hg38) Human Genome Assemblies. BACs were purchased from Children’s Hospital Oakland Research Institute (CHORI, Oakland, CA; http://bacpac.chori.org/). Table 3 specifies BAC clones included in this study. Specificity of the clones were checked in metaphase spreads.
[00127] Table 3 BAC Clones (All obtained from CHORI)
GENE BAC CHROMOSOM POSITION LENGTH
CLONES AL BAND
Figure imgf000030_0001
Figure imgf000031_0001
Figure imgf000032_0001
Figure imgf000033_0001
Figure imgf000034_0001
[00128] BAC clone DNA extraction
[00129] Individual clones were cultured using Luria-Bertani (LB) media plates with 12.5 pg /ml chloramphenicol (Teknova, Hollister, CA) for selection. Bacterial stocks were plated and cultured at 37°C overnight. A single colony was chosen from the selective plate and inoculated in 3 ml starter LB culture (SIGMA, St. Louis, MO). The starter culture was placed in an incubator shaker set to 250 rpm at 37°C for 6-8 hours, followed by re-inoculation in 6 ml growing culture (1 : 1000) for 12-16 hours. The bacterial culture was transferred into a 14 ml tube and centrifuged at 6000xg at 4°C for 15 min. The supernatant was discarded and the pellet was resuspended in 4 ml resuspension buffer Pl followed by addition of 4 ml of lysis buffer P2 to the bacterial suspension. The tube was inverted vigorously to mix the content. It was then incubated for 5 min at room temperature and 4 ml of prechilled buffer P3 was added. The tube was inverted vigorously to mix the content and incubated in ice for 15 min. The bacterial lysate was centrifuged at 20,000xg for 30 min at 4°C and supernatant was collected to be applied in Qiagen- tip column. Column was equilibrated with 4 ml of Buffer QBT and after the column emptied by gravity flow, supernatant was applied and allowed to fill the column resin by gravity flow.
Column was washed two times with lOml of buffer QC. DNA was eluted with l5ml of elution buffer QF warmed to 65°C. DNA was precipitated by adding 3.5 ml of isopropanol and centrifuged at l8,000xg for 30 min at 4°C, supernatant was decanted. Pellet was washed with 70% ethanol and centrifuged at l8,000xg for 10 min. Supernatant was decanted and pellet was air-dried for 5 min. DNA was redissolved in TE buffer. After measuring the DNA concentration the samples were stored at -20°C.
[00130] Multiple Displacement Amplification - REPLI-g
[00131] Extracted BAC DNA was amplified by multiple displacement amplification with Qiagen REPLI-g midi kit. First, the extracted BAC DNA (2 mΐ) underwent gentle alkaline denaturation by adding DB reagent and incubation for 2min. Reaction was neutralized with the stop solution. After neutralized, REPLI-g master mix was added and the sample incubated at 30°C for 16 h in a thermocycler for isothermal amplification. DNA polymerase was inactivated at 65°C for 3 min at the end of cycle. DNA was then cleaned (de-salted) by ethanol precipitation by adding 1/10 volume of 3M sodium acetate (pH 5.2) and 2-3 volumes of 100% ethanol.
Sample was centrifuged at full speed in a microcentrifuge for 10 min and supernatant was discarded. Wash proceeded by addition of lml of chilled 70% ethanol and centrifugation at full speed for 10 min. Supernatant was decanted and the pellet air dried. After drying, suitable amount of water was added to dissolve the pellet (80-100 mΐ). DNA was incubated at 55°C for one hour and then rested overnight at 4°C before proceeding to labelling by nick translation.
[00132] Nick Translation - DNA labelling
[00133] First, the labelling schema or which color combination would be applied to each gene region was decided. It was important that each color combination be unique to a particular gene region representing a specific‘bar-code’. The reactions were carried in 50 mΐ volume in duplicate.
[00134] All solutions, except the enzyme, were kept at room temperature and thoroughly mixed before use. Working dilutions were prepared by mixing 0.3 mM ATP, 0.3 mM GTP and 0.3 mM CTP to obtain 0.1 mM dNTP. Nuclease free water was added to 0.3 mM dTTP to final 0.1 mM working solution. dUTPs were reconstituted to 0.2mM working solutions. The inventors first prepared the UTP/TTP/NTP mix one tube for each color combination, by first adding the following amounts of dUTPs and TTPs (Table 4a).
[00135] Table 4a Fluorophore labeled dUTP and dTTP concentrations
Figure imgf000036_0001
[00136] Ideally the fluorescence intensity of each dye should be equivalent but this was not observed in practice, requiring adjustments in concentration during probe preparation. Table 4a specifies the amount of each fluorophore and respective dTTP. For instance if green and orange were chosen to label a particular gene, 3 mΐ of green dUTP and 4.5 mΐ of dTTP should be pipetted in the mix.
[00137] Table 4b. MIX
Figure imgf000036_0002
[00138] *As specified in Table 4a
[00139] Nick translation was carried out in duplicate of 50 mΐ. Reagents were added to each tube following Table 4c
[00140] Table 4c
Figure imgf000037_0001
[00141] Reaction was carried at l5°C for 3 h and terminated by heating to 70°C for 10 min. Samples were then placed on ice and protected from light. Efficiency of labeling was primarily checked by agarose gel. The efficiency of the nick translation reaction was confirmed by analyzing the product size distribution using agarose gel electrophoresis (FIGS. 19 and 20). Gel electrophoresis was performed with 2% agarose diluted in Tris base, acetic acid, and EDTA buffer; IO,OOOC GelRed was then added (l x final concentration; Phenix Research Products, Candler, NC). Each reaction (5 pL) was run at 120 V for 30 to 40 minutes. Gel images were captured using a Gel Doc XR+ Imager (Bio-Rad, Hercules CA). FIG. 19A shows efficiency of nick translation labeling performed at time points of 8 and 16 hours, resulting in shorter fragments (200 to 100 bp); however, the quality of hybridization is lower, with speckles in the background and weaker-intensity signals. FIGS. 19B-19C show efficiency of nick translation labeling performed at time points of 2.5, 3, and 4 hours, resulting in fragments of predominantly 100 to 400 bp and higher quality of specimen hybridization. Aq, aqua; Gd/Gld, gold; Gr, green; Or, orange; R/Rd, red. FIG. 20 shows three-hour nick translation product gels. Nick translations of 3 and 4 hours show comparable results, so 3 hours was adopted as the standard nick translation time. FIG. 20 A shows a 3 -hour nick translation 2% agarose gel showing unlabeled DNA bands on the left and DNA labeled with combinations of two fluorophores and size range of 100 to 400 bp. FIG. 20B: shows gel showing DNA on the left and DNA labeled with two and three fluorophores. A/Aq, aqua; Cy, Cy5; Gd, gold; Gn, green; Or, orange; Rd, red.
[00142] Cleaning up of nick translated probes
[00143] The nick translation reactions for each gene were pooled at equal volumes, and a l.5x amount of Cot-l human DNA (Life Technologies, Carlsbad, CA) was added (>l.5x began to suppress fluorescent signals), followed by the addition of a 1 : 10 total volume of 3 mol/L sodium acetate (pH 5.2) and 2 to 3 volumes of 100% ethanol. For a 5-plex assay, the reaction would contain a total of 5 mg of various BAC DNAs, plus 7.5 mg of Cot-l. The mixture was centrifuged at 18,000 x g for 20 minutes at 4°C to pellet the labeled DNA. The supernatant was discarded, and the pellet was washed with 70% ethanol and centrifuged at 18,000 x g- for 10 minutes. Supernatants were discarded, and the pellet was air dried in the dark for 5 to 10 minutes. The probe was re-suspended with nuclease-free water, and five volumes of
hybridization buffer were added and mixed well. The mixture was denatured by heating at 72°C and immediately placed on ice. Probes were stored at -20°C until use.
[00144] Hybridization Buffer
[00145] Hybridization buffer was composed of 50% v/v deionized formamide, 2x SSC, 50 mM potassium dihydrogen phosphate/di sodium hydrogen phosphate buffer (KH2PO4/ Na2HP04, pH 7.0), 1 mM EDTA, 5-10% v/v dextran sulfate.
[00146] Mounting Media
[00147] It has been observed that anti-fade chemicals contained in mounting media as well as pH may interfere with the stability of some fluorophores. The inventors tested commercially available options (Vectashield, ProLong Gold and Diamond) as well as self-produced media (DABCO, Mowiol, TDE). The inventors obtained the best fluorophore stability across the entire utilized spectra (from aqua to far red spectra) with the self-produced glycerol based medium following protocol from Dr. J. Waters http://nic.med.harvard.edu/resources/media/. The glycerol based medium was composed of 0.5% w/v n-propyl gallate, 90% glycerol, 20 mM Tris, pH 8.
The final pH of the solution should be adjusted to pH 7.6 with hydrochloric acid. No counterstain such as DAPI (4',6-Diamidine-2'-phenylindole) was added in order to minimize unwanted background fluorescence.
[00148] Image Analysis Pipeline
[00149] Image Registration
[00150] Chromatic aberration, the vertical shift in apparent position of objects, is a concern when imaging samples with multiple fluorophores. Blue wavelengths are focused closer to the lens than red wavelengths. Without corrective measures, two spots of different
wavelengths appear as significantly separated and risk being undetected as a pair, despite actually overlapping. Although the search radius for each spot was increased to compensate for the shift, this caused several false-positive problems in certain genes, especially between pairs of spots originating from neighboring wavelengths, where the shift is not as large.
[00151] To correct for chromatic aberration, separate samples were prepared which included Y-chromosomes of cells stained with all 6 fluorophores (FIG. 7). Since it was observed that the extent of the shift differed from day-to-day, new samples were prepared and imaged for each experiment. Z-stacks from three randomly selected cells were acquired per image acquisition session. The resulting transformation matrix from registering the Y-chromosome data was then applied to the actual datasets stained with FISH probes. Channels 2-6 were registered to the reference channel, channel 1. Optical aberrations were observed in the axial and lateral directions, and to correct for this, only translational corrections in x, y, and z were implemented. Since optical aberrations are less expected to cause rotational and scaling differences between channels, such transformations were not used in order to decrease computation time and improve transform inference robustness. The registered z-stack for each corrected dataset was then saved as an ome.tiff format which could be read using a bioformats reader.
[00152] Nuclear Segmentation
[00153] The nuclear mask was obtained by low-pass filtering and then thresholding a channel with high background - in this case channel 6 - using an Otsu method-derived threshold value (FIG. 2B). With a l00x/l.46 N.A. objective lens, cells were limited to just one cell per field of view. However, as the image acquisition setup changed to using a 63x/l.4 N.A. objective lens, it was possible to fit more cells into a field of view to increase throughput. The watershed algorithm was implemented to split touching cells where present and each nucleus was analyzed separately.
[00154] Spot Detection
[00155] Fluorescence spots are composed of point sources as well as extended sources. The inventors applied point source detection algorithms to detect the probes in each channel. This approach relies on statistical comparison of image intensity local maxima with a model of the microscope point-spread function (PSF) as a 3D Gaussian function to detect low signal-to- noise ratio (SNR) spots in a manner which is adaptive to local variations in the signal and background, without requiring specification of arbitrary thresholds (FIG. 3 A-C). This approach allowed us to measure accurately diffraction-limited fluorescence signals taking into account the uncertainties of the fitted amplitude and local background when individually testing for the significance of each candidate signal. The sigma factor for the Laplacian of the Gaussian function in the lateral and axial directions were measured and averaged over several datasets for each magnification. For each detected local maximum, the amplitude intensity and the variation of noise in the back- ground were extracted to approximate the SNR, as well as the calculated position in x, y, and z. Local maxima that were below a alpha value set threshold were eliminated. This threshold was optimized for each channel so that there were no false negatives but with a low amount of false positives as visualized in Imaris (Bitplane, USA).
[00156] During spot detection, genes such as MYC in the lung cell line H1975 could be highly amplified to such an extent that the captured image manifested large blob-like shapes as opposed to spots. For each channel, the blobs were segmented by first excluding areas of the nucleus with spots, as determined by filtering with a Laplacian of Gaussian filter. Once the spots had been removed, the remaining nucleus region (i.e., background) intensity was sampled, and a threshold for segmenting blobs was robustly set as 3 SDs above the median intensity of this background. Noise was approximated by measuring the variance of the background from an annulus around the blob. The number of copies of highly amplified genes (like MYC in H1975) was estimated by taking the volume of each blob and dividing it by the approximate volume of one MYC spot. FIG. 8 shows an example of a nucleus with a mixture of spots and blobs. FIG. 8 includes a bar 800 representing a scale of 0.8 pm.
[00157] Coincident Spot Detection
[00158] The analysis algorithm requires a label matrix in the form of a .csv file that contains a list of pairs of channels for each gene. Briefly, coincident points for all possible combinations of fluorophores were identified as being spots within four pixels of each other. Any pairs that did not exist in the label matrix were immediately eliminated. In cases where there were more than 2 spots located at a position, the most likely combination was determined on the basis of brightest intensity.
[00159] Gene Count Reporting
[00160] For each nucleus, the number of coincident spot sets for each gene and signal -to- noise ratio for each channel were stored as .csv files. The algorithm can also be run in a batch analysis mode, which loops through multiple datasets and reports a summary of the average, median, and standard deviation of the gene count.
[00161] Algorithm Validation
[00162] Automated algorithms were developed to isolate bar-coded probe signals and to quantify their copy number per nucleus. The automated quantification algorithm was tested by assessing cell lines with the assumption that each gene would have to match the normative database obtained by manual quantification by single-probe/centromere control FISH. For each slide, ten randomly selected nuclei were imaged and automated copy number counts for each color combination representing each gene was undertaken with the results displayed as median copy number per cell. Automated analysis captured amplification of MYC in H1975 (FIG. 4F), ERBB2 in H1954. The inventors observed that accurate detection of dim FISH spots in the presence of low-signal and high noise resulted in some spots being missed during automated quantification like EGFR in FIG. 4D&F, TP53 in FIG. 4G and CDK4 in FIG. 5B-C. The definition and elimination of non-noise disruptive features, and accurate identification of probe barcodes in non-uniform background spectra, are the major challenges that are being addressed to improve count accuracy.
[00163] aCGH Analysis
[00164] For aCGH studies, genomic DNA was extracted from tumor tissue using the QIAamp Blood Mini Kit using a modified protocol incorporating deparaffinization and protease digestion (Qiagen). Agilent Sureprint 4 l80k CGH SNP microarrays (Agilent Technologies, Santa Clara, CA) containing approximately 180,000 copy number probes, covering both coding and noncoding human sequences, were used. Briefly, 1.0 mg of human reference DNA, male genomic control DNA (Coriell Institute, Camden, NJ), and 1.0 mg of tumor DNA were digested with Alul and Rsal, and then heat treated at 95°C for 5 minutes. Control and tumor DNAs were labeled by random priming with CY3-dUTP and CY5-dUTP dyes, respectively, using the Agilent SureTag Complete DNA Labeling Kit. The labeled DNAs were purified with the SureTag Reaction Purification Column and mixed in equal proportion for hybridization to the array in the presence of Cot-l DNA (Invitrogen, Carlsbad, CA) using the Agilent
[00165] Oligo aCGH Hybridization Kit. Hybridization steps included 3 minutes of denaturation at 95°C, prehybridization for 30 minutes at 37°C, and hybridization for 40 hours at 65°C. After hybridization, the slides were washed with Agilent Oligo Array CGH Wash Buffer 1 and Buffer 2. Washed slides were scanned using the Agilent G2565CA Microarray Scanner. Microarray TIFF (.tif) images were processed and analyzed with Agilent CytoGenomics software version 2.7.
[00166] Analysis of Reproducibility and Accuracy
[00167] Statistical guidance was obtained from the Massachusetts General Hospital Biostatistics Center, and analyses were performed independently. Analysis of variance was conducted on independent repeated hybridizations of normal lymphocytes (25 replicates) and cell lines 293T (12 replicates) and H1975 (9 replicates) to establish whether there was significant variation between the scoring distributions of each replicate. P < 0.05 was considered statistically significant. GraphPad Prism 5.0 software (GraphPad Software, San Diego, CA) and R-Studio 3.4.3 software (RStudio, Boston, MA) were used.
[00168] Highly Multiplex-FISH for In-situ Genomics
[00169] In various embodiments the inventors have developed a robust and quantitative single-slide hybridization assay utilizing a library of at least 50 locus-specific DNA sequence probes. As discussed above, the inventors have constructed probes having 5, 10, 15 and 20 genes obtained by BAC clone derived DNAs and probes having 5 and 10 genes obtained by PCR. The inventors produced a standard recipe for optimally mixing fluorophore labels with double and triple bar-codes for single BAC clones/PCR product. Labeling of each gene was carried out by nick translation separately and the products were then combined in a single tube, the probe mix. Combinatorial-labeled DNAs hybridization mixture, the probe mix, was tested in tumor cell lines and in cultured circulating tumor cells and hybridization conditions were established for these cells.
[00170] The inventors further have optimized PCR based probe construction by adding a sequence of additional nucleotides (a tail) at the 5’ ends of amplimers at the first part of PCR reaction. This tail prevents formation of primer dimers and allows carrying out massive reactions with multiple amplimers simultaneously. The inventors set a library of 10 genes and hybridized cultured tumor cells as well as formalin fixed paraffin embedded (FFPE) samples (FIG.14). FIG. 14A shows tested sequence specific Tail-PCR probe in cultured circulating tumor cells (CTCs). FIG. 14B shows tested sequence specific formalin-fixed paraffin-embedded glioblastoma multiforme (GBM) cancer samples. The probe contained 10 genes listed in the table shown in FIG. 14C. The inventors detected similar count distributions for the ten genes tested with the PCR approach compared to B AC, demonstrating the sensitivity of the method and efficiency of hybridization.
[00171] The inventors have used the Zeiss-Elyra (Zeiss, Oberkochen, Germany), a laser confocal microscope, to capture volumetric renditions of the samples and to acquire multiplex image data that was further used to build the image analysis algorithm. Volumetric renderings of confocal images allowed us to determine minimal pixel size, rule out the possibility of overlap of spots in z plane and to determine the minimal distance necessary for image acquisition. The inventors have observed that a minimal z distance of 0.2 pm, within the limit of a widefield microscope, is sufficient to individualize the signals in z axis therefore supporting a shift towards a widefield based platform.
[00172] The inventors have also experimented with shifting the imaging platform towards a higher throughput slide processor and image acquisition platforms. For this purpose slide preparation was carried out by a programmable walk-away robotic instrument, the VP2000 from Abbott Molecular (Des Plaines, IL), in order to decrease labor and laboratory costs. Automated tissue preparation showed expected outcome with efficient hybridization (FIG. 15). FIG. 15A shows an FFPE prepared with robotic slide processor for breast cancer samples. FIG. 15B shows an FFPE prepared with robotic slide processor for GBM cancer samples. It was important that the hybridization efficiency of formalin fixed paraffin embedded surgical biopsy specimens processed with the VP2000 and hybridized with PCR-based FISH probe mix recognizing 10 genes to supporting feasibility of automation and high volume workload platforms was checked. Preparation of slides as well as the probe showed efficiency comparable with manual slide processing and B AC derived probe. Sample was imaged with a widefield microscope and each wavelength was recognized with filter cubes.
[00173] Concerning higher throughput imaging platforms, images have been acquired with a laser confocal microscope with spectral imaging combined with linear unmixing, and the inventors have tested widefield optical systems in order to improve turnaround time. Also, widefield systems bring the advantage of making more photons available for collection which makes the signals of each dot brighter though at the cost of resolution especially in the z plane. The inventors foresaw that a perfect system that fits all needs is inexistent in the market therefore the inventors tested systems that could provide‘adjustable best fit’. The inventors tested multispectral platforms represented by the Vectra system from Perkin Elmer (Waltham, MA) and filter cube based systems represented by whole slide image (WSI) scanner, the Pannoramic Confocal from 3DHistech (Budapest, Hungary) and the stand alone capture station CytoVision® from Leica Biosystems (Buffalo Grove, IL). One unanticipated challenge with the multispectral Vectra system was the z-stacking limitation. Though the system was able acquire the spectral profiles that are vital to segregate each fluorescent signal, it could not collect serial optical sections (z stack) from which three-dimensional renderings are created and allow identification of overlapping signals.
[00174] On the other hand the whole slide image scanner was able to get serial sections of large areas for each channel (each wavelength). The scanner uses light-emitting diodes (LEDs) as the fluorescence light source. In theory a precise light intensity control and uniform
illumination could be acquired but in practice these parameters still need to be adjusted in order to get the best signal to noise ratio.
[00175] The CytoVision®, the stand alone platform, was able to acquire images with suitable resolution and contrast, though had limited field of view compared to the WSI (FIG. 16). Specifically, FIG. 16A shows a formalin fixed paraffin embedded (FFPE) specimen imaged with the Vectra multispectral imaging system, FIG. 16B shows an FFPE specimen imaged with the Pannoramic confocal whole slide image scanner, and FIG. 16C shows an FFPE specimen imaged with the CytoVision® platform. The Vectra image cannot collect multiple z planes precluding the evaluation of potential overlapping gene signals. On the other hand it provided a more precise distinction of each fluorescence channel. The whole slide image scanner provided multiple z planes of large fields of view, features that are appealing in a clinical setting. The Cyto Vision® platform can provide good signal-to-noise and resolution within one field of view.
[00176] FIG. 17A shows formalin fixed paraffin embedded (FFPE) of glioblastoma multiforme (GBM) case hybridized with a multiplex probe recognizing 10 genes and submitted to signal quantification. The signals are quantified in the graph shown in FIG. 17B.
[00177] The inventors continue applying point source detection algorithms (multi-channel spot detection) with sensitivity tuned to match the six channels to evaluate FFPE samples. The inventors selected fields of tumor cells and submitted to automated copy number counts for each color combination, representing each gene, with the results displayed as median copy number per cell. FFPE specimens showed a need for more efficient nuclear segmentation algorithms. Also, FFPE presented highly variable spot intensities and noise levels, crosstalk between image channels and variable background spectra and intensity (FIG. 18). FIGS. 18A and 18B show FFPE of GBM cases illustrating the need to improve nuclear segmentation to focus the analysis on genomic content and away from other spurious features in FFPE samples.
[00178] The inventors started working with FFPE specimens of glioblastoma multiforme cases. As mentioned above the inventors observed a highly variable spot intensities and noise levels, therefore the inventors will compare the present multi-channel spot detection with sensitivity tuned to match the six channels with a single-detection and spectral matching algorithm and select the approach with best performance and most consistent results.
[00179] Turning to FIG. 24, an example 2400 of a system for multiplex labeling of a sample and gene copy number evaluation is shown in accordance with some embodiments of the disclosed subject matter. As shown in FIG. 24, a computing device 2410 can receive information regarding an image of a sample to which a plurality of fluorescentlydabeled polynucleotide probes has been applied from a database and/or user interface 2402. In some embodiments, computing device 2410 can execute at least a portion of a system for multiplex labeling of a sample and gene copy number evaluation 2404 to identify a gene and determine a copy number of the identified gene based on data received from the database and/or user interface 2402.
Additionally or alternatively, in some embodiments, computing device 2410 can communicate information about the image data received from the database and/or user interface 2402 to a server 2420 over a communication network 2406, which can execute at least a portion of system for multiplex labeling of a sample and gene copy number evaluation 2404 to identify a gene and determine a copy number of the identified gene. In some such embodiments, server 2420 can return information to computing device 2410 (and/or any other suitable computing device) indicative of an output of system for multiplex labeling of a sample and gene copy number evaluation 2404, such as a signal obtained from a sample that is imaged by the imaging system. This information may be transmitted and/or presented to a user (e.g. a researcher, an operator, a clinician, etc.) and/or may be stored (e.g. as part of a research database or a medical record associated with a subject).
[00180] In some embodiments, computing device 2410 and/or server 2420 can be any suitable computing device or combination of devices, such as a desktop computer, a laptop computer, a smartphone, a tablet computer, a wearable computer, a server computer, a virtual machine being executed by a physical computing device, etc. As described herein, system for multiplex labeling of a sample and gene copy number evaluation 2404 can present information about an identified gene, a copy number of the gene, and/or another output of system for multiplex labeling of a sample and gene copy number evaluation 2404, such as an image obtained from a sample by the imaging system to a user (e.g., researcher and/or physician).
[00181] In some embodiments, the imaging system can be any imaging system that is suitable for obtaining images for a system for multiplex labeling of a sample and gene copy number evaluation 2404. In some embodiments, the imaging system may be local to computing device 2410. For example, the imaging system may be integrated with computing device 2410 (e.g., computing device 2410 can be configured as part of a device for multiplex labeling of a sample and gene copy number evaluation). As another example, the imaging system may be connected to computing device 2410 by a cable, a direct wireless link, etc. so that computing device 2410 can control the imaging system remotely. Additionally or alternatively, in some embodiments, the imaging system can be located locally and/or remotely from computing device 2410, and can be in communication with computing device 2410 (and/or server 2420) via a communication network (e.g., communication network 2406). [00182] In some embodiments, communication network 2406 can be any suitable communication network or combination of communication networks. For example,
communication network 2406 can include a Wi-Fi network (which can include one or more wireless routers, one or more switches, etc.), a peer-to-peer network (e.g., a Bluetooth network), a cellular network (e.g., a 3G network, a 4G network, etc., complying with any suitable standard, such as CDMA, GSM, LTE, LTE Advanced, WiMAX, etc.), a wired network, etc. In some embodiments, communication network 2406 can be a local area network, a wide area network, a public network (e.g., the Internet), a private or semi-private network (e.g., a corporate or university intranet), any other suitable type of network, or any suitable combination of networks. Communications links shown in FIG. 24 can each be any suitable communications link or combination of communications links, such as wired links, fiber optic links, Wi-Fi links, Bluetooth links, cellular links, etc.
[00183] FIG. 25 shows an example 2500 of hardware that can be used to implement computing device 2410 and server 2420 in accordance with some embodiments of the disclosed subject matter. As shown in FIG. 25, in some embodiments, computing device 2410 can include a processor 2502, a display 2504, one or more inputs 25025, one or more communication systems 2508, and/or memory 2510. In some embodiments, processor 2502 can be any suitable hardware processor or combination of processors, such as a central processing unit, a graphics processing unit, etc. In some embodiments, display 2504 can include any suitable display devices, such as a computer monitor, a touchscreen, a television, etc. In some embodiments, inputs 25025 can include any suitable input devices and/or sensors that can be used to receive user input, such as a keyboard, a mouse, a touchscreen, a microphone, etc.
[00184] In some embodiments, communications systems 2508 can include any suitable hardware, firmware, and/or software for communicating information over communication network 2406 and/or any other suitable communication networks. For example, communications systems 2508 can include one or more transceivers, one or more communication chips and/or chip sets, etc. In a more particular example, communications systems 2508 can include hardware, firmware and/or software that can be used to establish a Wi-Fi connection, a Bluetooth connection, a cellular connection, an Ethernet connection, etc. [00185] In some embodiments, memory 2510 can include any suitable storage device or devices that can be used to store instructions, values, etc., that can be used, for example, by processor 2502 to present content using display 2504, to communicate with server 2420 via communications system(s) 2508, etc. Memory 2510 can include any suitable volatile memory, non-volatile memory, storage, or any suitable combination thereof. For example, memory 2510 can include RAM, ROM, EEPROM, one or more flash drives, one or more hard disks, one or more solid state drives, one or more optical drives, etc. In some embodiments, memory 2510 can have encoded thereon a computer program for controlling operation of computing device 2410.
In such embodiments, processor 2502 can execute at least a portion of the computer program to present content (e.g., images, user interfaces, graphics, tables, etc.), receive content from server 2420, transmit information to server 2420, etc.
[00186] In some embodiments, server 2420 can include a processor 2512, a display 2514, one or more inputs 25125, one or more communications systems 2518, and/or memory 2520. In some embodiments, processor 2512 can be any suitable hardware processor or combination of processors, such as a central processing unit, a graphics processing unit, etc. In some
embodiments, display 2514 can include any suitable display devices, such as a computer monitor, a touchscreen, a television, etc. In some embodiments, inputs 25125 can include any suitable input devices and/or sensors that can be used to receive user input, such as a keyboard, a mouse, a touchscreen, a microphone, etc.
[00187] In some embodiments, communications systems 2518 can include any suitable hardware, firmware, and/or software for communicating information over communication network 2406 and/or any other suitable communication networks. For example, communications systems 2518 can include one or more transceivers, one or more communication chips and/or chip sets, etc. In a more particular example, communications systems 2518 can include hardware, firmware and/or software that can be used to establish a Wi-Fi connection, a Bluetooth connection, a cellular connection, an Ethernet connection, etc.
[00188] In some embodiments, memory 2520 can include any suitable storage device or devices that can be used to store instructions, values, etc., that can be used, for example, by processor 2512 to present content using display 2514, to communicate with one or more computing devices 2410, etc. Memory 2520 can include any suitable volatile memory, non volatile memory, storage, or any suitable combination thereof. For example, memory 2520 can include RAM, ROM, EEPROM, one or more flash drives, one or more hard disks, one or more solid state drives, one or more optical drives, etc. In some embodiments, memory 2520 can have encoded thereon a server program for controlling operation of server 2420. In such
embodiments, processor 2512 can execute at least a portion of the server program to transmit information and/or content (e.g., information regarding the virtual lens, the desired intensity pattern, the modified hologram, any data collected from a sample that is illuminated, a user interface, etc.) to one or more computing devices 2410, receive information and/or content from one or more computing devices 2410, receive instructions from one or more devices (e.g., a personal computer, a laptop computer, a tablet computer, a smartphone, etc.), etc.
[00189] In some embodiments, any suitable computer readable media can be used for storing instructions for performing the functions and/or processes described herein. For example, in some embodiments, computer readable media can be transitory or non-transitory.
For example, non-transitory computer readable media can include media such as magnetic media (such as hard disks, floppy disks, etc.), optical media (such as compact discs, digital video discs, Blu-ray discs, etc.), semiconductor media (such as RAM, Flash memory, electrically
programmable read only memory (EPROM), electrically erasable programmable read only memory (EEPROM), etc.), any suitable media that is not fleeting or devoid of any semblance of permanence during transmission, and/or any suitable tangible media. As another example, transitory computer readable media can include signals on networks, in wires, conductors, optical fibers, circuits, or any suitable media that is fleeting and devoid of any semblance of permanence during transmission, and/or any suitable intangible media.
[00190] In some embodiments, the optical signals are detected by photodiodes. It should be recognized that any option-electronic conversion device including but not limited to photo detectors, photodiodes, line-scan and two-dimensional cameras, and photodiode arrays can be used to perform this detection function. It should be noted that, as used herein, the term mechanism can encompass hardware, software, firmware, or any suitable combination thereof.
[00191] It will be apparent to those skilled in the art that numerous changes and modifications can be made in the specific embodiments of the invention described above without departing from the scope of the invention. Accordingly, the whole of the foregoing description is to be interpreted in an illustrative and not in a limitative sense.

Claims

CLAIMS What is claimed is:
1. A method for multiplex labeling of a sample and gene copy number evaluation, comprising:
providing a plurality of fl uorescentl y-1 abel ed polynucleotide probes, each of the plurality of fluorescently-labeled polynucleotide probes being directed to a different polynucleotide and being labeled with a distinct combination of fluorophores selected from a plurality of fluorophores;
applying the plurality of fluorescently-labeled polynucleotide probes to a sample;
obtaining an image of the sample comprising emissions from the plurality of
fluorophores;
analyzing the image to identify a location within the sample having a group of fluorophores of the plurality of fluorophores associated with the location;
identifying a gene associated with the location based on identifying the location within the sample having the group of fluorophores; and
determining a copy number of the identified gene.
2. The method of claim 1, wherein determining the copy number of the identified gene further comprises:
determining the copy number of the identified gene based on determining an intensity level of the group of fluorophores associated with the location.
3. The method of any one of claims 1 or 2, wherein the image comprises a plurality of nuclei associated with a respective plurality of cells in the sample.
4. The method of any one of claims 1-3, wherein obtaining the image of the sample further comprises:
obtaining the image of the sample using widefield microscopy.
5. The method of any one of claims 1-3, wherein obtaining the image of the sample further comprises:
obtaining the image of the sample using confocal microscopy.
6. The method of any one of claims 1-5, wherein analyzing the image further comprises:
analyzing the image using linear unmixing.
7. The method of claim 6, wherein linear unmixing is performed relative to a reference spectrum.
8. The method of any one of claims 1-7, wherein analyzing the image to identify a location within the sample having a group of fluorophores associated with the location further comprises:
identifying at least two fluorophores of the group of fluorophores in the image, and
determining that the at least two fluorophores are colocalized if the at least two fluorophores are no more than a particular spatial distance apart.
9. The method of claim 8, wherein the particular spatial distance is 0 24pm
10. The method of any one of claims 1-9, wherein applying the plurality of fluorescently- labeled polynucleotide probes to the sample further comprises:
applying the plurality of fluorescently-labeled polynucleotide probes to the sample using fluorescence in situ hybridization (FISH).
11. The method of any one of claims 1-10, wherein each of the plurality of fluorescently- labeled polynucleotide probes comprises at least two different fluorophores.
12. The method of any one of claims 1-11, wherein a particular fluorescently-labeled polynucleotide probe of the plurality of fluorescently-labeled polynucleotide probes comprises a plurality of copies of the particular fluorescently-labeled polynucleotide probe directed to a particular gene,
wherein a first portion of the plurality of copies of the particular fluorescently- labeled polynucleotide probe are labeled with a first fluorophore of the plurality of fluorophores, and
wherein a second portion of the plurality of copies of the particular fluorescently- labeled polynucleotide probe different from the first portion are labeled with a second fluorophore of the plurality of fluorophores different from the first fluorophore, and wherein applying the plurality of fluorescently-labeled polynucleotide probes to a sample further comprises:
applying the first portion and the second portion of the plurality of copies of the particular fluorescently-labeled polynucleotide probe to the sample, wherein analyzing the image to identify a location within the sample having a group of fluorophores of the plurality of fluorophores associated with the location further comprises:
analyzing the image to identify the location within the sample having the first fluorophore and the second fluorophore, and
wherein identifying a gene associated with the location based on identifying the location within the sample having the group of fluorophores further comprises:
identifying the particular gene based on identifying the location within the sample having the first fluorophore and the second fluorophore.
13. The method of any one of claims 1-12, wherein the plurality of fluorescently-labeled polynucleotide probes comprises a first probe directed to a first gene and a second probe directed to a second gene different from the first gene,
wherein the first probe comprises a first group of fluorophores of the plurality of fluorophores and the second probe comprises a second group of fluorophores of the plurality of fluorophores different from the first group of fluorophores.
14. The method of claim 13, wherein analyzing the image to identify a location within the sample having a group of fluorophores of the plurality of fluorophores associated with the location further comprises:
analyzing the image to identify the first group of fluorophores associated with a first location,
analyzing the image to identify the second group of fluorophores associated with a second location different from the first location, and
wherein identifying a gene associated with the location based on identifying the location within the sample having the group of fluorophores further comprises:
identifying the first gene associated with the first location based on the first location including the first group of fluorophores, and
identifying the second gene associated with the second location based on the second location including the second group of fluorophores.
15. The method of any one of claims 1-14, wherein the plurality of fluorescently-labeled polynucleotide probes comprise at least one of bacterial artificial chromosome (BAC) clones, PCR-generated DNA, or synthetic DNA.
16. An apparatus for multiplex labeling of a sample and gene copy number evaluation, comprising:
a processor in communication with an imaging system, the processor to:
obtain an image of a sample from the imaging system,
the image comprising emissions from a plurality of fluorophores associated with the sample, and
the sample comprising a plurality of fluorescently-labeled polynucleotide probes applied to the sample, each of the plurality of fluorescently-labeled polynucleotide probes being directed to a different polynucleotide and being labeled with a distinct combination of fluorophores selected from the plurality of fluorophores, analyze the image to identify a location within the sample having a group of fluorophores of the plurality of fluorophores associated with the location,
identify a gene associated with the location based on identifying the location within the sample having the group of fluorophores, and
determine a copy number of the identified gene.
17. The apparatus of claim 16, wherein the processor, when determining the copy number of the identified gene, is further to:
determine the copy number of the identified gene based on determining an intensity level of the group of fluorophores associated with the location.
18. The apparatus of any one of claims 16 or 17, wherein the image comprises a plurality of nuclei associated with a respective plurality of cells in the sample.
19. The apparatus of any one of claims 16-18, wherein the imaging system is a widefield microscopy system, and
wherein the processor, when obtaining the image of the sample from the imaging system, is further to:
obtain the image of the sample from the widefield microscopy system.
20. The apparatus of any one of claims 16-19, wherein the imaging system is a confocal microscopy system, and
wherein the processor, when obtaining the image of the sample from the imaging system, is further to:
obtain the image of the sample from the confocal microscopy system.
21. The apparatus of any one of claims 16-20, wherein the processor, when analyzing the image, is further to:
analyze the image using linear unmixing.
22. The apparatus of claim 21, wherein linear unmixing is performed relative to a reference spectrum.
23. The apparatus of any one of claims 16-22, wherein the processor, when analyzing the image to identify a location within the sample having a group of fluorophores associated with the location, is further to:
identify at least two fluorophores of the group of fluorophores in the image, and determine that the at least two fluorophores are colocalized if the at least two fluorophores are no more than a particular spatial distance apart.
24. The apparatus of claim 23, wherein the particular spatial distance is 0 24pm
25. The apparatus of any one of claims 16-24, wherein the plurality of fluorescently-labeled polynucleotide probes are applied to the sample using fluorescence in situ hybridization (FISH).
26. The apparatus of any one of claims 16-25, wherein each of the plurality of fluorescently- labeled polynucleotide probes comprises at least two different fluorophores.
27. The apparatus of any one of claims 16-26, wherein a particular fluorescently-labeled polynucleotide probe of the plurality of fluorescently-labeled polynucleotide probes comprises a plurality of copies of the particular fluorescently-labeled polynucleotide probe directed to a particular gene,
wherein a first portion of the plurality of copies of the particular fluorescently- labeled polynucleotide probe are labeled with a first fluorophore of the plurality of fluorophores, and
wherein a second portion of the plurality of copies of the particular fluorescently- labeled polynucleotide probe different from the first portion are labeled with a second fluorophore of the plurality of fluorophores different from the first fluorophore,
wherein the first portion and the second portion of the plurality of copies of the particular fluorescently-labeled polynucleotide probe are applied to the sample, and wherein the processor, when analyzing the image to identify a location within the sample having a group of fluorophores of the plurality of fluorophores associated with the location, is further to:
analyze the image to identify the location within the sample having the first fluorophore and the second fluorophore, and
wherein the processor, when identifying a gene associated with the location based on identifying the location within the sample having the group of fluorophores, is further to:
identify the particular gene based on identifying the location within the sample having the first fluorophore and the second fluorophore.
28. The apparatus of any one of claims 16-27, wherein the plurality of fluorescently-labeled polynucleotide probes comprises a first probe directed to a first gene and a second probe directed to a second gene different from the first gene,
wherein the first probe comprises a first group of fluorophores of the plurality of fluorophores and the second probe comprises a second group of fluorophores of the plurality of fluorophores different from the first group of fluorophores.
29. The apparatus of claim 28, wherein the processor, when analyzing the image to identify a location within the sample having a group of fluorophores of the plurality of fluorophores associated with the location, is further to:
analyze the image to identify the first group of fluorophores associated with a first location,
analyze the image to identify the second group of fluorophores associated with a second location different from the first location, and
wherein the processor, when identifying a gene associated with the location based on identifying the location within the sample having the group of fluorophores, is further to:
identify the first gene associated with the first location based on the first location including the first group of fluorophores, and identify the second gene associated with the second location based on the second location including the second group of fluorophores.
30. The apparatus of any one of claims 16-29, wherein the plurality of fluorescently-labeled polynucleotide probes comprise at least one of bacterial artificial chromosome (BAC) clones, PCR-generated DNA, or synthetic DNA.
PCT/US2019/058126 2018-10-25 2019-10-25 Highly multiplexed fluorescence in situ hybridization (fish) platform for gene copy number evaluation WO2020086992A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201862750685P 2018-10-25 2018-10-25
US62/750,685 2018-10-25

Publications (1)

Publication Number Publication Date
WO2020086992A1 true WO2020086992A1 (en) 2020-04-30

Family

ID=70331679

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2019/058126 WO2020086992A1 (en) 2018-10-25 2019-10-25 Highly multiplexed fluorescence in situ hybridization (fish) platform for gene copy number evaluation

Country Status (1)

Country Link
WO (1) WO2020086992A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116542978A (en) * 2023-07-06 2023-08-04 珠海圣美生物诊断技术有限公司 Quality detection method and device for FISH probe

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100113289A1 (en) * 2008-10-30 2010-05-06 Bluegnome Limited Method and system for non-competitive copy number determination by genomic hybridization DGH
US20140031243A1 (en) * 2010-03-08 2014-01-30 California Institute Of Technology Multiplex detection of molecular species in cells by super-resolution imaging and combinatorial labeling

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100113289A1 (en) * 2008-10-30 2010-05-06 Bluegnome Limited Method and system for non-competitive copy number determination by genomic hybridization DGH
US20140031243A1 (en) * 2010-03-08 2014-01-30 California Institute Of Technology Multiplex detection of molecular species in cells by super-resolution imaging and combinatorial labeling

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
HUBER ET AL.: "Rapid micro fluorescence in situ hybridization in tissue sections", BIOMICROFLUIDICS, vol. 12, no. 4, 30 May 2018 (2018-05-30), pages 042212, XP081238585 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116542978A (en) * 2023-07-06 2023-08-04 珠海圣美生物诊断技术有限公司 Quality detection method and device for FISH probe
CN116542978B (en) * 2023-07-06 2023-10-20 珠海圣美生物诊断技术有限公司 Quality detection method and device for FISH probe

Similar Documents

Publication Publication Date Title
US20230295697A1 (en) Multiplex labeling of molecules by sequential hybridization barcoding
TWI728994B (en) Single-molecule sequencing of plasma dna
Onozato et al. Highly multiplexed fluorescence in situ hybridization for in situ genomics
Pajor et al. State‐of‐the‐art FISHing: Automated analysis of cytogenetic aberrations in interphase nuclei
US20090208965A1 (en) Automated method for detecting cancers and high grade hyperplasias
CA2438267A1 (en) Methods and probes for the detection of cancer
WO2007080583A2 (en) Methods and systems for analyzing biological samples
US20200131565A1 (en) Automated method for detecting cervical cancers and high grade hyperplasias
JP2023525993A (en) Equalization-based image processing and spatial crosstalk attenuator
Szalóki et al. High throughput FRET analysis of protein–protein interactions by slide‐based imaging laser scanning cytometry
Dai et al. Evaluation of optical genome mapping for detecting chromosomal translocation in clinical cytogenetics
WO2020086992A1 (en) Highly multiplexed fluorescence in situ hybridization (fish) platform for gene copy number evaluation
Potapova et al. Karyotyping human and mouse cells using probes from single-sorted chromosomes and open source software
Bar-Shira et al. Array-based comparative genome hybridization in clinical genetics
US20050042609A1 (en) Method and system for detecting inter-chromosomal imbalance by fluorescent in situ hybridization (fish) on interphase nuclei
US20210072143A1 (en) High capacity molecule detection
Lawce et al. Fluorescence in situ hybridization (FISH)
Cai Spatial mapping of single cells in human cerebral cortex using DARTFISH: A highly multiplexed method for in situ quantification of targeted RNA transcripts
Emad et al. Efficiency of manual scanning in recovering rare cellular events identified by fluorescence in situ hybridization: simulation of the detection of fetal cells in maternal blood
US20030143524A1 (en) Method and system for determining the amount, distribution and conformation of genetic material in a cell
US20240177807A1 (en) Cluster segmentation and conditional base calling
US20230015945A1 (en) Intensity extraction and spatial crosstalk attenuation for base calling
WO2024059852A1 (en) Cluster segmentation and conditional base calling
Wu et al. Spatial multi-omics at subcellular resolution via high-throughput in situ pairwise sequencing
WO2023239917A1 (en) Dependence of base calling on flow cell tilt

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19875588

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19875588

Country of ref document: EP

Kind code of ref document: A1