WO2012041333A1 - Automated imaging, detection and grading of objects in cytological samples - Google Patents

Automated imaging, detection and grading of objects in cytological samples Download PDF

Info

Publication number
WO2012041333A1
WO2012041333A1 PCT/DK2011/050374 DK2011050374W WO2012041333A1 WO 2012041333 A1 WO2012041333 A1 WO 2012041333A1 DK 2011050374 W DK2011050374 W DK 2011050374W WO 2012041333 A1 WO2012041333 A1 WO 2012041333A1
Authority
WO
WIPO (PCT)
Prior art keywords
threshold
chromaticity
pixels
minimum threshold
maximum threshold
Prior art date
Application number
PCT/DK2011/050374
Other languages
French (fr)
Inventor
Niels Taekker Foged
Michael Friis Lippert
Johan DORÉ HANSEN
Steen FROST TOFTHØJ
Thomas Ebstrup
Kristian Almstrup
Original Assignee
Visiopharm A/S
Rigshospitalet
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Visiopharm A/S, Rigshospitalet filed Critical Visiopharm A/S
Publication of WO2012041333A1 publication Critical patent/WO2012041333A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/40Analysis of texture
    • G06T7/41Analysis of texture based on statistical description of texture
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/69Microscopic objects, e.g. biological cells or cellular parts
    • G06V20/695Preprocessing, e.g. image segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10056Microscopic image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20036Morphological image processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30024Cell structures in vitro; Tissue sections in vitro

Definitions

  • the present invention relates to methods for the imaging, detection, and grading of objects in cytological samples or digital representations thereof, as well as a microscope-, slide scanner- or flow cell-based system and a computer readable medium therefore.
  • Cytopatology the study of cellular disease and the use of cellular changes for the diagnosis, prognosis, screening and/or treatment prediction of disease.
  • Microscopy is a central and traditional method for cytology by which objects of interest in cell biology or cytopathology can be studied through magnification and improved visual representation of the objects.
  • the objects per definition are individual cells or fractions or clusters thereof, but could also include e.g., connective tissue, viruses, or even non-biological constituents.
  • Visual examination in cytological microscopy is traditionally done by the operator at the microscope, but the recent availability of slide scanners ("virtual microscopes”) has allowed the observations to be done off-line, i.e. at a location away from the microscope, on digital representations of the magnified specimen presented on a computer monitor.
  • Histology is a special type of cytology, in which microscopy is used to investigate a thin slice of a tissue isolated by biopsy, and preserved and handled by methods such as fixation, embedding, and sectioning.
  • the tissue section is typically mounted on a glass or plastic slide and inspected by transmitted light microscopy either unstained or after a staining procedure emphasizing particular microanatomic and molecular features of interest.
  • the cytological sample is a suspension of cells originating from e.g., a body fluid, a bodily surface or tissue from which cells are segregated or aspirated and then suspended, or a cell culture.
  • transmitted light microscopy is a central method for investigation, and the cell sample is typically prepared by making a smear, a cytospin or an imprint on the surface of a glass or plastic slide, before most likely being exposed to fixation, and staining similarly to the methods for histology.
  • magnification microscopy in large surface areas of samples is so long that the procedure becomes costly and inefficient.
  • CIS carcinoma-in-situ
  • the present invention relates to new image analysis methods for use in the processing of digital images acquired by slide microscopy or slide scanning of cytological samples, wherein the new methods are capable of automated detection of objects resembling particular cells, structures or fractions thereof, and of automated grading, scoring and/or ranking of this resemblance.
  • the present methods provide tools for enhancing and detecting objects in digital representations of stained tissue samples or cell samples, thereby facilitating subsequent processing of the representation, including quantification of the staining and grading of the enhanced objects.
  • the present methods also provide tools for arranging the images of objects in a cytological sample in a systematic order for presentation, or for selecting the most appropriate coordinates of the sample for further image acquisition and analysis, or for manual review, contributing to documentation of the status of the sample.
  • the invention relates to a method for assisting the diagnosing, prognostication or monitoring of testicular carcinoma, and/or for assisting the prediction of the outcome of treatment of testicular carcinoma in an individual, based on a cytological sample from said individual, which has been tagged with one or more tags capable of tagging one or more distinctive cell types characteristic of testicular carcinoma, said method comprising the steps of:
  • the present invention allows for an automatic method for assisting the diagnosing, prognostication or monitoring of testicular carcinoma and thereby the present invention allows for the first time ever for a realistic method for screening for testicular carcinoma in groups of men, and according the present invention also relates to the use of the method for screening a population for testicular carcinoma.
  • the present document can only reproduce the images as black-and-white (B/W) copies or greyscale representations, not in colour.
  • the original digital colour images are reproduced as B/W-copies, and therefore will not provide the reader with all the information available in the original digital colour images.
  • the segmented images are typically greyscale representations or binary representations.
  • Figure 1 a shows a B/W photocopy of an original digital colour representation of a microscopically magnified area of a semen sample on a glass slide after
  • Figure 1 b shows a greyscale representation obtained by transforming the RGB colours of each pixel in the original digital colour image represented in Figure 1 a into a red chromaticity value, so that pixels with high red chromaticity values are shown in white/light grey and pixels with low red chromaticity values are shown in black/dark grey.
  • Figure 1 c shows a greyscale representation obtained by transforming the RGB colours of each pixel in the original digital colour image represented in Figure 1 a into a blue chromaticity value, so that pixels with high blue chromaticity values are shown in white/light grey and pixels with low blue chromaticity values are shown in black/dark grey.
  • Figure 1 d is a binomial representation of Figure 1 b, which is made by converting each pixel with a red chromaticity value above a defined threshold, T R , to a black pixel, and all other pixels to white pixels. Subsequently, black pixels were converted to white, if the intensity of the corresponding pixel in the original digital colour image represented in Figure 1 a was below a defined intensity threshold, T,.
  • Figure 1 e is a binomial representation of Figure 1 c, which is made by converting each pixel with a blue chromaticity value above a defined threshold, T B , to a black pixel, and all other pixels to a white pixel. Subsequently, black pixels were converted to white, if the intensity of the corresponding pixel in the original digital colour image represented in Figure 1 a was below a defined intensity threshold, T,.
  • Figure 1 f is a binomial representation combining data from of Figures 1 b and 1 c, which is made to identify purplish pixels by converting each pixel with a red chromaticity value above a defined threshold, T R2 , and a blue chromaticity value above a defined threshold, T B2 , to a black pixel and all other pixels to a white pixel. Subsequently, black pixels were converted to white, if the intensity of the corresponding pixel in the original digital colour image represented in Figure 1 a was below a defined intensity threshold,
  • Figure 1 g outlines the Four Neighbour Connectivity rule, where the "X” marks the pixel currently being investigated, and the four grey pixels (marked “+”) are the pixels that are defined to be connected to this particular pixel. Though many other definitions are possible, a cluster of pixels is often defined according to the Four Neighbour
  • Figure 1 h is a binomial representation based on data from Figure 1 d, for which pixels are converted from black to white, if they belong to a cluster of black pixels, which has an area below a defined minimum reddish profile area-threshold, A Rjmi n or an area above a defined maximum reddish profile area-threshold, A Rjmax .
  • the remaining black pixels represent clusters of reddish pixels in the original digital colour image with an admissible profile area. In this example, there is four reddish clusters numbered 1 to 4.
  • Figure 1 i is a binomial representation based on combining data from Figures 1 e and 1f, so that pixels which are black in either or both of these figures becomes black, and only pixels which are white in both of these figures become white.
  • the black pixels represent pixels which are bluish and/or purplish in the original digital colour image represented in Figure 1 a.
  • Figure 1j is a binomial representation based on data from Figure 1 i, for which pixels are converted from black to white, if they belong to a cluster of black pixels, which has an area below a defined minimum bluish/purplish profile area-threshold, A B /p ,m in or an area above a defined maximum bluish/purplish profile area-threshold, A B /p ,m ax-
  • the remaining black pixels represent clusters of bluish and/or purplish pixels in the original digital colour image with an admissible profile area.
  • Figure 2 is an overview corresponding to Table 1 including B/W photocopies of exemplifying colour images of typical specific parameters measured for the relevant pixel clusters and individual pixels representing objects in the original digital image.
  • the majority of these specific parameters relate to distributions of pixel colors and positions, but they could be any descriptive parameter, such as e.g. circumference, circularity and other shape factors.
  • Figure 3 corresponds to Table 2 and shows B/W photocopies of colour image examples corresponding to manual grading distributions for the 2,437 relevant objects selected by expert assessors in 506 whole slide semen cytospin scanning images from men in routine subfertility work-up. These objects were used for developing and optimizing the classification tree (Table 3), which ensures the automated grading of CIS cell like objects.
  • Figure 4 shows B/W photocopies of eight examples of colour images of the
  • archetypical CIS testis cell as it may appear after being released into the seminal fluid, isolated by cytospinning, and stained for intrinsic phosphatase activity (in this example: blue cytoplasm) and by immunocytochemistry for ⁇ -2 ⁇ (in this example: red nucleus).
  • Diagnosing means the process of recognizing a disease or condition by its signs or symptoms in an individual. Typically, the individual being diagnosed will not have had a similar diagnosis previously, but in some cases a similar or identical diagnosis will be made after the patient was treated for the first outbreak of the disease or condition. For example, a patient diagnosed with unilateral CIS of the testis may be treated and appear healthy for a period, and then subsequently be diagnosed with CIS in the other testicle.
  • Prognosticating means the process of describing the likely progression and outcome of the disease or condition in a patient. Typically, the described outcome would assume no treatment or standard treatment of the patient, and be based on medical experiences from individuals with similar general
  • predicting the outcome of treatment means the process of foreseeing the consequences of treating a patient in a specific way before commencing the treatment. Typically, the prediction will be based on analyses indicating the individual patient's response to the relevant medicaments, including chances of beneficial action and risk of side effects. For example, analysis for resistance to a particular chemotherapy of isolated cancer cells may be useful before choosing the medication for a patient.
  • monitoring the effect of a treatment means the process of regularly checking a patient during and after treatment for possible remission.
  • the monitoring will include sensitive analyses for the disease or condition being treated, since even an early, minor and normally rather unspecific sign may be a warning of an insufficient treatment result.
  • a patient who is, or has been, undergoing treatment for CIS testis may be followed by regular analyses for presence of CIS cells in ejaculates.
  • population screening means the process of systematically analyzing a group of individuals for the presence of a disease. Typically, the analyses will be able to identify individuals with a non- symptomatic stage of the disease.
  • the screening may be offered to individuals belonging to groups having an increased risk of the disease, or to even wider groups merely defined by their age, race and/or gender. For example, men experiencing infertility have a higher risk of CIS testis, and since they will be prone to analysis of semen quality anyway, they would be likely candidates for a screening for CIS cells in their ejaculate.
  • Cell does not deviate from the meaning understood by a person skilled in the art, but it is important that it is not limited to intact cell, but also includes fractions thereof still having a morphology resembling a cell.
  • Cytological material can undergo degradation such as enzymatic lysis either when still in the body or after isolation. Also cytospinning and subsequent handling, such as staining may lead to a partial degradation of cells.
  • the CIS cell can be degraded due to the harsh environment in the semen fluid, and in particular its cytoplasm may not appear intact when inspected by microscopy.
  • nucleus does not deviate from the meaning understood by a person skilled in the art i.e., an organelle containing almost all of the cell's genetic material.
  • the focus in the current context is on various staining techniques used for tissue sections or cytological samples, which may lead to staining of the nucleus either generally for eukaryotic cells, e.g., by hematoxylin, or specifically for the nucleus of certain cells, e.g. by immunohistochemistry for a nuclear antigen, such as ⁇ -2 ⁇ in the nucleus of CIS testis cells.
  • Cytoplasm does not deviate from the meaning understood by a person skilled in the art i.e., entire contents of a eukaryotic cell excluding the nucleus, and bounded by the plasma membrane.
  • the focus in the current context is on various staining techniques used for tissue sections or cytological samples, which may lead to staining of the cytoplasm either generally for eukaryotic cells, e.g., by eosin, or specifically for the cytoplasm of certain cells, e.g. by staining for intrinsic enzyme activity, such as phosphatase activity in the cytoplasm of CIS testis cells.
  • object in relation to a cytological sample typically means a structure being a cell or resembling a cell. More generally, as used herein objects are the structures, which are to be automatically detected and graded, when represented in digital images. Beyond cells they could be e.g., organelles, cell fragments, matrix or crystals.
  • Tagging in relation to a cytological sample means to perform one or more chemical reaction steps in order to bind or associate molecules to objects in the sample, in such a way that these molecules can be indirectly or directly identified by microscopy or other detection methods.
  • the term tagging is used herein interchangeably with the term "staining".
  • the tagging can be by a specific antibody raised against a target antigen characteristic for the object to be identified.
  • the tags must be indirectly labeled by a detection reaction typically including either a fluorophor or an extrinsic enzyme which is able to convert a soluble substrate to a precipitated colored product, as is the case for fluorescence and chromogenic immunocytochemistry, respectively.
  • the tagging can be by a substrate for an intrinsically active target enzyme characteristic for the object to be identified.
  • the substrate tag directly labels the object when it is converted into a precipitated colored product.
  • tagging is typically an equilibrium reaction and that the tagging therefore can be almost complete or just partial depending on the efficiency of tagging the particular target.
  • Digital image As used herein, the expression "digital image” does not deviate from the meaning understood by a person skilled in the art, and it includes both very large images typically derived from slide scanners, and smaller images acquired by digital cameras positioned on microscopes.
  • Pixel The term pixel is generally used for the smallest single component of a digital image.
  • the term includes "printed pixels", pixels carried by electronic signals, or represented by digital values, or pixels on a display device, or pixels in a digital camera (photosensor elements). Synonyms of "pixels" are picture element, pel, sample, byte, bit, dot, spot. The person skilled in the art will recognize that the list of synonyms of pixel is not complete and that other terms for pixel are also covered by the invention.
  • Cluster in relation to pixels means a group of pixels belonging to the same segment (or perhaps one of several joined segments), and inter-positioned according to a predefined neighbor relationship, such as the Four Neighbour Connectivity rule (see Fig. 1 g).
  • clusters of pixels are identified by segmentation of the digital image according to spectral, spatial, contextual, and morphological information, and subsequently graded for their degree of resemblance to digital representations of objects of interest, such as CIS cells.
  • Transformation Given a digital image as input, a transformation outputs an image based on a per pixel arithmetic operation. Thus, each pixel in the output image is calculated by performing a given arithmetic operation on the corresponding pixel in the input image.
  • An example of a transformation could be a basic intensity calculation, given a three band image with each pixel containing a red, green and blue value as input. Each pixel value in the transformed image is calculated as the sum of the red, green and blue values in the corresponding input pixel divided by three (the number of colour values).
  • spectral Information includes any quantitative color and intensity information, and values derived from multiple color information such as e.g., the hue calculated from the red, green and blue channel of individual or clusters of digital image pixels.
  • Spatial, contextual and morphological Information includes any quantitative
  • a normalization changes the range of pixel values in an image and is a special case of a transformation.
  • An example of a normalization could be a basic stretching of pixel values from the minimum pixel value in the image to the maximum pixel value in the image.
  • each pixel value in the transformed image is calculated by subtracting the minimum value from the corresponding input pixel value and dividing this by the maximum value minus the minimum value.
  • a filtering Given a digital image as input, a filtering outputs an image based on a given per pixel calculation. Each pixel in the output image is calculated by performing an arithmetic operation on a given number of pixels in the input image. These pixels are defined by a given filter, which can be envisioned as a small image denoting which pixels are to be included in the calculation relative to the pixel the calculation is performed for.
  • An example of a filtering could be a mean filter, using a 3 by 3 pixel filter centred on the pixel the calculation is performed for.
  • the value of each pixel in the output image is calculated by summing up the values in the input image under the filter and dividing by the number of pixels under the filter (for example 9).
  • Segmentation When applied to digital images, the actual goal of a segmentation is to divide images, pixel clusters or individual pixels into segments. This segmentation is performed by a given algorithm and is based on a given number of descriptive parameters per image, pixel cluster or individual pixel. As used herein "classification” is not used interchangeable with “segmentation”, since “segmentation” is a term used for image transformation, whereas “classification” is used only for grading images, pixel clusters, pixels and objects.
  • Thresholding is a subset of segmentation. For example, it could be a basic threshold on the green band given a threshold value and given a three band image with each pixel containing a red, green and blue value as input.
  • a preferred segmentation criterion for the invention is; each pixel in the output image is set to 1 if the corresponding pixel in the input image has a green value that is higher than the threshold value, otherwise it is set to 0.
  • Alkaline phosphatase Placenta-like alkaline phosphatase, is used herein
  • alkaline phosphatase and ALPPL2.
  • phosphatase is used, particularly when describing staining methods for the intrinsic enzyme activity in the cytoplasm of CIS testis cells.
  • Distinctive cell is a cell which is capable of being tagged by one or more tags as mentioned herein.
  • a distinctive cell type can further be a cell type with characteristics of a testicular carcinoma. Among distinctive cells are for example archetypical distinctive cells.
  • Archetypical distinctive cell are selected examples of cells showing the typical features of that distinctive cell. The selection is typically done by experienced experts in the relevant field. E.g., certain experts in the field of testicular cancer will know how an archetypical CIS testis cell looks, when it has been released into the semen fluid, isolated by cytospinning and stained for ⁇ -2 ⁇ and intrinsic phosphatase activity.
  • Chromaticity An objective specification of the quality of a color regardless of its luminance, that is, as determined by its hue and colorfulness.
  • CIS Abbreviation for 'carcinoma in situ'. An example is carcinoma in situ of the testis.
  • Classification tree A classification tree is used to predict/classify membership of cases or objects in the classes of a categorical dependent variable from their measurements on one or more predictor variables (parameters). The synonyms regression tree or decisions tree are used interchangeably with classification tree in this document .
  • Cross validation is the practice of partitioning the available data into multiple subsamples such that one or more of the subsamples is used to fit (train) a model (herein a classification tree), with the remaining subsamples being used to test how well the classification tree performs.
  • Cytological sample A sample or suspension of cells originating from e.g., a body fluid, a bodily surface or tissue.
  • a cytological sample can be obtained by biopsy, a fine needle biopsy, a tissue brushing, scraping of cells, collecting a body fluid. Cells of a cytological sample can be segregated or aspirated and then suspended, from a biopsy or a from cell culture.
  • Cvtospin A collected sample typically including cells is fixed onto a microscope slide by centrifugation.
  • Ejaculate The fluid emitted from a male penis that contains, among other things, sperm. It may or may not contain sperm cells (spermatozoa) and herein used interchangeably with the expressions "semen sample” or "seminal fluid”.
  • Fiducial marker A fiduciary marker or fiducial point is an object used in the field of view of an imaging system, which appears in the image produced, for use as a point of reference or a measure.
  • Grade and "grading" is used herein to describe the result and process of assigning a score to an object.
  • the assigned score may be on a continuous scale or distinct.
  • the grading in image analysis could be based on single pixel, but typically involve one or more clusters of pixels.
  • the grading of one or more objects in a patient sample may contribute to the assessment of a diagnosis, prognosis, and/or prediction of treatment.
  • Immunohistochemistry The process of localizing proteins in cells of a tissue section by employing antibodies specific for the proteins of interest and using a method that allows the antibodies to bind to the proteins while in a relatively native setting in a biological tissue section.
  • Marker An indicator signaling an event or condition in a biological system or sample giving a measure of status, exposure, susceptibility and more of the biological system, dependent on the marker.
  • a marker is herein the presence of a gene or product(s) hereof, the presence or relative level of which alone or in combination with other markers may indicate a neoplastic and/or cancerous state.
  • Membrane A membrane of a cell or an organelle in said cell, for example a cell membrane or a nuclear membrane.
  • a membrane protein or marker is attached to, or associated with, or interacting with a membrane.
  • Risk group Certain groups of individuals having common relevant traits will have a different risk than the rest of the population not having those traits, of acquiring particular disease(s). Individuals belonging to a high risk group for a certain disease may benefit from particular actions to avoid getting exposed to the disease or to regularly monitor if they have acquired the disease.
  • men who experience fertility problems, men who were affected by cryptorchidism in childhood, and men who were already treated for unilateral testis cancer will belong to (high) risk groups for CIS testis, and may choose to more frequent or more thorough examinations for this disease, than will men not having experienced fertility problems, and not having had cryptorchidism or unilateral testis cancer.
  • Virtual slide or whole slide image is a digital image file typically acquired by a slide scanner (also called a virtual microscope) characterized by covering a microscope glass slide in its entirety or a substantial part thereof in (very) high resolution, e.g. one billion pixels. Virtual slides allow rare cell event detection to be performed away from the microscope by use of a computer and an image analysis algorithm.
  • a slide scanner also called a virtual microscope
  • Virtual slides allow rare cell event detection to be performed away from the microscope by use of a computer and an image analysis algorithm.
  • the present invention provides a tool for assisting the diagnosing, prognostication or monitoring of testicular carcinoma through image analysis assistance in the grading of the potential carcinoma or carcinoma in situ (CIS) cells.
  • the above mentioned tool can be applied as an automated screening method for CIS cells in semen samples as described herein and is particularly useful since it can lead to detection of asymptomatic changes of normal cells to diseased cells. Detection of such changes before a disease matures into clinical symptoms and disease
  • Automated methods of the present invention may further have the advantage that screening can be done for a larger group of patients than by using the conventional methods, since the method can be less costly and involve less inconvenience or pain to the patients due to the analysis of semen samples instead of testicular biopsies.
  • the methods according to the invention are particularly suitable for screening purpose.
  • the invention may be used in general screening of testicular carcinoma in a population of individuals.
  • the invention may of course also be used when diagnosing individuals at risk of acquiring testicular carcinoma, or individuals suspected for having acquired testicular carcinoma, such as the individual is a male examined for infertility.
  • the result is presented to a medical professional before a final diagnosis is made.
  • the result is presented as a corresponding digital representation and/or grade, to a medical professional for assessment of the screening result of each of said individuals.
  • the result may be presented as the digital image having markings or labels in areas representing suspected cells.
  • the distinctive cell type may be any cell type that it indicative of the testicular carcinoma or CIS, such as is a cancer cell or a precursor of a cancer cell.
  • the cancer cell or the precursor of the cancer cell is a CIS cell, seminoma or non- seminoma.
  • the sample may be in principle be any type of sample suitable for obtaining the cell type, it is however mostly preferred that the cytological sample is a semen sample, and most preferred that the cell is a CIS cell and the sample is a semen sample.
  • the automated method according to the invention includes, in a first aspect the present invention relates to a method for assisting the diagnosing, prognostication or monitoring of testicular carcinoma, and/or for assisting the prediction of the outcome of treatment of testicular carcinoma in an individual, based on a cytological sample from said individual, which has been tagged with one or more tags capable of tagging one or more distinctive cell types characteristic of testicular carcinoma, said method comprising the steps of:
  • testicular carcinoma cells or CIS cells are first of all present in a very low concentration in the samples, and second that the enzymatic conditions of the seminal fluid tends to degrade the cells, then testicular carcinoma cells or CIS cells are representing an inhomogenous cell population which pose a serious problem when performing image analysis.
  • image analysis the normal approach is to identify the common features of the object to be inspected, however testicular carcinoma cells or CIS cells vary with respect to size, stainability, presence of cell nucleus, as well as presence of cell membrane whether intact or fragmented. Therefore, it has not been possible to find a conventional image analysis common denominator for the cells, and the invention relates to a combination of variables in order to identify all relevant cells.
  • the segmentation may be achieved by clustering, Bayesian segmentation, and/or thresholding the image with respect to colour or fluorescence of the tag.
  • the detection algorithm of the segmentation preferably segments the image according to spectral, spatial, contextual and/or morphological information.
  • the spectral information is preferably suitable for discriminating pixels resembling pixels in digital representations of tagged distinctive cells from pixels not resembling pixels in digital representations of tagged distinctive cells, and the spectral information is obtained by tagging the nucleus and/or the cytoplasm with one or more chromogens and/or fluorophores, thereby providing a characteristic colour and/or fluorescence to the nucleus and/or cytoplasm of the distinctive cell.
  • the spatial, contextual and/or morphological information is preferably suitable for discriminating pixels resembling pixels in digital representations of tagged distinctive cells from pixels not resembling pixels in digital representations of tagged distinctive cells. This may be achieved by emphasizing the spatial, contextual and/or
  • the method may include, during the
  • the method may include, during the segmentation in step b) classification of the variation of intensity of pixels potentially representing the one or more distinctive cell types. It has been found that the variation of the intensity is either on its own, or in combination with the form of the clusters, a characteristic measure for the cells during image analysis. In particularly, clusters of pixels wherein the variation is substantial have been found to correlate to the cells to be identified, whereas some artefacts tend to have little or no intensity variation, probably because they are not of biological origin, such as crystals of chromogen.
  • Textural information includes any quantitative information derived from the spatial arrangement of spectral information of individual or clusters of digital image pixels.
  • approaches to quantifying textural information including but not limited to statistical, structural and spectral approaches.
  • the detection algorithm in the segmentation step comprises parameters relating to spectral information and textural information. Accordingly, in one embodiment the algorithm comprises at least the two following parameters:
  • Spectral information parameter based on chromaticity, and calculated from the following formula 1 wherein the tag is a red stain:
  • C green is the green cormaticit
  • C biue is the blue chromaticity
  • n is the number of pixels in the coloured object.
  • Textural information parameter calculated according to Formula 2 and based on intensity and spatial information: var ) - var(l fMered ) Formula 2 wherein / is the intensity values for pixels from the coloured object so that var(/) is the variance in the intensity values of the coloured objects.
  • Ifuterea is like 1 intensity values for the coloured object, with the exception that for IfMered it is further filtered through a mean filter with neighbourhood connectivity on the intensity image, that is for I rMered each intensity value represents the mean of the intensity of its own and the four closest neighbouring pixels.
  • each tag marking a different part of the cell type.
  • the method may include classification of variation of intensity and/or classification of texture of pixels representing all different tags.
  • the segmentation in step b) includes thresholding the image with respect to colour or fluorescence of the tag.
  • this relates to thresholding with respect to intensity of the staining.
  • a clustering of the pixels may be performed by clustering of neighbouring pixels before the segmentation in order to establish clusters potentially representing the cells.
  • the pixels potentially representing the one or more distinctive cell types, such as the clusters are sorted with respect to size of clusters before segmentation. In a preferred embodiment they are sorted in relation to both an upper limit and a lower limit. Grading
  • the grading of the pixels or clusters of pixels are performed. Grading may be done manually on the samples identified to potentially include one or more distinctive cell types, but it is preferred that the grading is performed automatically as well.
  • the grading is performed by evaluating the clusters of pixels or individual pixels belonging to the pixel classes potentially representing the one or more distinctive cell types according to their degree of resemblance to a corresponding tagged archetypical distinctive cell.
  • the grading step assigns a grade to each cell identified by the method discussed above. In another embodiment the grading step assigns a grade to an image or a slide depending on the cells identified in said image or slide. In a most basic embodiment the grading step includes two grades, either the cells are positively identified as representing the one or more distinctive cell types, or negative, i.e. no cells have been identified as representing the one or more distinctive cell types. In another embodiment the grading step includes at least three grades, positive or negative as defined above or borderline, wherein borderline defines cells between clearly negative and clearly positive.
  • the grading steps includes several grades, such as the following: 1 . Artifact and/or staining precipitate. The object is definitely not a CIS cell.
  • the object has a high probability of not being a CIS cell, typically due to:
  • the object is too small to resemble a typical CIS cell nucleus, or b.
  • the object has a dubious or wrong morphology compared to a typical CIS cell nucleus.
  • an aspect of the invention relates to the use of an automated method for classification or grading of cells.
  • the automated method can be any machine learning method such as a decision tree, an artificial neural network or a bayesian network which has optimized parameters that allow for grading or classifying cells.
  • the machine learning method is a decision tree or classification tree.
  • the parameters optimized for grading or classifying cells depend on the outcome of the actual staining method(s), including but not limited to the range of morphologies of the target cell, e.g. size range, shapes, and degradation profile; the cellular localization of the tagged marker(s), e.g. nucleus, cytoplasm or membrane; the detection principle, e.g. chromogenic or fluorogenic, and the colour(s) of the tag(s).
  • the invention also relates to other parameters, such as spectral parameters and/or spatial and/or contextual and/or textural and/or morphological parameters.
  • the parameters used for the machine learning method can be one or more of the parameters mentioned in Table 1 .
  • the parameters may include low and high values of each of these parameters as shown in Table 1 .
  • Figure 2 is a tabular overview with corresponding B/W photocopies (see Figure 2) of exemplifying images of typical specific parameters measured for the relevant pixel clusters and individual pixels representing objects in the original digital image. The majority of these specific parameters relate to distributions of pixel colors and positions, but they could be any descriptive parameter, such as e.g. Circumference, Circularity and other shape factors.
  • ⁇ 0 denotes mean chromaticity
  • ⁇ 2 denotes Variance
  • R denotes a red pixel value
  • G denotes a green pixel value
  • B denotes a blue pixel value
  • Distance Variance where d denotes that the parameter relates to pixel distance from object center, thus a d 2 denotes distance variance
  • x and y denotes pixel coordinates in the image and
  • x c and y c denotes the coordinates of the center of mass
  • Distance Skewness where ⁇ denotes skewness
  • Circularity where f circ denotes circularity.
  • Circularity Circularity Circularity Circularity
  • Circularity Circularity Circularity
  • the optimized parameters are one or more of the following:
  • chromaticity such as red chromaticity, blue chromaticity or green chromaticity, and/or derivatives or combinations thereof and/or colour of the pixel cluster such as 1 for red, 2 for blue or 3 for green, and/or the profile area, and/or the mean chromaticity value and/or the variance of the chromaticity and/or the skewness of the chromaticity values and/or the kurtosis of the chromaticity and/or the variance of the distances to the centre of a cluster and/or skewness of the distances to the centre of a cluster and/or kurtosis of the distances to the centre of a cluster and/or textural parameters such as textural information based on spatial and intensity information.
  • the machine learning method can be trained on any set of digital cell images.
  • An aspect of the invention relates to the use of methods such as transformation, normalization, filtering and segmentation for performing operations on digital information or pixels as described above.
  • the data of the digital image is transformed, such as normalized as part of the grading or classification method.
  • cross-validation is used to optimize the machine learning method.
  • the machine learning method is a decision tree or a classification tree which is optimized by cross-validation.
  • step c) involves the use of a machine learning method with optimized parameters for grading or classifying cells, wherein said parameters are one or more parameters selected from the textural information, the mean difference between green chromaticity and blue chromaticity, the profile area, the mean red chromaticity, the mean 30% highest intensity, the blue chromaticity 30%, the red chromaticity 30%, the red chromaticity variance, the circularity, the extrinsic red chromaticity, the distance to center variance, and the combined circularity and distance to center skewness.
  • parameters are one or more parameters selected from the textural information, the mean difference between green chromaticity and blue chromaticity, the profile area, the mean red chromaticity, the mean 30% highest intensity, the blue chromaticity 30%, the red chromaticity 30%, the red chromaticity variance, the circularity, the extrinsic red chromaticity, the distance to center variance, and the combined circularity and distance to center skewness.
  • step c) involves the use of a machine learning method with optimized parameters for grading or classifying cells, wherein said parameters comprise one or more of the parameters selected from textural information, the mean red chromaticity and the distance to center variance.
  • the parameters can comprise one of the parameters selected from the textural information, or the mean red chromaticity, or the distance to center variance, or the parameters can comprise a combination of the textural information and mean red chromaticity, or a combination of the textural information and distance to center variance, or a combination of the textural information, mean red chromaticity and distance to center variance, or a combination of the mean red chromaticity and distance to center variance.
  • step c) involves the use of a machine learning method with optimized parameters for grading or classifying cells, wherein said parameters comprise at least the textural information, the mean difference between green chromaticity and blue chromaticity, the profile area, the mean red chromaticity, the mean 30% highest intensity, the blue chromaticity 30%, the red chromaticity 30%, the red chromaticity variance, the circularity, the extrinsic red chromaticity, the distance to center variance, and the combined circularity and distance to center skewness.
  • step c) involves the use of a decision tree with optimized parameters for grading or classifying cells, with a structure as defined in Table 2. Note that Inf is an abbreviation for infinite towards higher/positive values and - Inf is an abbreviation for infinite towards the smaller/negative values.
  • the cells are tagged by staining of cells using markers.
  • the markers may be any type of cellular markers, such as nuclear markers, cytoplasmatic markers and membrane markers.
  • the tag or one of the tags may be an in situ staining marker, such as an in situ staining marker identifying a target selected from the group consisting of transcription factors ⁇ -2 ⁇ , OCT3/4, NANOG, GATA-4, GATA-6 and FOG-2, alkaline phosphatase, epidermal growth factor receptor (EGFR), and Ki67.
  • an in situ staining marker identifying a target selected from the group consisting of transcription factors ⁇ -2 ⁇ , OCT3/4, NANOG, GATA-4, GATA-6 and FOG-2, alkaline phosphatase, epidermal growth factor receptor (EGFR), and Ki67.
  • One or more of the tags may be an in situ staining marker, such as an in situ staining marker identifying a target selected from the group consisting of ⁇ -2 ⁇ , OCT3/4, NANOG, GATA-4, GATA-6 and FOG-2, alkaline phosphatase, epidermal growth factor receptor (EGFR), SOX2, SOX15, SOX17, E2F1 , IFI16, TEAD4, TLE1 , TATDN2, NFIB, LM02, MECP2, HHEX, XBP1 , RRS1 , MYCN, ETV4, ETV5, MYCL1 , HIST1 H1 C, WDHD1 , RCC2, TP53, MDC1 , ALPL, DPPA4, TCL1 A, CDH1 ,GLDC, TCL1A, DPPA4, CDK5, CD14, FGD1 , NEURL, HLA-DOA, DYSF, MTHFD1 , ENAH, ZDHHC9, NME1 ,
  • the in situ staining marker identifies transcription factor ⁇ -2 ⁇ . In another more preferred embodiment the in situ staining marker identifies intrinsic enzyme activity of alkaline phosphatase.
  • two or more in situ staining markers are used, wherein said two or more in situ staining markers are capable of identifying two or more targets selected from the group consisting of ⁇ -2 ⁇ , OCT3/4, NANOG, GATA-4, GATA-6 and FOG-2, alkaline phosphatase, epidermal growth factor receptor (EGFR), SOX2, SOX15, SOX17, E2F1 , IFI16, TEAD4, TLE1 , TATDN2, NFIB, LM02, MECP2, HHEX, XBP1 , RRS1 , MYCN, ETV4, ETV5, MYCL1 , HIST1 H1 C, WDHD1 , RCC2, TP53, MDC1 , ALPL, DPPA4, TCL1A, CDH1 ,GLDC, TCL1 A, DPPA4, CDK5, CD14, FGD1 , NEURL, HLA-DOA, DYSF, MTHFD1 , ENAH, ZDH
  • a double staining i.e. staining the cell to be identified by two different tags in order to increase the likelihood of identifying the correct cells.
  • the double staining may be achieved by using two in situ staining markers selected from the group above, such as selected for identifying transcription factor ⁇ -2 ⁇ and intrinsic enzyme activity of alkaline phosphatase, respectively.
  • the in situ staining marker may be a chromogenic enzyme substrate, such as a chromogenic enzyme substrate selected from the group consisting of DAB, BCIP, Fast Red and AEC.
  • the in situ staining marker is a fluorophore selected from the group consisting of FITC, TRITC and Texas Red.
  • the staining of the cells for classification or grading can be any type of staining method known in the prior art, such as for example immunostaining or enzymatic staining.
  • the staining is immunostaining of ⁇ -2 ⁇ and enzymatic staining using the activity of alkaline phosphatase.
  • the sample or part of the sample is preferably dispersed on, or contained in, an appropriate medium for performing the tagging procedure of the one or more distinctive cells types.
  • the medium may be a reactor useful for incubation or flow of cells.
  • the medium is a dispersing surface useful for microscopy or scanning, such as wherein the medium is a glass slide or a compact disc.
  • Glass slides may include fiducial lines or fiducial points for easing the autofocusing in microscopy or scanning of samples with limited intrinsic and tagging contrast.
  • the digital representation of the tagged sample discussed above is acquired by adequate optics, such as a microscope, and an image acquisition device.
  • the image acquisition device is preferably a digital camera.
  • digital representation of the tagged sample is in the form of a virtual slide, typically acquired by whole slide scanning.
  • Automated system in one embodiment the present invention further encompasses an automated or semi- automated system suitable for carrying out one or more of the methods disclosed herein, said automated or semi-automated system comprising, in combination:
  • a database capable of including a plurality of digital images of the samples
  • a software module for analyzing a plurality of pixels from a digital image of the samples a control module comprising instructions for carrying out said method(s).
  • Said automated or semi-automated system can also further comprise one or more of: a slide loader, a barcode reader, a microscope (preferably motorized), and a stage (preferably motorized).
  • the system includes a slide scanner whereby each slide is scanned for production of digital images of the slides.
  • the slide scanner provides digital slides at high resolution, i.e. virtual slides, from each slide.
  • the system preferably includes an image processor and digitizer, and a general processor with peripherals for printing, storage, etc.
  • the system can additionally provide an opportunity for a user to provide guidance during the process. For example, the user can specify a particular area of interest by selecting it on the screen.
  • the system can also provide a way to eliminate a specified area or areas.
  • the present invention further encompasses a computer readable medium comprising instructions for carrying out one or more of the methods disclosed herein.
  • Suitable computer-readable media can for example be a hard disk to provide storage of data, data structures, computer-executable instructions, and the like.
  • Other types of media which are readable by a computer such as removable magnetic disks, CDs, magnetic cassettes, flash memory cards, digital video disks, and the like, may also be used.
  • the following examples (1 -3) illustrates how a digital image of a cytological sample can be processed by algorithmic steps of the invention, for the purpose of automated detection of objects in digital images representing or resembling a target cell of interest.
  • This initial part of the process relies mainly on colour and profile area information sufficient for image segmentation and detection of pixel clusters (Example 1 ), which must then be further analyzed for various more advanced descriptive parameters (see Example 2), and subsequently automatically graded for their individual degree of resemblance to an archetypical target cell, or fractions thereof, by use of a
  • the details of the automated detection procedure will depend closely on the expected outcome of the actual staining method(s), including but not limited to the range of morphologies of the target cell, e.g. size range, shapes, and degradation profile; the cellular localization of the tagged marker(s), e.g. nucleus, cytoplasm or membrane; the detection principle, e.g. chromogenic or fluorogenic, and the colour(s) of the tag(s).
  • the target cell is CIS cells in cytospins of semen samples.
  • the CIS cell typically will include a cell nucleus and some remains of cytoplasm, but may be somewhat degraded due to the lytic effects of the seminal fluid.
  • Two markers of the CIS cell are tagged: the transcription factor ⁇ -2 ⁇ typically located in its nucleus, and the enzyme alkaline phosphatase typically located in its cytoplasm. Their tagging is secured by a double staining method described by Nielsen et al (3), which includes immunostaining for ⁇ -2 ⁇ and active enzyme-staining for alkaline phosphatase.
  • the preferred colour of the ⁇ -2 ⁇ tag is reddish due to the choice of the colourless substrate 3-Amino-9-Ethylcarbazole (AEC), which in the presence of peroxide is converted into a red precipitate by the indirect peroxidase-label.
  • the preferred colour of the alkaline phosphatase tag is bluish due to the choice of the colourless substrate 5-Bromo-4-chloro-3-indolyl phosphate (BCIP), which in the presence of the oxidant nitro blue tetrazolium chloride (NBT) is converted by the intrinsic enzyme activity into a blue precipitate.
  • Figure 1 a shows a B/W photocopy of an original digital colour representation of a microscopically magnified area of a semen sample on a glass slide after the
  • the Red-Green-Blue (RGB) colours of the original digital image were transformed into the corresponding chromaticities according to the following definitions:
  • I R + G + B (Formula 3); where / is the intensity, and R, G, and B are the RGB colour values for red, green and blue, respectively.
  • the RGB were represented on an 8-bit scale with integer values ranging from 0 to 255, and thereby / could vary from 0 to 765.
  • Other RGB-scales could have been used.
  • C J TMd
  • C ⁇ y FOrm U la 4; where C represent the chromaticities.
  • the red and blue chromaticities of each pixel in the original digital colour image are presented in Figure 1 b and 1 c, respectively, according to a greyscale representation, where the chromaticity values were normalized.
  • T B 2 0.341 were converted to black pixels. All other pixels were converted to white pixels. As before the black pixels were subsequently converted to white, if the intensity of the corresponding pixel in the original digital colour image was below T
  • 150.
  • the automated detection of relevant objects resembling CIS-cells requires that the analysis can be done on clusters of pixels, and not only on individual pixels.
  • the analysis of clusters of pixels is useful for the detection of stained nuclei of a target cell represented in a digital image.
  • Pixels belonging to the same cluster can be defined by various measures of connectivity. In the current example, the most restrictive connectivity rule was applied, the Four Neighbour Connectivity. As shown in Figure 1 g, any pixel in an image (except for those on the edge) is touched by 8 neighbour pixels of which 4 pixels (in white) are diagonal and more distant than the 4 pixels (in grey), which are orthogonal and closest to the pixel being analysed for cluster-association (marked with "X" in Figure 1 g).
  • a cluster of pixels is defined as all pixels linked directly or indirectly via orthogonal neighbour pixels, i.e. according to the Four Neighbour Connectivity rule.
  • a pixel belonging to a given segmentation class can either be individual or be included in a cluster, but never in more than one cluster.
  • clusters are typically discarded if they are too small or too large to fit expectations of an admissible profile area.
  • Example 2 a total of five (5) admissible clusters were detected, four reddish clusters and one bluish/purplish cluster. They may alone or together be part of an object of interest, and this is further evaluated in Examples 2 and 3.
  • Example 2
  • a number of relevant parameters can be calculated, in order to automatically grade the resemblance of the cluster to the target cell of interest.
  • the selection of relevant parameters will depend closely on the expected most characteristic features of the stained target cell and of other likely resembling but irrelevant objects in the digital images of the cytological samples, since the selected parameters should be suitable for distinguishing pixel clusters representing the target cell from clusters representing irrelevant objects.
  • the target cell is CIS cells in cytospins of semen samples double stained for the ⁇ -2 ⁇ antigen (reddish nucleus), and for intrinsic alkaline phosphatase activity (bluish cytoplasm).
  • examples of other relevant auto-parameters are the mean distance to the centre, the perimeter of the pixel cluster, the skewness and kurtosis of the chromaticity, the kurtosis of the distance to the centre and the ratio between height and width.
  • examples of other relevant extrinsic or combined intrinsic and extrinsic parameters are the mean, variance, skewness and kurtosis of the chromaticity of nearby * clusters, the mean, variance, skewness and kurtosis of the distance to the center of the cluster of interest of pixels of nearby * clusters and the perimeter of the nearby * clusters.
  • a nearby cluster is defined as a cluster with one or more pixels within a distance of 5 pixels from a pixel in the cluster of interest.
  • the values of the extrinsic parameters are based on the values of all pixels contained in one or more nearby clusters.
  • a nearby cluster will be derived from the output of another segmentation than the one which led to the definition of the cluster of interest, but it may also be a second cluster derived from the output of the same segmentation.
  • the nearby cluster (#5) was derived from the blue/purple segmentation in Figure 1j
  • the cluster of interest (#3) was derived from the red segmentation in Figure 1 h.
  • Example 2 When an admissible cluster of pixels has been detected (see Example 1 ), and its values for the selected relevant intrinsic and extrinsic parameters have been determined (see Example 2), it can be automatically graded for its resemblance to the target cell of interest by use of an appropriate classification tree.
  • the design and construction of the classification tree will depend closely on the parameters selected to distinguish the features of the stained target cell from those of other resembling but irrelevant objects in the digital images of the cytological samples. In the present example, as was also the case in Examples 1 and 2, the target cell is
  • CIS cells in cytospins of semen samples double stained for the ⁇ -2 ⁇ antigen (reddish nucleus), and for intrinsic alkaline phosphatase activity (bluish cytoplasm).
  • the actual example is based on digital representations of microscopically magnified areas of cytospins of double stained semen samples collected by expert pathologists.
  • the digital representations were acquired due to the presence of an object assumed to be relevant to the preparation of a grading algorithm for clusters of pixels representing objects resembling CIS cells.
  • a classification tree was constructed (Table 3). The objects were individually graded on a well established manual classification scale of: [1 , 2, 21 ⁇ 2, 3, 31 ⁇ 2, 4], covering from artefacts (1 ) to archetypical CIS cells (4), Table 2 and Figure 3.
  • Table 2 shows manual grading distributions for the 2,437 relevant objects selected by expert assessors in 506 whole slide semen cytospin scanning images from men in routine subfertility work-up.
  • Figure 3 shows corresponding B/W photocopies of colour images of exemplifying objects. All 2,437 objects were used for developing and optimizing the classification tree, which ensures the automated grading of CIS cell like objects.
  • the object has a high probability of not being a CIS cell, typically due to:
  • the red-stained object is too small to resemble a typical CIS cell
  • the red-stained object has a dubious or wrong morphology compared to a typical CIS cell nucleus.
  • the classification tree however works with a less complex grade structure than the one detailed above, namely with grading the objects Positive, Borderline and Negative (P/B/N).
  • the Positive category includes the grades 31 ⁇ 2 and 4
  • the Borderline category includes the grades 2, 21 ⁇ 2 and 3
  • the Negative category includes the objects graded 1 .
  • Table 3 is a tabular overview of a typical classification tree. Each node in the tree has been assigned an index number. Furthermore each node is either a split or an end node where a grade is assigned. Splits divide objects based on the given parameter and boundaries and pass them on to other nodes based on the indices given in the left child/ right child columns.
  • the root of the tree is node 1 , where the classification is initiated for all pixel clusters. If a cluster has a value for a given parameter of a node, which is inside the
  • the pixel cluster is sent to the left child. Otherwise it is sent to the right child. If the pixel cluster reaches a node, which only contains a grade (a leaf node), it is assigned that grade, and the classification of the corresponding object is completed.
  • a classification tree was constructed by the basic method. To get a classification tree which was more independent of data variation, cross validation was used. This method splits the data multiple (typically 10) times into two sets; a training set and a test set. Thus multiple trees can be built and all these trees can be validated and their success at classifying correctly can be compared using the respective test sets. Table 3 shows an optimized, cross validated classification tree based on the same data as the basic classification tree.
  • the detected clusters can be classified.
  • cluster #3 see Figure 1 h
  • the classification takes the following path:
  • the first question is: is the textural information of the pixel cluster between 10 and 800? As it is so, the left child node is visited. 2. At this node [#2] the question is; is the mean of the difference between the green chromaticity values and the blue chromaticity between -infinity and 0.06? As it is so, the left child node is visited.
  • the question is; is the profile area of the pixel cluster between - infinity and 387.56? As it is outside these boundaries, the right child node is visited. 6. At this node [#6] the question is; is the mean of the 30% highest intensity values between 385 and 535? As it is inside this boundary, the left child node is visited.
  • Table 4 can be modified according to the practical clinical consequences of the manual inspection of objects, and of the automated image analysis of objects, and thereby used for estimation of specificity and sensitivity, as well as the positive and negative predictive values at the individual object level, as shown in Table 5.
  • the major purpose of the automated screening is to reduce the workload of manual inspection, thereby allowing the screening of semen samples in a routine setting without causing an overwhelming manual workload. This is achieved both by the automated object identification in general, and by reducing the number of relevant objects for review more than 10-folds by the automated grading. Still, the method can only be used, if no or very few CIS cell like objects are missed by the image analysis. This seems to be achieved since all 26 positive objects (Grade 31 ⁇ 2 or 4 according to manual inspection) were identified by the automated grading as borderline or positive, i.e. as prone for manual review.
  • Samples 1 ,175 ejaculates ranging from 1 -5 ejaculates per man.
  • the ejaculation abstinence period was 3.5 days, median sperm concentration 2.5 x10 6 spermatozoa/ml and median pH 8.
  • the median ejaculate volume was 3.9 ml, but the volume available for cytospin on glass slides and subsequent screening of CIS cells varied between 100 and 600 ⁇ , and was 400 ⁇ for more than 90% of the samples. 13% of the samples were classified as azoospermic, including 5% obstructive azoospermic.
  • cytospin samples were double-stained for intrinsic alkaline phosphatase activity and for the antigenic presence of ⁇ -2 ⁇ , before image acquisition.
  • Image acquisition Slide scanning was done using a NanoZoomer HT version 2.0 (Hamamatsu Photonics, Japan). The areas of stained cytospins were scanned in 3 Z- stacks (+1 and +2 microns from the autofocus layer) and with settings corresponding to a 40X objective in traditional microscopy. For each slide, the corresponding digital image included 2 regions of interest (ROIs) each of which was approx. 57 mm2 in size, and the file (approx. 1 .5 GB) was saved on a server until manual digital reading on a computer monitor and automated image analysis.
  • ROIs regions of interest
  • Clusters were subsequently defined by the 4-pixel neighborhood connectivity rule. Only red pixel clusters with a relevant profile area were selected for further analysis and grading according to their CIS cell resemblance. For pixel clusters that met the inclusion criteria a number of relevant parameters were subsequently calculated, in order to automatically grade its resemblance to archetypical CIS cells. Some parameters were determined by contributions from pixels belonging to the red cluster only (intrinsic parameters), whereas others included contributions from pixels not belonging to the cluster, but typically located near the cluster of interest e.g., blue pixels representing stained cytoplasm (extrinsic parameters). The final algorithms are included in version 4.0 of the VIS software (Visiopharm A/S, Hoersholm, Denmark) to facilitate easy import and analysis of scanned cytospins.
  • Testicular biopsies have been taken from four of these five men, and CIS was detected by histological analysis in three of them (Cases 2, 3 and 4). In contrast, the hitherto small testicular biopsies collected in Case 1 were negative, and a new biopsy has not been collected. For Case 5, no testicular biopsy is yet available. For one of these five men (Case 3), the only preceding symptom to the CIS diagnosis was couple infertility, since his semen quality and ultrasound result were normal, and no history of cryptorchidism or testicular disease was recorded.
  • the four other CIS-positive men all had poor semen quality, and either a history of cryptorchidism (Case 1 ), treated unilateral teratoma (Case 2), or treated unilateral non-seminoma (Cases 4 and 5).
  • Table 6 shows that in the ejaculates from 18 of the 765 men (2.4%), objects graded as borderline were identified. One of these men only provided this borderline sample, and for 15 men a second ejaculate was graded negative, while for two men also their second sample was classified as borderline. For the remaining 742 men (97%), CIS cell like objects were not detected in their ejaculates.
  • Borderline Samples were second samples from Borderline Men, and 2 Borderline Samples were from Positive Men.
  • the 95% confidence interval for the expected approx. 2% life-time risk of testicular cancer among men experiencing couple infertility is: 1 .0-3.1 %.
  • testicular carcinoma and/or for assisting the prediction of the outcome of treatment of testicular carcinoma in an individual, based on a cytological sample from said individual, which has been tagged with one or more tags capable of tagging one or more distinctive cell types characteristic of testicular carcinoma, said method comprising the steps of:
  • the distinctive cell type is a cancer cell or a precursor of a cancer cell.
  • the cancer cell or the precursor of the cancer cell is a CIS cell, seminoma or non-seminoma.
  • cytological sample is a semen sample and the distinctive cell type is a CIS cell.
  • the tags are in situ staining markers.
  • the one or more tags are in situ staining markers capable of identifying one or more targets selected from the group consisting of ⁇ -2 ⁇ , OCT3/4, NANOG, alkaline phosphatase, SOX2, SOX15, SOX17, E2F1 , IFI16, TEAD4, TLE1 , TATDN2, NFIB, LM02, MECP2, HHEX, XBP1 , RRS1 , MYCN, ETV4, ETV5, MYCL1 , HIST1 H1 C, WDHD1 , RCC2, TP53, MDC1 , ALPL, DPPA4, TCL1A,
  • MTHFD1 ENAH, ZDHHC9, NME1 , SDCBP, SLC25A16, ATP6AP2, PODXL, PDK4, PCDH8, RAB15, EVI2B, LRP4, B4GALT4, CHST2, FCGR3A, CD53, CD38, PIGL, CKMT1 B, RAB3B, NRCAM, KIT, ALK2, PDPN, HRASLS3, and TRA-1 -60.
  • the one or more tags are one or more in situ staining markers capable of identifying one or more targets selected from the group consisting of transcription factors ⁇ -2 ⁇ , OCT3/4, NANOG, GATA-4, GAT A- 6 and FOG-2, alkaline phosphatase, epidermal growth factor receptor (EGFR), and Ki67.
  • markers are capable of identifying transcription factor ⁇ -2 ⁇ . 1 1 ) The method according to items 7 to 10, wherein one or more in situ staining markers are capable of identifying intrinsic enzyme activity of alkaline phosphatase.
  • the method according to any of the preceding items includes thresholding the image with respect to colour or fluorescence of the tag.
  • the method according to any of the preceding items, where the segmentation discriminates pixels in the digital image resembling pixels in digital
  • representations of tagged distinctive cells from pixels not resembling pixels in digital representations of tagged distinctive cells, by use of an image analysis detection algorithm segmenting the image according to spectral, spatial, contextual and/or morphological information.
  • nucleus and/or the cytoplasm obtained by tagging the nucleus and/or the cytoplasm with one or more chromogens and/or fluorophores, thereby providing a characteristic colour and/or fluorescence to the nucleus and/or cytoplasm of the distinctive cell.
  • the segmentation in step b) includes classification of the form of the pixels potentially representing the one or more distinctive cell types, such as area and/or periphery of the clusters of pixels.
  • the segmentation includes classification of the variation of intensity in clusters of pixels potentially representing the one or more distinctive cell types.
  • the segmentation includes classification of the texture of clusters of pixels potentially representing the one or more distinctive cell types.
  • the method according to any of the preceding items including tagging using at least two different stains, each stain tagging a different part of the cell.
  • the method according to any of the preceding items including classification of variation of intensity of pixels representing at least one other tag.
  • the method according to any of the preceding items including classification of texture of pixels representing at least one other tag.
  • the medium is a glass slide or a compact disc. 41 ) The method according to item 40, wherein the glass slide includes fiducial lines or fiducial points for easing the autofocusing in microscopy or scanning of samples with limited intrinsic and tagging contrast.
  • step c The method according to the previous items, wherein the one or more clusters of pixels or individual pixels, which have obtained a critical grade in step c), is presented as a corresponding digital representation and/or grade, to a medical professional for assessment of the screening result of each of said individuals.
  • step c) involves the use of an automated method.
  • step c) involves the use of a machine learning method with optimized parameters for grading or classifying cells, wherein said machine learning method is selected from the group consisting of a decision tree, an artificial neural network or a Bayesian network.
  • step c) involves the use of a machine learning method with optimized parameters for grading or classifying cells, wherein said machine learning method a decision tree or a classification tree.
  • step c) involves a transformation of the data of the digital representation.
  • step c) involves the use of a machine learning method with optimized parameters for grading or classifying cells, wherein said parameters one or more parameters selected from the textural information, the difference between green chromaticity and blue chromaticity mean, the profile area, the mean red chromaticity, mean 30% highest intensity, the blue chromaticity 30%, the red chromaticity 30%, the red chromaticity variance, the circularity, the extrinsic red chromaticity, distance to center variance, combined circularity and distance to center skewness.
  • step c) involves the use of a machine learning method with optimized parameters for grading or classifying cells, wherein said parameters comprise one or more of the parameters selected from textural information, the mean red chromaticity and the distance to center variance.
  • step c) involves the use of a machine learning method with optimized parameters for grading or classifying cells, wherein said parameters comprise textural information.
  • step c) involves the use of a machine learning method with optimized parameters for grading or classifying cells, wherein said parameters comprise the mean red chromaticity.
  • step c) involves the use of a machine learning method with optimized parameters for grading or classifying cells, wherein said parameters comprise the distance to center variance.
  • step c) involves the use of a machine learning method with optimized parameters for grading or classifying cells, wherein said parameters comprise at least the textural information, the difference between green chromaticity and blue chromaticity mean, the profile area, the mean red chromaticity, mean 30% highest intensity, the blue chromaticity 30%, the red chromaticity 30%, the red chromaticity variance, the circularity, the extrinsic red chromaticity, distance to center variance, combined circularity and distance to center skewness.
  • step c) involves the use of a decision tree having one or more of the following parameters and corresponding thresholds:
  • Textural Information minimum threshold 10.000, maximum threshold 800.000, and/or,
  • Mean green chromaticity - blue chromaticity minimum threshold -infinite, maximum threshold 0.060, and/or,
  • Profile Area minimum threshold 100.000, maximum threshold 800.000, and/or, Mean Red Chromaticity: minimum threshold 0.375, maximum threshold inf, and/or,
  • Textural Information minimum threshold 10.000, maximum threshold 508.522, and/or,
  • Blue Chromaticity 30% minimum threshold 0.309, maximum threshold 0.332, and/or,
  • Red Chromaticity 30% minimum threshold 0.395, maximum threshold 0.417, and/or,
  • Circularity minimum threshold 44.601 , maximum threshold inf, and/or
  • Red Chromaticity 30% minimum threshold 0.351 , maximum inf, and/or, Red Chromaticity 30%: minimum threshold 0.388, maximum 0.410, and/or, Red Chromaticity Variance: minimum threshold 569.270, maximum threshold inf, and/or,
  • Textural Information minimum threshold 797.579, maximum threshold 800.000, and/or,
  • Profile Area minimum threshold -inf, maximum threshold 800.000, and/or, Mean Red Chromaticity: minimum threshold 0.393, maximum threshold 0.429, and/or,
  • Mean Red Chromaticity minimum threshold 0.378, maximum threshold inf, and/or,
  • Mean Green chromaticity - blue chromaticity minimum threshold 0.012, maximum threshold 0.035, and/or,
  • Textural Information minimum threshold 74.938, maximum threshold 508.522, and/or,
  • Mean Green chromaticity - blue chromaticity minimum threshold 0.016, maximum threshold 0.025, and/or,
  • Mean Red Chromaticity minimum threshold 0.366, maximum threshold inf, and/or,
  • Mean Red Chromaticity minimum threshold 0.369, maximum threshold inf, and/or,
  • Profile Area minimum threshold 99.160, maximum threshold 800.000, and/or, Mean Green chromaticity - blue chromaticity: minimum threshold 0.030, maximum threshold 0.060, and/or,
  • Red Chromaticity Variance minimum threshold 531 .349, maximum threshold inf, and/or, Distance to Center Variance: minimum threshold 3.577, maximum threshold inf, and/or,
  • Profile Area minimum threshold 99.160, maximum threshold 800.000, and/or, Textural Information: minimum threshold 74.938, maximum threshold 800.000, and/or,
  • Red Chromaticity 30% minimum threshold 0.373, maximum threshold inf, and/or,
  • Red Chromaticity 30% minimum threshold 0.380, maximum threshold 0.388, and/or,
  • Circularity minimum threshold 28.229, maximum threshold 44.601 , and/or, ME Mean Green chromaticity - blue chromaticity: minimum threshold -0.007, maximum threshold 0.060, and/or,
  • Profile Area minimum threshold 251 .800, maximum threshold 442.600, and/or,
  • Profile Area minimum threshold 99.160, maximum threshold 800.000, and/or, Mean 30% highest intensity: minimum threshold 364.438, maximum threshold 605.723, and/or,
  • Mean Green chromaticity - blue chromaticity minimum threshold 0.012, maximum threshold 0.025, and/or,
  • a control module comprising instructions for carrying out said method.
  • a computer readable medium comprising instructions for carrying out a method as defined by items 1 to 53.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Medical Informatics (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Radiology & Medical Imaging (AREA)
  • Quality & Reliability (AREA)
  • Investigating Or Analysing Biological Materials (AREA)

Abstract

The present invention relates to methods for assisting the diagnosing, prognostication or monitoring of testicular carcinoma, and/or for assisting the prediction of the outcome of treatment of testicular carcinoma in an individual, based on a cytological sample from said individual. Said method involves steps of for imaging, detection, and grading of objects in cytological samples or digital representations thereof, as well as a microscope-, slide scanner- or flow cell-based system and a computer readable medium therefore.

Description

Automated imaging, detection and grading of objects in cytological samples Field of invention
The present invention relates to methods for the imaging, detection, and grading of objects in cytological samples or digital representations thereof, as well as a microscope-, slide scanner- or flow cell-based system and a computer readable medium therefore.
Background of invention
The two major purposes of cytology are:
- Cell biology: the study of cellular anatomy, function and chemistry, typically either under normal healthy anatomical conditions, or under artificial cell culture conditions;
- Cytopatology: the study of cellular disease and the use of cellular changes for the diagnosis, prognosis, screening and/or treatment prediction of disease.
Microscopy is a central and traditional method for cytology by which objects of interest in cell biology or cytopathology can be studied through magnification and improved visual representation of the objects. The objects per definition are individual cells or fractions or clusters thereof, but could also include e.g., connective tissue, viruses, or even non-biological constituents.
Visual examination in cytological microscopy is traditionally done by the operator at the microscope, but the recent availability of slide scanners ("virtual microscopes") has allowed the observations to be done off-line, i.e. at a location away from the microscope, on digital representations of the magnified specimen presented on a computer monitor.
Histology is a special type of cytology, in which microscopy is used to investigate a thin slice of a tissue isolated by biopsy, and preserved and handled by methods such as fixation, embedding, and sectioning. The tissue section is typically mounted on a glass or plastic slide and inspected by transmitted light microscopy either unstained or after a staining procedure emphasizing particular microanatomic and molecular features of interest.
In other situations, the cytological sample is a suspension of cells originating from e.g., a body fluid, a bodily surface or tissue from which cells are segregated or aspirated and then suspended, or a cell culture. Also in this situation, transmitted light microscopy is a central method for investigation, and the cell sample is typically prepared by making a smear, a cytospin or an imprint on the surface of a glass or plastic slide, before most likely being exposed to fixation, and staining similarly to the methods for histology.
Other types of inspection of the cytological sample than transmitted light microscopy could also be used either alone or in combination with transmitted light microscopy. For example, widefield or confocal fluorescent light microscopy, where fluorophores are used as indirect markers of microanatomic or molecular features of interest.
Classically, the objects of major relevance in a cytological sample have been detected by the microscope operator according to particular characteristics being featured by the object. However, the involvement of the operator for at least the detection and imaging of the relevant objects, and possibly also for the analysis, has severely restricted the practical possibilities of applying thorough cytological microscopy to a large number and/or a large volume of cytological samples. This is particularly a problem, when the object of relevance is rare and small, since the time spent searching at high
magnification microscopy in large surface areas of samples is so long that the procedure becomes costly and inefficient.
A highly relevant example of this situation is a recently invented method for detection of carcinoma-in-situ (CIS) cells in human semen samples. Virtually all testicular cancers originate from this precursor cell. If left untreated, CIS will invariably progress into testicular cancer. Unfortunately, it has hitherto been practically almost impossible to diagnose CIS, since it has required a testicular biopsy at this asymptomatic stage of the progressing disease.
However, various specific and characteristic molecular markers, including the transcription factor, ΑΡ-2γ were recently identified in CIS cells (1 ). By the use of immunocytochemical staining for ΑΡ-2γ, it was possible to detect CIS cells released into the seminal fluid in men with early stages of testicular cancer (2).
Due to the low concentration of CIS cells in positive samples (typically in the lower end of 0-100 cells per ml of seminal fluid), in combination with the relatively large volume of a semen sample (typically 1 -5 ml), economical constraints have severely restricted the possibility to convert this important finding into routine diagnosis, since the checking of a semen sample for CIS cells can easily require several hours of manual microscopy. Obviously, the labour intensity of this method therefore rules out the possibility of screening apparently healthy men for the presence of CIS cells, even though it is known that approx. 1 % of all men will develop testicular cancer.
Thus, there is an unmet need for automating cytological microscopy in general, and a specific need for an automated process and a device, which can reduce the labour intensiveness of imaging, detecting, and grading of objects in semen samples, which resemble CIS cells.
Summary of invention
The present invention relates to new image analysis methods for use in the processing of digital images acquired by slide microscopy or slide scanning of cytological samples, wherein the new methods are capable of automated detection of objects resembling particular cells, structures or fractions thereof, and of automated grading, scoring and/or ranking of this resemblance. The present methods provide tools for enhancing and detecting objects in digital representations of stained tissue samples or cell samples, thereby facilitating subsequent processing of the representation, including quantification of the staining and grading of the enhanced objects. The present methods also provide tools for arranging the images of objects in a cytological sample in a systematic order for presentation, or for selecting the most appropriate coordinates of the sample for further image acquisition and analysis, or for manual review, contributing to documentation of the status of the sample.
Accordingly, in a first aspect the invention relates to a method for assisting the diagnosing, prognostication or monitoring of testicular carcinoma, and/or for assisting the prediction of the outcome of treatment of testicular carcinoma in an individual, based on a cytological sample from said individual, which has been tagged with one or more tags capable of tagging one or more distinctive cell types characteristic of testicular carcinoma, said method comprising the steps of:
a) obtaining a digital representation of the tagged sample,
b) segmenting the digital representation according to an image analysis
detection algorithm into one or more pixel segments potentially representing the one or more distinctive cell types, and one or more pixel segments not representing said one or more distinctive cell types,
c) grading images and/or clusters of pixels or individual pixels belonging to the pixel classes potentially representing the one or more distinctive cell types according to their degree of resemblance to a corresponding tagged archetypical distinctive cell.
The present invention allows for an automatic method for assisting the diagnosing, prognostication or monitoring of testicular carcinoma and thereby the present invention allows for the first time ever for a realistic method for screening for testicular carcinoma in groups of men, and according the present invention also relates to the use of the method for screening a population for testicular carcinoma.
Description of Drawings
The present document can only reproduce the images as black-and-white (B/W) copies or greyscale representations, not in colour. The original digital colour images are reproduced as B/W-copies, and therefore will not provide the reader with all the information available in the original digital colour images. The segmented images are typically greyscale representations or binary representations.
Figure 1 a shows a B/W photocopy of an original digital colour representation of a microscopically magnified area of a semen sample on a glass slide after
immunostaining for ΑΡ-2γ (reddish in original digital colour image) and active enzyme- staining for alkaline phosphatase (bluish in original digital colour image).
Figure 1 b shows a greyscale representation obtained by transforming the RGB colours of each pixel in the original digital colour image represented in Figure 1 a into a red chromaticity value, so that pixels with high red chromaticity values are shown in white/light grey and pixels with low red chromaticity values are shown in black/dark grey. Figure 1 c shows a greyscale representation obtained by transforming the RGB colours of each pixel in the original digital colour image represented in Figure 1 a into a blue chromaticity value, so that pixels with high blue chromaticity values are shown in white/light grey and pixels with low blue chromaticity values are shown in black/dark grey.
Figure 1 d is a binomial representation of Figure 1 b, which is made by converting each pixel with a red chromaticity value above a defined threshold, TR, to a black pixel, and all other pixels to white pixels. Subsequently, black pixels were converted to white, if the intensity of the corresponding pixel in the original digital colour image represented in Figure 1 a was below a defined intensity threshold, T,.
Figure 1 e is a binomial representation of Figure 1 c, which is made by converting each pixel with a blue chromaticity value above a defined threshold, TB, to a black pixel, and all other pixels to a white pixel. Subsequently, black pixels were converted to white, if the intensity of the corresponding pixel in the original digital colour image represented in Figure 1 a was below a defined intensity threshold, T,. Figure 1 f is a binomial representation combining data from of Figures 1 b and 1 c, which is made to identify purplish pixels by converting each pixel with a red chromaticity value above a defined threshold, TR2, and a blue chromaticity value above a defined threshold, TB2, to a black pixel and all other pixels to a white pixel. Subsequently, black pixels were converted to white, if the intensity of the corresponding pixel in the original digital colour image represented in Figure 1 a was below a defined intensity threshold,
T|.
Figure 1 g outlines the Four Neighbour Connectivity rule, where the "X" marks the pixel currently being investigated, and the four grey pixels (marked "+") are the pixels that are defined to be connected to this particular pixel. Though many other definitions are possible, a cluster of pixels is often defined according to the Four Neighbour
Connectivity rule.
Figure 1 h is a binomial representation based on data from Figure 1 d, for which pixels are converted from black to white, if they belong to a cluster of black pixels, which has an area below a defined minimum reddish profile area-threshold, ARjmin or an area above a defined maximum reddish profile area-threshold, ARjmax. The remaining black pixels represent clusters of reddish pixels in the original digital colour image with an admissible profile area. In this example, there is four reddish clusters numbered 1 to 4.
Figure 1 i is a binomial representation based on combining data from Figures 1 e and 1f, so that pixels which are black in either or both of these figures becomes black, and only pixels which are white in both of these figures become white. The black pixels represent pixels which are bluish and/or purplish in the original digital colour image represented in Figure 1 a.
Figure 1j is a binomial representation based on data from Figure 1 i, for which pixels are converted from black to white, if they belong to a cluster of black pixels, which has an area below a defined minimum bluish/purplish profile area-threshold, AB/p,min or an area above a defined maximum bluish/purplish profile area-threshold, AB/p,max- The remaining black pixels represent clusters of bluish and/or purplish pixels in the original digital colour image with an admissible profile area. In this example, there is one bluish/purplish cluster numbered 5. Note that this presentation of the data helps to filter objects which resemble CIS-cells from objects which are less likely to represent CIS- cells.
Figure 2 is an overview corresponding to Table 1 including B/W photocopies of exemplifying colour images of typical specific parameters measured for the relevant pixel clusters and individual pixels representing objects in the original digital image. The majority of these specific parameters relate to distributions of pixel colors and positions, but they could be any descriptive parameter, such as e.g. circumference, circularity and other shape factors.
Figure 3 corresponds to Table 2 and shows B/W photocopies of colour image examples corresponding to manual grading distributions for the 2,437 relevant objects selected by expert assessors in 506 whole slide semen cytospin scanning images from men in routine subfertility work-up. These objects were used for developing and optimizing the classification tree (Table 3), which ensures the automated grading of CIS cell like objects. Figure 4 shows B/W photocopies of eight examples of colour images of the
archetypical CIS testis cell, as it may appear after being released into the seminal fluid, isolated by cytospinning, and stained for intrinsic phosphatase activity (in this example: blue cytoplasm) and by immunocytochemistry for ΑΡ-2γ (in this example: red nucleus).
Definition
For the purposes of interpreting this specification the following definitions shall apply and whenever appropriate, terms used in the singular shall also include the plural and vice versa:
Diagnosing: As used herein, "diagnosing" means the process of recognizing a disease or condition by its signs or symptoms in an individual. Typically, the individual being diagnosed will not have had a similar diagnosis previously, but in some cases a similar or identical diagnosis will be made after the patient was treated for the first outbreak of the disease or condition. For example, a patient diagnosed with unilateral CIS of the testis may be treated and appear healthy for a period, and then subsequently be diagnosed with CIS in the other testicle.
Prognosticating: As used herein, "prognosticating" means the process of describing the likely progression and outcome of the disease or condition in a patient. Typically, the described outcome would assume no treatment or standard treatment of the patient, and be based on medical experiences from individuals with similar general
characteristics (e.g., gender, age, physical condition, etc) and with similar results of analyses of prognostic indicators. For example, the identification of dissemination of a testicular cancer will influence the prognosis of the patient. Predicting outcome of treatment: As used herein, "predicting the outcome of treatment" means the process of foreseeing the consequences of treating a patient in a specific way before commencing the treatment. Typically, the prediction will be based on analyses indicating the individual patient's response to the relevant medicaments, including chances of beneficial action and risk of side effects. For example, analysis for resistance to a particular chemotherapy of isolated cancer cells may be useful before choosing the medication for a patient.
Monitoring: As used herein, "monitoring the effect of a treatment" means the process of regularly checking a patient during and after treatment for possible remission. Typically, the monitoring will include sensitive analyses for the disease or condition being treated, since even an early, minor and normally rather unspecific sign may be a warning of an insufficient treatment result. For example, a patient who is, or has been, undergoing treatment for CIS testis may be followed by regular analyses for presence of CIS cells in ejaculates.
Population screening: As used herein, "population screening for a disease" means the process of systematically analyzing a group of individuals for the presence of a disease. Typically, the analyses will be able to identify individuals with a non- symptomatic stage of the disease. The screening may be offered to individuals belonging to groups having an increased risk of the disease, or to even wider groups merely defined by their age, race and/or gender. For example, men experiencing infertility have a higher risk of CIS testis, and since they will be prone to analysis of semen quality anyway, they would be likely candidates for a screening for CIS cells in their ejaculate.
Cell: As used herein, the expression "cell" does not deviate from the meaning understood by a person skilled in the art, but it is important that it is not limited to intact cell, but also includes fractions thereof still having a morphology resembling a cell. Cytological material can undergo degradation such as enzymatic lysis either when still in the body or after isolation. Also cytospinning and subsequent handling, such as staining may lead to a partial degradation of cells. For example, the CIS cell can be degraded due to the harsh environment in the semen fluid, and in particular its cytoplasm may not appear intact when inspected by microscopy.
Nucleus: As used herein, the expression "nucleus" does not deviate from the meaning understood by a person skilled in the art i.e., an organelle containing almost all of the cell's genetic material. However, the focus in the current context is on various staining techniques used for tissue sections or cytological samples, which may lead to staining of the nucleus either generally for eukaryotic cells, e.g., by hematoxylin, or specifically for the nucleus of certain cells, e.g. by immunohistochemistry for a nuclear antigen, such as ΑΡ-2γ in the nucleus of CIS testis cells.
Cytoplasm: As used herein, the expression "cytoplasm" does not deviate from the meaning understood by a person skilled in the art i.e., entire contents of a eukaryotic cell excluding the nucleus, and bounded by the plasma membrane. However, the focus in the current context is on various staining techniques used for tissue sections or cytological samples, which may lead to staining of the cytoplasm either generally for eukaryotic cells, e.g., by eosin, or specifically for the cytoplasm of certain cells, e.g. by staining for intrinsic enzyme activity, such as phosphatase activity in the cytoplasm of CIS testis cells.
Object: As used herein, the expression "object" in relation to a cytological sample typically means a structure being a cell or resembling a cell. More generally, as used herein objects are the structures, which are to be automatically detected and graded, when represented in digital images. Beyond cells they could be e.g., organelles, cell fragments, matrix or crystals.
Tagging: As used herein, the expression "tagging" in relation to a cytological sample means to perform one or more chemical reaction steps in order to bind or associate molecules to objects in the sample, in such a way that these molecules can be indirectly or directly identified by microscopy or other detection methods. The term tagging is used herein interchangeably with the term "staining". Typically, the tagging can be by a specific antibody raised against a target antigen characteristic for the object to be identified. In such cases, the tags must be indirectly labeled by a detection reaction typically including either a fluorophor or an extrinsic enzyme which is able to convert a soluble substrate to a precipitated colored product, as is the case for fluorescence and chromogenic immunocytochemistry, respectively. In more rare cases, the tagging can be by a substrate for an intrinsically active target enzyme characteristic for the object to be identified. In such cases, the substrate tag directly labels the object when it is converted into a precipitated colored product. Also, notice that tagging is typically an equilibrium reaction and that the tagging therefore can be almost complete or just partial depending on the efficiency of tagging the particular target. Digital image: As used herein, the expression "digital image" does not deviate from the meaning understood by a person skilled in the art, and it includes both very large images typically derived from slide scanners, and smaller images acquired by digital cameras positioned on microscopes.
Pixel: The term pixel is generally used for the smallest single component of a digital image. The term includes "printed pixels", pixels carried by electronic signals, or represented by digital values, or pixels on a display device, or pixels in a digital camera (photosensor elements). Synonyms of "pixels" are picture element, pel, sample, byte, bit, dot, spot. The person skilled in the art will recognize that the list of synonyms of pixel is not complete and that other terms for pixel are also covered by the invention.
Cluster: As used herein, the expression "cluster" in relation to pixels means a group of pixels belonging to the same segment (or perhaps one of several joined segments), and inter-positioned according to a predefined neighbor relationship, such as the Four Neighbour Connectivity rule (see Fig. 1 g). For the present invention, clusters of pixels are identified by segmentation of the digital image according to spectral, spatial, contextual, and morphological information, and subsequently graded for their degree of resemblance to digital representations of objects of interest, such as CIS cells.
For a given a digital image, there is a number of methods of manipulating and extracting information in order to get a specific result. Below is an overview of expressions from the field of image analysis, which are used herein: Transformation: Given a digital image as input, a transformation outputs an image based on a per pixel arithmetic operation. Thus, each pixel in the output image is calculated by performing a given arithmetic operation on the corresponding pixel in the input image. An example of a transformation could be a basic intensity calculation, given a three band image with each pixel containing a red, green and blue value as input. Each pixel value in the transformed image is calculated as the sum of the red, green and blue values in the corresponding input pixel divided by three (the number of colour values).
Spectral information: As used herein, the expression "spectral Information" includes any quantitative color and intensity information, and values derived from multiple color information such as e.g., the hue calculated from the red, green and blue channel of individual or clusters of digital image pixels.
Spatial, contextual and morphological Information: As used herein, the expression "spatial, contextual and morphological Information" includes any quantitative
information derived from the location of, relationship between, and form or shape of individual or clusters of digital image pixels.
Normalization: A normalization changes the range of pixel values in an image and is a special case of a transformation. An example of a normalization could be a basic stretching of pixel values from the minimum pixel value in the image to the maximum pixel value in the image. In a single band image, each pixel value in the transformed image is calculated by subtracting the minimum value from the corresponding input pixel value and dividing this by the maximum value minus the minimum value.
Filtering: Given a digital image as input, a filtering outputs an image based on a given per pixel calculation. Each pixel in the output image is calculated by performing an arithmetic operation on a given number of pixels in the input image. These pixels are defined by a given filter, which can be envisioned as a small image denoting which pixels are to be included in the calculation relative to the pixel the calculation is performed for. An example of a filtering could be a mean filter, using a 3 by 3 pixel filter centred on the pixel the calculation is performed for. The value of each pixel in the output image is calculated by summing up the values in the input image under the filter and dividing by the number of pixels under the filter (for example 9).
Segmentation: When applied to digital images, the actual goal of a segmentation is to divide images, pixel clusters or individual pixels into segments. This segmentation is performed by a given algorithm and is based on a given number of descriptive parameters per image, pixel cluster or individual pixel. As used herein "classification" is not used interchangeable with "segmentation", since "segmentation" is a term used for image transformation, whereas "classification" is used only for grading images, pixel clusters, pixels and objects.
Thresholding: As used herein, thresholding is a subset of segmentation. For example, it could be a basic threshold on the green band given a threshold value and given a three band image with each pixel containing a red, green and blue value as input. A preferred segmentation criterion for the invention is; each pixel in the output image is set to 1 if the corresponding pixel in the input image has a green value that is higher than the threshold value, otherwise it is set to 0.
Other expressions used herein requiring specific definitions are (in alphabetical order): AP-2y: AP-2qamma, ΑΡ2γ, used herein interchangeably with TFAP2C
Alkaline phosphatase: Placenta-like alkaline phosphatase, is used herein
interchangeably with "alkaline phosphatase" and ALPPL2. Also the more general expression "phosphatase" is used, particularly when describing staining methods for the intrinsic enzyme activity in the cytoplasm of CIS testis cells.
Distinctive cell: is a cell which is capable of being tagged by one or more tags as mentioned herein. A distinctive cell type can further be a cell type with characteristics of a testicular carcinoma. Among distinctive cells are for example archetypical distinctive cells.
Archetypical distinctive cell: are selected examples of cells showing the typical features of that distinctive cell. The selection is typically done by experienced experts in the relevant field. E.g., certain experts in the field of testicular cancer will know how an archetypical CIS testis cell looks, when it has been released into the semen fluid, isolated by cytospinning and stained for ΑΡ-2γ and intrinsic phosphatase activity.
Examples of such isolated and stained archetypical CIS testis cells are shown in Figure 4, and it clearly shows that some variation in appearance is accepted for archetypical CIS testis cells.
Chromaticity: An objective specification of the quality of a color regardless of its luminance, that is, as determined by its hue and colorfulness.
CIS: Abbreviation for 'carcinoma in situ'. An example is carcinoma in situ of the testis. Classification tree: A classification tree is used to predict/classify membership of cases or objects in the classes of a categorical dependent variable from their measurements on one or more predictor variables (parameters). The synonyms regression tree or decisions tree are used interchangeably with classification tree in this document .
Cross validation: Cross validation is the practice of partitioning the available data into multiple subsamples such that one or more of the subsamples is used to fit (train) a model (herein a classification tree), with the remaining subsamples being used to test how well the classification tree performs.
Cytological sample: A sample or suspension of cells originating from e.g., a body fluid, a bodily surface or tissue. A cytological sample can be obtained by biopsy, a fine needle biopsy, a tissue brushing, scraping of cells, collecting a body fluid. Cells of a cytological sample can be segregated or aspirated and then suspended, from a biopsy or a from cell culture.
Cvtospin: A collected sample typically including cells is fixed onto a microscope slide by centrifugation.
Ejaculate: The fluid emitted from a male penis that contains, among other things, sperm. It may or may not contain sperm cells (spermatozoa) and herein used interchangeably with the expressions "semen sample" or "seminal fluid". Fiducial marker: A fiduciary marker or fiducial point is an object used in the field of view of an imaging system, which appears in the image produced, for use as a point of reference or a measure.
Grade: and "grading" is used herein to describe the result and process of assigning a score to an object. The assigned score may be on a continuous scale or distinct. The grading in image analysis could be based on single pixel, but typically involve one or more clusters of pixels. The grading of one or more objects in a patient sample may contribute to the assessment of a diagnosis, prognosis, and/or prediction of treatment. Immunohistochemistry: The process of localizing proteins in cells of a tissue section by employing antibodies specific for the proteins of interest and using a method that allows the antibodies to bind to the proteins while in a relatively native setting in a biological tissue section. Marker: An indicator signaling an event or condition in a biological system or sample giving a measure of status, exposure, susceptibility and more of the biological system, dependent on the marker. A marker is herein the presence of a gene or product(s) hereof, the presence or relative level of which alone or in combination with other markers may indicate a neoplastic and/or cancerous state.
Membrane: A membrane of a cell or an organelle in said cell, for example a cell membrane or a nuclear membrane. A membrane protein or marker is attached to, or associated with, or interacting with a membrane. Risk group: Certain groups of individuals having common relevant traits will have a different risk than the rest of the population not having those traits, of acquiring particular disease(s). Individuals belonging to a high risk group for a certain disease may benefit from particular actions to avoid getting exposed to the disease or to regularly monitor if they have acquired the disease. For example, men who experience fertility problems, men who were affected by cryptorchidism in childhood, and men who were already treated for unilateral testis cancer will belong to (high) risk groups for CIS testis, and may choose to more frequent or more thorough examinations for this disease, than will men not having experienced fertility problems, and not having had cryptorchidism or unilateral testis cancer.
Virtual slide: or whole slide image is a digital image file typically acquired by a slide scanner (also called a virtual microscope) characterized by covering a microscope glass slide in its entirety or a substantial part thereof in (very) high resolution, e.g. one billion pixels. Virtual slides allow rare cell event detection to be performed away from the microscope by use of a computer and an image analysis algorithm.
Detailed description of the invention
Many diseases, such as microbial infections and various cancers are characterized by being initiated by asymptomatic changes of normal cells to diseased cells. Detection of such changes before a disease matures into clinical symptoms and disease
progression can be essential for the opportunity to treat the disease, and may also reduce side effects and disease management costs. The present invention provides a tool for assisting the diagnosing, prognostication or monitoring of testicular carcinoma through image analysis assistance in the grading of the potential carcinoma or carcinoma in situ (CIS) cells.
The above mentioned tool can be applied as an automated screening method for CIS cells in semen samples as described herein and is particularly useful since it can lead to detection of asymptomatic changes of normal cells to diseased cells. Detection of such changes before a disease matures into clinical symptoms and disease
progression can be essential for the opportunity to treat the disease, and may also reduce side effects and disease management costs. Automated methods of the present invention may further have the advantage that screening can be done for a larger group of patients than by using the conventional methods, since the method can be less costly and involve less inconvenience or pain to the patients due to the analysis of semen samples instead of testicular biopsies.
Screening
Due to the fact that the invention described herein provides methods capable of identifying and grading potential carcinoma or carcinoma in situ (CIS) cells
automatically, the methods according to the invention are particularly suitable for screening purpose. The invention may be used in general screening of testicular carcinoma in a population of individuals. However, the invention may of course also be used when diagnosing individuals at risk of acquiring testicular carcinoma, or individuals suspected for having acquired testicular carcinoma, such as the individual is a male examined for infertility.
In a preferred embodiment, the result is presented to a medical professional before a final diagnosis is made. In particular wherein one or more clusters of pixels or individual pixels, having obtained a critical grade in step c), the result is presented as a corresponding digital representation and/or grade, to a medical professional for assessment of the screening result of each of said individuals. The result may be presented as the digital image having markings or labels in areas representing suspected cells.
Cells
The distinctive cell type may be any cell type that it indicative of the testicular carcinoma or CIS, such as is a cancer cell or a precursor of a cancer cell. Preferably, the cancer cell or the precursor of the cancer cell is a CIS cell, seminoma or non- seminoma.
The sample may be in principle be any type of sample suitable for obtaining the cell type, it is however mostly preferred that the cytological sample is a semen sample, and most preferred that the cell is a CIS cell and the sample is a semen sample.
Image analysis
The automated method according to the invention includes, in a first aspect the present invention relates to a method for assisting the diagnosing, prognostication or monitoring of testicular carcinoma, and/or for assisting the prediction of the outcome of treatment of testicular carcinoma in an individual, based on a cytological sample from said individual, which has been tagged with one or more tags capable of tagging one or more distinctive cell types characteristic of testicular carcinoma, said method comprising the steps of:
a) obtaining a digital representation of the tagged sample,
b) segmenting the digital representation according to an image analysis
detection algorithm into one or more pixel segments potentially representing the one or more distinctive cell types, and one or more pixel segments not representing said one or more distinctive cell types,
c) grading the image and/or clusters of pixels or individual pixels belonging to the pixel classes potentially representing the one or more distinctive cell types according to their degree of resemblance to a corresponding tagged archetypical distinctive cell.
Segmentation Due to the facts that testicular carcinoma cells or CIS cells are first of all present in a very low concentration in the samples, and second that the enzymatic conditions of the seminal fluid tends to degrade the cells, then testicular carcinoma cells or CIS cells are representing an inhomogenous cell population which pose a serious problem when performing image analysis. Using image analysis the normal approach is to identify the common features of the object to be inspected, however testicular carcinoma cells or CIS cells vary with respect to size, stainability, presence of cell nucleus, as well as presence of cell membrane whether intact or fragmented. Therefore, it has not been possible to find a conventional image analysis common denominator for the cells, and the invention relates to a combination of variables in order to identify all relevant cells.
The segmentation may be achieved by clustering, Bayesian segmentation, and/or thresholding the image with respect to colour or fluorescence of the tag.
Spectral, spatial, contextual and/or morphological information
The detection algorithm of the segmentation preferably segments the image according to spectral, spatial, contextual and/or morphological information.
The spectral information is preferably suitable for discriminating pixels resembling pixels in digital representations of tagged distinctive cells from pixels not resembling pixels in digital representations of tagged distinctive cells, and the spectral information is obtained by tagging the nucleus and/or the cytoplasm with one or more chromogens and/or fluorophores, thereby providing a characteristic colour and/or fluorescence to the nucleus and/or cytoplasm of the distinctive cell.
The spatial, contextual and/or morphological information is preferably suitable for discriminating pixels resembling pixels in digital representations of tagged distinctive cells from pixels not resembling pixels in digital representations of tagged distinctive cells. This may be achieved by emphasizing the spatial, contextual and/or
morphological information by tagging the nucleus and/or the cytoplasm with one or more chromogens or fluorophores, thereby facilitating the identification of structures contributing to the location of, relationship between, and form or shape of the nucleus and/or cytoplasm. In order to discriminate by morphology the method may include, during the
segmentation in step b) classification of the form of the clusters of pixels potentially representing the one or more distinctive cell types, such as area and/or periphery of the clusters of pixels.
Furthermore, the method may include, during the segmentation in step b) classification of the variation of intensity of pixels potentially representing the one or more distinctive cell types. It has been found that the variation of the intensity is either on its own, or in combination with the form of the clusters, a characteristic measure for the cells during image analysis. In particularly, clusters of pixels wherein the variation is substantial have been found to correlate to the cells to be identified, whereas some artefacts tend to have little or no intensity variation, probably because they are not of biological origin, such as crystals of chromogen.
Even more preferred is a method where the segmentation includes classification of the texture of pixels potentially representing the one or more distinctive cell types. Textural information includes any quantitative information derived from the spatial arrangement of spectral information of individual or clusters of digital image pixels. There are many approaches to quantifying textural information, including but not limited to statistical, structural and spectral approaches.
In a most preferred embodiment it has been found that the detection algorithm in the segmentation step comprises parameters relating to spectral information and textural information. Accordingly, in one embodiment the algorithm comprises at least the two following parameters:
Spectral information parameter, based on chromaticity, and calculated from the following formula 1 wherein the tag is a red stain:
%1 =
Figure imgf000020_0001
Formula 1
Wherein Cgreen is the green cormaticit, Cbiue is the blue chromaticity and n is the number of pixels in the coloured object. Textural information parameter, calculated according to Formula 2 and based on intensity and spatial information: var ) - var(lfMered) Formula 2 wherein / is the intensity values for pixels from the coloured object so that var(/) is the variance in the intensity values of the coloured objects. Ifuterea is like 1 intensity values for the coloured object, with the exception that for IfMered it is further filtered through a mean filter with neighbourhood connectivity on the intensity image, that is for IrMered each intensity value represents the mean of the intensity of its own and the four closest neighbouring pixels. As described below, it is in some embodiments preferred that two or more different tags are used, each tag marking a different part of the cell type. When using more than one tag the method may include classification of variation of intensity and/or classification of texture of pixels representing all different tags. Prior to segmentation
It is preferred that a series of steps are performed before the segmentation takes place. Thus, in one embodiment the segmentation in step b) includes thresholding the image with respect to colour or fluorescence of the tag. In particularly this relates to thresholding with respect to intensity of the staining.
When pixels potentially representing the one or more distinctive cell types are identified in the image, a clustering of the pixels may be performed by clustering of neighbouring pixels before the segmentation in order to establish clusters potentially representing the cells.
It is also preferred that the pixels potentially representing the one or more distinctive cell types, such as the clusters, are sorted with respect to size of clusters before segmentation. In a preferred embodiment they are sorted in relation to both an upper limit and a lower limit. Grading
After the detection in the segmentation step has been performed, and pixels and/or clusters of pixels potentially representing the one or more distinctive cell types have been identified the grading of the pixels or clusters of pixels are performed. Grading may be done manually on the samples identified to potentially include one or more distinctive cell types, but it is preferred that the grading is performed automatically as well.
The grading is performed by evaluating the clusters of pixels or individual pixels belonging to the pixel classes potentially representing the one or more distinctive cell types according to their degree of resemblance to a corresponding tagged archetypical distinctive cell.
In one embodiment the grading step assigns a grade to each cell identified by the method discussed above. In another embodiment the grading step assigns a grade to an image or a slide depending on the cells identified in said image or slide. In a most basic embodiment the grading step includes two grades, either the cells are positively identified as representing the one or more distinctive cell types, or negative, i.e. no cells have been identified as representing the one or more distinctive cell types. In another embodiment the grading step includes at least three grades, positive or negative as defined above or borderline, wherein borderline defines cells between clearly negative and clearly positive.
In yet another embodiment the grading steps includes several grades, such as the following: 1 . Artifact and/or staining precipitate. The object is definitely not a CIS cell.
2. The object has a high probability of not being a CIS cell, typically due to:
a. The object is too small to resemble a typical CIS cell nucleus, or b. The object has a dubious or wrong morphology compared to a typical CIS cell nucleus.
2½. More CIS cell like than a grade 2, and less CIS cell like than a grade 3. 3. Borderline positive. The object could be a CIS cell, but uncertainty is typically due to:
a. A partially stained nucleus with an acceptable morphology
b. A low staining intensity, but correct morphology
3½. More CIS cell like than a grade 3, but still not a clear grade 4.
4. A definite CIS cell with stained nucleus and correct morphology, and possibly also stained cytoplasm.
Machine learning method
An aspect of the invention relates to the use of an automated method for classification or grading of cells. Thus, in one embodiment of the invention, the automated method can be any machine learning method such as a decision tree, an artificial neural network or a bayesian network which has optimized parameters that allow for grading or classifying cells.
In one preferred embodiment of the present invention, the machine learning method is a decision tree or classification tree. The parameters optimized for grading or classifying cells depend on the outcome of the actual staining method(s), including but not limited to the range of morphologies of the target cell, e.g. size range, shapes, and degradation profile; the cellular localization of the tagged marker(s), e.g. nucleus, cytoplasm or membrane; the detection principle, e.g. chromogenic or fluorogenic, and the colour(s) of the tag(s). The invention also relates to other parameters, such as spectral parameters and/or spatial and/or contextual and/or textural and/or morphological parameters.
In one embodiment the parameters used for the machine learning method can be one or more of the parameters mentioned in Table 1 . The parameters may include low and high values of each of these parameters as shown in Table 1 . Figure 2 is a tabular overview with corresponding B/W photocopies (see Figure 2) of exemplifying images of typical specific parameters measured for the relevant pixel clusters and individual pixels representing objects in the original digital image. The majority of these specific parameters relate to distributions of pixel colors and positions, but they could be any descriptive parameter, such as e.g. Circumference, Circularity and other shape factors. μ0 denotes mean chromaticity, σ2 denotes Variance, R denotes a red pixel value, G denotes a green pixel value and B denotes a blue pixel value; Mean 30% highest Intensity, where I denotes intensity; Green Chromaticity - Blue Chromaticity, where C denotes chromaticity, the subscript indicates the color; Distance Variance, where d denotes that the parameter relates to pixel distance from object center, thus ad 2 denotes distance variance, x and y denotes pixel coordinates in the image and xc and yc denotes the coordinates of the center of mass; Distance Skewness, where γ denotes skewness; Circularity, where fcirc denotes circularity.
Table 1
Example Example lllustratio
Explanation
of small of large n
values values
Profile
Profile Profile area
Profile area
area: 1 12 area: (Intrinsic)
1 ,931
See See image See image
image Figure 2B Figure 2C
A = N
Figure 2 A
Mean Mean Mean
Chromatic Chromatici Chromatici
Mean Chromaticity
ity: 0.3615 ty: ty
0.5301 (Intrinsic)
See See image
image Figure 2E _ 1 Rn
c N
Figure 2D Z Rn + Gn + Bn
71=1
or
Figure imgf000024_0001
Chromatic Chromatici Chromatici
ity ty ty Chromaticity Variance
Variance: Variance: Variance
7.07 · 10"7 0.0467 (Intrinsic)
Figure imgf000025_0001
See See image See image Yd = - ^ (V(¾ + xc)2 + (yn + yc)2 image Figure 2P Figure 2Q
Figure 20
- μ-d)
Circularity Circularity: Circularity Circularity
: 21 104 (Intrinsic)
See See image
image Figure 2S Circ Circumference2
Figure 2R
Mean Mean 30% Mean 30% Mean 30% highest red chromaticity 30% highest highest
highest red red
red chromatici chromatici
chromatici ty: 0.658 ty
ty: 0.361 (Intrinsic)
See See image
image Figure 2U _ 1 Rn
Figure 2T c N Z Rn + Gn + Bn
=1
Combined Combined Combined Combined Circularity
Circularity Circularity: Circularity
: 18.1 93.6 (Intrinsic
and
Extrinsic)
See See image f —
image Figure 2W irc Circumference2
Figure 2V
Mean Mean 30% Mean 30% Mean 30% highest blue chromaticity 30% highest highest
highest blue blue
blue chromatici chromatici
chromatici ty: 0.395 ty
ty: 0.291 (Extrinsic)
N
See See image _ 1 Bn
μ' N Rn + Gn + Bn image Figure 2Y =1
Figure 2X
Combined Combined Combined
Profile Profile Profile
Profile area
Area: Area: Area
106 3,785 (Intrinsic
and
Extrinsic) See See image
A = N
image Figure
Figure 2Z 2ZZ
In one embodiment, the optimized parameters are one or more of the following:
chromaticity, such as red chromaticity, blue chromaticity or green chromaticity, and/or derivatives or combinations thereof and/or colour of the pixel cluster such as 1 for red, 2 for blue or 3 for green, and/or the profile area, and/or the mean chromaticity value and/or the variance of the chromaticity and/or the skewness of the chromaticity values and/or the kurtosis of the chromaticity and/or the variance of the distances to the centre of a cluster and/or skewness of the distances to the centre of a cluster and/or kurtosis of the distances to the centre of a cluster and/or textural parameters such as textural information based on spatial and intensity information. The machine learning method can be trained on any set of digital cell images.
An aspect of the invention relates to the use of methods such as transformation, normalization, filtering and segmentation for performing operations on digital information or pixels as described above.
In a preferred embodiment of the invention, the data of the digital image is transformed, such as normalized as part of the grading or classification method. In a preferred embodiment, cross-validation is used to optimize the machine learning method. Thus in an even more preferred embodiment, the machine learning method is a decision tree or a classification tree which is optimized by cross-validation.
In another preferred embodiment of the method of the present invention, step c) involves the use of a machine learning method with optimized parameters for grading or classifying cells, wherein said parameters are one or more parameters selected from the textural information, the mean difference between green chromaticity and blue chromaticity, the profile area, the mean red chromaticity, the mean 30% highest intensity, the blue chromaticity 30%, the red chromaticity 30%, the red chromaticity variance, the circularity, the extrinsic red chromaticity, the distance to center variance, and the combined circularity and distance to center skewness. In a one embodiment of the invention, step c) involves the use of a machine learning method with optimized parameters for grading or classifying cells, wherein said parameters comprise one or more of the parameters selected from textural information, the mean red chromaticity and the distance to center variance. In such an embodiment, the parameters can comprise one of the parameters selected from the textural information, or the mean red chromaticity, or the distance to center variance, or the parameters can comprise a combination of the textural information and mean red chromaticity, or a combination of the textural information and distance to center variance, or a combination of the textural information, mean red chromaticity and distance to center variance, or a combination of the mean red chromaticity and distance to center variance.
Even more preferably, the method of the present invention, step c) involves the use of a machine learning method with optimized parameters for grading or classifying cells, wherein said parameters comprise at least the textural information, the mean difference between green chromaticity and blue chromaticity, the profile area, the mean red chromaticity, the mean 30% highest intensity, the blue chromaticity 30%, the red chromaticity 30%, the red chromaticity variance, the circularity, the extrinsic red chromaticity, the distance to center variance, and the combined circularity and distance to center skewness.
In yet a preferred embodiment, step c) involves the use of a decision tree with optimized parameters for grading or classifying cells, with a structure as defined in Table 2. Note that Inf is an abbreviation for infinite towards higher/positive values and - Inf is an abbreviation for infinite towards the smaller/negative values.
Table 2
Left Right
Child: Child:
No. Parameter Minimum Maximum Grade
Inside outside
boundary boundary
1 Textural Information 10.000 800.000 2 143
Green chromaticity -
2 -inf 0.060 3 142
blue chromaticity mean
3 Profile Area 100.000 800.000 4 141
4 Red Chromaticity Mean 0.375 inf 5 66
5 Textural Information 10.000 508.522 6 45
Mean 30% highest
6 384.545 535.348 7 28
intensity
7 Blue Chromaticity 30% 0.309 0.332 8 19 Red Chromaticity 30% 0.395 0.417 9 16
Red Chromaticity
303.827 607.190 10 13
Variance
Circularity 44.601 inf 11 12
Positive borderline
Textural Information 74.938 800.000 14 15
borderline
Negative
Red Chromaticity,
0.221 0.237 17 18 extrinsic
Negative borderline
Profile Area 328.120 800.000 20 21
Positive
Distance to Center
14.600 inf 22 23
Variance
borderline
Red Chromaticity 30% 0.424 0.453 24 27
Profile Area 175.480 800.000 25 26
Positive borderline
Positive
Circularity, combined 39.188 inf 29 30
borderline
Distance to Center
-0.237 0.197 31 36
Skewness
Distance to Center
7.986 inf 32 35
Variance
Red Chromaticity 30% 0.351 inf 33 34
Negative
Positive borderline
Red Chromaticity 30% 0.388 0.410 37 40
Red Chromaticity
569.270 inf 38 39
Variance
borderline
Negative
Red Chromaticity
379.668 607.190 41 44
Variance
Distance to Center
-0.381 -0.266 42 43
Skewness
borderline
Negative borderline
Textural Information 797.579 800.000 46 49
Profile Area -inf 800.000 47 48
borderline borderline
Red Chromaticity Mean 0.393 0.429 50 55
Distance to Center
3.577 25.623 51 52
Variance
borderline
Red Chromaticity
379.668 683.031 53 54
Variance
Negative borderline
Red Chromaticity
379.668 569.270 56 63
Variance
Circularity, combined 24.763 37.585 57 60
Mean 30% highest
-inf 676.097 58 59 intensity
Negative borderline
Circularity, combined 48.804 87.271 61 62
Positive borderline
Red Chromaticity Mean 0.378 inf 64 65
borderline
Negative
Mean 30% highest
374.492 565.509 67 106 intensity
Distance to Center
-0.555 -0.121 68 93
Skewness
Green chromaticity -
0.012 0.035 69 82 blue chromaticity mean
Textural Information 74.938 508.522 70 75
Circularity, combined 27.969 31.174 71 74
Red Chromaticity,
0.182 inf 72 73 extrinsic
borderline
Negative borderline
Green chromaticity -
0.016 0.025 76 79 blue chromaticity mean
Distance to Center
5.782 inf 77 78
Variance
borderline
Positive
Textural Information 653.050 800.000 80 81
borderline
Negative
Red Chromaticity Mean 0.366 inf 83 90
Red Chromaticity Mean 0.369 inf 84 87
Mean 30% highest
424.759 535.348 85 86 intensity
borderline
Positive
Blue Chromaticity 30% -inf 0.408 88 89
Negative
Positive
Profile Area 99.160 800.000 91 92
Negative borderline
Green chromaticity -
0.030 0.060 94 99 blue chromaticity mean
Red Chromaticity
531 .349 inf 95 98
Variance
Distance to Center
3.577 inf 96 97
Variance
borderline
Negative
Negative Blue Chromaticity 30% 0.317 0.348 100 103
Profile Area 99.160 800.000 101 102
Negative borderline
Textural Information 74.938 800.000 104 105
borderline
Negative
Red Chromaticity 30% 0.373 inf 107 128
Circularity, combined 23.160 39.188 108 119
Red Chromaticity 30% 0.380 0.388 109 114
Mean 30% highest
615.776 666.044 110 11 1 intensity
Negative
Green chromaticity -
0.016 0.049 112 113 blue chromaticity mean
borderline
Negative
Circularity 28.229 44.601 115 116
Negative
Green chromaticity -
-0.007 0.060 117 118 blue chromaticity mean
borderline
Negative
Blue Chromaticity 30% 0.309 0.317 120 127
Profile Area 251 .800 442.600 121 124
Textural Information 219.466 800.000 122 123
borderline
Negative
Profile Area 99.160 800.000 125 126
Negative borderline borderline
Mean 30% highest
364.438 605.723 129 138 intensity
Green chromaticity -
0.012 0.025 130 135 blue chromaticity mean
Red Chromaticity
379.668 inf 131 134
Variance
Textural Information 363.994 800.000 132 133
Negative borderline
Negative
Profile Area 99.160 800.000 136 137
Negative borderline
Profile Area 99.160 800.000 139 140
Negative borderline
Negative
Negative
Negative Tagging
The cells are tagged by staining of cells using markers. The markers may be any type of cellular markers, such as nuclear markers, cytoplasmatic markers and membrane markers.
The tag or one of the tags may be an in situ staining marker, such as an in situ staining marker identifying a target selected from the group consisting of transcription factors ΑΡ-2γ, OCT3/4, NANOG, GATA-4, GATA-6 and FOG-2, alkaline phosphatase, epidermal growth factor receptor (EGFR), and Ki67.
One or more of the tags may be an in situ staining marker, such as an in situ staining marker identifying a target selected from the group consisting of ΑΡ-2γ, OCT3/4, NANOG, GATA-4, GATA-6 and FOG-2, alkaline phosphatase, epidermal growth factor receptor (EGFR), SOX2, SOX15, SOX17, E2F1 , IFI16, TEAD4, TLE1 , TATDN2, NFIB, LM02, MECP2, HHEX, XBP1 , RRS1 , MYCN, ETV4, ETV5, MYCL1 , HIST1 H1 C, WDHD1 , RCC2, TP53, MDC1 , ALPL, DPPA4, TCL1 A, CDH1 ,GLDC, TCL1A, DPPA4, CDK5, CD14, FGD1 , NEURL, HLA-DOA, DYSF, MTHFD1 , ENAH, ZDHHC9, NME1 , SDCBP, SLC25A16, ATP6AP2, PODXL, PDK4, PCDH8, RAB15, EVI2B, LRP4, B4GALT4, CHST2, FCGR3A, CD53, CD38, PIGL, CKMT1 B, RAB3B, NRCAM, KIT,ALK2, PDPN, HRASLS3, and TRA-1 -60.
In a more preferred embodiment the in situ staining marker identifies transcription factor ΑΡ-2γ. In another more preferred embodiment the in situ staining marker identifies intrinsic enzyme activity of alkaline phosphatase.
In even more preferred embodiment, two or more in situ staining markers are used, wherein said two or more in situ staining markers are capable of identifying two or more targets selected from the group consisting of ΑΡ-2γ, OCT3/4, NANOG, GATA-4, GATA-6 and FOG-2, alkaline phosphatase, epidermal growth factor receptor (EGFR), SOX2, SOX15, SOX17, E2F1 , IFI16, TEAD4, TLE1 , TATDN2, NFIB, LM02, MECP2, HHEX, XBP1 , RRS1 , MYCN, ETV4, ETV5, MYCL1 , HIST1 H1 C, WDHD1 , RCC2, TP53, MDC1 , ALPL, DPPA4, TCL1A, CDH1 ,GLDC, TCL1 A, DPPA4, CDK5, CD14, FGD1 , NEURL, HLA-DOA, DYSF, MTHFD1 , ENAH, ZDHHC9, NME1 , SDCBP, SLC25A16, ATP6AP2, PODXL, PDK4, PCDH8, RAB15, EVI2B, LRP4, B4GALT4, CHST2, FCGR3A, CD53, CD38, PIGL, CKMT1 B, RAB3B, NRCAM, KIT,ALK2, PDPN, HRASLS3, and TRA-1 -60, and preferably double staining is achieved by using two or more in situ staining markers for identifying transcription factor ΑΡ-2γ and intrinsic enzyme activity of alkaline phosphatase, respectively.
In one embodiment it is preferred to perform a double staining, i.e. staining the cell to be identified by two different tags in order to increase the likelihood of identifying the correct cells. The double staining may be achieved by using two in situ staining markers selected from the group above, such as selected for identifying transcription factor ΑΡ-2γ and intrinsic enzyme activity of alkaline phosphatase, respectively.
The in situ staining marker may be a chromogenic enzyme substrate, such as a chromogenic enzyme substrate selected from the group consisting of DAB, BCIP, Fast Red and AEC.
In another embodiment the in situ staining marker is a fluorophore selected from the group consisting of FITC, TRITC and Texas Red. The staining of the cells for classification or grading can be any type of staining method known in the prior art, such as for example immunostaining or enzymatic staining. In one preferred embodiment of the invention, the staining is immunostaining of ΑΡ-2γ and enzymatic staining using the activity of alkaline phosphatase. Medium
The sample or part of the sample is preferably dispersed on, or contained in, an appropriate medium for performing the tagging procedure of the one or more distinctive cells types. The medium may be a reactor useful for incubation or flow of cells.
In another embodiment the medium is a dispersing surface useful for microscopy or scanning, such as wherein the medium is a glass slide or a compact disc. Glass slides may include fiducial lines or fiducial points for easing the autofocusing in microscopy or scanning of samples with limited intrinsic and tagging contrast. Image acquisition
The digital representation of the tagged sample discussed above is acquired by adequate optics, such as a microscope, and an image acquisition device. The image acquisition device is preferably a digital camera.
In a preferred embodiment digital representation of the tagged sample is in the form of a virtual slide, typically acquired by whole slide scanning.
Automated system In one embodiment the present invention further encompasses an automated or semi- automated system suitable for carrying out one or more of the methods disclosed herein, said automated or semi-automated system comprising, in combination:
a database capable of including a plurality of digital images of the samples;
a software module for analyzing a plurality of pixels from a digital image of the samples a control module comprising instructions for carrying out said method(s).
Said automated or semi-automated system can also further comprise one or more of: a slide loader, a barcode reader, a microscope (preferably motorized), and a stage (preferably motorized).
In a preferred embodiment the system includes a slide scanner whereby each slide is scanned for production of digital images of the slides. In a more preferred embodiment the slide scanner provides digital slides at high resolution, i.e. virtual slides, from each slide.
The system preferably includes an image processor and digitizer, and a general processor with peripherals for printing, storage, etc.
The system can additionally provide an opportunity for a user to provide guidance during the process. For example, the user can specify a particular area of interest by selecting it on the screen. The system can also provide a way to eliminate a specified area or areas.
Computer readable medium
In another aspect, the present invention further encompasses a computer readable medium comprising instructions for carrying out one or more of the methods disclosed herein. Suitable computer-readable media can for example be a hard disk to provide storage of data, data structures, computer-executable instructions, and the like. Other types of media which are readable by a computer, such as removable magnetic disks, CDs, magnetic cassettes, flash memory cards, digital video disks, and the like, may also be used.
Examples
The following examples (1 -3) illustrates how a digital image of a cytological sample can be processed by algorithmic steps of the invention, for the purpose of automated detection of objects in digital images representing or resembling a target cell of interest. This initial part of the process relies mainly on colour and profile area information sufficient for image segmentation and detection of pixel clusters (Example 1 ), which must then be further analyzed for various more advanced descriptive parameters (see Example 2), and subsequently automatically graded for their individual degree of resemblance to an archetypical target cell, or fractions thereof, by use of a
classification tree (see Example 3).
Obviously, the details of the automated detection procedure will depend closely on the expected outcome of the actual staining method(s), including but not limited to the range of morphologies of the target cell, e.g. size range, shapes, and degradation profile; the cellular localization of the tagged marker(s), e.g. nucleus, cytoplasm or membrane; the detection principle, e.g. chromogenic or fluorogenic, and the colour(s) of the tag(s).
Example 1
Detection of CIS cell-like clusters of pixels In the actual example, the target cell is CIS cells in cytospins of semen samples. Under such conditions, the CIS cell typically will include a cell nucleus and some remains of cytoplasm, but may be somewhat degraded due to the lytic effects of the seminal fluid. Two markers of the CIS cell are tagged: the transcription factor ΑΡ-2γ typically located in its nucleus, and the enzyme alkaline phosphatase typically located in its cytoplasm. Their tagging is secured by a double staining method described by Nielsen et al (3), which includes immunostaining for ΑΡ-2γ and active enzyme-staining for alkaline phosphatase. The preferred colour of the ΑΡ-2γ tag is reddish due to the choice of the colourless substrate 3-Amino-9-Ethylcarbazole (AEC), which in the presence of peroxide is converted into a red precipitate by the indirect peroxidase-label. The preferred colour of the alkaline phosphatase tag is bluish due to the choice of the colourless substrate 5-Bromo-4-chloro-3-indolyl phosphate (BCIP), which in the presence of the oxidant nitro blue tetrazolium chloride (NBT) is converted by the intrinsic enzyme activity into a blue precipitate.
Figure 1 a shows a B/W photocopy of an original digital colour representation of a microscopically magnified area of a semen sample on a glass slide after the
abovementioned double staining for CIS cells. In the original image, there are reddish and bluish colours resembling the immunostaining for ΑΡ-2γ and the active enzyme- staining for alkaline phosphatase, respectively.
In the first step of the process, the Red-Green-Blue (RGB) colours of the original digital image were transformed into the corresponding chromaticities according to the following definitions:
I = R + G + B (Formula 3); where / is the intensity, and R, G, and B are the RGB colour values for red, green and blue, respectively.
Typically, the RGB were represented on an 8-bit scale with integer values ranging from 0 to 255, and thereby / could vary from 0 to 765. Other RGB-scales could have been used. ; C = J ™d C ^ = y FOrmUla 4; where C represent the chromaticities. For visualization purposes, the red and blue chromaticities of each pixel in the original digital colour image are presented in Figure 1 b and 1 c, respectively, according to a greyscale representation, where the chromaticity values were normalized.
For the identification of pixels with an admissible red chromaticity, possibly
representing the immunostaining for ΑΡ-2γ, a binomial representation of Figure 1 b was made in Figure 1 d by segmentation. Pixels with a red chromaticity higher than the experimentally derived red chromaticity threshold, TR=0.361 were converted to black pixels, and all other pixels were converted to white pixels. The red chromaticity threshold was determined by studies of the digital images of archetypical CIS cells in cytospins stained by immunocytochemistry for ΑΡ-2γ. Subsequently, black pixels were converted to white, if the intensity of the corresponding pixel in the original digital colour image represented in Figure 1 a was below the selected intensity threshold TF150 according to the RGB-values on an 8-bit scale. This conversion was done to avoid false identification of darkish pixels in the original image, for which small variations in R-, G- and B-values could result in large variations in chromaticity.
Similarly, the identification of pixels with an admissible blue chromaticity, possibly representing the staining for intrinsic alkaline phosphatase activity, was done by making a binomial representation of Figure 1 c in Figure 1 e by segmentation. Pixels with a blue chromaticity higher than the experimentally derived blue chromaticity threshold, TB=0.353 were converted to black pixels, and all other pixels were converted to white pixels. As before the black pixels were subsequently converted to white, if the intensity of the corresponding pixel in the original digital colour image was below T|=150.
Also, the identification of admissible purplish pixels was done by making a binomial representation. However, this representation (Figure 1 f) was based on the
segmentation of data from both Figure 1 b and 1 c, since only pixels with a red chromaticity value above the experimentally derived red chromaticity threshold, TR2 = 0.341 , and a blue chromaticity value above the experimentally derived blue
chromaticity threshold, TB2 = 0.341 were converted to black pixels. All other pixels were converted to white pixels. As before the black pixels were subsequently converted to white, if the intensity of the corresponding pixel in the original digital colour image was below T| = 150.
The automated detection of relevant objects resembling CIS-cells requires that the analysis can be done on clusters of pixels, and not only on individual pixels. For example, the analysis of clusters of pixels is useful for the detection of stained nuclei of a target cell represented in a digital image. Pixels belonging to the same cluster can be defined by various measures of connectivity. In the current example, the most restrictive connectivity rule was applied, the Four Neighbour Connectivity. As shown in Figure 1 g, any pixel in an image (except for those on the edge) is touched by 8 neighbour pixels of which 4 pixels (in white) are diagonal and more distant than the 4 pixels (in grey), which are orthogonal and closest to the pixel being analysed for cluster-association (marked with "X" in Figure 1 g). In the current example, a cluster of pixels is defined as all pixels linked directly or indirectly via orthogonal neighbour pixels, i.e. according to the Four Neighbour Connectivity rule. A pixel belonging to a given segmentation class can either be individual or be included in a cluster, but never in more than one cluster.
Since a cluster can be very small (down to two pixels), or very large (comprising up to almost all pixels of the digital image), clusters are typically discarded if they are too small or too large to fit expectations of an admissible profile area.
For the actual example, the four (4) clusters of reddish pixels with appropriate profile areas were detected (Figure 1 h). They were defined by applying the Four Neighbour Connectivity rule on the black pixels in Figure 1 d, and by restricting the allowed profile area of admissible clusters to being within a range defined by a minimum and a maximum reddish profile area-threshold of AR min = 100 pixels and AR max= 2,000 pixels, respectively. This range was defined by studying digital images acquired by slide scanning at 20 x objective with a Hamamatsu Nanozoomer of the reddish profile area of nuclei of archetypical CIS cells in semen sample cytospins after immunostaining for ΑΡ-2γ. Other thresholds would be relevant for images acquired at other resolutions and/or by other scanners or microscope cameras. For the purpose of the following analysis, the bluish and purplish pixels were combined before applying the Four Neighbour Connectivity rule. The black pixels of Figure 1 i are those pixels which are black in Figures 1 e and/or in Figure 1 f, whereas the white pixels are those pixels which were white in both these figures.
For the actual example, one (1 ) cluster of bluish/purplish pixels with an appropriate profile area was detected (Figure 1j). It was defined by applying the Four Neighbour Connectivity rule to the black pixels in Figure 1 i, and by restricting the allowed profile area of admissible clusters to being within a range defined by a minimum and a maximum bluish/purplish profile area-threshold of AB/p,min = 100 pixels and AB/p,max= 5,000 pixels, respectively. This range was defined by studying digital images of the bluish/purplish profile area of cytoplasmic regions of archetypical CIS cells in semen sample cytospins after intrinsic enzyme activity staining for alkaline phosphatase.
Thus, in the present example, a total of five (5) admissible clusters were detected, four reddish clusters and one bluish/purplish cluster. They may alone or together be part of an object of interest, and this is further evaluated in Examples 2 and 3. Example 2
Calculation of relevant parameters for CIS cell like pixel clusters
For each pixel cluster detected by image segmentation according to algorithmic steps of the invention as having admissible colour and profile area (see Example 1 ), a number of relevant parameters can be calculated, in order to automatically grade the resemblance of the cluster to the target cell of interest.
Obviously, the selection of relevant parameters will depend closely on the expected most characteristic features of the stained target cell and of other likely resembling but irrelevant objects in the digital images of the cytological samples, since the selected parameters should be suitable for distinguishing pixel clusters representing the target cell from clusters representing irrelevant objects.
Some parameters are determined only by contributions from pixels belonging to the same cluster (intrinsic parameters), whereas other parameters include contributions from pixels not belonging to the cluster, but typically located near the cluster of interest (extrinsic parameters).
In the present example, as was also the case in Example 1 , the target cell is CIS cells in cytospins of semen samples double stained for the ΑΡ-2γ antigen (reddish nucleus), and for intrinsic alkaline phosphatase activity (bluish cytoplasm).
The actual example is based on the five clusters identified in Figures 1 h and 1j, and in particular on cluster #3 (see Figure 1 h). This cluster is of special interest, since it according to manual inspection by experts in the field represents a CIS cell nuclei stained reddish by immunohistochemistry for ΑΡ-2γ.
Descriptions of some of the invention's most relevant parameters, and examples of low and high values of each of these parameters are summarized in Table 1 and Figure 2 as mentioned above. Notice that the original colour images of Figure 2, which were all acquired from cytospins of double stained semen samples, are reproduced here as B/W copies.
Examples of relevant intrinsic parameters are given below, and examples of their values are given for cluster #3 (Figure 1 h):
· The profile area of the pixel cluster: 694 (here measured as number of pixels)
• The mean chromaticity value: 0.423 (in this case red chromaticity)
• The variance of the chromaticity values: 2.16 (in this case red chromaticity)
• The mean of the 30% highest chromaticity values: 0.476 (in this case red
chromaticity)
· The mean of the difference between the green chromaticity values and the blue chromaticity values: 0.00670
• The variance of the distances to the centre: 28.3 (here measured in (pixel length)2)
• The skewness of the distances to the centre: -0.38677
• The circularity of the pixel cluster: 36.6
• The mean of the 30% highest intensity values: 437
• The textural information of the pixel cluster: 173
Though not used in the actual algorithms, examples of other relevant auto-parameters are the mean distance to the centre, the perimeter of the pixel cluster, the skewness and kurtosis of the chromaticity, the kurtosis of the distance to the centre and the ratio between height and width.
Examples of relevant extrinsic or combined intrinsic and extrinsic parameters are given below, and examples of their values are given for the cluster of interest (in this case cluster #3 in the red representation in Figure 1 h) by also using information provided by its nearby clusters (in this case cluster #5 in the blue/purple representation in Figure 1 h):
• The profile area of nearby* cluster(s): 979 (here measured as number of pixels) · The mean of the 30% highest chromaticity values of nearby* cluster(s): 0.401
(in this case blue chromaticity)
• The circularity of the combined actual cluster and nearby* cluster(s): 58.6 (in this case the combination of cluster #3 and cluster #5)
Though not used in the actual algorithms, examples of other relevant extrinsic or combined intrinsic and extrinsic parameters are the mean, variance, skewness and kurtosis of the chromaticity of nearby* clusters, the mean, variance, skewness and kurtosis of the distance to the center of the cluster of interest of pixels of nearby* clusters and the perimeter of the nearby* clusters.
* Here, a nearby cluster is defined as a cluster with one or more pixels within a distance of 5 pixels from a pixel in the cluster of interest. The values of the extrinsic parameters are based on the values of all pixels contained in one or more nearby clusters. Typically, a nearby cluster will be derived from the output of another segmentation than the one which led to the definition of the cluster of interest, but it may also be a second cluster derived from the output of the same segmentation. For the actual case, the nearby cluster (#5) was derived from the blue/purple segmentation in Figure 1j, whereas the cluster of interest (#3) was derived from the red segmentation in Figure 1 h.
Example 3
The construction and use of a classification tree
When an admissible cluster of pixels has been detected (see Example 1 ), and its values for the selected relevant intrinsic and extrinsic parameters have been determined (see Example 2), it can be automatically graded for its resemblance to the target cell of interest by use of an appropriate classification tree. The design and construction of the classification tree will depend closely on the parameters selected to distinguish the features of the stained target cell from those of other resembling but irrelevant objects in the digital images of the cytological samples. In the present example, as was also the case in Examples 1 and 2, the target cell is
CIS cells in cytospins of semen samples double stained for the ΑΡ-2γ antigen (reddish nucleus), and for intrinsic alkaline phosphatase activity (bluish cytoplasm).
The actual example is based on digital representations of microscopically magnified areas of cytospins of double stained semen samples collected by expert pathologists. The digital representations were acquired due to the presence of an object assumed to be relevant to the preparation of a grading algorithm for clusters of pixels representing objects resembling CIS cells. Based on a total of 2,437 collected and manually graded objects in these digital representations derived from 506 whole slide semen cytospins of men in a couple infertility work-up, a classification tree was constructed (Table 3). The objects were individually graded on a well established manual classification scale of: [1 , 2, 2½, 3, 3½, 4], covering from artefacts (1 ) to archetypical CIS cells (4), Table 2 and Figure 3. Table 2 shows manual grading distributions for the 2,437 relevant objects selected by expert assessors in 506 whole slide semen cytospin scanning images from men in routine subfertility work-up. Figure 3 shows corresponding B/W photocopies of colour images of exemplifying objects. All 2,437 objects were used for developing and optimizing the classification tree, which ensures the automated grading of CIS cell like objects.
Table 2
Class Grade Description No. See examples in
1 Artifact and/or staining precipitate. Not a 2,039
CIS cell.
Negative Figs. 3A-C.
2 Cell-like, but too small or wrong 201 morphology to be a CIS cell.
2½ More CIS cell like than a grade 2. Less 1 17
CIS cell like than a grade 3.
Borderline Figs. 3D-F.
3 Acceptable morphology and partial 54 staining. Could be a CIS cell.
Positive 3½ More CIS cell like than a grade 3. Less 19 Figs. 3G-I. CIS cell like than a grade 4.
4 Stained nucleus and cytoplasm. Correct 7
morphology. Definite CIS cell.
More specifically, the official grades represent the following object types:
1 . Artifact and/or staining precipitate. The object is definitely not a CIS cell.
2. The object has a high probability of not being a CIS cell, typically due to:
a. The red-stained object is too small to resemble a typical CIS cell
nucleus, or
b. The red-stained object has a dubious or wrong morphology compared to a typical CIS cell nucleus.
2½. More CIS cell like than a grade 2, and less CIS cell like than a grade 3.
3. Borderline positive. The object could be a CIS cell, but uncertainty is typically due to:
a. A partially stained nucleus with an acceptable morphology
b. A low staining intensity, but correct morphology
3½. More CIS cell like than a grade 3, but still not a clear grade 4.
4. A definite CIS cell with stained nucleus and correct morphology, and possibly also stained cytoplasm.
Clusters of pixels representing these 2,437 objects were detected and analysed as described in Examples 1 and 2, and the expert-provided grades of these objects served as the basis for construction of the classification tree.
The classification tree however works with a less complex grade structure than the one detailed above, namely with grading the objects Positive, Borderline and Negative (P/B/N). The Positive category includes the grades 3½ and 4, the Borderline category includes the grades 2, 2½ and 3 and the Negative category includes the objects graded 1 .
A matrix representation has been chosen to visualize the classification tree, see Table 3.
Table 3 is a tabular overview of a typical classification tree. Each node in the tree has been assigned an index number. Furthermore each node is either a split or an end node where a grade is assigned. Splits divide objects based on the given parameter and boundaries and pass them on to other nodes based on the indices given in the left child/ right child columns.
Table 3
Left Right
Child: Child:
No. Parameter Minimum Maximum Grade
Inside outside
boundary boundary
1 Textural Information 10.000 800.000 2 143
Green chromaticity -
2 -inf 0.060 3 142
blue chromaticity mean
3 Profile Area 100.000 800.000 4 141
4 Red Chromaticity Mean 0.375 inf 5 66
5 Textural Information 10.000 508.522 6 45
Mean 30% highest
6 384.545 535.348 7 28
intensity
7 Blue Chromaticity 30% 0.309 0.332 8 19
8 Red Chromaticity 30% 0.395 0.417 9 16
Red Chromaticity
9 303.827 607.190 10 13
Variance
10 Circularity 44.601 inf 11 12
1 1 Positive
12 borderline
1 3 Textural Information 74.938 800.000 14 15
14 borderline
1 5 Negative
Red Chromaticity,
16 0.221 0.237 17 18
extrinsic
1 7 Negative
18 borderline
1 9 Profile Area 328.120 800.000 20 21
20 Positive
Distance to Center
21 14.600 inf 22 23
Variance
22 borderline
23 Red Chromaticity 30% 0.424 0.453 24 27
24 Profile Area 175.480 800.000 25 26
25 Positive
26 borderline
27 Positive
28 Circularity, combined 39.188 inf 29 30
29 borderline
Distance to Center
30 -0.237 0.197 31 36
Skewness
Distance to Center
31 7.986 inf 32 35
Variance
32 Red Chromaticity 30% 0.351 inf 33 34
33 Negative
34 Positive
35 borderline
36 Red Chromaticity 30% 0.388 0.410 37 40
Red Chromaticity
37 569.270 inf 38 39
Variance
38 borderline Negative
Red Chromaticity
379.668 607.190 41 44
Variance
Distance to Center
-0.381 -0.266 42 43
Skewness
borderline
Negative borderline
Textural Information 797.579 800.000 46 49
Profile Area -inf 800.000 47 48
borderline borderline
Red Chromaticity Mean 0.393 0.429 50 55
Distance to Center
3.577 25.623 51 52
Variance
borderline
Red Chromaticity
379.668 683.031 53 54
Variance
Negative borderline
Red Chromaticity
379.668 569.270 56 63
Variance
Circularity, combined 24.763 37.585 57 60
Mean 30% highest
-inf 676.097 58 59 intensity
Negative borderline
Circularity, combined 48.804 87.271 61 62
Positive borderline
Red Chromaticity Mean 0.378 inf 64 65
borderline
Negative
Mean 30% highest
374.492 565.509 67 106 intensity
Distance to Center
-0.555 -0.121 68 93
Skewness
Green chromaticity -
0.012 0.035 69 82 blue chromaticity mean
Textural Information 74.938 508.522 70 75
Circularity, combined 27.969 31.174 71 74
Red Chromaticity,
0.182 inf 72 73 extrinsic
borderline
Negative borderline
Green chromaticity -
0.016 0.025 76 79 blue chromaticity mean
Distance to Center
5.782 inf 77 78
Variance
borderline
Positive
Textural Information 653.050 800.000 80 81
borderline
Negative
Red Chromaticity Mean 0.366 inf 83 90
Red Chromaticity Mean 0.369 inf 84 87 Mean 30% highest
84 424.759 535.348 85 86
intensity
85 borderline
86 Positive
87 Blue Chromaticity 30% -inf 0.408 88 89
88 Negative
89 Positive
90 Profile Area 99.160 800.000 91 92
91 Negative
92 borderline
Green chromaticity -
93 0.030 0.060 94 99
blue chromaticity mean
Red Chromaticity
94 531 .349 inf 95 98
Variance
Distance to Center
95 3.577 inf 96 97
Variance
96 borderline
97 Negative
98 Negative
99 Blue Chromaticity 30% 0.317 0.348 100 103
100 Profile Area 99.160 800.000 101 102
101 Negative
102 borderline
103 Textural Information 74.938 800.000 104 105
104 borderline
105 Negative
106 Red Chromaticity 30% 0.373 inf 107 128
107 Circularity, combined 23.160 39.188 108 119
108 Red Chromaticity 30% 0.380 0.388 109 114
Mean 30% highest
109 615.776 666.044 110 11 1
intensity
1 10 Negative
Green chromaticity -
1 1 1 0.016 0.049 112 113
blue chromaticity mean
1 12 borderline
1 1 3 Negative
1 14 Circularity 28.229 44.601 115 116
1 1 5 Negative
Green chromaticity -
1 16 -0.007 0.060 117 118
blue chromaticity mean
1 1 7 borderline
1 18 Negative
1 1 9 Blue Chromaticity 30% 0.309 0.317 120 127
120 Profile Area 251 .800 442.600 121 124
121 Textural Information 219.466 800.000 122 123
122 borderline
123 Negative
124 Profile Area 99.160 800.000 125 126
125 Negative
126 borderline
127 borderline
Mean 30% highest
128 364.438 605.723 129 138
intensity
Green chromaticity -
129 0.012 0.025 130 135
blue chromaticity mean
1 30 Red Chromaticity 379.668 inf 131 134 Variance
1 31 Textural Information 363.994 800.000 132 133
1 32 Negative
1 33 borderline
1 34 Negative
1 35 Profile Area 99.160 800.000 136 137
1 36 Negative
1 37 borderline
1 38 Profile Area 99.160 800.000 139 140
1 39 Negative
140 borderline
141 Negative
142 Negative
143 Negative
The root of the tree is node 1 , where the classification is initiated for all pixel clusters. If a cluster has a value for a given parameter of a node, which is inside the
corresponding boundary, the pixel cluster is sent to the left child. Otherwise it is sent to the right child. If the pixel cluster reaches a node, which only contains a grade (a leaf node), it is assigned that grade, and the classification of the corresponding object is completed. A classification tree was constructed by the basic method. To get a classification tree which was more independent of data variation, cross validation was used. This method splits the data multiple (typically 10) times into two sets; a training set and a test set. Thus multiple trees can be built and all these trees can be validated and their success at classifying correctly can be compared using the respective test sets. Table 3 shows an optimized, cross validated classification tree based on the same data as the basic classification tree.
Using the classification tree for analysis of the digital representation of the
microscopically magnified area of a stained semen sample described in Example 1 , the detected clusters can be classified. For cluster #3 (see Figure 1 h), the classification takes the following path:
1 . Starting at the root node [#1 ], the first question is: is the textural information of the pixel cluster between 10 and 800? As it is so, the left child node is visited. 2. At this node [#2] the question is; is the mean of the difference between the green chromaticity values and the blue chromaticity between -infinity and 0.06? As it is so, the left child node is visited.
3. At this node [#3] the question is; is the profile area between 100 and 800? As it is inside these boundaries, the left child node is visited.
4. At this node [#4] the question is; is the mean red chromaticity value between 0.375 and infinity? As it is inside these boundaries, the left child node is visited.
5. At this node [#5] the question is; is the profile area of the pixel cluster between - infinity and 387.56? As it is outside these boundaries, the right child node is visited. 6. At this node [#6] the question is; is the mean of the 30% highest intensity values between 385 and 535? As it is inside this boundary, the left child node is visited.
7. At this node [#7] the question is; is the mean of the 30% highest blue chromaticity values between 0.309 and 0.332? As it is outside this boundary, the right child node is visited.
8. At this node [#19] the question is; is the profile area between 328 and infinity? As it is inside this boundary, the left child node is visited.
9. At this node [#20] there is no question and the object is assigned the grade Positive, meaning 3.5 or 4 on the original scale. As shown in Table 4, the 2,437 manually selected and graded objects were
automatically analysed by the optimized, cross validated classification tree. Notice than none of the 26 objects graded as positive by experts were negative according to image analysis.
Figure imgf000048_0001
Since the proposed CIS screening procedure may include a final manual review by an expert of all objects automatically graded as borderline or positive by the classification tree, and since the presence of objects classified as borderline by manual inspection per se have no clinical consequences, Table 4 can be modified according to the practical clinical consequences of the manual inspection of objects, and of the automated image analysis of objects, and thereby used for estimation of specificity and sensitivity, as well as the positive and negative predictive values at the individual object level, as shown in Table 5.
Table 5. Grades determined b automated anal sis usin the classification tree.
Figure imgf000049_0001
Agreement: 93%
Cohen's kappa: 0.23
Specificity: 93% Positive predictive value: 14% Sensitivity: 100% Negative predictive value: 100%
The major purpose of the automated screening is to reduce the workload of manual inspection, thereby allowing the screening of semen samples in a routine setting without causing an overwhelming manual workload. This is achieved both by the automated object identification in general, and by reducing the number of relevant objects for review more than 10-folds by the automated grading. Still, the method can only be used, if no or very few CIS cell like objects are missed by the image analysis. This seems to be achieved since all 26 positive objects (Grade 3½ or 4 according to manual inspection) were identified by the automated grading as borderline or positive, i.e. as prone for manual review. Example 4
Clinical results of screening for CIS cells in semen samples
Participants: All 765 men (median age 30.7 years) attending routine semen analysis in the course of couple infertility work-up at the University Department of Growth and
Reproduction, Rigshospitalet (Copenhagen, Denmark) over a continuous period of 18 months.
Samples: 1 ,175 ejaculates ranging from 1 -5 ejaculates per man. The ejaculation abstinence period was 3.5 days, median sperm concentration 2.5 x106 spermatozoa/ml and median pH 8. The median ejaculate volume was 3.9 ml, but the volume available for cytospin on glass slides and subsequent screening of CIS cells varied between 100 and 600 μΙ, and was 400 μΙ for more than 90% of the samples. 13% of the samples were classified as azoospermic, including 5% obstructive azoospermic.
Staining: The cytospin samples were double-stained for intrinsic alkaline phosphatase activity and for the antigenic presence of ΑΡ-2γ, before image acquisition.
Image acquisition: Slide scanning was done using a NanoZoomer HT version 2.0 (Hamamatsu Photonics, Japan). The areas of stained cytospins were scanned in 3 Z- stacks (+1 and +2 microns from the autofocus layer) and with settings corresponding to a 40X objective in traditional microscopy. For each slide, the corresponding digital image included 2 regions of interest (ROIs) each of which was approx. 57 mm2 in size, and the file (approx. 1 .5 GB) was saved on a server until manual digital reading on a computer monitor and automated image analysis.
In silico evaluation: In addition to the manual inspection, all stained specimens were also evaluated by the automated in silico procedures. The algorithms were based on automated identification and quantification, respectively, of the typical spectral and spatial parameters for a double stained CIS cell: A round and reddish nucleus (o~10 μηι) surrounded by a bluish halo of cytoplasm. The initial detection of CIS cell-like objects was based on calculation of the red chromaticity for each pixel, and
segmentation according to this parameter. Clusters were subsequently defined by the 4-pixel neighborhood connectivity rule. Only red pixel clusters with a relevant profile area were selected for further analysis and grading according to their CIS cell resemblance. For pixel clusters that met the inclusion criteria a number of relevant parameters were subsequently calculated, in order to automatically grade its resemblance to archetypical CIS cells. Some parameters were determined by contributions from pixels belonging to the red cluster only (intrinsic parameters), whereas others included contributions from pixels not belonging to the cluster, but typically located near the cluster of interest e.g., blue pixels representing stained cytoplasm (extrinsic parameters). The final algorithms are included in version 4.0 of the VIS software (Visiopharm A/S, Hoersholm, Denmark) to facilitate easy import and analysis of scanned cytospins.
Results: As outlined in Table 6, exfoliated CIS cell like objects were detected in the ejaculates from five of the 765 men (0.65%). For two of these men (case 1 and 2), all four ejaculates were positive for each of them, whereas the three ejaculates from two other men (case 3 and 4) included one positive and one borderline for each of them. For the last man (case 5), the positive ejaculate has been the only sample collected until now.
Testicular biopsies have been taken from four of these five men, and CIS was detected by histological analysis in three of them (Cases 2, 3 and 4). In contrast, the hitherto small testicular biopsies collected in Case 1 were negative, and a new biopsy has not been collected. For Case 5, no testicular biopsy is yet available. For one of these five men (Case 3), the only preceding symptom to the CIS diagnosis was couple infertility, since his semen quality and ultrasound result were normal, and no history of cryptorchidism or testicular disease was recorded. The four other CIS-positive men, all had poor semen quality, and either a history of cryptorchidism (Case 1 ), treated unilateral teratoma (Case 2), or treated unilateral non-seminoma (Cases 4 and 5).
Finally, Table 6 shows that in the ejaculates from 18 of the 765 men (2.4%), objects graded as borderline were identified. One of these men only provided this borderline sample, and for 15 men a second ejaculate was graded negative, while for two men also their second sample was classified as borderline. For the remaining 742 men (97%), CIS cell like objects were not detected in their ejaculates.
Table 6. Summary of the results for 765 men in couple infertility work-up.
Men Samples §Men(%) Total 765 1 175 100.0%
Positive 5 1 1 0.7%
Borderline 18 *22 2.4%
Negative 736 1 1 17 96.2%
Technical problem 6 25 0.8%
* Beyond the 18 initial Borderline Samples from Borderline Men,
2 Borderline Samples were second samples from Borderline Men, and 2 Borderline Samples were from Positive Men.
§ The 95% confidence interval for the expected approx. 2% life-time risk of testicular cancer among men experiencing couple infertility is: 1 .0-3.1 %.
The following items further serve to describe the invention.
Items
1 ) A method for assisting the diagnosing, prognostication or monitoring of
testicular carcinoma, and/or for assisting the prediction of the outcome of treatment of testicular carcinoma in an individual, based on a cytological sample from said individual, which has been tagged with one or more tags capable of tagging one or more distinctive cell types characteristic of testicular carcinoma, said method comprising the steps of:
a) obtaining a digital representation of the tagged sample,
b) segmenting the digital representation according to an image analysis
detection algorithm into one or more pixel segments potentially representing the one or more distinctive cell types, and one or more pixel segments not representing said one or more distinctive cell types,
c) grading images and/or clusters of pixels or individual pixels belonging to the pixel classes potentially representing the one or more distinctive cell types according to their degree of resemblance to a corresponding tagged archetypical distinctive cell.
2) The method according to any of the preceding items, wherein the distinctive cell type is a cancer cell or a precursor of a cancer cell.
3) The method according to any of the preceding items, wherein the cancer cell or the precursor of the cancer cell is a CIS cell, seminoma or non-seminoma.
4) The method according to any of the preceding items wherein the cancer cell or the precursor of the cancer cell is a CIS cell.
5) The method according to any of the preceding items, wherein the cytological sample is a semen sample.
6) The method according to any of the preceding items, wherein the cytological sample is a semen sample and the distinctive cell type is a CIS cell. 7) The method according to any of the preceding items, wherein one or more of the tags are in situ staining markers.
8) The method according to any of the preceding items, wherein the one or more tags are in situ staining markers capable of identifying one or more targets selected from the group consisting of ΑΡ-2γ, OCT3/4, NANOG, alkaline phosphatase, SOX2, SOX15, SOX17, E2F1 , IFI16, TEAD4, TLE1 , TATDN2, NFIB, LM02, MECP2, HHEX, XBP1 , RRS1 , MYCN, ETV4, ETV5, MYCL1 , HIST1 H1 C, WDHD1 , RCC2, TP53, MDC1 , ALPL, DPPA4, TCL1A,
CDH1 ,GLDC, TCL1A, DPPA4, CDK5, CD14, FGD1 , NEURL, HLA-DOA, DYSF,
MTHFD1 , ENAH, ZDHHC9, NME1 , SDCBP, SLC25A16, ATP6AP2, PODXL, PDK4, PCDH8, RAB15, EVI2B, LRP4, B4GALT4, CHST2, FCGR3A, CD53, CD38, PIGL, CKMT1 B, RAB3B, NRCAM, KIT, ALK2, PDPN, HRASLS3, and TRA-1 -60.
9) The method according to any of the preceding items, wherein the one or more tags are one or more in situ staining markers capable of identifying one or more targets selected from the group consisting of transcription factors ΑΡ-2γ, OCT3/4, NANOG, GATA-4, GAT A- 6 and FOG-2, alkaline phosphatase, epidermal growth factor receptor (EGFR), and Ki67.
10) The method according to items 7 to 9, wherein one or more in situ staining
markers are capable of identifying transcription factor ΑΡ-2γ. 1 1 ) The method according to items 7 to 10, wherein one or more in situ staining markers are capable of identifying intrinsic enzyme activity of alkaline phosphatase.
12) The method to according to items 7 to 1 1 , wherein double staining is achieved by using two or more in situ staining markers for identifying transcription factor
ΑΡ-2γ and intrinsic enzyme activity of alkaline phosphatase, respectively
13) The method according to items 7 to 12, wherein one or more in situ staining marker is a chromogenic enzyme substrate selected from the group comprising of DAB, BCIP, Fast Red and AEC. ) The method according to items 7 to 13, wherein one or more in situ staining marker is a fluorophore selected from the group comprising of FITC, TRITC and Texas Red. ) The method according to any of the preceding items for screening of a disease in a population of individuals. ) The method according to any of the preceding items, wherein the individual is a male suspected to have an increased risk of carcinoma of the testis. ) The method according to the preceding items, wherein the individual is a male examined for infertility. ) The method according to any of the preceding items , wherein the segmentation in step b) includes thresholding the image with respect to colour or fluorescence of the tag. ) The method according to any of the preceding items, wherein pixels potentially representing the one or more distinctive cell types are clustered before the segmentation. ) The method according to any of the preceding items, wherein pixels potentially representing the one or more distinctive cell types are sorted with respect to size of clusters before segmentation. ) The method according to item 20, wherein pixels potentially representing the one or more distinctive cell types are sorted with respect to size of clusters both in relation to an upper limit and a lower limit. ) The method according to any of the preceding items, where the segmentation discriminates pixels in the digital image resembling pixels in digital
representations of tagged distinctive cells, from pixels not resembling pixels in digital representations of tagged distinctive cells, by use of an image analysis detection algorithm segmenting the image according to spectral, spatial, contextual and/or morphological information.
23) The method according to item 22, where the spectral information is suitable for discriminating pixels resembling pixels in digital representations of tagged distinctive cells from pixels not resembling pixels in digital representations of tagged distinctive cells.
24) The method according to items 22 to 23, where the spectral information is
obtained by tagging the nucleus and/or the cytoplasm with one or more chromogens and/or fluorophores, thereby providing a characteristic colour and/or fluorescence to the nucleus and/or cytoplasm of the distinctive cell.
25) The method according to items 22 to 24, where the spatial, contextual and/or morphological information is suitable for discriminating pixels resembling pixels in digital representations of tagged distinctive cells from pixels not resembling pixels in digital representations of tagged distinctive cells.
26) The method according to item 22 to 25, where the spatial, contextual and/or morphological information is emphasized by tagging the nucleus and/or the cytoplasm with one or more chromogens or fluorophores, thereby facilitating the identification of structures contributing to the location of, relationship between, and form or shape of the nucleus and/or cytoplasm. 27) The method according to any of the preceding items, where the grading of each cluster of pixels or individual pixel selected by segmentation for their resemblance to pixels in digital representations of tagged archetypical distinctive cells is done according to quantitative characteristics measured by analysis of spectral, spatial, contextual and/or morphological information for each of these selected clusters of pixels or individual pixels.
28) The method according to any of the preceding items, wherein the segmentation in step b) includes classification of the form of the pixels potentially representing the one or more distinctive cell types, such as area and/or periphery of the clusters of pixels. ) The method according to any of the preceding items, wherein the segmentation includes classification of the variation of intensity in clusters of pixels potentially representing the one or more distinctive cell types. ) The method according to any of the preceding items, wherein the segmentation includes classification of the texture of clusters of pixels potentially representing the one or more distinctive cell types. ) The method according to any of the preceding items, including tagging using at least two different stains, each stain tagging a different part of the cell. ) The method according to any of the preceding items, including classification of variation of intensity of pixels representing at least one other tag. ) The method according to any of the preceding items, including classification of texture of pixels representing at least one other tag. ) The method according to items any of the preceding items, where the digital representation of the tagged cytological sample is achieved by a slide scanner or virtual microscope providing whole slide digital images. ) The method according to any of the preceding items, wherein the one or more clusters of pixels or individual pixels, which have obtained a characteristic grade in step c), is reported as a corresponding digital representation and/or grade, which without further modification can be used for the automatic diagnosis, prognosis, monitoring or treatment prediction of a disease. ) The method according to any of preceding items , wherein the one or more clusters of pixels or individual pixels, which have obtained a characteristic grade in step c), is presented as a corresponding digital representation and/or grade, to a medical professional for assessment of the diagnosis, prognosis, monitoring or treatment prediction of a disease. 37) The method according to any of the preceding items, wherein the cytological sample or part of the sample is dispersed on, or contained in, an appropriate medium for performing the tagging procedure of the one or more distinctive cells types.
38) The method according to item 37, wherein the medium is a reactor useful for incubation or flow of cells.
39) The method according to item 37, wherein the medium is a dispersing surface useful for microscopy or scanning.
40) The method according to item 37, wherein the medium is a glass slide or a compact disc. 41 ) The method according to item 40, wherein the glass slide includes fiducial lines or fiducial points for easing the autofocusing in microscopy or scanning of samples with limited intrinsic and tagging contrast.
42) The method according to the previous items, wherein the one or more clusters of pixels or individual pixels, which have obtained a critical grade in step c), is presented as a corresponding digital representation and/or grade, to a medical professional for assessment of the screening result of each of said individuals.
43) The method according to the preceding items, wherein the step c) involves the use of an automated method.
44) The method according to the preceding items, wherein step c) involves the use of a machine learning method with optimized parameters for grading or classifying cells, wherein said machine learning method is selected from the group consisting of a decision tree, an artificial neural network or a Bayesian network.
45) The method according to the preceding items, wherein step c) involves the use of a machine learning method with optimized parameters for grading or classifying cells, wherein said machine learning method a decision tree or a classification tree. ) The method according to the preceding items, wherein the step c) involves a transformation of the data of the digital representation. ) The method according to the preceding items, wherein step c) involves the use of a machine learning method with optimized parameters for grading or classifying cells, wherein said parameters one or more parameters selected from the textural information, the difference between green chromaticity and blue chromaticity mean, the profile area, the mean red chromaticity, mean 30% highest intensity, the blue chromaticity 30%, the red chromaticity 30%, the red chromaticity variance, the circularity, the extrinsic red chromaticity, distance to center variance, combined circularity and distance to center skewness. ) The method according to the preceding items, wherein the step c) involves the use of a machine learning method with optimized parameters for grading or classifying cells, wherein said parameters comprise one or more of the parameters selected from textural information, the mean red chromaticity and the distance to center variance. ) The method according to the preceding items, wherein the step c) involves the use of a machine learning method with optimized parameters for grading or classifying cells, wherein said parameters comprise textural information. ) The method according to the preceding items, wherein the step c) involves the use of a machine learning method with optimized parameters for grading or classifying cells, wherein said parameters comprise the mean red chromaticity. ) The method according to the preceding items, wherein the step c) involves the use of a machine learning method with optimized parameters for grading or classifying cells, wherein said parameters comprise the distance to center variance. 52) The method according to the preceding items, wherein step c) involves the use of a machine learning method with optimized parameters for grading or classifying cells, wherein said parameters comprise at least the textural information, the difference between green chromaticity and blue chromaticity mean, the profile area, the mean red chromaticity, mean 30% highest intensity, the blue chromaticity 30%, the red chromaticity 30%, the red chromaticity variance, the circularity, the extrinsic red chromaticity, distance to center variance, combined circularity and distance to center skewness. 53) The method according to the preceding items wherein the step c) involves the use of a decision tree having one or more of the following parameters and corresponding thresholds:
Textural Information: minimum threshold 10.000, maximum threshold 800.000, and/or,
Mean green chromaticity - blue chromaticity: minimum threshold -infinite, maximum threshold 0.060, and/or,
Profile Area: minimum threshold 100.000, maximum threshold 800.000, and/or, Mean Red Chromaticity: minimum threshold 0.375, maximum threshold inf, and/or,
Textural Information: minimum threshold 10.000, maximum threshold 508.522, and/or,
Mean 30% highest intensity, minimum threshold 384.545, maximum threshold 535.348, and/or,
Blue Chromaticity 30%: minimum threshold 0.309, maximum threshold 0.332, and/or,
Red Chromaticity 30%: minimum threshold 0.395, maximum threshold 0.417, and/or,
Red Chromaticity Variance: minimum threshold 303.827, maximum threshold 607.190, and/or,
Circularity: minimum threshold 44.601 , maximum threshold inf, and/or,
Textural Information minimum threshold 74.938, maximum threshold 800.000, and/or,
Extrinsic Red Chromaticity: minimum threshold 0.221 , maximum threshold 0.237, and/or,
Profile Area: minimum threshold 328.120, maximum threshold 800.000, and/or, Distance to Center Variance: minimum threshold 14.600, maximum threshold inf, and/or,
Red Chromaticity 30%: minimum threshold 0.424, maximum threshold 0.453, and/or,
Profile Area: minimum threshold 175.480, maximum threshold 800.00, and/or, Combined Circularity: minimum threshold 39.188, maximum threshold inf, and/or,
Distance to Center Skewness: minimum threshold -0.237, maximum threshold 0.197, and/or,
Distance to Center Variance: minimum threshold 7.986, maximum threshold inf, and/or,
Red Chromaticity 30%: minimum threshold 0.351 , maximum inf, and/or, Red Chromaticity 30%: minimum threshold 0.388, maximum 0.410, and/or, Red Chromaticity Variance: minimum threshold 569.270, maximum threshold inf, and/or,
Red Chromaticity Variance: minimum threshold 379.668, maximum threshold 607.190, and/or,
Distance to Center Skewness: minimum threshold -0.381 , maximum threshold - 0.266, and/or,
Textural Information: minimum threshold 797.579, maximum threshold 800.000, and/or,
Profile Area: minimum threshold -inf, maximum threshold 800.000, and/or, Mean Red Chromaticity: minimum threshold 0.393, maximum threshold 0.429, and/or,
Distance to Center Variance: minimum threshold 3.577, maximum threshold 25.623, and/or,
Red Chromaticity Variance: minimum threshold 379.668, maximum threshold 683.031 , and/or,
Red Chromaticity Variance: minimum threshold 379.668, maximum threshold 569.270, and/or,
Combined Circularity: minimum threshold 24.763, maximum threshold37.585, and/or,
Mean 30% highest intensity: minimum threshold -inf, maximum threshold 676.097, and/or, Combined Circularity: minimum threshold 48.804, maximum threshold 87.271 , and/or,
Mean Red Chromaticity: minimum threshold 0.378, maximum threshold inf, and/or,
Mean 30% highest intensity: minimum threshold 374.492, maximum threshold
565.509, and/or,
Distance to Center Skewness: minimum threshold -0.555, maximum threshold- 0.121 , and/or,
Mean Green chromaticity - blue chromaticity: minimum threshold 0.012, maximum threshold 0.035, and/or,
Textural Information: minimum threshold 74.938, maximum threshold 508.522, and/or,
Combined Circularity: minimum threshold 27.969, maximum threshold 31 .174, and/or,
Extrinsic Red Chromaticity: minimum threshold 0.182, maximum threshold inf, and/or,
Mean Green chromaticity - blue chromaticity: minimum threshold 0.016, maximum threshold 0.025, and/or,
Distance to Center Variance: minimum threshold 5.782, maximum threshold inf, and/or,
Textural Information: minimum threshold 653.050, maximum threshold 800.000, and/or,
Mean Red Chromaticity: minimum threshold 0.366, maximum threshold inf, and/or,
Mean Red Chromaticity: minimum threshold 0.369, maximum threshold inf, and/or,
Mean 30% highest intensity: minimum threshold 424.759, maximum threshold 535.348, and/or,
Blue Chromaticity 30%: minimum threshold -inf, maximum threshold 0.408, and/or,
Profile Area: minimum threshold 99.160, maximum threshold 800.000, and/or, Mean Green chromaticity - blue chromaticity: minimum threshold 0.030, maximum threshold 0.060, and/or,
Red Chromaticity Variance: minimum threshold 531 .349, maximum threshold inf, and/or, Distance to Center Variance: minimum threshold 3.577, maximum threshold inf, and/or,
Blue Chromaticity 30%: minimum threshold 0.317, maximum threshold 0.348, and/or,
Profile Area: minimum threshold 99.160, maximum threshold 800.000, and/or, Textural Information: minimum threshold 74.938, maximum threshold 800.000, and/or,
Red Chromaticity 30%: minimum threshold 0.373, maximum threshold inf, and/or,
Combined Circularity: minimum threshold 23.160, maximum threshold 39.188, and/or,
Red Chromaticity 30%: minimum threshold 0.380, maximum threshold 0.388, and/or,
Mean 30% highest intensity: minimum threshold 615.776, maximum threshold 666.044, and/or,
Mean Green chromaticity - blue chromaticity: minimum 0.016, maximum threshold 0.049, and/or,
Circularity: minimum threshold 28.229, maximum threshold 44.601 , and/or, ME Mean Green chromaticity - blue chromaticity: minimum threshold -0.007, maximum threshold 0.060, and/or,
Blue Chromaticity 30%: minimum 0.309, maximum 0.317, and/or,
Profile Area: minimum threshold 251 .800, maximum threshold 442.600, and/or,
Textural Information: minimum threshold 219.466, maximum threshold 800.000, and/or,
Profile Area: minimum threshold 99.160, maximum threshold 800.000, and/or, Mean 30% highest intensity: minimum threshold 364.438, maximum threshold 605.723, and/or,
Mean Green chromaticity - blue chromaticity: minimum threshold 0.012, maximum threshold 0.025, and/or,
Red Chromaticity Variance: minimum threshold 379.668, maximum threshold inf, and/or,
Textural Information: minimum threshold 363.994, maximum threshold 800.000, and/or,
Profile Area: minimum threshold 99.160, maximum threshold 800.000. ) An automated or semi-automated system suitable for carrying out a method as defined by the items 1 to 53 , wherein said automated or semi-automated system comprises in combination:
a) a database capable of including a plurality of digital images (representations) of the samples,
b) a software module for analyzing a plurality of pixels from a digital image of the samples, and
c) a control module comprising instructions for carrying out said method. ) A computer readable medium comprising instructions for carrying out a method as defined by items 1 to 53.
References
1 . Almstrup K, Hoei-Hansen CE, Wirkner U, Blake J, Schwager C, Ansorge W, Nielsen JE, Skakkebaek NE, Rajpert-De Meyts E, Letters H. Embryonic stem celllike features of testicular carcinoma in situ revealed by genome-wide gene expression profiling. Cancer Res. 2004; 64: 4736-43.
Hoei-Hansen CE, Carlsen E, Jorgensen N, Letters H, Skakkebaek NE, Rajpert-De Meyts E. Towards a non-invasive method for early detection of testicular neoplasia in semen samples by identification of fetal germ cell-specific markers. Hum Reprod. 2007; 22: 167-73.
Nielsen JE, Kristensen D, Almstrup K, Jorgensen A, Olesen I, Jacobsen GK, Horn T, Skakkebaak NE, Letters H, Rajpert-De Meyts E A novel double staining strategy for improved detection of testicular carcinoma in situ (CIS) cells in human semen samples. Andrologia 201 1 Feb 16.)

Claims

Claims
1 ) A method for assisting the diagnosing, prognostication or monitoring of
testicular carcinoma, and/or for assisting the prediction of the outcome of treatment of testicular carcinoma in an individual, based on a cytological sample from said individual, which has been tagged with one or more tags capable of tagging one or more distinctive cell types characteristic of testicular carcinoma, said method comprising the steps of:
a) obtaining a digital representation of the tagged sample,
b) segmenting the digital representation according to an image analysis
detection algorithm into one or more pixel segments potentially representing the one or more distinctive cell types, and one or more pixel segments not representing said one or more distinctive cell types,
c) grading images and/or clusters of pixels or individual pixels belonging to the pixel classes potentially representing the one or more distinctive cell types according to their degree of resemblance to a corresponding tagged archetypical distinctive cell.
2) The method according to any of the preceding claims, wherein the distinctive cell type is a cancer cell or a precursor of a cancer cell.
3) The method according to any of the preceding claims, wherein the cancer cell or the precursor of the cancer cell is a CIS cell, seminoma or non-seminoma.
4) The method according to any of the preceding claims wherein the cancer cell or the precursor of the cancer cell is a CIS cell.
5) The method according to any of the preceding claims, wherein the cytological sample is a semen sample. 6) The method according to any of the preceding claims, wherein the cytological sample is a semen sample and the distinctive cell type is a CIS cell.
7) The method according to any of the preceding claims, wherein one or more of the tags are in situ staining markers. 8) The method according to any of the preceding claims, wherein the one or more tags are in situ staining markers capable of identifying one or more targets selected from the group consisting of ΑΡ-2γ, OCT3/4, NANOG, alkaline phosphatase, SOX2, SOX15, SOX17, E2F1 , IFI16, TEAD4, TLE1 , TATDN2, NFIB, LM02, MECP2, HHEX, XBP1 , RRS1 , MYCN, ETV4, ETV5, MYCL1 , HIST1 H1 C, WDHD1 , RCC2, TP53, MDC1 , ALPL, DPPA4, TCL1 A,
CDH1 ,GLDC, TCL1A, DPPA4, CDK5, CD14, FGD1 , NEURL, HLA-DOA, DYSF, MTHFD1 , ENAH, ZDHHC9, NME1 , SDCBP, SLC25A16, ATP6AP2, PODXL, PDK4, PCDH8, RAB15, EVI2B, LRP4, B4GALT4, CHST2, FCGR3A, CD53, CD38, PIGL, CKMT1 B, RAB3B, NRCAM, KIT,ALK2, PDPN, HRASLS3, and TRA-1 -60.
9) The method according to any of the preceding claims, wherein the one or more tags are one or more in situ staining markers capable of identifying one or more targets selected from the group consisting of transcription factors ΑΡ-2γ, OCT3/4, NANOG, GATA-4, GAT A- 6 and FOG-2, alkaline phosphatase, epidermal growth factor receptor (EGFR), and Ki67.
10) The method according to claims 7 to 9, wherein one or more in situ staining markers are capable of identifying transcription factor ΑΡ-2γ.
1 1 ) The method according to claims 7 to 10, wherein one or more in situ staining markers are capable of identifying intrinsic enzyme activity of alkaline phosphatase.
12) The method to according to claims 7 to 1 1 , wherein double staining is achieved by using two or more in situ staining markers for identifying transcription factor ΑΡ-2γ and intrinsic enzyme activity of alkaline phosphatase, respectively
13) The method according to claims 7 to 12, wherein one or more in situ staining marker is a chromogenic enzyme substrate selected from the group comprising of DAB, BCIP, Fast Red and AEC. 14) The method according to claims 7 to 13, wherein one or more in situ staining marker is a fluorophore selected from the group comprising of FITC, TRITC and Texas Red. 15) The method according to any of the preceding claims for screening of a disease in a population of individuals.
16) The method according to any of the preceding claims, wherein the individual is a male suspected to have an increased risk of carcinoma of the testis.
17) The method according to the preceding claims, wherein the individual is a male examined for infertility.
18) The method according to any of the preceding claims , wherein the
segmentation in step b) includes thresholding the image with respect to colour or fluorescence of the tag.
19) The method according to any of the preceding claims, wherein pixels potentially representing the one or more distinctive cell types are clustered before the segmentation.
20) The method according to any of the preceding claims, wherein pixels potentially representing the one or more distinctive cell types are sorted with respect to size of clusters before segmentation.
21 ) The method according to claim 20, wherein pixels potentially representing the one or more distinctive cell types are sorted with respect to size of clusters both in relation to an upper limit and a lower limit. 22) The method according to any of the preceding claims, where the segmentation discriminates pixels in the digital image resembling pixels in digital
representations of tagged distinctive cells, from pixels not resembling pixels in digital representations of tagged distinctive cells, by use of an image analysis detection algorithm segmenting the image according to spectral, spatial, contextual and/or morphological information. 23) The method according to claim 22, where the spectral information is suitable for discriminating pixels resembling pixels in digital representations of tagged distinctive cells from pixels not resembling pixels in digital representations of tagged distinctive cells.
24) The method according to claims 22 to 23, where the spectral information is obtained by tagging the nucleus and/or the cytoplasm with one or more chromogens and/or fluorophores, thereby providing a characteristic colour and/or fluorescence to the nucleus and/or cytoplasm of the distinctive cell.
25) The method according to claims 22 to 24, where the spatial, contextual and/or morphological information is suitable for discriminating pixels resembling pixels in digital representations of tagged distinctive cells from pixels not resembling pixels in digital representations of tagged distinctive cells.
26) The method according to claim 22 to 25, where the spatial, contextual and/or morphological information is emphasized by tagging the nucleus and/or the cytoplasm with one or more chromogens or fluorophores, thereby facilitating the identification of structures contributing to the location of, relationship between, and form or shape of the nucleus and/or cytoplasm.
27) The method according to any of the preceding claims, where the grading of each cluster of pixels or individual pixel selected by segmentation for their resemblance to pixels in digital representations of tagged archetypical distinctive cells is done according to quantitative characteristics measured by analysis of spectral, spatial, contextual and/or morphological information for each of these selected clusters of pixels or individual pixels. 28) The method according to any of the preceding claims, wherein the
segmentation in step b) includes classification of the form of the pixels potentially representing the one or more distinctive cell types, such as area and/or periphery of the clusters of pixels. 29) The method according to any of the preceding claims, wherein the
segmentation includes classification of the variation of intensity in clusters of pixels potentially representing the one or more distinctive cell types.
30) The method according to any of the preceding claims, wherein the
segmentation includes classification of the texture of clusters of pixels potentially representing the one or more distinctive cell types.
31 ) The method according to any of the preceding claims, including tagging using at least two different stains, each stain tagging a different part of the cell.
32) The method according to any of the preceding claims, including classification of variation of intensity of pixels representing at least one other tag.
33) The method according to any of the preceding claims, including classification of texture of pixels representing at least one other tag.
34) The method according to claims any of the preceding claims, where the digital representation of the tagged cytological sample is achieved by a slide scanner or virtual microscope providing whole slide digital images.
35) The method according to any of the preceding claims, wherein the one or more clusters of pixels or individual pixels, which have obtained a characteristic grade in step c), is reported as a corresponding digital representation and/or grade, which without further modification can be used for the automatic diagnosis, prognosis, monitoring or treatment prediction of a disease.
36) The method according to any of preceding claims , wherein the one or more clusters of pixels or individual pixels, which have obtained a characteristic grade in step c), is presented as a corresponding digital representation and/or grade, to a medical professional for assessment of the diagnosis, prognosis, monitoring or treatment prediction of a disease. 37) The method according to any of the preceding claims, wherein the cytological sample or part of the sample is dispersed on, or contained in, an appropriate medium for performing the tagging procedure of the one or more distinctive cells types.
38) The method according to claim 37, wherein the medium is a reactor useful for incubation or flow of cells.
39) The method according to claim 37, wherein the medium is a dispersing surface useful for microscopy or scanning.
40) The method according to claim 37, wherein the medium is a glass slide or a compact disc. 41 ) The method according to claim 40, wherein the glass slide includes fiducial lines or fiducial points for easing the autofocusing in microscopy or scanning of samples with limited intrinsic and tagging contrast.
42) The method according to the previous claims, wherein the one or more clusters of pixels or individual pixels, which have obtained a critical grade in step c), is presented as a corresponding digital representation and/or grade, to a medical professional for assessment of the screening result of each of said individuals.
43) The method according to the preceding claims, wherein the step c) involves the use of an automated method.
44) The method according to the preceding claims, wherein step c) involves the use of a machine learning method with optimized parameters for grading or classifying cells, wherein said machine learning method is selected from the group consisting of a decision tree, an artificial neural network or a Bayesian network.
45) The method according to the preceding claims, wherein step c) involves the use of a machine learning method with optimized parameters for grading or classifying cells, wherein said machine learning method a decision tree or a classification tree.
46) The method according to the preceding claims, wherein the step c) involves a transformation of the data of the digital representation.
47) The method according to the preceding claims, wherein step c) involves the use of a machine learning method with optimized parameters for grading or classifying cells, wherein said parameters one or more parameters selected from the textural information, the difference between green chromaticity and blue chromaticity mean, the profile area, the mean red chromaticity, mean 30% highest intensity, the blue chromaticity 30%, the red chromaticity 30%, the red chromaticity variance, the circularity, the extrinsic red chromaticity, distance to center variance, combined circularity and distance to center skewness.
48) The method according to the preceding claims, wherein the step c) involves the use of a machine learning method with optimized parameters for grading or classifying cells, wherein said parameters comprise one or more of the parameters selected from textural information, the mean red chromaticity and the distance to center variance.
49) The method according to the preceding claims, wherein the step c) involves the use of a machine learning method with optimized parameters for grading or classifying cells, wherein said parameters comprise textural information.
50) The method according to the preceding claims, wherein the step c) involves the use of a machine learning method with optimized parameters for grading or classifying cells, wherein said parameters comprise the mean red chromaticity.
51 ) The method according to the preceding claims, wherein the step c) involves the use of a machine learning method with optimized parameters for grading or classifying cells, wherein said parameters comprise the distance to center variance. 52) The method according to the preceding claims, wherein step c) involves the use of a machine learning method with optimized parameters for grading or classifying cells, wherein said parameters comprise at least the textural information, the difference between green chromaticity and blue chromaticity mean, the profile area, the mean red chromaticity, mean 30% highest intensity, the blue chromaticity 30%, the red chromaticity 30%, the red chromaticity variance, the circularity, the extrinsic red chromaticity, distance to center variance, combined circularity and distance to center skewness.
53) The method according to the preceding claims wherein the step c) involves the use of a decision tree having one or more of the following parameters and corresponding thresholds:
Textural Information: minimum threshold 10.000, maximum threshold 800.000, and/or,
Mean green chromaticity - blue chromaticity: minimum threshold -infinite, maximum threshold 0.060, and/or,
Profile Area: minimum threshold 100.000, maximum threshold 800.000, and/or,
Mean Red Chromaticity: minimum threshold 0.375, maximum threshold inf, and/or,
Textural Information: minimum threshold 10.000, maximum threshold 508.522, and/or,
Mean 30% highest intensity, minimum threshold 384.545, maximum threshold
535.348, and/or,
Blue Chromaticity 30%: minimum threshold 0.309, maximum threshold 0.332, and/or,
Red Chromaticity 30%: minimum threshold 0.395, maximum threshold 0.417, and/or,
Red Chromaticity Variance: minimum threshold 303.827, maximum threshold 607.190, and/or,
Circularity: minimum threshold 44.601 , maximum threshold inf, and/or,
Textural Information minimum threshold 74.938, maximum threshold 800.000, and/or, Extrinsic Red Chromaticity: minimum threshold 0.221 , maximum threshold 0.237, and/or,
Profile Area: minimum threshold 328.120, maximum threshold 800.000, and/or, Distance to Center Variance: minimum threshold 14.600, maximum threshold inf, and/or,
Red Chromaticity 30%: minimum threshold 0.424, maximum threshold 0.453, and/or,
Profile Area: minimum threshold 175.480, maximum threshold 800.00, and/or, Combined Circularity: minimum threshold 39.188, maximum threshold inf, and/or,
Distance to Center Skewness: minimum threshold -0.237, maximum threshold 0.197, and/or,
Distance to Center Variance: minimum threshold 7.986, maximum threshold inf, and/or,
Red Chromaticity 30%: minimum threshold 0.351 , maximum inf, and/or,
Red Chromaticity 30%: minimum threshold 0.388, maximum 0.410, and/or, Red Chromaticity Variance: minimum threshold 569.270, maximum threshold inf, and/or,
Red Chromaticity Variance: minimum threshold 379.668, maximum threshold 607.190, and/or,
Distance to Center Skewness: minimum threshold -0.381 , maximum threshold - 0.266, and/or,
Textural Information: minimum threshold 797.579, maximum threshold 800.000, and/or,
Profile Area: minimum threshold -inf, maximum threshold 800.000, and/or,
Mean Red Chromaticity: minimum threshold 0.393, maximum threshold 0.429, and/or,
Distance to Center Variance: minimum threshold 3.577, maximum threshold 25.623, and/or,
Red Chromaticity Variance: minimum threshold 379.668, maximum threshold
683.031 , and/or,
Red Chromaticity Variance: minimum threshold 379.668, maximum threshold 569.270, and/or,
Combined Circularity: minimum threshold 24.763, maximum threshold37.585, and/or, Mean 30% highest intensity: minimum threshold -inf, maximum threshold 676.097, and/or,
Combined Circularity: minimum threshold 48.804, maximum threshold 87.271 , and/or,
Mean Red Chromaticity: minimum threshold 0.378, maximum threshold inf, and/or,
Mean 30% highest intensity: minimum threshold 374.492, maximum threshold 565.509, and/or,
Distance to Center Skewness: minimum threshold -0.555, maximum threshold- 0.121 , and/or,
Mean Green chromaticity - blue chromaticity: minimum threshold 0.012, maximum threshold 0.035, and/or,
Textural Information: minimum threshold 74.938, maximum threshold 508.522, and/or,
Combined Circularity: minimum threshold 27.969, maximum threshold 31 .174, and/or,
Extrinsic Red Chromaticity: minimum threshold 0.182, maximum threshold inf, and/or,
Mean Green chromaticity - blue chromaticity: minimum threshold 0.016, maximum threshold 0.025, and/or,
Distance to Center Variance: minimum threshold 5.782, maximum threshold inf, and/or,
Textural Information: minimum threshold 653.050, maximum threshold 800.000, and/or,
Mean Red Chromaticity: minimum threshold 0.366, maximum threshold inf, and/or,
Mean Red Chromaticity: minimum threshold 0.369, maximum threshold inf, and/or,
Mean 30% highest intensity: minimum threshold 424.759, maximum threshold 535.348, and/or,
Blue Chromaticity 30%: minimum threshold -inf, maximum threshold 0.408, and/or,
Profile Area: minimum threshold 99.160, maximum threshold 800.000, and/or, Mean Green chromaticity - blue chromaticity: minimum threshold 0.030, maximum threshold 0.060, and/or, Red Chromaticity Variance: minimum threshold 531 .349, maximum threshold inf, and/or,
Distance to Center Variance: minimum threshold 3.577, maximum threshold inf, and/or,
Blue Chromaticity 30%: minimum threshold 0.317, maximum threshold 0.348, and/or,
Profile Area: minimum threshold 99.160, maximum threshold 800.000, and/or, Textural Information: minimum threshold 74.938, maximum threshold 800.000, and/or,
Red Chromaticity 30%: minimum threshold 0.373, maximum threshold inf, and/or,
Combined Circularity: minimum threshold 23.160, maximum threshold 39.188, and/or,
Red Chromaticity 30%: minimum threshold 0.380, maximum threshold 0.388, and/or,
Mean 30% highest intensity: minimum threshold 615.776, maximum threshold 666.044, and/or,
Mean Green chromaticity - blue chromaticity: minimum 0.016, maximum threshold 0.049, and/or,
Circularity: minimum threshold 28.229, maximum threshold 44.601 , and/or, ME
Mean Green chromaticity - blue chromaticity: minimum threshold -0.007, maximum threshold 0.060, and/or,
Blue Chromaticity 30%: minimum 0.309, maximum 0.317, and/or,
Profile Area: minimum threshold 251 .800, maximum threshold 442.600, and/or, Textural Information: minimum threshold 219.466, maximum threshold 800.000, and/or,
Profile Area: minimum threshold 99.160, maximum threshold 800.000, and/or, Mean 30% highest intensity: minimum threshold 364.438, maximum threshold 605.723, and/or,
Mean Green chromaticity - blue chromaticity: minimum threshold 0.012, maximum threshold 0.025, and/or,
Red Chromaticity Variance: minimum threshold 379.668, maximum threshold inf, and/or,
Textural Information: minimum threshold 363.994, maximum threshold 800.000, and/or, Profile Area: minimum threshold 99.160, maximum threshold 800.000.
54) An automated or semi-automated system suitable for carrying out a method as defined by the claims 1 to 53 , wherein said automated or semi-automated system comprises in combination:
a) a database capable of including a plurality of digital images (representations) of the samples,
b) a software module for analyzing a plurality of pixels from a digital image of the samples, and
c) a control module comprising instructions for carrying out said method.
55) A computer readable medium comprising instructions for carrying out a method as defined by claims 1 to 53.
PCT/DK2011/050374 2010-09-30 2011-09-30 Automated imaging, detection and grading of objects in cytological samples WO2012041333A1 (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
DKPA201070414 2010-09-30
DKPA201070414 2010-09-30
DKPA201070434 2010-10-11
DKPA201070434 2010-10-11

Publications (1)

Publication Number Publication Date
WO2012041333A1 true WO2012041333A1 (en) 2012-04-05

Family

ID=44946928

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/DK2011/050374 WO2012041333A1 (en) 2010-09-30 2011-09-30 Automated imaging, detection and grading of objects in cytological samples

Country Status (1)

Country Link
WO (1) WO2012041333A1 (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103994964A (en) * 2014-05-23 2014-08-20 天津大学 Quantitative analysis method aiming at apoptotic cell morphology of fluorescence microscopic image
CN104282008A (en) * 2013-07-01 2015-01-14 株式会社日立制作所 Method for performing texture segmentation on image and device thereof
US20150269411A1 (en) * 2014-03-18 2015-09-24 Samsung Electro-Mechanics Co., Ltd. Cell analysis apparatus and method
JP2016517515A (en) * 2013-03-15 2016-06-16 ホロジック, インコーポレイテッドHologic, Inc. System and method for observing and analyzing cytological specimens
CN106046134A (en) * 2016-06-03 2016-10-26 南通大学 Application of micro-molecular polypeptide NFIB<delta>
CN104156928B (en) * 2014-08-25 2017-04-26 深圳先进技术研究院 Ultrasonoscopy speckle noise filtering method based on Bayesian model
CN108550148A (en) * 2018-04-13 2018-09-18 重庆大学 Nucleus in histotomy micro-image divides automatically and classifying identification method
CN110059723A (en) * 2019-03-19 2019-07-26 北京工业大学 A kind of robust smog detection method based on integrated depth convolutional neural networks
CN110111310A (en) * 2019-04-17 2019-08-09 广州思德医疗科技有限公司 A kind of method and device of assessment tag picture
CN110853022A (en) * 2019-11-14 2020-02-28 腾讯科技(深圳)有限公司 Pathological section image processing method, device and system and storage medium
WO2020168284A1 (en) * 2019-02-15 2020-08-20 The Regents Of The University Of California Systems and methods for digital pathology
US20220392203A1 (en) * 2019-11-04 2022-12-08 Ummon Healthtech Method of, and computerized system for labeling an image of cells of a patient

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100002929A1 (en) * 2004-05-13 2010-01-07 The Charles Stark Draper Laboratory, Inc. Image-based methods for measuring global nuclear patterns as epigenetic markers of cell differentiation
WO2010017822A1 (en) * 2008-08-15 2010-02-18 Visiopharm A/S A method and a system for determining a target in a biological sample by image analysis

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100002929A1 (en) * 2004-05-13 2010-01-07 The Charles Stark Draper Laboratory, Inc. Image-based methods for measuring global nuclear patterns as epigenetic markers of cell differentiation
WO2010017822A1 (en) * 2008-08-15 2010-02-18 Visiopharm A/S A method and a system for determining a target in a biological sample by image analysis

Non-Patent Citations (8)

* Cited by examiner, † Cited by third party
Title
ALMSTRUP K, HOEI-HANSEN CE, WIRKNER U, BLAKE J, SCHWAGER C, ANSORGE W, NIELSEN JE, SKAKKEBAEK NE, RAJPERT-DE MEYTS E, LEFFERS H.: "Embryonic stem cell-like features of testicular carcinoma in situ revealed by genome-wide gene expression profiling", CANCER RES., vol. 64, 2004, pages 4736 - 43, XP002607478
CHRISTINA E. HOEI-HANSEN ET AL.: "Current approaches for detection of carcinoma in situ testis", INTERNATIONAL JOURNAL OF ANDROLOGY, vol. 30, no. 4, 9 August 2007 (2007-08-09), pages 398 - 405, XP002666303 *
HOEI-HANSEN C E ET AL: "Towards a non-invasive method for early detection of testicular neoplasia in semen samples by identification of fetal germ cell-specific markers", HUMAN REPRODUCTION, OXFORD UNIVERSITY PRESS, GB, vol. 22, no. 1, 1 January 2007 (2007-01-01), pages 167 - 173, XP002524492, ISSN: 0268-1161, [retrieved on 20060818], DOI: 10.1093/HUMREP/DEL320 *
HOEI-HANSEN CE, CARLSEN E, JORGENSEN N, LEFFERS H, SKAKKEBAEK NE, RAJPERT-DE MEYTS E.: "Towards a non-invasive method for early detection of testicular neoplasia in semen samples by identification of fetal germ cell-specific markers", HUM REPROD., vol. 22, 2007, pages 167 - 73, XP002524492, DOI: doi:10.1093/HUMREP/DEL320
JEROEN VAN DER LAAK: "Automated Identification of Cell and Tissue Components in Pathology", 1 January 2001 (2001-01-01), University Medical Center St Radboud - The Netherlands, pages 1 - 149, XP055015408, ISBN: 978-9-01-234567-5, Retrieved from the Internet <URL:http://dare.ubn.kun.nl/bitstream/2066/19066/1/19066_autoidofc.pdf> [retrieved on 20111222] *
K. ALMSTRUP ET AL.: "Screening of subfertile men for testicular carcinoma in situ by an automated image analysis-based cytological test of the ejaculate", INTERNATIONAL JOURNAL OF ANDROLOGY, vol. 34, 22 June 2011 (2011-06-22), pages 21 - 31, XP002666322 *
LE M T ET AL: "A novel semi-automatic image processing approach to determine Plasmodium falciparum parasitemia in Giemsa -stained thin blood smears", vol. 9, 1 January 2008 (2008-01-01), pages 1 - 12, XP002540787, ISSN: 1471-2121, Retrieved from the Internet <URL:http://www.pubmedcentral.nih.gov/picrender.fcgi?artid=2330144&blobtype=pdf> [retrieved on 20090811] *
NIELSEN JE, KRISTENSEN D, ALMSTRUP K, JORGENSEN A, OLESEN , JACOBSEN GK, HORN T, SKAKKEBAEK NE, LEFFERS H, RAJPERT-DE MEYTS E: "A novel double staining strategy for improved detection of testicular carcinoma in situ (CIS) cells in human semen samples", ANDROLOGIA, 16 February 2011 (2011-02-16)

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2016517515A (en) * 2013-03-15 2016-06-16 ホロジック, インコーポレイテッドHologic, Inc. System and method for observing and analyzing cytological specimens
CN104282008A (en) * 2013-07-01 2015-01-14 株式会社日立制作所 Method for performing texture segmentation on image and device thereof
CN104282008B (en) * 2013-07-01 2017-07-28 株式会社日立制作所 The method and apparatus that Texture Segmentation is carried out to image
US20150269411A1 (en) * 2014-03-18 2015-09-24 Samsung Electro-Mechanics Co., Ltd. Cell analysis apparatus and method
US9530043B2 (en) * 2014-03-18 2016-12-27 Samsung Electro-Mechanics Co., Ltd. Cell analysis apparatus and method
CN103994964A (en) * 2014-05-23 2014-08-20 天津大学 Quantitative analysis method aiming at apoptotic cell morphology of fluorescence microscopic image
CN104156928B (en) * 2014-08-25 2017-04-26 深圳先进技术研究院 Ultrasonoscopy speckle noise filtering method based on Bayesian model
CN106046134B (en) * 2016-06-03 2019-08-30 南通大学 The application of micromolecule polypeptide NFIB
CN106046134A (en) * 2016-06-03 2016-10-26 南通大学 Application of micro-molecular polypeptide NFIB<delta>
CN108550148A (en) * 2018-04-13 2018-09-18 重庆大学 Nucleus in histotomy micro-image divides automatically and classifying identification method
WO2020168284A1 (en) * 2019-02-15 2020-08-20 The Regents Of The University Of California Systems and methods for digital pathology
CN110059723A (en) * 2019-03-19 2019-07-26 北京工业大学 A kind of robust smog detection method based on integrated depth convolutional neural networks
CN110059723B (en) * 2019-03-19 2021-01-05 北京工业大学 Robust smoke detection method based on integrated deep convolutional neural network
CN110111310A (en) * 2019-04-17 2019-08-09 广州思德医疗科技有限公司 A kind of method and device of assessment tag picture
CN110111310B (en) * 2019-04-17 2021-03-05 广州思德医疗科技有限公司 Method and device for evaluating tag picture
US20220392203A1 (en) * 2019-11-04 2022-12-08 Ummon Healthtech Method of, and computerized system for labeling an image of cells of a patient
CN110853022A (en) * 2019-11-14 2020-02-28 腾讯科技(深圳)有限公司 Pathological section image processing method, device and system and storage medium

Similar Documents

Publication Publication Date Title
EP3486836B1 (en) Image analysis method, apparatus, program, and learned deep learning algorithm
JP7231631B2 (en) Methods for calculating tumor spatial heterogeneity and intermarker heterogeneity
WO2012041333A1 (en) Automated imaging, detection and grading of objects in cytological samples
Sheikhzadeh et al. Automatic labeling of molecular biomarkers of immunohistochemistry images using fully convolutional networks
JP5184087B2 (en) Methods and computer program products for analyzing and optimizing marker candidates for cancer prognosis
US20200388033A1 (en) System and method for automatic labeling of pathology images
US9355445B2 (en) Breast cancer pathological image diagnosis support system, breast cancer pathological image diagnosis support method, and recording medium recording breast cancer pathological image diagnosis support program
JP6192747B2 (en) Machine learning system based on tissue objects for automatic scoring of digital hall slides
EP3155592B1 (en) Predicting breast cancer recurrence directly from image features computed from digitized immunohistopathology tissue slides
US10083340B2 (en) Automated cell segmentation quality control
JP5040597B2 (en) Evaluation system, evaluation method, and evaluation program
JP2021506022A (en) Deep learning systems and methods for solidarity cell and region classification in bioimaging
RU2690224C2 (en) Image processing and analysis device
US20220351860A1 (en) Federated learning system for training machine learning algorithms and maintaining patient privacy
JP5469070B2 (en) Method and system using multiple wavelengths for processing biological specimens
JP2023510915A (en) Non-tumor segmentation to aid tumor detection and analysis
Puri et al. Automated computational detection, quantitation, and mapping of mitosis in whole-slide images for clinically actionable surgical pathology decision support
WO2022246294A1 (en) Automated segmentation of artifacts in histopathology images
JP7492650B2 (en) Automated identification of necrotic regions in digital images of multiplex immunofluorescence stained tissues
Ulaganathan et al. A clinicopathological study of various oral cancer diagnostic techniques
CN113178228B (en) Cell analysis method based on nuclear DNA analysis, computer device, and storage medium
CN117529750A (en) Digital synthesis of histological staining using multiple immunofluorescence imaging
US20240070904A1 (en) Optimized data processing for medical image analysis
JP6592854B2 (en) Method for determining c-MYC gene translocation
Zhang et al. Automated scoring system of HER2 in pathological images under the microscope

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 11782030

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 11782030

Country of ref document: EP

Kind code of ref document: A1