EP1380005A1 - Computerimplementierte methoden zur mustererkennung in organischem material - Google Patents

Computerimplementierte methoden zur mustererkennung in organischem material

Info

Publication number
EP1380005A1
EP1380005A1 EP02751960A EP02751960A EP1380005A1 EP 1380005 A1 EP1380005 A1 EP 1380005A1 EP 02751960 A EP02751960 A EP 02751960A EP 02751960 A EP02751960 A EP 02751960A EP 1380005 A1 EP1380005 A1 EP 1380005A1
Authority
EP
European Patent Office
Prior art keywords
tissue
class
image
cell
parameter
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP02751960A
Other languages
English (en)
French (fr)
Inventor
Glenna C. Burmer
Christopher A. Ciarcia
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
LifeSpan BioSciences Inc
Original Assignee
LifeSpan BioSciences Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by LifeSpan BioSciences Inc filed Critical LifeSpan BioSciences Inc
Publication of EP1380005A1 publication Critical patent/EP1380005A1/de
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/69Microscopic objects, e.g. biological cells or cellular parts

Definitions

  • the human brain functions as a very powerful image processing system. As a consequence of extensive training and experience, a human histologist learns to recognize, either through a microscope or in an image, the distinctive features of hundreds of different tissue types and identify the distinctive features of structures, substructures, cell types, and nuclei that are the constituents of each type of tissue. By repeatedly observing these characteristic patterns, the human brain then generalizes this knowledge to accurately classify tissue types, tissue structures, tissue substructures, cell types, and nucleus types in novel specimens or images.
  • the human pathologist learns to distinguish the appearance of normal tissues from the appearance of tissues affected by one or more diseases that modify the appearance of particular cells, structures, or substructures within the specimen or alter the overall appearance of the tissue. With extensive training and experience, the human pathologist learns to distinguish and classify many different diseases that are associated with each tissue type. Also, if a particular tissue component includes a molecule that is visible or has been marked using a chemical that shows a distinctive color through a microscope or in an image, the human can note the presence of this component and identify the type of cell or other tissue constituent in which the component appears.
  • the present invention includes an expert system that performs, in an automated fashion, various functions that are typically carried out by a histologist and/or pathologist such as one or more of those described above for tissue specimens where features spanning a pattern are detectible.
  • the expert system is comprised of systems and methods that analyze images of such tissue specimens and (1) classify the tissue type, (2) determine whether a designated tissue structure, tissue substructure, or nucleus type is present, (3) identify with visible marking or with pixel coordinates such tissue structure, substructure, or nuclei in the image, and/or (4) classify the structure type, substructure type, cell type, and nuclei of a tissue constituent at a particular location in the image.
  • the automated systems and methods can classify such tissue constituents as normal or abnormal (e.g.
  • the systems and methods can identify the locations where a sought component that includes a distinctive molecule appears in such specimens and classify the tissue type, tissue structure and substructure, as well as cell type that contains the sought component and whether the component is in the nucleus.
  • the invented systems and methods can be scaled up to perform large numbers of such analyses per hour. This makes it feasible, for example, to identify tissue constituents within an organism where a drug or other compound has bound, where a product of a specific gene sequence is expressed, or where a particular tissue component is localized.
  • the invented systems and methods can be scaled to screen tens of thousands of compounds or genetic sequences in an organism with a single set of tissue samples. While this information could be gathered using a histologist and/or pathologist, the cost would be high and, even if cost were no object, the time required for such an analysis would interfere with completion of the project within an acceptable amount of time.
  • the invented systems and methods make use of image pattern recognition capabilities to discover information about images showing features of many cells fixed in relation to each other as a part of a tissue of an organism. It can also recognize a pattern across two dimensions in the surface appearance of cell nuclei for cells that a fixed in a tissue or are dissociated from their tissue of origin.
  • the systems and methods can be used for cells from any kind of organism, including plants and animals.
  • One value of the systems and methods in the near term is for the automated analysis of human tissues.
  • the systems and methods provide the ability to automate, with an image capture system and a computer, a process to identify and classify tissue types, tissue structures, tissue substructures, cell types, and nuclear characteristics within a specimen.
  • the image capture system can be any device that captures a high resolution image showing features of a tissue sample, including any device or process that involves scanning the sample in two or three spatial dimensions.
  • the process used by histologists includes looking at tissue samples that contain many cells in fixed relationship to each other and identifying patterns that occur within the tissue. Different tissue types produce distinctive patterns that involve multiple cells, groups of cells, and/or multiple cell types. Different tissue structures and substructures also produce distinctive patterns that involve multiple cells and/or multiple cell types.
  • the intercellular patterns are used by the expert system, as by a histologist, to identify tissue types, tissue structures, and tissue substructures within the tissues. Recognition of these characteristics by the automated systems and methods need not require the identification of individual nuclei, cells, or cell types within the sample, although identification can be aided by simultaneous use of such methods.
  • the automated systems and methods can identify individual cell types within the specimen from their relationships with each other across many cells, from their relationships with cells of other types, or from the appearance of their nuclei.
  • the invented systems use analysis of patterns across at least two spatial dimensions in the nuclear image to identify individual cell types within the sample.
  • features spanning many cells as they occur in the tissue must be detectable in the image.
  • the system examines patterns across the image of the nucleus. Depending upon the tissue type, the cell type of interest, and the method for generating the image, staining of the sample may or may not be desired. Some tissue components can be adequately detected without staining.
  • Visible light received through an optical lens is a one method for generating the image.
  • any other process that captures a large enough image with high enough resolution can be used, including methods that utilize other frequencies of electromagnetic radiation or scanning techniques with a highly focused beam such as X-ray beam, or electron microscopy.
  • the tissue samples are thin-sliced and mounted on microscope slides by conventional methods.
  • an image of multiple cells within a tissue may be generated without removing the tissue from the organism.
  • invasive probes can be inserted into human tissues and used for in vivo imaging. The same methods for image analysis can be applied to images collected using these methods.
  • Other in vivo image generation methods can also be used provided they can distinguish features in a multi-cellular image or distinguish a pattern on the surface of a nucleus with adequate resolution.
  • include image generation methods such as CT scan, MRI, ultrasound, or PET scan.
  • image generation methods such as CT scan, MRI, ultrasound, or PET scan.
  • a set of data for each image is typically stored in the computer system. In one embodiment, approximately one million pixels per image and 256 different intensity levels for each of three colors for each pixel, for a total of 24 bits of information per pixel, at a minimum, are stored for each image.
  • parameters are computed from the data to reduce the quantity by looking for patterns within the data across at least two spatial dimensions using the full range of 256 intensity values for each pixel. Once the parameters are computed, the amount of data required to represent the parameters of an image can be very small compared to the original image content. Thus, the parameter computation process retains information of interest and discards the rest of the information contained within the image.
  • a signature can be generated for each tissue type, tissue structure, tissue substructure, and nucleus type, and this information can be assembled into a knowledge base for use by the expert system, preferably using a set of neural networks.
  • the expert system uses the expert system, the data contained within each parameter from an unknown image is compared to corresponding parameters previously computed from other images where the tissue type, tissue structure, tissue substructure, cell types or nuclear characteristics are known.
  • the expert system computes a similarity between the unknown image and the known images previously supplied to the expert system and a probability of likeness is computed for each comparison.
  • Normal tissues contain specific cell types that exhibit characteristic morphological features, functions and/or arrangements with other cells by virtue of their genetic programming. Normal tissues contain particular cell types in particular numbers or ratios, with precise spatial relationships relative to one another. These features tend to be within a fairly narrow range within the same normal tissues between different individuals.
  • normal tissues In addition to the cell types that provide a particular organ with the ability to serve its unique functions (for example, the epithelial or parenchymal cells), normal tissues also have cells that perform functions that are common across organs, such as blood vessels that contain hematologic cells, nerves that contain neurons and Schwann cells, structural cells such as fibroblasts (stromal cells) outside the central nervous system or glial cells in the brain, some inflammatory cells, and cells that provide the ability for motion or contraction of an organ (e.g., smooth muscle).
  • the combinations of cells comprising these particular functions are comprised of patterns that are reproduced between different individuals for a particular organ or tissue, etc., and can be recognized by the methods described herein as "normal" for a particular tissue.
  • alterations in the tissue that are detectible by this method can occur in one or more of several forms: (1) in the appearance of tissue structures (2) in the morphology of the nuclear characteristics of the cells, (3) in the ratios of particular cells, (4) in the appearance of cells that are not normal constituents of the organ, (5) in the loss of cells that should normally be present, or (6) by accumulations of abnormal material.
  • the source of injury is genetic, environmental, chemical, toxic, inflammatory, autoimmune, developmental, infectious, proliferative, neoplastic, accidental, or nutritional, characteristic changes occur that are outside the bounds of the normal features within an organ and can therefore be recognized and categorized by the methods of the present invention.
  • a signature for each normal tissue type and each known abnormal tissue type can be generated.
  • the expert system can then replace the pathologist for determining whether a novel tissue sample is normal or fits a known abnormal tissue type.
  • the computed parameters can also be used to determine which individual structures appear abnormal and which cells display abnormal nuclei and then compute measurements of the magnitudes of the abnormalities.
  • Genes can show very different patterns of expression across tissues. Some genes may be widely expressed whereas others may show very discrete, localized patterns of expression. Gene products such as mRNA and/or proteins may be expressed in one or more cell types, in one or more tissue structures or substructures, within one or more tissues. Some genes may not be expressed in normal tissues but may be expressed during development or as a consequence of disease. Finding the cell types, tissue structures, tissue substructures, and tissue types in which a gene is expressed, producing a gene product, can be of great value.
  • the system can be used to find any localized component with an identifiable, distinctive structure or identifiable molecule, including metabolic by-products.
  • the system can be used to find material that is secreted by a cell and/or material that is associated with the exterior of the cell, such as proteins, fatty acids, carbohydrates and lipids that have a distinctive structure or identifiable molecule that can be located in an image.
  • the component of interest need not be fixed within the cell but may be confined instead to a certain domain within the cell. Examples of other localized tissue components that may be found include: a neural tangle, a neural plaque, or any drug, adjuvant, bacterium, virus, or prion that becomes localized.
  • the automated system can be used to find and identify nuclei types, cell types, tissue structures, tissue substructures, and tissue types where the component of interest occurs.
  • the component of interest can be a drug or compound that is in the specimen. In this case, the drug or compound may act as a marker for another component within the image. Therefore, the system can be used to find components that are fixed within a cell, components that are localized to a part of a cell while not being fixed, and components that occur on the outside of a cell.
  • researchers have searched for locations of tissue and/or cellular components having an identifiable molecular structure by first applying to the tissue a marker that is known to attach to a component in a particular cell type within a particular tissue. Then, they also apply a second marker that will mark the molecular structure that is sought. If the two markers occur together, the cell where the sought molecular structure is expressed can be identified. A determination of whether the two markers occur together within an image can be made with a computer system, even though the computer system cannot identify cell locations or cell types except by detecting the location of the first marker in the image.
  • This prior art has a serious limitation because it is typically used when there is already a known marker that can mark a known cell type without marking other cell types. Such specific and selective markers are only known for a very small portion of the more than 1500 cell types found in the body.
  • the invented systems and methods can be used for tissue analysis without applying a marker that marks a known cell type.
  • a single marker that attaches to a component of interest can be applied to one or more tissues from an organism.
  • the systems and methods identify, in an automated fashion, the tissue type, the tissue structure and/or substructure, the cell type, and/or in some cases, the subcellular region in which the particular component of interest occurs.
  • This system is particularly valuable for studying the expression of genes across multiple tissues.
  • the researcher utilizes a marker that selectively attaches to the mRNA, or other gene product for a gene of interest, and applies this marker to many tissue samples from many locations within the organism.
  • the invented systems and methods are then used to analyze an image of each desired tissue sample, identify each location of a marker within the images, and then identify and classify the tissue types, tissue structures, tissue substructures, cell types and/or subcellular structures where the marker occurs.
  • the number of molecules of the marker that attach to the tissue specimen is related to the number of molecules of the component that is present in the tissue.
  • the number of molecules of the marker can be approximately determined by the intensity of the signal at a pixel within the image generated from the marker.
  • Figure 2 shows object segmentation
  • Figure 3 shows how sample analysis windows may be taken from an object.
  • Figure 4 lists six parameter computation (feature extraction) methods.
  • Figure 5 shows the IDG parameter extraction method.
  • Figure 6 shows a typical neural network of subnet used for recognition.
  • Figure 7 shows the voting matrix for nuclei recognition.
  • Figure 8 shows the voting matrix for tissue or structure recognition.
  • Tissue samples can be of tissue of fixed cells or of cells dissociated from their tissues such as blood cells, inflammatory cells, or PAP smear cells. Tissue samples can be mounted onto microscope slides by conventional methods to present an exposed surface for viewing. Tissues can be fresh or immersed in preservative to preserve the tissue and tissue antigens and avoid postmortem deterioration. For example, tissues that have been fresh- frozen or immersed in preservative and then frozen or embedded in a substance such as paraffin, plastic, epoxy resin, or celloidin can be sectioned on a cryostat or sliding microtome or a vibratome and mounted onto microscope slides. Staining
  • staining of the sample may or may not be required. Some cellular components can be adequately detected without staining. Methods that may be used to generate images without staining include contrasting techniques such as differential interference contrast, Nomarsky differential interference contrast, stop-contrast (darkfield), phase-contrast, and polarization-contrast. Additional methods that may be used include techniques that do not depend upon reflectance such as Raman spectroscopy, as well as techniques that rely upon the excitation and emission of light such as epi-fluorescence. In one embodiment, a general histological nuclear stain such as hematoxylin is used.
  • Eosin which colors many constituents within each tissue specimen and cell, can also be used.
  • Hematoxylin is a blue to purple dye that imparts this color to basopbilic substances (i.e., substances that have an affinity for bases). Therefore, areas around the nucleus, for instance, which contain high concentrations of nucleic acids, will appear blue.
  • Eosin conversely, is a red to pink dye that colors acidophilic substances. Protein, therefore, would stain red or pink. Glycogen appears as empty ragged spaces within the sample because glycogen is not stained by either hematoxylin or eosin.
  • stains may also be used, such as those used to visualize cell nuclei (Feulgen reaction), mast cells (Giemsa, toluidine blue), carbohydrates (periodic acid-Schiff, Alcian blue), connective tissue (trichrome), lipids (Sudan black, oil red O), micro-organisms
  • a marker is added to the samples. Because stain materials may reduce adhesion of the marker, the marker is typically added before the sample is stained. Alternatively, in some embodiments, it may be added after staining.
  • a marker is a molecule designed to adhere to a specific type of site in the tissue to render the site detectable in the image.
  • the invented methods for determimng tissue constituents at the location of a sought component detect the presence of some molecule that is detectable in the image at that location.
  • the sought component is directly detectable, such as where it is a drug that fluoresces or where it is a structure that, with or without stain, shows a distinctive shape that can be identified by pattern recognition.
  • the sought component can be identified by adding a marker that will adhere to the sought component and facilitate its detection.
  • Some markers cannot be detected directly and a tag may be added to the marker, such as by adding a radioactive molecule to the marker before the marker is applied to the sample.
  • Molecules such as digoxigenin or biotin or enzymes such as horseradish peroxidase or alkaline phosphatase are tags that are commonly incorporated into markers to facilitate their indirect detection.
  • markers that are considered to be highly specific are markers that attach to known cellular components in known cells.
  • the objective is to search for components within tissue samples when it is not known in which tissue type, tissue structure, tissue substructure, and/or nucleus type the component might occur. This is accomplished by designing a marker that will find the component, applying the marker to tissue specimens that may contain many different tissues, structures, substructures, and cell types, and then determining whether any part of the specimens contains the marker and, therefore, the component of interest.
  • Markers may be antibodies, drugs, ligands, or other compounds that attach or bind to the component of interest and are radioactive or fluorescent, or have a distinctive color, or are otherwise detectable.
  • Antibody markers and other markers may be used to bind to and identify an antibody, drug, ligand, or compound in the tissue specimen.
  • An antibody or other primary binding marker that attaches to the component of interest may be indirectly detected by attaching to it another antibody (e.g., a secondary antibody) or other marker where the secondary antibody or marker is detectable.
  • Nucleic acid probes can also be used as markers.
  • a probe is a nucleic acid that attaches or hybridizes to a gene product such as mRNA by nucleic acid type bonding (base pairing) or by stei ⁇ c interactions.
  • the probe can be radioactive, fluorescent, have a distinctive color, or contain a tagging molecule such as digoxigenin or biotin. Probes can be directly detected or indirectly detected using a secondary marker that is in turn detectable.
  • Markers and tags that have distinctive colors or fluorescence or other visible indicia can be seen directly through a microscope or in an image.
  • Other types of markers and tags can provide indicia that can be converted to detectable emissions or images.
  • radioactive molecules can be detected by such techniques as adding another material that fluoresces or emits light upon receiving radioactive emissions or adding materials that change color, like photographic emulsion or film, upon receiving radioactive energy.
  • the next step in the process is to acquire an image 1 that can be processed by computer algorithms.
  • the stored image data is transferred into numeric arrays, allowing computation of parameters and other numerical transformations.
  • Some basic manipulations of the raw data that can be used include color separation, computation of gray scale statistics, thresholding and binarization operations, morphological operations, and convolution filters. These methods are commonly used to compute parameters from images.
  • the slides are placed under a light microscope such as a Zeiss Axioplan 2, which has a motorized XY stage, such as those marketed by Ludl and Prior, and an RGB (red-green-blue) digital camera, such as a DNC1310C, mounted on it.
  • a light microscope such as a Zeiss Axioplan 2
  • a motorized XY stage such as those marketed by Ludl and Prior
  • an RGB (red-green-blue) digital camera such as a DNC1310C
  • This exemplary camera captures 1300 by 1030 pixels.
  • the camera is connected to a computer by an image capture board, such as the pixeLY ⁇ X board by Epix, and the acquired images are saved to the computer's hard disk drive.
  • the camera is controlled by software, such as the CView software that is supplied by DNC, and the computer is connected to an RGB monitor for viewing of the color images.
  • the microscope is set at a magnification that allows discrimination of cell features for many cells at one time.
  • a lOx or 20x magnification is preferred but other magnifications can be used.
  • the field diaphragm and the condensor height and diaphragm are adjusted, the aperture is set, the illumination level is adjusted, the image is focused, and the image is taken. These steps are preferably automated by integration software that drives the microscope, motorized stage, and camera.
  • the images 1 are saved in a TIFF format, or other suitable format, which saves three color signals (typically red, green, and blue) in a 24-bit file format (8 -bits per color).
  • TIFF format or other suitable format, which saves three color signals (typically red, green, and blue) in a 24-bit file format (8 -bits per color).
  • tissue per pixel 1 micron of tissue per pixel is sufficient. This is the equivalent of using a camera having 10 micron pixels with a microscope having a lOx objective lens.
  • a typical field of view at lOx is 630 microns by 480 microns. Given that the average cell in tissue has a 20 micron diameter, this view shows about 32 cells by 24 cells.
  • tissue recognition the image must show tissue having a minimum dimension spanning at least about 120 microns.
  • tissue structure recognition some very small structures can be recognized from an image showing tissue with a minimum dimension of at least about 60 microns.
  • nucleus recognition the image need only be as large as a typical nucleus, about 20 microns, and the pixel size need only be as small as about 0.17 microns.
  • each image represents 0.87 mm by 0.69 mm and each pixel represents 0.66 microns by 0.66 microns.
  • the objective lens can be changed to 20x and the resolution can be 0.11 microns of tissue per pixel.
  • an embodiment of the image processing systems and methods contains three major components: (1) ah object segmentation module 51 whose function is the extraction of object data relating to tissue/cell sample structures from background signals, (2) a parameter computation (or "feature extraction") module 52 that computes the characteristic structural pattern features across two (or three) spatial dimensions within the data and computes pixel intensity variations within this data across the spatial dimensions, and (3) a structural pattern recognition module 53 that makes the assessment of recognition probability (level of confidence) using an associative voting matrix architecture, typically using a plurality of neural networks.
  • ah object segmentation module 51 whose function is the extraction of object data relating to tissue/cell sample structures from background signals
  • a parameter computation (or "feature extraction") module 52 that computes the characteristic structural pattern features across two (or three) spatial dimensions within the data and computes pixel intensity variations within this data across the spatial dimensions
  • a structural pattern recognition module 53 that makes the assessment of recognition probability (level of confidence) using an associative voting matrix architecture, typically using a plurality of neural networks.
  • Each component is described
  • the invention may be embodied in software, on a computer readable medium or on a network signal, to be run on a general purpose computer or on a network of general purpose computers.
  • the neural network component may be implemented with dedicated circuits rather than with one or more general purpose computers.
  • One embodiment employs a method of signal segmentation procedure to extract and enhance color-coded (stained) signals and background structures to be used for form content-based feature analysis.
  • the method separates the subject color image into three (3) RGB multi-spectral bands and computes the covariance matrix. This matrix is then diagonalized to determine the eigenvectors which represent a set of de-correlated planes ordered by decreasing levels of variance as a function of 'color-clustered' (structure correlated) signal strengths. Further steps in the segmentaion procedure vary with each parameter extraction method.
  • Some aspects of the parameter extraction methods of the present invention require finding meaningful pattern information across two or three spatial dimensions in very small changes in pixel intensity values. For this reason, pixel data must be captured and processed with fine gradations in intensity.
  • One embodiment employs a scale of 256 possible values (8 significant bits) for precision. 128 values (7 significant bits) will also work, although not as well, while 64 values (6 significant bits) yields serious degradation, and 32 values (5 significant bits) is beyond the limit for extraction of meaningful parameters using the methods of this aspect of the invention.
  • the pixel intensity data values are used in parameter extraction algorithms that operate in two or three dimensions, rather than in a one dimensional scan across the data, by using vector operations. To obtain pattern data across two dimensions, at least 6 pixels in each dimension are required to avoid confusion with noise. Thus, each element of the parameters is extracted from at least a two dimensional grid of pixels having a minimum dimension of 6 pixels. The smallest such object is 24 pixels in an octagon shape.
  • An embodiment of the system incorporates a parameter extraction module that computes the characteristic structural patterns within each of the segmented signals/objects.
  • the tissue/cell structural patterns are distinctive and type specific. As such they make excellent type recognition discriminators.
  • For tissue recognition and tissue structure recognition in one embodiment, six different parameters are computed across a window that spans some of or all of the (sometimes segmented) image.
  • the parameters can be computed independently for each region/object of interest and only one of the parameter computation algorithms, called IDG for integrated diffusion gradient transform, described below, is used.
  • IDG integrated diffusion gradient transform
  • pixels representing nuclei are segmented from the rest of the data so that computation intensive steps will not get bogged down with data that has no useful information.
  • the segmentation procedure isolates imaged structures 2- 9 that are defined as local regions where object recognition will be applied. These object- regions are imaged structures that have a high probability of encompassing nuclei. They will be subjected to form content based parameter computation that examines their 2- dimensional spatial and intensity distributive content to compute a signature of the nuclear material.
  • the initial image 1 is acquired as a color RGB image and then converted to an 8-bit grayscale data array with 256 possible intensity values for each pixel by employing a principal component analysis of the three color planes and extracting a composite image of the R, G and B color planes that is enhanced for contrast and detail.
  • the composite 8-bit image is then subjected to a signal discontinuity enhancement procedure that is designed to increase the contrast between imaged object-regions and overall average background content so that the nuclei, which are stained dark, can be segmented into objects of interest and the remainder of the data can be discarded.
  • the intermediate intensity pixels are dampened to a lower intensity, thereby creating a sharp edge around each clump of pixels showing one or more nuclei.
  • Segmentation of the objects 2-9 is then achieved by applying a localized NxN box deviation filter of a size approximately the same size as that of an individual nucleus, in a point to point, pixel-to-pixel fashion across the entire enhanced image.
  • Those pixels that have significant intensity amplitude above the deviation filter statistical limits and are clustered together forming grouped objects of a size greater than or equal to an individual nucleus are identified individually, mapped and then defined as object-regions of interest.
  • a clump of nuclei appears as a singular object-region 7 which is a mapping that defines which pixels will be subjected to the feature extraction procedure; with actual measurements being made on the principal component enhanced 8-bit image at the same points indicated by the segmented object-region mapping.
  • a center-line 10 is defined that substantially divides the object-region along its longitudinal median.
  • a series of six regional sampling analysis windows 11-16 are then centered on the median and placed in a uniform fashion along that line, and individual distributive intensity pattern measurements are computed across two spatial dimensions within each window. These measurements are normalized to be substantially invariant and comparative between different object-regional measurements taken from different images. By taking sample analysis windows from the center of each clump of pixels representing nuclei, the chances of including one or more nucleoli are very good.
  • Nucleoli are one example of a nuclear component that shows distinctive patterns that are effective discriminants for nucleus types.
  • the parameter calculation used on each of the sampling windows 11-16 is called the 'integrated diffusion gradient' (IDG) of the spatial intensity distribution, discussed below. It is a set of measurements that automatically separate type specific pattern features by relative amplitude, spatial distribution, imaged form, and form variance into a set of characteristic form differentials. In one embodiment, twenty-one discrete IDG measures are computed for each of the six sample windows 11-16, for a total of 126 IDG calculations per window. In one embodiment for recognition of nuclei, once the IDG parameters have been calculated for a each window, a characteristic vector for each object-region 7 is then created by incorporating the 126 measures from each sample window and two additional parameters.
  • IDG integrated diffusion gradient'
  • the first additional parameter is a measure of the object-region's intensity surface fill factor across the two spatial dimensions, thereby computing a "three- dimensional surface fractal" measurement.
  • the second additional parameter is a measure of the region's relative working size compared to the entire imaged field-of-view. In combination, this set of measurements becomes a singular characteristic vector for each object-region. It contains 128 measures of the patterned form. All of the measures are independent of traditional cross-sectional nuclear boundary shape characterization and they may not incorporate or require nuclear boundary definition or delineation. Ideally, they are taken entirely within the boundary of a single nucleus or cluster of pixels representing nuclei.
  • the methods employ procedures to compute six different characteristic form parameters for a window within each image 1 which generally is as large as the entire image.
  • Such parameters computed from an image are often referred to as "features” that have been “extracted.”
  • parameter (or feature) extraction (or computation) methods that would produce effective results for this expert system.
  • the parameter computations all compute measures of characteristic patterns across two or three spatial dimensions using intensity values with a precision of at least 6 significant bits for each pixel and including a measure of variance in the pixel intensities.
  • One embodiment computes the six parameters described below. All six parameters contain information specific to the basic form of the physical tissue and cell structures as regards their statistical, distributive, and variance properties. .
  • IDG - Integrated Diffusion Gradient The IDG transform procedure can be used to compute the basic 'signal form response profile' of structural patterns within a tissue/cell image.
  • the procedure automatically separates type-specific signal structures by relative amplitude, spatial distribution, signal form and signal shape variance into a set of characteristic modes called the 'characteristic form differentials'.
  • These form differentials have been modeled as a set of signal form response functions which, when decoupled (for example, in a linear least- squares fashion) from the form response profile, represent excellent type recognition discriminators.
  • the IDG for each window 23 (which, in one embodiment, is a small window 11-16 for nucleus recognition and is the entire image 1 for tissue or structure recognition) is calculated by examining the two dimensional spatial intensity distribution at different intensity levels 17-19 and computing their local intensity form differential variance. The placement of each level is a function of intensity amplitude in the window.
  • Figure 5 shows three intensity peaks 20-22, that extend through the first level 17 and the second level 18. Only two of them extend through the third level 19.
  • the computations are made at all intensity levels (256) for the entire image.
  • the computations are made at only 3 levels, as shown in Figure 5, because there are a large number of objects 2-9 for each image and there are 6 sample windows 11-16 for each object.
  • the IDG parameters are extracted from image data in the following manner:
  • the pattern image data is fitted with a self-optimizing nth order polynomial fit, i.e., the chi-squared quality of fit is computed over n ranging from 2 to 5 and the order of the best fit is selected.
  • This fit is used to define a flux-flow 'diffusion' surface for measurement of the characteristic form differential function. Depending on gain variances across the pattern, this diffusion surface can be warped (order of the fit greater than 2). This insures that, in this embodiment, the form differential measurements are always taken normal to the diffusion plane.
  • the diffusion plane is positioned above the enhanced signal pattern and lowered one unit level at a time (dH).
  • the resulting function automatically separates type-specific signal structures by relative amplitude, signal strength distribution, signal form and signal shape variance into a function called the characteristic form differential (dNp/dH).
  • Each of the peaks and valleys within the form differential function represent the occurrence of different signal components and the transition gradients between the structures are characteristic of the signal shape variance.
  • the characteristic form differential is then decomposed into a linear superposition of these signal specific response profiles. This is accomplished by fitting the form differential function in a linear least-squares fashion, optimizing for (1) response profile amplitude, (2) extent as profile full-widfh-at-half-height (FWHH) and (3) their relative placement.
  • FWHH full-widfh-at-half-height
  • the response function fitting criteria can be used to determine the location of the background baseline as an added feature component (or for signal segmentation purposes). This can be accomplished by examining the relative change in the response profile measures over the entire dNp/dH function to identify the onset of the signal baseline as the diffusion surface is lowered. From this analysis, the bounding signal responses and the signal baseline threshold (THD) are computed.
  • the IDG transform extracts 256 form differentials which are then fitted with 8 characteristic response functions. Location of each fit is specified with one value and the amplitude is specified with a second value, making 16 total values. Along with two baseline parameters, which are the minimum for the 256 point curve and the area under the curve, this generates an input vector of 18 input values for the neural network. 2.
  • PPF Two-Dimensional Pattern Projection Fractal
  • the PPf can be computed by projecting the tissue/cell segmentation signals into a 2- dimensional binary point-pattern distribution. This distribution is then subjected to an analysis procedure that maps the clustered distributions of the projection over a broad range of sampling intervals across the segmented image. The sample measurement is based on the computation of the fractal probability density function.
  • PPF focuses on the fundamental statistical and distributive nature of the characteristic aspects of form within tissue samples. It is based on a technique that takes advantage of the naturally occurring properties of tissue patterns that exhibit spatial homogeneity (invariance under displacement), scaling (invariance under moderate scale change) and self-similarity (same basic form throughout), e.g., characteristics of basic fractal form; with different tissue/cell structural patterns having unique fractal forms. The mix of tissue cell types and the way they are distributed in the tissue type provides unique differences in the imaged tissue structures.
  • the measurement of the PPF parameter is implemented as a form of the computation of the fractal probability density function using new procedures for the generation of a point-pattern projection and variant magnification sampling.
  • Further signal segmentation comprises an analysis of the 2-dimensional distributive pattern of the imaged intensity profile, segmented when the optimum contrast image is computed employing principal component analysis, fitted with an nth order polynomial surface and then binarized to generate a positive residual projection.
  • the segmented pattern data is signal-gain (intensity) de-biased. This can be accomplished by iteratively replacing each pixel value within the pattern image with the minimum localized value defined within an octagonal area between about 5 and 15 pixels across. This results in a pattern that is not changed as regards uniformity or gradual variance. However, regions of high variance, smaller than the radius of the region of interest (ROI), are reduced to the minimum level of the local background.
  • ROI radius of the region of interest
  • the pattern image is then fitted with a self-optimizing nth order polynomial fit, i.e., the chi-squared quality of fit is computed over n ranging from 2 to 5 and the order of the best fit is selected. This fit is then used to compute the positive residual of the patterned image and binarized to generate a point pattern distribution.
  • a self-optimizing nth order polynomial fit i.e., the chi-squared quality of fit is computed over n ranging from 2 to 5 and the order of the best fit is selected.
  • d the density of tissue /cell pattern points at a given location
  • C is a constant
  • r is the distance from the center of a cluster
  • D is the Hausdorff fractal dimension.
  • Actual computation of the fractal dimension is accomplished using a box-counting procedure.
  • a grid is superimposed onto the tissue point pattern image and the number of grid boxes containing any part of the fractal pattern are counted. The size of the box grid is then increased and the process is iteratively repeated until the pattern sample size limits the number of measurements.
  • Extraction of the PPF feature set CAN BE accomplished by computing the Hausdorff dimension for multiple overlapping regions of interest (ROIs) that span the entire image domain with additional phased samplings varying in ROI scale size.
  • ROIs regions of interest
  • the ROI's CAN BE selected to be 128 pixels by 128 pixels or 256 pixels by 256 pixels.
  • the result is 240 individual fractal measurements of the tissue/cell point distribution pattern with a sampling cell magnification varying from 0.156 to 1.0.
  • the PPF algoritm extracts 240 different phased positional and scaled fractal measurements, generating an input vector of 240 input values to the neural networks. 3.
  • the SNA procedure involves the separation of a tissue/cell color image into three (3) RGB multi-spectral bands which then form the basis of a principal components transform.
  • the covariance matrix CAN BE computed and diagonalized to determine the eigenvectors, a set of de-correlated planes ordered by decreasing levels of variance as a function of 'color- clustered' signal strengths.
  • This procedure for the 2-dimensional tissue/cell patterns represents a rotational transform that maps the tissue/cell structural patterns into the signal variance domain.
  • the resultant 3x3 re-mapping diagonalized matrix and its corresponding relative eigenvector magnitudes form the basis of a characteristic statistical variance parameter set delineating tissue cell signals, nuclei and background signatures.
  • This procedure represents a rotational transform that maps the tissue/cell structural patterns into the signal variance domain.
  • the principal component images (El, E2, E3) are therefore uncorrelated and ordered by decreasing levels of signal variance, E.G., El has the largest variance and E3 has the lowest.
  • the result is the removal of the correlation that was present between the axes of the original RGB spectral data with a simultaneous compression of pattern variance into fewer dimensions.
  • the principal components transformation represents a rotation of the original RGB coordinate axis to coincide with the directions of maximum and minimum variance in the signal (pattern specific) clusters.
  • the re-mapping shifts the origin to the center of the variance distribution with the distribution about the mean being multi-modal for the different signal patterns (E.G., cell, nuclei, background) within the tissue imagery.
  • the canonical transform does maximize the separability of defined signal structures. Since the nature of the stains is specific to class species within a singular tissue type, this separability correlates directly with signal recognition.
  • the parameter sets are the resultant 3x3 re-mapping diagonalization matrix and its corresponding relative eigenvector magnitudes.
  • the SVA algorithm extracts 9 parameters derived from the RGB color 3x3 diagonalization matrix, generating an input vector of 9 input values to the neural networks.
  • This linearization projection procedure reduces the dynamic range of the tissue/cell signal segmentation while conserving the structural pattern distributions.
  • the resultant PPT computation then generates a re-mapped function that is constrained by the requirement of
  • parameter extraction is based on analysis of the 2-dimensional distributive line-pattern of the imaged intensity profile, segmented when the optimum contrast image is computed employing principal component analysis, fitted with an nth order polynomial surface, binarized to generate a positive residual projection and then subjected to 2- dimensional linearization procedure that forms a line drawing equivalent of the entire tissue image.
  • the first two steps of the PPT parameter calculation algorithm are the same as for the PPF parameter, above.
  • the method then continues as follows: (3)
  • the binarized characteristic pattern is then subjected to a selective morphological erosion operator that reduces regions of pixels into singular points along median lines defined within the method as the projection linearization of form. This is accomplished by applying a modified form of the standard erosion kernel to the residual image in an iterative process.
  • the erosion operator has been changed to include a rule that considers the occupancy of nearest neighbors, E.G., if a central erosion point does not have connected neighbors that form a continuous distribution, the point cannot be removed.
  • This process reduces the projection into a linearized pattern that contains significant topological and metric information based on the numbers of end points, nodes where branches meet and internal holes within the regions of the characteristic pattern.
  • the methods compute actual PPT features by mapping the linearized pattern from a Cartesian space into a polar form using a modified Hough Transform that employs a masking algorithm that bounds the selection of Hough accumulation cells into specific ranges of slope and intercept.
  • the PPT algorithm extracts 1752 parameters from the Hough transform of the line drawing of the two dimensional tissue intensity image, generating an input vector of 1752 input values to the neural networks.
  • TTFWT Tissue Type Fractal Wavelet Transform
  • characteristic geometrical forms CAN represent fractal primitives and form the basis for a set of mother-wavelets employable in a multidimensional wavelet decomposition.
  • the TTFWT parameter extraction procedure extracts a fractal representation of the tissue/cell structural patterns via a discrete wavelet transform (DWT) based on the mappings of self-similar regions of a tissue/cell signal pattern image using the shape of the IDG characteristic form differentials as the class of mother- wavelets.
  • DWT discrete wavelet transform
  • Parameter extraction is based on the re-sampling and integration of the multi-dimensional wavelet decomposition on a radial interval to generate a characteristic waveform containing elements relative to the fractal wavelet coefficient densities.
  • the procedure includes the following steps: (1) The image pattern is resized and sampled to fit on a 2 N interval, for example as a
  • a characteristic mother wavelet is defined by a study of signal type-specific structures relative to amplitude, spatial distribution, signal form and signal shape variance in a statistical fashion across a large set of tissue/cell images under the IDG procedures previously discussed.
  • the re-sampled image is then subjected to a 2-dimensional wavelet transform using the uniquely defined fractal form mother wavelet.
  • the 2-dimensional wavelet transform space is then sampled and integrated on intervals of wavelet coefficient (scaling and translation intervals) and renormalized on unit area. These represent the relative element energy densities of the transform.
  • the TTFWT algorithm generates an input vector of 128 input values to the neural networks.
  • the RDHP parameter extraction procedure is designed to enhance the measurement of the local fractal probability density functions (FPDFs) within tissue/cell patterns on a sampling interval which is rotationally and scaling invariant.
  • the procedure builds on the characteristic of local self-similarities within tissue/cell imagery. Image components can be seen as re-scaled with intensity transformed mappings yielding a self-referential distribution of the tissue/cell structural data.
  • Implementation involves the measurement of a series of fractal dimensions measured across two spatial dimensions (based on range dependent signal intensity variance) on a centered radial 360 degree scan interval. The resulting radial fractal probability density curve is then normalized and subjected to a Polar Fourier
  • parameter extraction is based on analysis of the 2-dimensional distributive de-biased pattern of the imaged intensity profile, segmented when the optimum contrast image is computed employing principal component analysis with regions of high variance being reduced to the minimum level of the local background generating a signal-gain (intensity) de-biased image.
  • the first step of the RDPH parameter calculation algorithm is the same as for the PPF parameter, above. The method then continues as follows:
  • the enhanced pattern is then signal-gain (intensity) de-biased. This is accomplished by iteratively replacing each pixel value within the enhanced pattern image with the minimum localized value defined within an octagonal region-of- interest (ROI).
  • ROI region-of- interest
  • a set of 360 profiles are generated from a centered analysis scheme within the de-biased image.
  • this represents the measurement of the occupation density on a unit radial interval bounded by image size constraints.
  • the profiles represent area integrated signal intensities.
  • the fractal dimension of each of the angle-dependent profiles is computed.
  • the fractal measurements are normalized to unit magnitude to remove scale dependence.
  • the function is then operated on by a polar Fourier transform (PFT) to generate a set of polar harmonics with each component above the zero order representing increasing degree of deviation from circular form. These represent the RDPH parameter set.
  • PFT polar Fourier transform
  • the RDPH algorithm extracts 128 parameters from the polar-fourier transform of the 360 2-dimensional distribution dependent fractal dimension measurements, generating an input vector of 128 input values to the neural networks.
  • One embodiment of the systems and methods has been structured to meet three primary design specifications. These are: (1) the ability to handle high-throughput automated classification of tissue and cell structures, (2) the ability to generate correlated assessments of the characteristic nature of the tissue/cell structures being classified and (3) the ability to adaptively extend trained experience and provide for self-expansive evolutionary growth. Achievement of these design criteria has been accomplished through the use of an association decision matrix that operates on the outputs of multiple neural networks.
  • Figure 6 shows one of the neural networks. As described above, several of the parameter computation processes yield a set of 128 values which are the inputs to feed the 128 input nodes 31 of a neural network. Others of the parameter computations require other numbers of input nodes. For each neural network, a second layer has half as many neurons.
  • the network shown in Figure 6 has 64 neurons 32 in a second layer and a singular output neuron 33.
  • Each of these neural networks may be comprised of subnetworks as further described below.
  • Each network can be trained to classify the image into one of many classes as is known. In this case, each network is trained on all the classes.
  • each network is trained on only one pattern and is designed to return a level of associative recognition ranging from 0, as totally unlike, to 1, as completely similar.
  • the network is trained on only two classes of images, those that show the sought material and others like them expected within the image to be analyzed that do not.
  • the output of each network is a probability value, expressed as 0 - 1, that the material in the image is the item on which the network was trained. For output to a human, the probability may be restated as a percent as shown in Figure 8.
  • the outputs of the many neural networks are then aggregated to yield a single most probable determination.
  • each neural network compares the input vector (parameter) to a "template" that was created by training the network on a single pattern with many images of that pattern. Therefore, a separate network is used for each pattern to be recognized. If a sample is to be classified into one of 50 tissue types, 50 networks are used.
  • the networks can be implemented with software on a general purpose computer, and each of the 50 networks can be loaded on a single computer in series for the computations. Alternatively, they can be run simultaneously on 50 computers in parallel, or otherwise as desired.
  • a systems analysis from acquisition to feature extraction, can be used to identify different sources of data degradation variance within the tissue processing procedures and within the data acquisition environment that influence the ability to isolate and measure characteristic patterns.
  • These sources of data degradation can be identified by human experience and intuition. Because these sources generally are not independent, they typically cannot be linearly decoupled, removed or separately corrected for.
  • Identified modal aspects of data degradation include (1) tissue processing artifacts such as stain type, stain application method, multiple stain interference /obscuration and physical tissue quality control issues, (2) data acquisition aspects relating to microscope imaging aberrations such as spherical and barrel distortions, RGB color control, pixel dynamic range and resolution, digital quantization, and aliasing effects, (3) systematic noise effects and pattern measurement variance based on statistical sampling densities, and (4) effects from undesirable variation in level of stain applied. In one embodiment, these are grouped into 7 categories. To compensate for these variance-modes of data degradation and enhance recognition ability, one embodiment employs for each neural network a set of eight different subnetworks that each account for a different systematic variance aspect (mode): seven individual modes and one composite mode.
  • mode systematic variance aspect
  • Each subnetwork processes the same input pattern vector, but each subnetwork has been trained on data that demonstrate significant effects specific to a different variance-mode and its relative coupling to other modal data degradation aspects.
  • This processing architecture is one way to provide the association-decision matrix with the ability to dampen and minimize the level of loss in recognition based on obscuration of patternable form from tissue preparation, data acquisition, and other artifacts, interference, or noise, by directly incorporating recognition of the inherent range of artifacts in an image.
  • a human can select images of known content showing the desired data degradation effects and train a subnetwork with images that show the characteristic source of data degradation.
  • the eighth subnetwork can be trained with all or a subset of the images. For each image, the subnetwork can be instructed whether the image shows the type of tissue or structure or nuclei for which the network is being trained.
  • only the IDG parameter is used for each nucleus or clump and only one neural network is used for comparison to each recognition "template” (although that network may include a subnet for each data degradation mode).
  • that network may include a subnet for each data degradation mode.
  • only one neural net is required, but it can still have 8 subnets for data degradation modes.
  • the IDG parameter yields a set of 128 values for each of the 8 subnetworks and there are 8 outputs 33 from the subnetworks. These 8 outputs are applied as inputs 36 to an associative voting matrix as shown in Figure 7. Each of the inputs may be adjusted with a weighting factor 37.
  • the present system uses weights of one; other weights can be selected as desired.
  • the weighted numbers 38 with a range of association levels from 0 to 1, are added to produce a final number 39 between, in this embodiment, 0 and 8. This sum of modal association levels is called the association matrix vote.
  • a vote of 4.0 or greater is considered to be positive recognition of the nucleus type being tested for.
  • Recognition of nuclei can typically determine not only whether a nucleus appears abnormal, but also the cell type.
  • a list of normal cell types that can be identified by the signature of their nuclei, along with a list of the tissues, tissue structures, and sub-structures that can be recognized is shown in Table 2, below.
  • Abnormal cell types suitable for use with the present invention include, for example, the following four categories: (1) Neoplastic and Proliferative Diseases
  • the altered nuclear characteristics of neoplastic cells and their altered growth arrangements allow the method to identify both benign and malignant proliferations, distinguish them from the surrounding normal or reactive tissues, distinguish between benign and malignant lesions, and identify the invasive and pre-invasive components of malignant lesions.
  • benign proliferative lesions include (but are not necessarily limited to) scars, desmoplastic tissue reactions, fibromuscular and glandular hyperplasias (such as those of breast and prostate); adenomas of breast, respiratory tract, gastrointestinal tract, salivary gland, liver, gall bladder, endocrine glands; benign growths of soft tissues such as fibromas, neuromas, neurofibromas, meningiomas, gliomas, and leiomyomas; benign epitehlial and adnexal tumors of skin, benign melanocytic nevi; oncocytomas of kidney, and the benign tumors of ovarian surface epithelium.
  • malignant tumors suitable for use with the methods, systems, and the like discussed herein, in either their invasive and preinvasive phases, both at a primary site and at a site to which they have metastasized, are listed in following Table 1.
  • Adrenal pheochromocytoma neuroblastoma Blood vessels hemangiosarcoma lymphangiosarcoma Kaposi's sarcoma Bone osteosarcoma chondrosarcoma giant cell tumor osteoid osteoma enchondroma chondromyxoid fibroma osteoblastoma Bone marrow & Spleen chronic lymphocytic leukemia acute lymphoblastic leukemia multiple myeloma acute myelogenous leukemia chronic myelogenous leukemia hairy cell leukemia
  • Cervix squamous carcinoma malignant melanoma Colon invasive colorectal carcinoma non-invasive carcinomas adenomas dysplasias
  • Hodgkin's lymphoma Muscle rhabdomyosarcoma leiomyoma leiomyosarcoma Nervous schwannoma neurofibroma neuroblastoma glioblastoma ependymoma oligodendroglioma astrocytoma medulloblastoma ganglioneuroma meningioma Oral and nasal squamous carcinoma non-invasive carcinoma dysplasia malignant melanoma Ovary invasive carcinoma borderline epithelial tumors germ cell tumors stromal tumors Prostate invasive carcinoma intraepithelial neoplasia benign prostatic hyperplasia Salivary gland pleomorphic adenoma & mixed tumor mucoepidermoid tumor adenoid cystic carcinoma Skin malignant melanoma squamous carcinoma non-invasive carcinoma dysplasia adnexal tumors dermatofibroma basal cell carcinoma kerato
  • the method can be used to identify diseases that involve the immune system, including infectious, inflammatory and autoimmune diseases.
  • inflammatory cells become activated and infiltrate tissues in defined populations that contain characteristics that can be detected by the method, as well as producing characteristic changes in the tissue architecture that are a consequence of cell injury or repair within the resident cell types that are present within the tissue.
  • Inflammatory cells include neutrophils, mast cells, plasma cells, immunoblasts of lymphocytes, eosinophils, histiocytes, and macrophages.
  • inflammatory diseases include granulomatous diseases such as sarcoidosis and Crohn's colitis, bacterial, viral, fungal or other organismal infectious diseases such as tuberculosis, helicobacter pylori induced ulcers, meningitis, and pneumonia
  • allergic diseases include asthma, allergic rhinitis (hay fever), and celiac sprue
  • autoimmune diseases such as rheumatoid arthritis, psoriasis, Type I diabetes and ulcerative colitis, multiple sclerosis, hypersensitivity reactions such as transplant rejection, and other such disorders of the immune system or inflammatory conditions (such as endocarditis or myocarditis, glomerulonephritis, pancreatitis, bronchitis, encephalitis, thyroiditis, prostatitis, gingivitis, cholecystitis, cervicitis, thyroiditis or hepatitis) that produce characteristic patterns involving the presence of infiltrating immune cells or alterations to
  • the method is useful for detecting diseases that involve the loss of particular cell types, or the presence of injured and degenerating cell types.
  • neurodegenerative diseases include as Alzheimer's disease, Parkinson's disease and amyotrophic lateral sclerosis, which involve the loss of neurons and characteristic changes within injured neurons.
  • diseases that involve injury to cell types by ischemic insult (loss of blood supply) include stroke, myocardial infarct (heart attack), thrombotic or embolic injury to organs.
  • diseases that involve loss or alteration of particular cell types include osteoarthritis in joints.
  • chronic forms of injury include hypertension, cirrhosis and heart failure.
  • Examples of chemical or toxic injuries that produce characteristics of cell death include acute tubular necrosis of the kidney. Examples of aging within organs include aging in the skin and hair. (4) Metabolic and Genetic Diseases Certain genetic diseases also produce characteristic changes in cell populations that can be recognized by this method. Examples of such diseases include cystic fibrosis, retinitis pigmentosa, neurofibromatosis, and storage diseases such as Gaucher's and Tay- Sachs. Examples of diseases that produce characteristic alterations in the bone marrow or peripheral blood cell components include anemias or thrombocytopenias.
  • a desired set of images of known tissue / structure types is subjected to the parameter extractions described above and separate associative class templates are generated using artificial neural networks for use, not as classifiers into one of many classes but as structural pattern references to a single template for the tissue or structure to be recognized. These references indicate the 'degree of similarity' between the reference and a test tissue or structure and may simultaneously estimate the recognition probability (level of confidence).
  • Each network then contributes to the table of associative assessments that make up the 'association matrix' as shown in Figure 8.
  • each of these subnets can be comprised of additional subnets, for example one for each mode of data degradation in the training set.
  • the system can recognize with sufficient certainty to be useful many of the same tissue types and structures that can be recognized by a pathologist with a microscope, including those in Table 2 below.
  • Table 2 In operation of the system, there is no functional difference between a structure and a substructure. They are both recognized by the same methods.
  • a substructure is simply a form that is found within a larger structure form.
  • this relative hierarchy is shown in the following Table 2, which also lists normal cell types.
  • Table 2 Tissues, Structures, Sub-Structures, and Normal Cell Types TISSUE STRUCTURE SUBSTRUCTURE CELL adrenal Cortex zona fasciculata spongiocyte adrenal Cortex zona glomerulosa adrenal Cortex zona reticularis adrenal medulla chromaffin cell adrenal medulla ganglion cell artery Tunica adventitia adipocytes artery Tunica adventitia vasa vasorum endothelial cell artery Tunica adventitia fibroblast artery Tunica adventitia nerve Schwan cell artery Tunica adventitia vasa vasorum smooth muscle cell artery Tunica intima endothelial cell artery tunica intima myointimal cell artery tunica media smooth muscle cell artery tunica media external elastic lamella artery tunica intima internal elastic lamella bladder mucosa transitional cell bladder muscularis smooth muscle cell bone periosteum osteoprogenitor cell bone matrix bone perisoteum bone osteo
  • Table 2 Tissues, Structures, Sub-Structures, and Normal Cell Types
  • the brain is the most complex tissue in the body. The are myriad brain structures, and other structures, cell types, tissues, etc., that can be imaged with brain scans and recognized by this system that are not listed above.
  • Some diseases can be identified by accumulations of material within tissues that are used as hallmarks of that disease. These accumulations of material often form abnormal structures within tissues. Such accumulations can be located within cells (e.g., Lewy bodies in dopaminergic neurons of the substantia nigra in Parkinson's disease) or be found extracellularly (e.g., neuritic plaques in Alzheimer's disease). They can be, for example, glycoprotein, proteinaceous, lipid, crystalline, glycogen, and/or nucleic acid accumulations. Some can be identified in the image without the addition markers and others require selective markers to be attached to them.
  • proteinaceous accumulations useful for the diagnosis of specific diseases include: neuritic plaques and tangles in Alzheimer's disease, plaques in multiple sclerosis, prion proteins in spongiform encephalopathy, collagen in scleroderma, hyalin deposits or Mallory bodies in hyalin disease, deposits in Rimmelstiel- Wilson disease, Lewy bodies in Parkinson's disease and Lewy body disease, alpha-synuclein inclusions in glial cells in multiple system atrophies, atheromatous plaques in atherosclerosis, collagen in Type II diabetes, caseating granulomas in tuberculosis, and amyloid-beta precursor protein in inclusion-body myositis.
  • lipid accumulations include: deposits in nutritional liver diseases , atheromatous plaques in atherosclerosis, fatty change in liver, foamy macrophages in atherosclerosis, xanthomas, and other lipid accumulation disorders, and fatty streaks in atherosclerosis.
  • crystalline accumulations include: uric acid and calcium oxylate crystals in kidney stones, uric acid crystals in gout, calcium crystals in atherosclerotic plaques, calcium deposits in neplirolithiasis , calcium deposits in valvular heart disease, and psammoma bodies in papillary carcinoma.
  • nucleic acid accumulations or inclusions examples include: viral DNA in herpes , viral DNA in cytomegalovirus, viral DNA in human papilloma virus, viral DNA in HIV, Councilman bodies in viral hepatitis, and molluscum bodies in molluscum contagiosum.
  • the evaluation of the accumulated weight of the associated template assessments for an existing trained tissue / structure type experience defines the classification/recognition decision.
  • the present methods can include dynamic system adaptability and self-organized evolution.
  • the system can automatically upgrade the training of each of the parameter- reference template recognition envelopes to include the slight variations in current sample experience.
  • the system dynamically and automatically increases the density of its trained experience. If the referential assessment is outside previous experience, the nature of that divergence is apparent from the associations to each of the trained types (self teaching) and under significant statistical reoccurrence of similar divergent types, new references can be automatically generated and dynamically added to association matrix. Locating and Quantifying Components that include Distinctive Molecules
  • pixels which show colors emitted by a marker or a tag on a marker, or are otherwise wavelength distinguishable can be identified and the intensity of the color can be, correlated with quantity of the marked component.
  • tissue components include molecules that can be directly distinguished in an image without the use of a marker.
  • the level of association of the primary signal emitted by the component or marker or tag can be determined and localized to structures, cell types, etc. There are several suitable methods.
  • One method begins by identifying one pixel or contiguous pixels that show a distinctive signature indicating presence of the sought component, checks to determine if they are within or close to a nucleus, and, if so, identifies the nucleus type. If the component appears within a nucleus or within a radius so small that the component must be within the cell, the above described method can determine the cell type and whether the nucleus is normal or abnormal where the component appears. The system can also identify the tissue type. The tissue type will have a limited number of structures within it and each of those will be comprised of a limited number of cell types. If the identified cell type occurs in only one structure type within that tissue type, the structure is known.
  • a structure which may be a substructure of a larger structure
  • determine whether the sought component is included in the structure a large number of sample windows which may be overlapping, typically with each large enough to capture at least one possible candidate for a structure type in that tissue, are taken from the image.
  • Each sample is compared to a template for the structure type using the neural networks as described above. Sample windows that are identified as showing the structure are then reduced in size at each edge in turn until the size reduction reduces the certainty of recognition.
  • the structure where the component occurs is one that has known substructures
  • many smaller windows which may be overlapping can sampled from the reduced window and compared to templates for the substructures. If a substructure is found, the smaller window is again reduced on each edge in turn until the certainty of recognition goes down.
  • the boundary of the structure or substructure within the window or smaller window can be identified as a loop of pixels and each pixel showing the component can be checked to determine if it is on or within or outside the loop.
  • the component intensities for all pixels on or within the loop can be summed to quantify the presence of the sought component.
  • the above methods can be reversed to start with each set of one or more contiguous pixels that show the presence of the component above a threshold. Then, a window surrounding the set of pixels is taken and checked for the presence of a structure known to occur in that tissue type. If none is found, the window is enlarged and the process is repeated until a structure is found. Then the boundary of the structure can be identified and a determination is made whether it includes the set of pixels showing the component.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Investigating Or Analysing Biological Materials (AREA)
  • Image Analysis (AREA)
  • Investigating Or Analysing Materials By Optical Means (AREA)
  • Image Processing (AREA)
EP02751960A 2001-04-09 2002-04-09 Computerimplementierte methoden zur mustererkennung in organischem material Withdrawn EP1380005A1 (de)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US28267701P 2001-04-09 2001-04-09
US282677P 2001-04-09
PCT/US2002/011568 WO2002097714A1 (en) 2001-04-09 2002-04-09 Computer method for image pattern recognition in organic material

Publications (1)

Publication Number Publication Date
EP1380005A1 true EP1380005A1 (de) 2004-01-14

Family

ID=23082627

Family Applications (1)

Application Number Title Priority Date Filing Date
EP02751960A Withdrawn EP1380005A1 (de) 2001-04-09 2002-04-09 Computerimplementierte methoden zur mustererkennung in organischem material

Country Status (3)

Country Link
EP (1) EP1380005A1 (de)
JP (1) JP2004535569A (de)
WO (1) WO2002097714A1 (de)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10430943B2 (en) 2016-10-07 2019-10-01 Sony Corporation Automated nuclei area/number estimation for IHC image analysis

Families Citing this family (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6754380B1 (en) * 2003-02-14 2004-06-22 The University Of Chicago Method of training massive training artificial neural networks (MTANN) for the detection of abnormalities in medical images
SE0400318D0 (sv) * 2004-02-12 2004-02-12 Carl Henrik Grunditz Inspektion av kartografiska bilder genom multilager, neuralhybrid klassificering
US8041090B2 (en) 2005-09-10 2011-10-18 Ge Healthcare Uk Limited Method of, and apparatus and computer software for, performing image processing
US8326037B1 (en) 2005-11-23 2012-12-04 Matrox Electronic Systems, Ltd. Methods and apparatus for locating an object in an image
JP5347272B2 (ja) 2008-01-18 2013-11-20 日本電気株式会社 スポット定量装置、スポット定量方法及びプログラム
JP2012524276A (ja) * 2009-04-14 2012-10-11 ザ ジェネラル ホスピタル コーポレーション 生体組織をマルチモーダルにイメージングするための方法及び装置
CN104160264A (zh) * 2012-03-07 2014-11-19 索尼公司 观测设备、观测程序和观测方法
US9518914B2 (en) * 2012-09-24 2016-12-13 Brigham And Women's Hospital, Inc. Portal and method for management of dialysis therapy
JP6324201B2 (ja) * 2013-06-20 2018-05-16 キヤノン株式会社 分光データ処理装置、及び分光データ処理方法
CN103439271B (zh) * 2013-08-29 2015-10-28 华南理工大学 一种猪肉成熟状况的可视化检测方法
WO2016105313A2 (en) * 2014-12-22 2016-06-30 Heksagon Muhendislik Ve Tasarim Anonim Sirketi A compost mixing system and method
CN110573883B (zh) * 2017-04-13 2023-05-30 美国西门子医学诊断股份有限公司 用于在样本表征期间确定标签计数的方法和装置
JP7137935B2 (ja) * 2018-02-27 2022-09-15 シスメックス株式会社 画像解析方法、画像解析装置、プログラム、学習済み深層学習アルゴリズムの製造方法および学習済み深層学習アルゴリズム
AU2019340215B2 (en) 2018-09-12 2022-08-18 Auckland Uniservices Limited Methods and systems for ocular imaging, diagnosis and prognosis
WO2021104410A1 (zh) * 2019-11-28 2021-06-03 北京小蝇科技有限责任公司 血涂片全视野智能分析方法血细胞分割模型及识别模型的构造方法
TWI790572B (zh) * 2021-03-19 2023-01-21 宏碁智醫股份有限公司 影像相關的檢測方法及檢測裝置
CN113838028A (zh) * 2021-09-24 2021-12-24 无锡祥生医疗科技股份有限公司 一种颈动脉超声自动多普勒方法、超声设备及存储介质

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4097845A (en) * 1976-11-01 1978-06-27 Rush-Presbyterian-St. Luke's Medical Center Method of and an apparatus for automatic classification of red blood cells
US4965725B1 (en) * 1988-04-08 1996-05-07 Neuromedical Systems Inc Neural network based automated cytological specimen classification system and method
US5299269A (en) * 1991-12-20 1994-03-29 Eastman Kodak Company Character segmentation using an associative memory for optical character recognition
US5881124A (en) * 1994-03-31 1999-03-09 Arch Development Corporation Automated method and system for the detection of lesions in medical computed tomographic scans
DE19616997A1 (de) * 1996-04-27 1997-10-30 Boehringer Mannheim Gmbh Verfahren zur automatisierten mikroskopunterstützten Untersuchung von Gewebeproben oder Körperflüssigkeitsproben

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
SHIH-JONG LEE J: "INTEGRATION OF NEURAL NETWORKS AND DECISION TREE CLASSIFIERS FOR AUTOMATED CYTOLOGY SCREENING", PROCEEDINGS OF THE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS 1991, 8 July 1991 (1991-07-08), New York, pages 257 - 262, XP000238302 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10430943B2 (en) 2016-10-07 2019-10-01 Sony Corporation Automated nuclei area/number estimation for IHC image analysis

Also Published As

Publication number Publication date
JP2004535569A (ja) 2004-11-25
WO2002097714A1 (en) 2002-12-05

Similar Documents

Publication Publication Date Title
US20020186875A1 (en) Computer methods for image pattern recognition in organic material
WO2002097714A1 (en) Computer method for image pattern recognition in organic material
Doyle et al. Cascaded discrimination of normal, abnormal, and confounder classes in histopathology: Gleason grading of prostate cancer
CN111448569B (zh) 存储和检索数字病理学分析结果的方法
US7587078B2 (en) Automated image analysis
US9057701B2 (en) System and methods for rapid and automated screening of cells
JP7422235B2 (ja) 腫瘍検出および分析を支援するための非腫瘍セグメンテーション
WO2003105675A2 (en) Computerized image capture of structures of interest within a tissue sample
JP2007510199A (ja) 自動顕微鏡スライド組織サンプルマッピング及び画像取得
JP6745874B2 (ja) 組織認識のための方法および装置
CN115088022A (zh) 用于训练机器学习算法和维护患者隐私的联邦学习系统
He et al. Local and global Gaussian mixture models for hematoxylin and eosin stained histology image segmentation
dos Santos et al. Automated nuclei segmentation on dysplastic oral tissues using cnn
Sáez et al. Neuromuscular disease classification system
Shirazi et al. Automated pathology image analysis
Mazo et al. Automatic recognition of fundamental tissues on histology images of the human cardiovascular system
Chaudhury et al. Diagnosis of invasive ductal carcinoma using image processing techniques
Turner et al. Automated image analysis technologies for biological 3D light microscopy
Ljosa et al. Probabilistic segmentation and analysis of horizontal cells
Wirjadi et al. Automated feature selection for the classification of meningioma cell nuclei
Ahmed et al. State of the art in information extraction and quantitative analysis for multimodality biomolecular imaging
Gowrishankar et al. Diagnosis of Groundnut Plant Leaf Disease using Threshold Based Color Segmentation and Artificial Neural Network
Liu et al. 3D Rotation-Invariant description from tensor operation on spherical HOG field
Vega Image-based detection and classification of allergenic pollen
Bilodeau et al. MICRA-Net: MICRoscopy Analysis Neural Network to solve detection, classification, and segmentation from a single simple auxiliary task

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE TR

AX Request for extension of the european patent

Extension state: AL LT LV MK RO SI

17P Request for examination filed

Effective date: 20031107

17Q First examination report despatched

Effective date: 20051220

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20070731