CA2707015A1 - Novel methods for cellular image analysis and simulation - Google Patents
Novel methods for cellular image analysis and simulation Download PDFInfo
- Publication number
- CA2707015A1 CA2707015A1 CA2707015A CA2707015A CA2707015A1 CA 2707015 A1 CA2707015 A1 CA 2707015A1 CA 2707015 A CA2707015 A CA 2707015A CA 2707015 A CA2707015 A CA 2707015A CA 2707015 A1 CA2707015 A1 CA 2707015A1
- Authority
- CA
- Canada
- Prior art keywords
- scattering
- intensity
- image
- images
- band
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
- 238000000034 method Methods 0.000 title claims abstract description 65
- 230000001413 cellular effect Effects 0.000 title claims description 37
- 238000004088 simulation Methods 0.000 title abstract description 51
- 238000010191 image analysis Methods 0.000 title description 6
- 238000012152 algorithmic method Methods 0.000 claims abstract description 3
- 210000004027 cell Anatomy 0.000 claims description 92
- 210000003463 organelle Anatomy 0.000 claims description 63
- 210000003470 mitochondria Anatomy 0.000 claims description 32
- 238000004458 analytical method Methods 0.000 claims description 28
- 230000005855 radiation Effects 0.000 claims description 9
- 210000004962 mammalian cell Anatomy 0.000 claims description 8
- 238000004891 communication Methods 0.000 claims description 7
- 230000005540 biological transmission Effects 0.000 claims description 5
- 230000001427 coherent effect Effects 0.000 claims description 5
- 230000005670 electromagnetic radiation Effects 0.000 claims description 5
- 238000003672 processing method Methods 0.000 claims description 2
- 238000005286 illumination Methods 0.000 claims 2
- 238000001514 detection method Methods 0.000 abstract description 19
- 238000012805 post-processing Methods 0.000 abstract description 9
- 238000000205 computational method Methods 0.000 abstract description 3
- 239000003795 chemical substances by application Substances 0.000 description 109
- 238000012360 testing method Methods 0.000 description 57
- 238000009826 distribution Methods 0.000 description 46
- 230000006870 function Effects 0.000 description 25
- 238000004163 cytometry Methods 0.000 description 22
- 238000002441 X-ray diffraction Methods 0.000 description 21
- 238000004422 calculation algorithm Methods 0.000 description 19
- 230000008569 process Effects 0.000 description 19
- 238000000605 extraction Methods 0.000 description 17
- 230000011218 segmentation Effects 0.000 description 15
- 230000009467 reduction Effects 0.000 description 14
- 238000013459 approach Methods 0.000 description 12
- 238000000149 argon plasma sintering Methods 0.000 description 12
- 230000002438 mitochondrial effect Effects 0.000 description 12
- 210000003850 cellular structure Anatomy 0.000 description 11
- 229920002521 macromolecule Polymers 0.000 description 9
- 239000007787 solid Substances 0.000 description 9
- 239000000243 solution Substances 0.000 description 9
- 230000000694 effects Effects 0.000 description 8
- 239000000523 sample Substances 0.000 description 8
- 230000006399 behavior Effects 0.000 description 7
- 230000002441 reversible effect Effects 0.000 description 7
- 230000000007 visual effect Effects 0.000 description 7
- 210000004940 nucleus Anatomy 0.000 description 6
- 230000003044 adaptive effect Effects 0.000 description 5
- 238000004364 calculation method Methods 0.000 description 5
- 230000000875 corresponding effect Effects 0.000 description 5
- 230000007423 decrease Effects 0.000 description 5
- 238000002474 experimental method Methods 0.000 description 5
- 230000003993 interaction Effects 0.000 description 5
- 238000002372 labelling Methods 0.000 description 5
- 238000012545 processing Methods 0.000 description 5
- 230000008901 benefit Effects 0.000 description 4
- 238000012512 characterization method Methods 0.000 description 4
- 239000013078 crystal Substances 0.000 description 4
- 210000000805 cytoplasm Anatomy 0.000 description 4
- 238000007418 data mining Methods 0.000 description 4
- 238000011156 evaluation Methods 0.000 description 4
- 239000000284 extract Substances 0.000 description 4
- 210000005260 human cell Anatomy 0.000 description 4
- 238000003709 image segmentation Methods 0.000 description 4
- 238000003384 imaging method Methods 0.000 description 4
- 230000003287 optical effect Effects 0.000 description 4
- 230000002829 reductive effect Effects 0.000 description 4
- 238000007619 statistical method Methods 0.000 description 4
- 206010028980 Neoplasm Diseases 0.000 description 3
- 241000950638 Symphysodon discus Species 0.000 description 3
- 238000000333 X-ray scattering Methods 0.000 description 3
- 230000009471 action Effects 0.000 description 3
- 201000011510 cancer Diseases 0.000 description 3
- 238000002790 cross-validation Methods 0.000 description 3
- 238000011161 development Methods 0.000 description 3
- 230000018109 developmental process Effects 0.000 description 3
- 201000010099 disease Diseases 0.000 description 3
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 3
- 238000013101 initial test Methods 0.000 description 3
- 230000003834 intracellular effect Effects 0.000 description 3
- HOQADATXFBOEGG-UHFFFAOYSA-N isofenphos Chemical compound CCOP(=S)(NC(C)C)OC1=CC=CC=C1C(=O)OC(C)C HOQADATXFBOEGG-UHFFFAOYSA-N 0.000 description 3
- 238000010801 machine learning Methods 0.000 description 3
- 238000005457 optimization Methods 0.000 description 3
- 239000002245 particle Substances 0.000 description 3
- 238000002360 preparation method Methods 0.000 description 3
- 238000004451 qualitative analysis Methods 0.000 description 3
- 238000012113 quantitative test Methods 0.000 description 3
- 238000013517 stratification Methods 0.000 description 3
- 238000012549 training Methods 0.000 description 3
- 235000017284 Pometia pinnata Nutrition 0.000 description 2
- 240000007653 Pometia tomentosa Species 0.000 description 2
- 230000004913 activation Effects 0.000 description 2
- 230000002411 adverse Effects 0.000 description 2
- 210000002421 cell wall Anatomy 0.000 description 2
- 230000002596 correlated effect Effects 0.000 description 2
- 238000003745 diagnosis Methods 0.000 description 2
- 238000012631 diagnostic technique Methods 0.000 description 2
- 238000006073 displacement reaction Methods 0.000 description 2
- 230000036541 health Effects 0.000 description 2
- 230000010354 integration Effects 0.000 description 2
- 230000002452 interceptive effect Effects 0.000 description 2
- 238000013178 mathematical model Methods 0.000 description 2
- 238000005259 measurement Methods 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 230000001832 mitochondrialike Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 239000002086 nanomaterial Substances 0.000 description 2
- 230000001537 neural effect Effects 0.000 description 2
- 230000008520 organization Effects 0.000 description 2
- 238000003909 pattern recognition Methods 0.000 description 2
- 230000001902 propagating effect Effects 0.000 description 2
- 108090000623 proteins and genes Proteins 0.000 description 2
- 102000004169 proteins and genes Human genes 0.000 description 2
- 238000005549 size reduction Methods 0.000 description 2
- 230000007704 transition Effects 0.000 description 2
- 238000010200 validation analysis Methods 0.000 description 2
- 239000013598 vector Substances 0.000 description 2
- 208000031229 Cardiomyopathies Diseases 0.000 description 1
- 241000288113 Gallirallus australis Species 0.000 description 1
- 108700020796 Oncogene Proteins 0.000 description 1
- 230000002159 abnormal effect Effects 0.000 description 1
- 238000010521 absorption reaction Methods 0.000 description 1
- 239000013543 active substance Substances 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 239000012472 biological sample Substances 0.000 description 1
- 239000008364 bulk solution Substances 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 210000003855 cell nucleus Anatomy 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 210000000349 chromosome Anatomy 0.000 description 1
- 239000003086 colorant Substances 0.000 description 1
- 238000005094 computer simulation Methods 0.000 description 1
- 238000000354 decomposition reaction Methods 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 239000000428 dust Substances 0.000 description 1
- 238000003708 edge detection Methods 0.000 description 1
- 230000003628 erosive effect Effects 0.000 description 1
- 210000003743 erythrocyte Anatomy 0.000 description 1
- 206010016256 fatigue Diseases 0.000 description 1
- 239000012530 fluid Substances 0.000 description 1
- 238000012252 genetic analysis Methods 0.000 description 1
- 230000005571 horizontal transmission Effects 0.000 description 1
- 230000002401 inhibitory effect Effects 0.000 description 1
- 238000013383 initial experiment Methods 0.000 description 1
- 238000011835 investigation Methods 0.000 description 1
- 230000001788 irregular Effects 0.000 description 1
- 238000002955 isolation Methods 0.000 description 1
- 238000003064 k means clustering Methods 0.000 description 1
- 210000004698 lymphocyte Anatomy 0.000 description 1
- 230000036210 malignancy Effects 0.000 description 1
- 230000003211 malignant effect Effects 0.000 description 1
- 238000012067 mathematical method Methods 0.000 description 1
- 230000003278 mimic effect Effects 0.000 description 1
- 230000030544 mitochondrion distribution Effects 0.000 description 1
- 239000013610 patient sample Substances 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 238000011056 performance test Methods 0.000 description 1
- 230000003334 potential effect Effects 0.000 description 1
- 230000000644 propagated effect Effects 0.000 description 1
- 238000004445 quantitative analysis Methods 0.000 description 1
- 238000011158 quantitative evaluation Methods 0.000 description 1
- 238000001338 self-assembly Methods 0.000 description 1
- 238000000638 solvent extraction Methods 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
- 238000000528 statistical test Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
- 230000005570 vertical transmission Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N15/00—Investigating characteristics of particles; Investigating permeability, pore-volume or surface-area of porous materials
- G01N15/02—Investigating particle size or size distribution
- G01N15/0205—Investigating particle size or size distribution by optical means
- G01N15/0211—Investigating a scatter or diffraction pattern
Landscapes
- Chemical & Material Sciences (AREA)
- Dispersion Chemistry (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Analytical Chemistry (AREA)
- Biochemistry (AREA)
- General Health & Medical Sciences (AREA)
- General Physics & Mathematics (AREA)
- Immunology (AREA)
- Pathology (AREA)
- Investigating Or Analysing Biological Materials (AREA)
Abstract
The present invention provides for a rapid computational method (mtPatterns) to simulate the scattering patterns of biological cells, comprised of an algorithmic method that considers the scattering contributions of a three-dimensional array of scatterers and uses this information to formulate detailed large-angle two-dimensional scattering plots. This method can readily generate scatter pattern simulations that are consistent with those found in published experimental and theoretical results, can and facilitate the identification of experimental scattering images via pattern comparison. Also provided for is a multi-agent system to parameterize laser scattering images of the kind produced by a wide-angle 2D
cytometer comprising a three stage pipeline of. feature detection, feature clustering, and post-processing to create a parametric representation of an input scattering image.
cytometer comprising a three stage pipeline of. feature detection, feature clustering, and post-processing to create a parametric representation of an input scattering image.
Description
NOVEL METHODS FOR CELLULAR IMAGE ANALYSIS AND SIMULATION
FIELD OF TM INVENTION
The present invention pertains to the field of cellular analysis, in particular identification of the type, state, or presence of cellular organelles through analysis of transmitted, reflected or refracted light through cells.
BACKGROUND OF THE INVENTION
All of the publications, patents and patent applications cited within this application are herein incorporated by reference in their entirety to the same extent as if the disclosure of each individual publication, patent application or patent was specifically and individually indicated to be incorporated by reference in its entirety.
There is a great need for methods to extract and recognize patterns in cellular scattering images. Scattering patterns contain vital information about the scattering source, and their interpretation facilitates diagnostic techniques ranging from the analysis of protein and DNA structure from x-ray diffraction, to the assessment of cell health based on patterns of laser light scattered by cellular components. In perhaps the best known example, Watson and Crick used information from patterns seen in two-dimensional x-ray scatter plots to infer the double-helix nature of DNA. In assessing cellular structure, Sem'yanov et at. and Ghosh at al. recognized regular patterns in one-dimensional cell scattering plots, and were able to use a parameterization of these patterns to extract microstructural cell information (K.A. Sem'yanov, et at. Appl Opt 39:5884 (2000); N. Ghosh, et al.Appl Phys Lett 88:084101 (2006); Z. Ulanowski, et al. Appl Opt 37:4027 (1998)).
Scattering pattern analysis techniques are especially crucial in light of new medically-relevant optical analysis methods-specifically the development of the wide-angle cytometer. Wide-angle cytometry devices are rapid, cost effective systems able to capture two-dimensional scattering patterns from a single cell or particle suspended within a fluidic channel. In these devices, laser light is propagated through a cellular body, where it scatters and is collected by a digital imaging device, as described by Singh et al. (K.
Singh, et al. Cytometiy 69A:307 (2006)).
Building on traditional cytometry schemes-which typically only capture scattered light at a few fixed angles or an angular slice-these label-free (i.e. non-fluorescent) detection devices provide extensive information about the internal structure of cells and are highly relevant to medical diagnostic practices (K. Singh, et al. C),tometry 69A:307 (2006)). It is important to be able to rapidly ascertain small deviations in cell structure, as the structure of a cell can be an indicator for the progression of diseases (such as cancer) in patients (P,L. Gourley IEEE J Quantum Electronl 1:8 18 (2005)). However, the art requires a system to infer cell structure from two-dimensional scattering plots through extraction and parameterizing intensity patterns in cytometric scattering images.
Previous work has shown that when light scatters through the cellular body it generates a complex and information-rich pattern of overlapping intensity regions. These regions are created by interfering waves propagating through a variety of cellular structures with differing size and optical properties (P.L. Gourley IEEE J Quantum Electron 11:818 (2005)). Based on the current understanding of the scattering mechanisms present in biological cells (as indicated experimentally (K. Singh, et al. Cytometry 69A:307 (2006);
K.A. Sem'yanov, et al. Appl Opt 39:5884 (2000)) and through numerical simulation (C.
Liu, et al. JBiomed Opt 10:014007 (2005)), these two-dimensional scattering images are typically comprised of a set of scattering bands of varying intensity and width, with a number of additional high-frequency intensity regions. For examples of experimentally acquired scattering signatures, please see the recent work of Singh et al. (K.
Singh, et al.
Cytometry 69A:307 (2006)) and the recent work of Su et al. (X.-T. Su, et at Optics Express 15:17 (2007)).
Scattering intensity contributions in cells typically come from three sources:
large cell structures with diameter d greater than the incident wavelength h (geometric scattering, d > X, on the order of micrometers), cell structures slightly less than the wavelength of incident light (Mie scattering, 7/15 < d < ?,), and very small organelles (Rayleigh scattering, sizes on the order of nanometers, d < 7/15) (P.L. Gourley IEEE J
Quantum Electronl 1:818 (2005)), These lead to three general image cases.
In the first case (geometric scattering, and Mie scattering as d approaches X), the scattered light will form large regular intensity bands, which-in the case of wide-angle cytometers-appear as vertical stripes in captured wide-angle scattering images (K. Singh, et al. Cytometry 69A:307 (2006)). While bands may arc at low scattering angles, as shown by the images of Singh et al. (K. Singh, et al. Cytometry 69A:307 (2006)), they appear approximately linear over smaller solid angles-particularly in the side-scattering region (Fig. 1, left). These larger intensity bands are most prominent (e.g. highest intensity) in the forward and back scatter regions of a 180 X 180 degree scattering image, and are primarily due to the geometry of the cell wall and the larger organelles within the cell (C.
Liu, at al. JBiomed Opt 10:014007 (2005); K. Singh, et al. Cytometry 69A:307 (2006);
V.P. Maltsev, Rev Sci Instrum 71:243 (2000); P.L. Gourley IEEE JQuantum Electron11:818 (2005)) In the second case, combining the influence of both large and medium-sized microstructural elements (e.g. both geometric scatterers and larger Mie scatters), a scattering image may contain bands that vary greatly in intensity along their length.
Interference can lead to lighter or darker regions positioned within the intensity band structure.
For cellular scattering, the presence of smaller micro-and nano-scale cellular structures, like the mitochondria, which are primarily responsible for scattering at large angles (P,L.
Gourley IEEE J Quantum Electronl 1:818 (2005)); will lead to a set of small randomly-distributed intensity regions. The number, frequency, and size of the regions relates to the internal complexity of the cell. This is a result of the third case: Rayleigh scattering (and also Mie scattering where d approaches 1./ 15). Intensity contributions from spatially distributed organelles will constructively and destructively interfere to create a number of high-frequency intensity regions (similar to those in Fig. 6, middle).
The end result is a complex scattering pattern that is comprised of interfering contributions from high-frequency intensity components and a series of vertical intensity bands and which indicates the detailed internal morphology of the cellular body. The combination of image cases one+two, one three, or one+two+three will all lead to images similar to the one presented in Fig.6 right. This complex structure can be observed from wide-angle cytometers and numerical Finite Difference Time Domain (FDTD) simulations.
Computational methods have done little to take advantage of this rich image structure.
One of the major factors inhibiting the development of wide-angle diagnostic devices is the computational effort needed to analyze the scattered light signatures. To deduce cellular information from scattered laser light he inverse scattering problem for light through biological cells must be addressed. This inverse scattering problem involves recreating the geometric parameters of a cell based on the observed path of light propagating through its cytoplasm, This is a largely unsolved problem, and any direct mathematical methods are either computationally intractable and/or lead to non-unique solutions (V.P. Maltsev, Rev Sci Instrum 71:243 (2000)). While numerous attempts have been made to simulate the effects of scattering in cellular bodies, the art is in need of a method and system for quickly inferring the geometric structure, both internally and externally, of a cell based solely on its light scattering data still eludes researchers (V.P.
Maltsev, Rev Sci Instrum 71:243 (2000)).
Previous work has shown that the light scattered by biological cells can be used to infer some aspects of internal cellular structure (V.P. Maltsev, Rev Sci Instrum 71:243 (2000);
X.-T. Su, et al. Optics Express 15:17 (2007)). This is pertinent to wide-angle cytometry systems (e.g. Singh et al. (K. Singh, et al. Cytometry 69A:307 (2006))) where the rich structure of two-dimensional scattering signatures from single cells may be used to explore the micro- and nano-structural makeup of the scattering source (K.
Singh, et al.
Cytometry 69A:307 (2006); X.-T. Su, et al. Optics Express 15:17 (2007)).
Structural information on the intra-cellular components (such as the organization and number of mitochondria) has great clinical relevance, as it may be used to detect and characterize certain diseases, such as cancer and cardiomyopathy (P.f.. Gourley, IEEE
JQuantum Electron 11:818 (2005); b. C. Wallace, Science 283:5407 (1999); M. Brandon, at al.
Oncogene 25:34 (2006)). As such, there is a great need for models to describe the relationship between scattering patterns and cell structure; this is critical to the evaluation of experimental scattering patterns.
Rapid simulation of cellular scattering (specifically the scattering from small component organelles) is a pivotal part of this pursuit (X.-T. Su, at at. Optics Express 15:17 (2007)).
By quickly simulating large libraries of images that approximate the results of cellular and mitochondria) scattering, computer methods could be used to identify key correlations between scattering patterns and initial scatterer structure.
SU'MMAR'Y' OF THE INVENTION
In one aspect, the present invention provides for a method for the analysis of scattering profiles arising from the relfiection, transmission, refraction and diffraction of coherent electromagnetic radiation through a mammalian cell comprising the steps of: 1) directing electro-magnetic radiation at the cell 2) receiving the reflected, transmitted, refracted and diffracted radiation, and 3) applying computer processing methods to the received radiation intensity patterns.
In another aspect, the present invention provides for a rapid computational method (mtPatterns) to simulate the scattering patterns of biological cells, comprising an algorithmic method that considers the scattering contributions of a three-dimensional array of scatterers and uses this information to formulate detailed large-angle two-dimensional scattering plots. In a further aspect, the scatterers may be in any spatial arrangement. In a still further aspect, the small scatterers may share the properties (e.g.
isotropic radiation) of small cellular organelles. In a further aspect, the scatterers may mimic the properties of any other organelle or structure via optical point-spread functions or other numerical methods.
In another aspect, the present invention provides for the prediction of cell structure directly from an observed scattering image, comprising mitochondria spacing within a scattering distribution, through scattering pattern blob spacing, and number of .
mitochondria within a scattering distribution, through scattering pattern intensity.
In another aspect, the present invention provides for a multi-agent system (Cythe) to parameterize laser scattering images of the kind produced by a wide-angle 2D
cytometer comprising a three stage pipeline of: feature detection, feature clustering, and post-processing to create a parametric representation of an input scattering image, wherein the result of the system is a parameter set numerically representing the complex image features created by light scattering through a cellular body that facilitates a parametric solution to the inverse scattering problem of laser light through a single biological cell. In a further aspect, the resulting parameterization may be used to classify experimental scattering patterns. In a still further aspect, the resulting parameterization may be used to segment experimental scattering patterns. In a further aspect, the resulting parameterization may be used to generate classifiers and predictive algorithms to classify experimental scattering patterns. In a still further aspect, the resulting parameterization may be used to identify and/or classify frequency information and Fourier information in scattering patterns.
In another aspect, the present invention provides for a diagnostic sequence comprising a creation of a scattering pattern from a mammalian cell or subpopulation of mammalian and the parametric analysis of obtained scattering patterns using the Cythe system of the present invention. In a further aspect, the sequence may be iterative (e.g.
comprised of the repetition of the following steps: pattern generation, pattern analysis/parameterization, and pattern comparison). In a still further aspect, the system may take as input both simulated and experimental scattering images, as generated by the present invention or a physical light collection device (including, but not limited to, wide-angle 2D laser cytometers), In a further aspect, the system may use feature extraction and/or iterative simulation to separate out the scattering contributions of different cellular structures, allowing detailed determination of cell content (in terms of nuclear size and cell size, microstructures, and nanostructures), the integration of feature extraction/parameterization results into pattern simulation and comparison, and the quick decomposition of scattering images into medically relevant parameters. In a still further aspect, the system may be used to classify experimentally obtained scattering patterns, predict the effect of intracellular changes on scattering patterns, and generate large datasets which be compared to experimental results (by way of other aspects of the present invention, iterative methods such as evolutionary computing, or any other means available) to allow for both real-time and off-line medical diagnostic techniques. In a further aspect, the extracted information could be used alongside phase reconstruction to deduce the structure of a scattering structure or structures.
The accompanying description illustrates preferred embodiments of the present invention and serves to explain the principles of the present invention.
BRIEF DESCRIPTION OF THE FIGURE
FIGURE 1 shows an example of how changes to spatial distribution (p(r), bottom row) impacts blob spacing of two-dimensional scattering patterns (I (S), top row) in an X-ray scattering situation;
FIGURE 2 shows the geometric layout of the scattering simulation process;
FIGURE 3 shows the two angular point-spread functions M(O(u,vlr), (p(u,vlr)) for individual scatterers: pure isotropic radiation (left, characteristic of mitochondria much smaller than),), and a theoretical pattern of a single mitochondrial scatterer (right, characteristic of larger mitochondria in the Mie scattering regime, as described by the polar plot of Gourley et al. ((P.1.. Gourley, IEEE J Quantum Electron 11:818 (2005)));
FIGURE 4 shows a blob spacing comparison between the experimental cytometry data for two Raji cells (left and middle) and the mtPatterns algorithm (right) over a thirty degree solid angle centred on 90 FIGURE 5 shows a series of sample images generated using the mtPatterns algorithm.
Images are shown for both uniform scattering (row A) and scattering using a complex point-spread function (row B);
FIGURE 6 shows simplified example images containing features known to be present in experimental and numerically simulated scattering patterns: a series of vertical intensity bands, like those found in micro-structural scattering (left), and a number of randomly-placed high-frequency intensity regions, characteristic of nano-structural Rayleigh scattering (middle);
FIGURE 7 shows agent fixation which is determined by comparing the image intensity at an agent's position to the average intensity ( a) within its view radius (left);
FIGURE 8 shows an example of horizontal bridge removal (before and after removal -left and right respectively), following the agent fixation shown in Fig. 7;
FIGURE 9 shows the two parts of a single propagate_idO cycle for an active agent (center pixel;
FIGURE 10 shows the schematic for the multilayer perceptron for parameter analysis;
FIGURE 11 shows Average of Band Intensity Nearest Neighbour Deviation (top), Average of Band Width Deviation (bottom), and Average of Band Width Maximums (bottom) with respect to Organelle Count for the full library of cells (L e. a set of thirty-eight simulations with varying nucleus size, position, and random organelle distribution;
image size reduced to 125x125 pixels);
FIGURE 12 shows the region identification for FDTD images, with (left) and without (right) organelles, with white depicting detected pixels;
FIGURE 13 shows a visual comparison of Cythe region detection (row B) with the initial test image (row A) for images with vertical intensity bands (left), high-frequency intensity regions (middle), and high-frequency regions overlayed onto a band structure (right, similar to those observed in FDTD simulations);
FIGURE 14 shows a visual example of horizontal bridge removal;
FIGURE 15 shows extraction of a band hierarchy for a simple noise-free image (top row) and for the same image with 10% of the images pixels assigned a random 8-bit intensity value (i.e. random noise; bottom row): the initial image (left), the agent population after the agent_fixationO routine (middle), and the final regions after post-processing (right);
FIGURE 16 shows scattering patterns with fixed cell size and increasing mitochondrial content (RS = 300, Rrn = 10 m, Ruxr = 20 m);
FIGURE 17 shows several sizes of randomly distributed mitochondrial shell with increasing radius (Rh, = R, =r, RS = 65, mitochondria = 50);
FIGURE 18 shows comparison of absolute one dimensional FFT power (with respect to special frequency) for scattering images of five different organelle scattering distributions;
FIGURE 19 shows a comparison of absolute one dimensional FFT power (with respect to spatial frequency) for scattering images with different organelle counts;
FIGURE 20 shows a comparison of FFT power for FDTD simulation of varying mitochondrial content, along both 0 (top) and 0 (bottom) axes.
DETAILED DESCRIPTION OF THE PRESENT INVENTION
RapidScattering Pattern Simulation A number of groups have recently worked on the assessment of scatter structure from light intensity measurements or predictions, and these studies typically fall into two categories:
forward methods and reverse methods. Forward methods rely on the prediction of a scattering pattern based on a knowledge of the scattering structure and model of light propagation (e.g. Finite Difference Time Domain-FDTD-simulations, (C. Liu, et al. .1 Biomed Opt 10:014007 (2005)), while reverse methods attempt to deduce some aspect of scatterer geometry from a pattern of scattered light (again using a model or algorithm to relate pattern to structure).
A true reverse method (i.e. a direct solution to the inverse scattering problem) would give an exact and detailed geometric structure of a cellular scatterer from a pattern of scattered light. Such a direct reverse solution has been shown to be computationally intractable for the problem of biological cells (V.P. Maltsev, Rev Sci Instrutn 71:243 (2000)). However, some advances in the reverse domain have been able extract one or two physical aspects of the scatterer. Of note, an `indicatrix' has been used on one-dimensional scattering slices and collections of angular slices to determine cell size and haemoglobin content (V.P.
Maltsev, Rev Sci Instrum 71:243 (2000); K.A. Sem'yanov, et al. Appl Opt 39:5884 (2000)). Scattering has also been used to determine red blood cell size and refractive index via a Fourier transform (N. Ghosh, et al. Appl Phys Lett 88:084101 (2006)).
Other groups have worked on predicting the properties of bulk solutions of multiple scattering bodies using Mie/Rayleigh-Gans theory fitting and Light Scattering Spectroscopy (LSS) (V. L.
Kim, et al. IEEE J Quant Electron 9:2 (2003); H. Fang, et at. IEEE J Quant Electron 9:2 (2003)) or elastic/angularly resolved light scattering (A. Katz, et al. IEEE J
Quant Electron 9:2 (2003)).
Due to the complexity of the forward scattering problem, scattering from individual organelles like medium-sized mitochondria, for example ellipsoids or spheroids in the size range of 400nm to 800ntn in diameter and 800nm to 3000nm in length (J. S.
Modica-Napolitano and K. Singh, Expert Rev Mol Med 4:9 (2004)), is typically described in the literature by Mie and Rayleigh-Gans (R-G) theory (A. Katz, et al. IEEE J Quant Electron 9:2 (2003); C. F. Bohren and D. R. Huffman, Absorption and scattering of light by small particles (Wiley, New York, 1998)). These simulation methods give an analytical basis for the intensity patterns observed from larger Mie and geometric scattering structures (e.g.
intensity banding (K. Singh, et al. Cytometty 69A:307 (2006)), but are generally constrained by the restrictive scatterer geometries (e.g. for Mie theory, treating scattering bodies as spheres).
More recently, FDTD code has been used to predict the two-dimensional scattering from cells; while extremely detailed and true to experimental wide-angle cytometry results (X.-T. Su, et al. Optics Express 15:17 (2007)), this discretized solution to Maxwell's equation is time consuming and computationally intensive for any realistic simulation step size (X.-T. Su, at al. Optics Express 15:17 (2007)). As such, simulations must be run on large super-computer arrays and may take hours or days to generate a simulated pattern. To summarize the passage above, forward methods for cellular simulation require either highly regular and simplified simulation parameters (such as the spheres in Mie theory) or long and detailed computations.
Though there are many forward and reverse simulation methods, as discussed above, to date FDTD appears to be the most effective method capable of generating extremely realistic wide-angle scattering patterns from a three-dimensional cell model (X.-T. Su, at al. Optics Express 15:17 (2007)). However, this leads to a computational bottleneck when generating large numbers of simulations, and does not provide a computationally tractable solution to the reverse problem of determining scatterer geometry from an experimental scattering pattern. If researchers hope to characterize and rapidly classify the effect of nano-structural cell components on scattering, another simulation method must be developed.As described above, the analysis of interesting nanostructural contributions to scattering in complete cells has so far proved quite difficult, However, it is known that organelles such as the mitochondria, which have an important role in a number of diseases, are the dominant cause of large-angle scattering (P.L. Gourley, IEEE
.J Quantum Electron 11:818 (2005)) (i.e. light scattered perpendicular to the path of the incident light, commonly called the side-scattering region / large-angle scattering domain).
As shown by the present invention, it is possible to rapidly simulate and analyze the important aspects of large-angle cellular scattering by only considering the contributions of mitochondrial scatterers.
It has been shown that approximately 90% of side-scattered light from cells is due to the presence of mitochondria (P.L. Gourley, IEEE J Quantum Electron 11:819 (2005)). Small structures, such as the mitochondria, readily scatter at large angles and are in fact the dominant cause of intensity artifacts in this region. This is due in part to the mitochondria's complex internal structure and numerous index-of-refraction changes (P.L.
Gourley, IEEE JQuantum Electron 11:818 (2005)). As seen from recent experimental and simulation work, side-scattered light from the mitochondria in human cells, for example Raji cell-lines (X.-T. Su, et at. Optics Express 15:17 (2007)), typically takes the form of small asymmetrical `blobs' in a two-dimensional scattering pattern, where blob size and spacing is thought to be related to the distribution of small scattering bodies within the cell (X.-T. Su, et al. Optics Express 15:17 (2007)). Conversely, larger cell structures-such as the nucleus and the cell wall-lead to a high amount of intensity in the forward and back scatter regions (i.e. angles approaching the path of the incident light (P.L.
Gourley, IEEE J
Quantum Electron 11:818 (2005)) and broad intensity banding in scattering images (X.-T.
Su, et al. Optics Express 15:17 (2007)). This is easily understood by examining the scattering regimes present: smaller cellular organelles will scatter isotropically or pseudo-isotropically via the Rayleigh and Mie regimes (size ¾ . ), while larger bodies will scatter along the light path according to geometric transmission (size > > ? ) (E.
Hecht, Optics, 4th ed (Addison Wesley, San Francisco, 2002)). For more information, a detailed description of light scattering in human cells is presented by Gourley et al, (P.L.
Gourley, IEEE ,I
Quantum Electron 11:818 (2005)).
Given the dominance of mitochondrial scattering in the large-angle domain, as verified by the recent experiments of Su et al. (X.-T. Su, et at. Optics Express 15:17 (2007), it is possible to ignore or filter out the intensity contributions of larger cell structures. This reduces the problem to the collective scattering from a series of points. As a result, the large-angle light scattering from cellular mitochondria begins to closely resemble the classical x-ray diffraction problem. This allows a dramatic simplification of the scattering prediction problem, as a number of useful parallels can be drawn from x-ray diffraction theory (XRD), greatly reducing the computational burden and providing a means (as for XRD) of partially solving the inverse problem.
To simplify the discussion, from this point on we will use and refer to the standard XR1) terminology and notation presented by Kasai and Kakudo (N. Kasai and M.
Kakudo, X-Ray Daction by Macromolecules (Springer, New York, 2005)); relevant terms will be redefined as needed.
A basis in x-ray diffraction and Fourier theory XRD is based on a number of assumptions regarding the experimental setup and the molecular or atomic structure being analyzed (N. Kasai and M. Kakudo, X-Ray Diffraction by Macromolecules (Springer, New York, 2005)). The four most prominent constraints and assumptions (listed below) hinge on the relationship between the incident wavelength (I) and the scattering aperture (i.e. the spacing between neighbouring scattering points or electrons), and the spatial relationship between the scatterer distribution and the receptive field (i.e. a planar recording device, such as x-ray film or a CCD camera).
They are:
Thompson scattering-the incident and scattered wavelengths are the same;
Scatterers radiate isotropically; the Fraunhofer approximation-energy arrives as a plane wave at the receptive field; and the kinematical theory of diffraction-secondary and tertiary scattering interactions (i.e. multi-scatter) should be negligible compared to primary scattering and wave interference (N. Kasai and M. Kakudo, X-Ray Diactlon by Macromolecules (Springer, New York, 2005)).
With this in mind, it can be shown that a simplified model of large-angle cellular light scattering fulfils the fimdamental constraints and assumptions for XRD. First, like XRD, the incident and emission radiation for a sample in a wide-angle cytometer also share a common dominant wavelength (i.e. elastic scattering). Secondly, in the limit of very small mitochondria (e.g. sub-wavelength nanostructres), side-scatter will be very close to isotropic, especially when observed via a small solid angle in the side-scatter region. In addition, for the geometric arrangement of a wide-angle cytometer the distance to the receptive plane (millimetre) is much greater than the distance between scatterers (organelles, micron scale), thus fulfilling the Fraunhofer approximation (i.e.
a2 /(X R) _ 0.0013 <<1, where R is the distance to the receptive plane and a is the scatterer spacing or scattering aperture aperture (E. Hecht, Optics, 4th ed. Addison Wesley, San Francisco, 2002)).
Finally, for a standard cytometry system, the impact of multi-scatter, especially in the case of a single analysed cell, is expected to be minimal. This result can be inferred from the large (micron) spacing between scatterers and the low scattering amplitude at large angles (P.L. Gourley, IEEE J Quantum Electron 11:818 (2005)); forward scatter greatly exceeds side scatter, leading to rapid multi-scatter intensity falloff.
Given these relationships between the physical cytometry setup and XRD, we can begin to examine the mitochondrial scattering problem from the perspective of Fourier scattering theory. In XRD Fourier theory, the wavelength, spatial distribution (or density) of a structure's scattering points (i.e. electrons) and their relationship to the recording location are the principal variables needed to calculate a structure's recorded scattering pattern (N.
Kasai and M. Kakudo, X-Ray Diffraction by Macromolecules (Springer, New York, 2005)). Through Fourier theory (e.g. Eq. 1, below) it is possible to relate scatterer density to the spacing and placement of intensity regions in the recorded scattering pattern a.
small scatterer spacing will result in a wide spacing between regions of high intensity in the recorded scattering pattern, while large spacing between scatterers will result in small tightly packed (i.e. high-frequency) intensity regions . This fact is commonly used to predict macro-molecule and protein structure from scattering and diffraction patterns with the help of phase reconstruction algorithms.
It is known from Fourier theory that a 3D spatial distribution of isotropic scatterers p(r) in the real domain can be related to scattering amplitude A(S) and intensity i(S) = A(S)A*(S) in the Fourier domain, where S is the set of vectors describing the geometry of the receptive field and r is the set of vectors describing the array of scattering points.
Therefore A(S) is the Fourier transform of the scattering distribution, and relates directly to the recorded intensity l(S) on the receptive field. From Kasai and Kakudo (their Eq.
FIELD OF TM INVENTION
The present invention pertains to the field of cellular analysis, in particular identification of the type, state, or presence of cellular organelles through analysis of transmitted, reflected or refracted light through cells.
BACKGROUND OF THE INVENTION
All of the publications, patents and patent applications cited within this application are herein incorporated by reference in their entirety to the same extent as if the disclosure of each individual publication, patent application or patent was specifically and individually indicated to be incorporated by reference in its entirety.
There is a great need for methods to extract and recognize patterns in cellular scattering images. Scattering patterns contain vital information about the scattering source, and their interpretation facilitates diagnostic techniques ranging from the analysis of protein and DNA structure from x-ray diffraction, to the assessment of cell health based on patterns of laser light scattered by cellular components. In perhaps the best known example, Watson and Crick used information from patterns seen in two-dimensional x-ray scatter plots to infer the double-helix nature of DNA. In assessing cellular structure, Sem'yanov et at. and Ghosh at al. recognized regular patterns in one-dimensional cell scattering plots, and were able to use a parameterization of these patterns to extract microstructural cell information (K.A. Sem'yanov, et at. Appl Opt 39:5884 (2000); N. Ghosh, et al.Appl Phys Lett 88:084101 (2006); Z. Ulanowski, et al. Appl Opt 37:4027 (1998)).
Scattering pattern analysis techniques are especially crucial in light of new medically-relevant optical analysis methods-specifically the development of the wide-angle cytometer. Wide-angle cytometry devices are rapid, cost effective systems able to capture two-dimensional scattering patterns from a single cell or particle suspended within a fluidic channel. In these devices, laser light is propagated through a cellular body, where it scatters and is collected by a digital imaging device, as described by Singh et al. (K.
Singh, et al. Cytometiy 69A:307 (2006)).
Building on traditional cytometry schemes-which typically only capture scattered light at a few fixed angles or an angular slice-these label-free (i.e. non-fluorescent) detection devices provide extensive information about the internal structure of cells and are highly relevant to medical diagnostic practices (K. Singh, et al. C),tometry 69A:307 (2006)). It is important to be able to rapidly ascertain small deviations in cell structure, as the structure of a cell can be an indicator for the progression of diseases (such as cancer) in patients (P,L. Gourley IEEE J Quantum Electronl 1:8 18 (2005)). However, the art requires a system to infer cell structure from two-dimensional scattering plots through extraction and parameterizing intensity patterns in cytometric scattering images.
Previous work has shown that when light scatters through the cellular body it generates a complex and information-rich pattern of overlapping intensity regions. These regions are created by interfering waves propagating through a variety of cellular structures with differing size and optical properties (P.L. Gourley IEEE J Quantum Electron 11:818 (2005)). Based on the current understanding of the scattering mechanisms present in biological cells (as indicated experimentally (K. Singh, et al. Cytometry 69A:307 (2006);
K.A. Sem'yanov, et al. Appl Opt 39:5884 (2000)) and through numerical simulation (C.
Liu, et al. JBiomed Opt 10:014007 (2005)), these two-dimensional scattering images are typically comprised of a set of scattering bands of varying intensity and width, with a number of additional high-frequency intensity regions. For examples of experimentally acquired scattering signatures, please see the recent work of Singh et al. (K.
Singh, et al.
Cytometry 69A:307 (2006)) and the recent work of Su et al. (X.-T. Su, et at Optics Express 15:17 (2007)).
Scattering intensity contributions in cells typically come from three sources:
large cell structures with diameter d greater than the incident wavelength h (geometric scattering, d > X, on the order of micrometers), cell structures slightly less than the wavelength of incident light (Mie scattering, 7/15 < d < ?,), and very small organelles (Rayleigh scattering, sizes on the order of nanometers, d < 7/15) (P.L. Gourley IEEE J
Quantum Electronl 1:818 (2005)), These lead to three general image cases.
In the first case (geometric scattering, and Mie scattering as d approaches X), the scattered light will form large regular intensity bands, which-in the case of wide-angle cytometers-appear as vertical stripes in captured wide-angle scattering images (K. Singh, et al. Cytometry 69A:307 (2006)). While bands may arc at low scattering angles, as shown by the images of Singh et al. (K. Singh, et al. Cytometry 69A:307 (2006)), they appear approximately linear over smaller solid angles-particularly in the side-scattering region (Fig. 1, left). These larger intensity bands are most prominent (e.g. highest intensity) in the forward and back scatter regions of a 180 X 180 degree scattering image, and are primarily due to the geometry of the cell wall and the larger organelles within the cell (C.
Liu, at al. JBiomed Opt 10:014007 (2005); K. Singh, et al. Cytometry 69A:307 (2006);
V.P. Maltsev, Rev Sci Instrum 71:243 (2000); P.L. Gourley IEEE JQuantum Electron11:818 (2005)) In the second case, combining the influence of both large and medium-sized microstructural elements (e.g. both geometric scatterers and larger Mie scatters), a scattering image may contain bands that vary greatly in intensity along their length.
Interference can lead to lighter or darker regions positioned within the intensity band structure.
For cellular scattering, the presence of smaller micro-and nano-scale cellular structures, like the mitochondria, which are primarily responsible for scattering at large angles (P,L.
Gourley IEEE J Quantum Electronl 1:818 (2005)); will lead to a set of small randomly-distributed intensity regions. The number, frequency, and size of the regions relates to the internal complexity of the cell. This is a result of the third case: Rayleigh scattering (and also Mie scattering where d approaches 1./ 15). Intensity contributions from spatially distributed organelles will constructively and destructively interfere to create a number of high-frequency intensity regions (similar to those in Fig. 6, middle).
The end result is a complex scattering pattern that is comprised of interfering contributions from high-frequency intensity components and a series of vertical intensity bands and which indicates the detailed internal morphology of the cellular body. The combination of image cases one+two, one three, or one+two+three will all lead to images similar to the one presented in Fig.6 right. This complex structure can be observed from wide-angle cytometers and numerical Finite Difference Time Domain (FDTD) simulations.
Computational methods have done little to take advantage of this rich image structure.
One of the major factors inhibiting the development of wide-angle diagnostic devices is the computational effort needed to analyze the scattered light signatures. To deduce cellular information from scattered laser light he inverse scattering problem for light through biological cells must be addressed. This inverse scattering problem involves recreating the geometric parameters of a cell based on the observed path of light propagating through its cytoplasm, This is a largely unsolved problem, and any direct mathematical methods are either computationally intractable and/or lead to non-unique solutions (V.P. Maltsev, Rev Sci Instrum 71:243 (2000)). While numerous attempts have been made to simulate the effects of scattering in cellular bodies, the art is in need of a method and system for quickly inferring the geometric structure, both internally and externally, of a cell based solely on its light scattering data still eludes researchers (V.P.
Maltsev, Rev Sci Instrum 71:243 (2000)).
Previous work has shown that the light scattered by biological cells can be used to infer some aspects of internal cellular structure (V.P. Maltsev, Rev Sci Instrum 71:243 (2000);
X.-T. Su, et al. Optics Express 15:17 (2007)). This is pertinent to wide-angle cytometry systems (e.g. Singh et al. (K. Singh, et al. Cytometry 69A:307 (2006))) where the rich structure of two-dimensional scattering signatures from single cells may be used to explore the micro- and nano-structural makeup of the scattering source (K.
Singh, et al.
Cytometry 69A:307 (2006); X.-T. Su, et al. Optics Express 15:17 (2007)).
Structural information on the intra-cellular components (such as the organization and number of mitochondria) has great clinical relevance, as it may be used to detect and characterize certain diseases, such as cancer and cardiomyopathy (P.f.. Gourley, IEEE
JQuantum Electron 11:818 (2005); b. C. Wallace, Science 283:5407 (1999); M. Brandon, at al.
Oncogene 25:34 (2006)). As such, there is a great need for models to describe the relationship between scattering patterns and cell structure; this is critical to the evaluation of experimental scattering patterns.
Rapid simulation of cellular scattering (specifically the scattering from small component organelles) is a pivotal part of this pursuit (X.-T. Su, at at. Optics Express 15:17 (2007)).
By quickly simulating large libraries of images that approximate the results of cellular and mitochondria) scattering, computer methods could be used to identify key correlations between scattering patterns and initial scatterer structure.
SU'MMAR'Y' OF THE INVENTION
In one aspect, the present invention provides for a method for the analysis of scattering profiles arising from the relfiection, transmission, refraction and diffraction of coherent electromagnetic radiation through a mammalian cell comprising the steps of: 1) directing electro-magnetic radiation at the cell 2) receiving the reflected, transmitted, refracted and diffracted radiation, and 3) applying computer processing methods to the received radiation intensity patterns.
In another aspect, the present invention provides for a rapid computational method (mtPatterns) to simulate the scattering patterns of biological cells, comprising an algorithmic method that considers the scattering contributions of a three-dimensional array of scatterers and uses this information to formulate detailed large-angle two-dimensional scattering plots. In a further aspect, the scatterers may be in any spatial arrangement. In a still further aspect, the small scatterers may share the properties (e.g.
isotropic radiation) of small cellular organelles. In a further aspect, the scatterers may mimic the properties of any other organelle or structure via optical point-spread functions or other numerical methods.
In another aspect, the present invention provides for the prediction of cell structure directly from an observed scattering image, comprising mitochondria spacing within a scattering distribution, through scattering pattern blob spacing, and number of .
mitochondria within a scattering distribution, through scattering pattern intensity.
In another aspect, the present invention provides for a multi-agent system (Cythe) to parameterize laser scattering images of the kind produced by a wide-angle 2D
cytometer comprising a three stage pipeline of: feature detection, feature clustering, and post-processing to create a parametric representation of an input scattering image, wherein the result of the system is a parameter set numerically representing the complex image features created by light scattering through a cellular body that facilitates a parametric solution to the inverse scattering problem of laser light through a single biological cell. In a further aspect, the resulting parameterization may be used to classify experimental scattering patterns. In a still further aspect, the resulting parameterization may be used to segment experimental scattering patterns. In a further aspect, the resulting parameterization may be used to generate classifiers and predictive algorithms to classify experimental scattering patterns. In a still further aspect, the resulting parameterization may be used to identify and/or classify frequency information and Fourier information in scattering patterns.
In another aspect, the present invention provides for a diagnostic sequence comprising a creation of a scattering pattern from a mammalian cell or subpopulation of mammalian and the parametric analysis of obtained scattering patterns using the Cythe system of the present invention. In a further aspect, the sequence may be iterative (e.g.
comprised of the repetition of the following steps: pattern generation, pattern analysis/parameterization, and pattern comparison). In a still further aspect, the system may take as input both simulated and experimental scattering images, as generated by the present invention or a physical light collection device (including, but not limited to, wide-angle 2D laser cytometers), In a further aspect, the system may use feature extraction and/or iterative simulation to separate out the scattering contributions of different cellular structures, allowing detailed determination of cell content (in terms of nuclear size and cell size, microstructures, and nanostructures), the integration of feature extraction/parameterization results into pattern simulation and comparison, and the quick decomposition of scattering images into medically relevant parameters. In a still further aspect, the system may be used to classify experimentally obtained scattering patterns, predict the effect of intracellular changes on scattering patterns, and generate large datasets which be compared to experimental results (by way of other aspects of the present invention, iterative methods such as evolutionary computing, or any other means available) to allow for both real-time and off-line medical diagnostic techniques. In a further aspect, the extracted information could be used alongside phase reconstruction to deduce the structure of a scattering structure or structures.
The accompanying description illustrates preferred embodiments of the present invention and serves to explain the principles of the present invention.
BRIEF DESCRIPTION OF THE FIGURE
FIGURE 1 shows an example of how changes to spatial distribution (p(r), bottom row) impacts blob spacing of two-dimensional scattering patterns (I (S), top row) in an X-ray scattering situation;
FIGURE 2 shows the geometric layout of the scattering simulation process;
FIGURE 3 shows the two angular point-spread functions M(O(u,vlr), (p(u,vlr)) for individual scatterers: pure isotropic radiation (left, characteristic of mitochondria much smaller than),), and a theoretical pattern of a single mitochondrial scatterer (right, characteristic of larger mitochondria in the Mie scattering regime, as described by the polar plot of Gourley et al. ((P.1.. Gourley, IEEE J Quantum Electron 11:818 (2005)));
FIGURE 4 shows a blob spacing comparison between the experimental cytometry data for two Raji cells (left and middle) and the mtPatterns algorithm (right) over a thirty degree solid angle centred on 90 FIGURE 5 shows a series of sample images generated using the mtPatterns algorithm.
Images are shown for both uniform scattering (row A) and scattering using a complex point-spread function (row B);
FIGURE 6 shows simplified example images containing features known to be present in experimental and numerically simulated scattering patterns: a series of vertical intensity bands, like those found in micro-structural scattering (left), and a number of randomly-placed high-frequency intensity regions, characteristic of nano-structural Rayleigh scattering (middle);
FIGURE 7 shows agent fixation which is determined by comparing the image intensity at an agent's position to the average intensity ( a) within its view radius (left);
FIGURE 8 shows an example of horizontal bridge removal (before and after removal -left and right respectively), following the agent fixation shown in Fig. 7;
FIGURE 9 shows the two parts of a single propagate_idO cycle for an active agent (center pixel;
FIGURE 10 shows the schematic for the multilayer perceptron for parameter analysis;
FIGURE 11 shows Average of Band Intensity Nearest Neighbour Deviation (top), Average of Band Width Deviation (bottom), and Average of Band Width Maximums (bottom) with respect to Organelle Count for the full library of cells (L e. a set of thirty-eight simulations with varying nucleus size, position, and random organelle distribution;
image size reduced to 125x125 pixels);
FIGURE 12 shows the region identification for FDTD images, with (left) and without (right) organelles, with white depicting detected pixels;
FIGURE 13 shows a visual comparison of Cythe region detection (row B) with the initial test image (row A) for images with vertical intensity bands (left), high-frequency intensity regions (middle), and high-frequency regions overlayed onto a band structure (right, similar to those observed in FDTD simulations);
FIGURE 14 shows a visual example of horizontal bridge removal;
FIGURE 15 shows extraction of a band hierarchy for a simple noise-free image (top row) and for the same image with 10% of the images pixels assigned a random 8-bit intensity value (i.e. random noise; bottom row): the initial image (left), the agent population after the agent_fixationO routine (middle), and the final regions after post-processing (right);
FIGURE 16 shows scattering patterns with fixed cell size and increasing mitochondrial content (RS = 300, Rrn = 10 m, Ruxr = 20 m);
FIGURE 17 shows several sizes of randomly distributed mitochondrial shell with increasing radius (Rh, = R, =r, RS = 65, mitochondria = 50);
FIGURE 18 shows comparison of absolute one dimensional FFT power (with respect to special frequency) for scattering images of five different organelle scattering distributions;
FIGURE 19 shows a comparison of absolute one dimensional FFT power (with respect to spatial frequency) for scattering images with different organelle counts;
FIGURE 20 shows a comparison of FFT power for FDTD simulation of varying mitochondrial content, along both 0 (top) and 0 (bottom) axes.
DETAILED DESCRIPTION OF THE PRESENT INVENTION
RapidScattering Pattern Simulation A number of groups have recently worked on the assessment of scatter structure from light intensity measurements or predictions, and these studies typically fall into two categories:
forward methods and reverse methods. Forward methods rely on the prediction of a scattering pattern based on a knowledge of the scattering structure and model of light propagation (e.g. Finite Difference Time Domain-FDTD-simulations, (C. Liu, et al. .1 Biomed Opt 10:014007 (2005)), while reverse methods attempt to deduce some aspect of scatterer geometry from a pattern of scattered light (again using a model or algorithm to relate pattern to structure).
A true reverse method (i.e. a direct solution to the inverse scattering problem) would give an exact and detailed geometric structure of a cellular scatterer from a pattern of scattered light. Such a direct reverse solution has been shown to be computationally intractable for the problem of biological cells (V.P. Maltsev, Rev Sci Instrutn 71:243 (2000)). However, some advances in the reverse domain have been able extract one or two physical aspects of the scatterer. Of note, an `indicatrix' has been used on one-dimensional scattering slices and collections of angular slices to determine cell size and haemoglobin content (V.P.
Maltsev, Rev Sci Instrum 71:243 (2000); K.A. Sem'yanov, et al. Appl Opt 39:5884 (2000)). Scattering has also been used to determine red blood cell size and refractive index via a Fourier transform (N. Ghosh, et al. Appl Phys Lett 88:084101 (2006)).
Other groups have worked on predicting the properties of bulk solutions of multiple scattering bodies using Mie/Rayleigh-Gans theory fitting and Light Scattering Spectroscopy (LSS) (V. L.
Kim, et al. IEEE J Quant Electron 9:2 (2003); H. Fang, et at. IEEE J Quant Electron 9:2 (2003)) or elastic/angularly resolved light scattering (A. Katz, et al. IEEE J
Quant Electron 9:2 (2003)).
Due to the complexity of the forward scattering problem, scattering from individual organelles like medium-sized mitochondria, for example ellipsoids or spheroids in the size range of 400nm to 800ntn in diameter and 800nm to 3000nm in length (J. S.
Modica-Napolitano and K. Singh, Expert Rev Mol Med 4:9 (2004)), is typically described in the literature by Mie and Rayleigh-Gans (R-G) theory (A. Katz, et al. IEEE J Quant Electron 9:2 (2003); C. F. Bohren and D. R. Huffman, Absorption and scattering of light by small particles (Wiley, New York, 1998)). These simulation methods give an analytical basis for the intensity patterns observed from larger Mie and geometric scattering structures (e.g.
intensity banding (K. Singh, et al. Cytometty 69A:307 (2006)), but are generally constrained by the restrictive scatterer geometries (e.g. for Mie theory, treating scattering bodies as spheres).
More recently, FDTD code has been used to predict the two-dimensional scattering from cells; while extremely detailed and true to experimental wide-angle cytometry results (X.-T. Su, et al. Optics Express 15:17 (2007)), this discretized solution to Maxwell's equation is time consuming and computationally intensive for any realistic simulation step size (X.-T. Su, at al. Optics Express 15:17 (2007)). As such, simulations must be run on large super-computer arrays and may take hours or days to generate a simulated pattern. To summarize the passage above, forward methods for cellular simulation require either highly regular and simplified simulation parameters (such as the spheres in Mie theory) or long and detailed computations.
Though there are many forward and reverse simulation methods, as discussed above, to date FDTD appears to be the most effective method capable of generating extremely realistic wide-angle scattering patterns from a three-dimensional cell model (X.-T. Su, at al. Optics Express 15:17 (2007)). However, this leads to a computational bottleneck when generating large numbers of simulations, and does not provide a computationally tractable solution to the reverse problem of determining scatterer geometry from an experimental scattering pattern. If researchers hope to characterize and rapidly classify the effect of nano-structural cell components on scattering, another simulation method must be developed.As described above, the analysis of interesting nanostructural contributions to scattering in complete cells has so far proved quite difficult, However, it is known that organelles such as the mitochondria, which have an important role in a number of diseases, are the dominant cause of large-angle scattering (P.L. Gourley, IEEE
.J Quantum Electron 11:818 (2005)) (i.e. light scattered perpendicular to the path of the incident light, commonly called the side-scattering region / large-angle scattering domain).
As shown by the present invention, it is possible to rapidly simulate and analyze the important aspects of large-angle cellular scattering by only considering the contributions of mitochondrial scatterers.
It has been shown that approximately 90% of side-scattered light from cells is due to the presence of mitochondria (P.L. Gourley, IEEE J Quantum Electron 11:819 (2005)). Small structures, such as the mitochondria, readily scatter at large angles and are in fact the dominant cause of intensity artifacts in this region. This is due in part to the mitochondria's complex internal structure and numerous index-of-refraction changes (P.L.
Gourley, IEEE JQuantum Electron 11:818 (2005)). As seen from recent experimental and simulation work, side-scattered light from the mitochondria in human cells, for example Raji cell-lines (X.-T. Su, et at. Optics Express 15:17 (2007)), typically takes the form of small asymmetrical `blobs' in a two-dimensional scattering pattern, where blob size and spacing is thought to be related to the distribution of small scattering bodies within the cell (X.-T. Su, et al. Optics Express 15:17 (2007)). Conversely, larger cell structures-such as the nucleus and the cell wall-lead to a high amount of intensity in the forward and back scatter regions (i.e. angles approaching the path of the incident light (P.L.
Gourley, IEEE J
Quantum Electron 11:818 (2005)) and broad intensity banding in scattering images (X.-T.
Su, et al. Optics Express 15:17 (2007)). This is easily understood by examining the scattering regimes present: smaller cellular organelles will scatter isotropically or pseudo-isotropically via the Rayleigh and Mie regimes (size ¾ . ), while larger bodies will scatter along the light path according to geometric transmission (size > > ? ) (E.
Hecht, Optics, 4th ed (Addison Wesley, San Francisco, 2002)). For more information, a detailed description of light scattering in human cells is presented by Gourley et al, (P.L.
Gourley, IEEE ,I
Quantum Electron 11:818 (2005)).
Given the dominance of mitochondrial scattering in the large-angle domain, as verified by the recent experiments of Su et al. (X.-T. Su, et at. Optics Express 15:17 (2007), it is possible to ignore or filter out the intensity contributions of larger cell structures. This reduces the problem to the collective scattering from a series of points. As a result, the large-angle light scattering from cellular mitochondria begins to closely resemble the classical x-ray diffraction problem. This allows a dramatic simplification of the scattering prediction problem, as a number of useful parallels can be drawn from x-ray diffraction theory (XRD), greatly reducing the computational burden and providing a means (as for XRD) of partially solving the inverse problem.
To simplify the discussion, from this point on we will use and refer to the standard XR1) terminology and notation presented by Kasai and Kakudo (N. Kasai and M.
Kakudo, X-Ray Daction by Macromolecules (Springer, New York, 2005)); relevant terms will be redefined as needed.
A basis in x-ray diffraction and Fourier theory XRD is based on a number of assumptions regarding the experimental setup and the molecular or atomic structure being analyzed (N. Kasai and M. Kakudo, X-Ray Diffraction by Macromolecules (Springer, New York, 2005)). The four most prominent constraints and assumptions (listed below) hinge on the relationship between the incident wavelength (I) and the scattering aperture (i.e. the spacing between neighbouring scattering points or electrons), and the spatial relationship between the scatterer distribution and the receptive field (i.e. a planar recording device, such as x-ray film or a CCD camera).
They are:
Thompson scattering-the incident and scattered wavelengths are the same;
Scatterers radiate isotropically; the Fraunhofer approximation-energy arrives as a plane wave at the receptive field; and the kinematical theory of diffraction-secondary and tertiary scattering interactions (i.e. multi-scatter) should be negligible compared to primary scattering and wave interference (N. Kasai and M. Kakudo, X-Ray Diactlon by Macromolecules (Springer, New York, 2005)).
With this in mind, it can be shown that a simplified model of large-angle cellular light scattering fulfils the fimdamental constraints and assumptions for XRD. First, like XRD, the incident and emission radiation for a sample in a wide-angle cytometer also share a common dominant wavelength (i.e. elastic scattering). Secondly, in the limit of very small mitochondria (e.g. sub-wavelength nanostructres), side-scatter will be very close to isotropic, especially when observed via a small solid angle in the side-scatter region. In addition, for the geometric arrangement of a wide-angle cytometer the distance to the receptive plane (millimetre) is much greater than the distance between scatterers (organelles, micron scale), thus fulfilling the Fraunhofer approximation (i.e.
a2 /(X R) _ 0.0013 <<1, where R is the distance to the receptive plane and a is the scatterer spacing or scattering aperture aperture (E. Hecht, Optics, 4th ed. Addison Wesley, San Francisco, 2002)).
Finally, for a standard cytometry system, the impact of multi-scatter, especially in the case of a single analysed cell, is expected to be minimal. This result can be inferred from the large (micron) spacing between scatterers and the low scattering amplitude at large angles (P.L. Gourley, IEEE J Quantum Electron 11:818 (2005)); forward scatter greatly exceeds side scatter, leading to rapid multi-scatter intensity falloff.
Given these relationships between the physical cytometry setup and XRD, we can begin to examine the mitochondrial scattering problem from the perspective of Fourier scattering theory. In XRD Fourier theory, the wavelength, spatial distribution (or density) of a structure's scattering points (i.e. electrons) and their relationship to the recording location are the principal variables needed to calculate a structure's recorded scattering pattern (N.
Kasai and M. Kakudo, X-Ray Diffraction by Macromolecules (Springer, New York, 2005)). Through Fourier theory (e.g. Eq. 1, below) it is possible to relate scatterer density to the spacing and placement of intensity regions in the recorded scattering pattern a.
small scatterer spacing will result in a wide spacing between regions of high intensity in the recorded scattering pattern, while large spacing between scatterers will result in small tightly packed (i.e. high-frequency) intensity regions . This fact is commonly used to predict macro-molecule and protein structure from scattering and diffraction patterns with the help of phase reconstruction algorithms.
It is known from Fourier theory that a 3D spatial distribution of isotropic scatterers p(r) in the real domain can be related to scattering amplitude A(S) and intensity i(S) = A(S)A*(S) in the Fourier domain, where S is the set of vectors describing the geometry of the receptive field and r is the set of vectors describing the array of scattering points.
Therefore A(S) is the Fourier transform of the scattering distribution, and relates directly to the recorded intensity l(S) on the receptive field. From Kasai and Kakudo (their Eq.
2.13), the forward version of this Fourier relation is described as follows:
A(S) Jp(r)ep{2i(S.r)}dv, {EQ 1}
ry, the intensity contribution of all the scattering sources in a volume can be In summa mapped to a two dimensional receptive plane as related by the Fourier transform in Eq. 1.
This allows one to know something about the scatterer distribution p(r) based on the scattering amplitude A(S). More specifically, the spacing between blobs in the reciprocal (scattering) domain inversely corresponds to the spacing of real scatterer points in the scattering object in XRD, p(r) can be used to efficiently calculate A(S), and, although this is not the focus of the present invention, XRD methods exist to use A(S) to infer p(r).
Clearly then, an XRD-like approach to eytometry as contemplated by the present invention would have significant benefits.
As a simple visual example of this relationship, Fig. I shows how changes to spatial distribution (p (r), bottom row) impact blob spacing in the two-dimensional x-ray scattering pattern (A(S), top row) of a small 2 x 2 grid of atoms with horizontal spacings of lA and vertical spacings of IA, 0.5A, and 0.33A (left to right, respectively). As the y-axis spacing between atoms decreases by a factor of two and three (bottom middle, right), y-axis spacing in the reciprocal plot increases by a factor of two and three (top middle, right). Physical and Fourier dimensions are listed in A and 1/A on the plots.
Data was generated using DISCUS, the Fourier-transform based scattering simulator of Proffen and Neder (T. Proffen and R. B. Neder, JAppl Crystallogr 30 (1997)). This figure was generated using DISCUS, the widely-used X-ray scattering simulator of Proffen and Neder (T. Proffen and R. B. Neder, JAppl Crystallogr 30 (1997)). DISCUS uses continuous Fourier transforms to simulate the scattering from complex bounded (i.e. non-infinite and constrained) crystals and collections of atoms. In this example (Fig. 1), as the y-axis spacing between atoms decreases by a factor of two and three (from 1A
to 0.33A, bottom middle and right), y-axis spacing between blobs in the reciprocal plot (e.g. the scattering or Fourier domain, top middle and right) increases by a factor of two and three (from 1A-' to 3A-1 , top row). This holds with Eq. 1.
As another simple consistency check, one can see from standard diffraction theory that the characteristic spacing of a uniform diffracting lattice will directly impact the angular spacing of maxima in the resulting diffraction pattern. Using Bragg's Law (2d sin 0 m mX.
, with d as the lattice spacing and m as the order of principal maxima ), intensity maxima spacing varies as a function of scatterer point spacing. Using Bragg's Law (above) for a diffraction lattice with point spacings of 14m, 2.09 m (i.e. the characteristic organelle spacing expected from a 8 m human cell of nuclear size 4 m with 300 mitochondria uniformly distributed throughout the cytoplasm at a concentration of 0.21mt /
m3 , as per the cell dimensions of Su et al. X.-T. Su, et al. Optics Express 15:17 (2007)), and 3 m, one can generate a diffraction pattern that contains a characteristic spacing between maxima of X18 , X8.7 , and --6 respectively. Though real cell organelles and macromolecules are not organized in a rigid regular lattice (there are many theorized mitochondria) distributions, and the displacement caused by the nucleus will alter effective organelle spacing), it is easy to see that the range and behaviour of these values compares well with our expectations from Fourier theory (presented above).
With this theoretical background in mind, the present invention provides a method to 5 quickly and inexpensively simulate the large-angle scattering from a series of mitochondria-like scatterers.
The mtPatterns algorithm As shown by Eq. 1 and Fig. 1, the key component of a scattering pattern A(S) is actually derived from the spatial distribution of the scatterers p(r). This points to a rapid new 10 scattering simulation method based primarily on the spatial arrangement of mitochondria.
From Eq. 1 it is known that an arbitrary distribution of scattering points can be used to synthesize a final scattering pattern. Let us assume that p(r) is a distribution of small, isotropically radiating mitochondria. Given that each scatterer has position r and radiates isotropically, we see that the distribution p(r) is a Dirac delta function S(x - r)--the 15 distribution has unit intensity only at the exact position r of each mitochondria, and is zero everywhere else (i.e. a set of discrete points r e R3 ). This allows us to reduce Eq. 1 into the form of Eq. 2-the phase-shifted sum of each scatterer's influence on each point S of the receptive field A(S).
A(S) = Leap{-27ri(S = r)}
r {EQ 2) Eq. 2 may be easily implemented for the rapid computer simulation of scattering patterns.
One implementation is presented as the mtPattems algorithm (Tab. 1). As per Eq. 2, the mtPattems algorithm creates a three-dimensional array of isotropic scatterers r from a user-specified spatial distribution of scattering points. For this work it is assumed that the bounds of this distribution (r) are a spherical shell of inside radius R; and outside radius R0. However, any arbitrary volume may be specified. Next, it creates a two-dimensional receptive field A(u, v) of size U x V and positions it a specified distance d away from the isotropic scattering population centroid ra , along the z-axis (this simulates the CCD
camera). For realistic systems, d is typically than R0.
Table 2: The set of useful image parameters (P).
0 Create an array r of mt scattering points in .V= space, within radius bounds [Ri, R0]
1 Create a receptive field A as an array of size U x V, normal to the z-axis 2 Initialize all values in A to zero 3 Position receptive field A distance d above population center ro along z axis 4 ForallrEr:
For all u. V E A:
6 with wavelength A, calculate intensity iu v,lr, based on distance `r - (u, v) 7 if non-uniform scattering: lu.v1r = iu;vlr = M(euvir! Ou,vjr) 8 A(u, v) + = iu,vir 9 Return: A
Takes: (nit, R1, R0, U, V, d, M }, Returns: {A }
At this point, the algorithm determines the phase-shifted intensity contribution i(u,vir) of every scatter r E r at every point (u, v) E U , V of the receptive array A(u, v). This is done 5 by evaluating, for the given wavelength X , the phase and amplitude of the scattered signal after traveling a distance equivalent to the shortest path between the scatterer r and each point on the receptive field A(u, v). The individual scattering contributions at each point on the receptive field are then superimposed to generate the final scattering image.
Therefore, as an optional step, a point-spread function M(0(u,vir), cp(u,v}r)) may be applied to scale the isotropic radiation of each scatter by a known set of angular intensity values (which may be generated numerically or empirically). This allows for the simulation of non-uniform scattering, including the effect of varying shape and rotation of larger organelles. In one implementation of the present invention, M(O(u,vlr), (p(u,vlr)) was a look-up table of real values from 0.0 to 1.0, indexed by the angles 0(u,vlr) and (p(u,vJr) between the scatterer r and the point (u, v) on the receptive field; this was done to allow image files to be loaded as scattering functions. The present invention also allows for continuous point-spread functions to be used, as available (e.g. the Rayleigh-Gans approximation).
For the remainder of the description of the present invention, spherical coordinates will be used to describe the angle between a scattering object and the receptive plane. Specifically contemplated is 0 as the angle along the path of the incident light (i.e. with the axis normal to the incident light and parallel to the receptive field) and (p as the rotation around the SUBSTITUTE SHEET (RULE 26) path of incident light (e.g. 0 = 0 is pure forward scattering, 0 = 180 is pure back scattering, and 0 = 90 is pure side scattering along direction cp).
Scattering Pattern Parameterization and Classification Given the challenge of solving the inverse problem for scattering from a living cell, the literature to date has focused on the empirical classification of cells based on their scatter at a few specific angles or an angular slice through the center of the full two-dimensional scattering pattern (commonly called the "indicatrix"). It is evident from the rich structure of the scatter patterns (along both the 4, and a axis) that there is far more information present than is contained in simple angular slices.
As shown by recent efforts in the art, a more computationally tractable method is to effect a 'parametric solution' to the inverse scattering problem (V.P. Maltsev, Rev Soi Instrum 71:243 (2000); K.A. Sem'yanov, et al. Appl Opt 39:5884 (2000); N. Ghosh, et al. Appl Phys Lett 88:084101 (2006); Z. Ulanowski, et al. Appl Opt 37:4027 (1998)). In this two-step method (parameterization and pattern recognition), they parameterize some aspect of a scattering pattern and use a set of mathematical relations (K.A. Sem'yanov, et al. Appl Opt 39:5884 (2000); V.F. Maltsev, Rev Sci Instrum 71:243 (2000)), fast Fourier transforms (N. Ghosh, et al. Appl Phys Lett 88:084101 (2006)), or standard data mining techniques (Z. Ulanowski, et al. Appl Opt 37:4027 (1998)) to relate the extracted parameters to the initial structure of the scattering source. This process is rapid by comparison to forward methods, and may allow a degree of structural generalization that alleviates some of the problems caused by non-unique forward solutions.
Extracting viable parametric information from information-rich wide-angle scattering signatures presents a number of computational challenges. Because of complex cellular geometries, intensity bands may partially overlap in some places, the maximum intensity of each band may differ greatly from that of its neighbours, and the ambient background intensity is not consistent over the entire image. In addition, band boundaries are smooth gradients, not sharp intensity level transitions. These characteristics make it quite difficult to extract relevant features from an image and group them into meaningful regions.
While the art has addressed the individual components that make up this high-level segmentation problem (e.g. feature detection/extraction, connected component labeling, noise rejection, region clustering), the prior art has not developed a way to extract and analyze the full range of information present in two-dimensional cytometric scattering images. Analysis of information present in two dimensional scattering images involves partitioning two-dimensional scattering images into spatially distinct regions and S extracting high-level semantic information (i.e. image parameters) from the detected regions. Computer vision and image segmentation lie at the heart of most medical image analysis schemes. While there are many possible methods to segment wide-angle scattering images, after surveying the body of current segmentation literature the system of the present invention provides within the framework of a multi-agent image processing environment (described below) due to its demonstrated power, flexibility, and novelty.
Multi-agent segmentation systems have been thoroughly tested in a number of image processing situations, and demonstrate comparable or superior performance when compared to traditional methods. In addition, the distributed nature of multi-agent systems is a benefit for future hardware implementation. As such, they provide a solid basis for the development of a cytometric image processing pipeline.
A large body of recent image segmentation work relies on the use of multi-agent swarms, including particle swarm optimizations, evolutionary autonomous agents, and ant-colony optimizations. These multi-agent ('swarm') systems are composed of a population of autonomous or semi-autonomous 'agents' that collaborate (directly, indirectly, and/or competitively) to achieve a common goal. In this context, an agent is defined as a independent computational unit with a set of internal states and action rules;
an agent's future behaviour depends on its current condition, programmed rules, and the state of its environment (A.P. Engelbrecht, "Computational Intelligence: An Introduction", (John Wiley & Sons, 2002). (Multi-agent systems are finding widespread use in engineering and computer science applications, ranging from process optimization to computer vision, population modeling to industrial control; Engelbrecht provides a good introduction to this topic (A.P. Engelbrecht, "Computational Intelligence: An Introduction", (John Wiley &
Sons, 2002).
Prior art segmentation algorithms have one thing in common: they attempt to break a complex image into a set of smaller regions, where each region is homogeneous with respect to one or more parameters (e.g. contrast, intensity, texture) (N. Pal et al. Pattern Recognit 26:1277 (1993)). The effectiveness of each method varies depending on the size, texture, orientation, and shape of features contained in an image; no single approach will work well for every image (N. Pal et al. Pattern Recognit 26:1277 (1993)). In most cases, image sub-division is a two stage process-an image is segmented into smaller sections which are then clustered into groups based on some similarity metric (D.K.
Panjwani et al.
IEEE Trans Pattern Anal Mach Intell 17:939 (1995); S. Lee et al. IEEE Trans Image Process 14:312 (2005)); such as the split-and-merge or region-growing approach, recently used for tracking cells in diagnostic images (B. Prasad, at al Meas Sci Technol 16:909 (2005)). Unlike the prior art the system of the present invention does not involve agent reproduction or movement.
To parameterize scattering images detection of continuous intensity regions is needed, as well as characterization of the regions with respect to their spatial orientation within the image, their intensity profile, and their relationship to other parts of the image. This allows numerical representation of the low and high frequency intensity structures present in scattering images (as described above).
The system of the present invention provides the first computational system designed to comprehensively parameterize wide-angle scattering signatures. The system of the present invention is able to identify the overall structure and relationships present in a scattering image. The resulting parameterization scheme, built from the numerical characterization of intensity bands and independent intensity blobs, can be used in the identification of cellular morphology. This facilitates the rapid division of experimental samples into healthy and diseased categories for expedient medical diagnosis. The prior art has shown viable two-stage image segmentation systems: in stage one all salient pixels are labeled with one or more identifiers; in stage two all labeled pixels are clustered and grouped according to some similarity or congruency metric (A.K. lain at al. IEEE Trans Pattern Anal Mach Intell 18:195 (1996); S. Lee at al. IEEE Trans Image Process 14:312 (2005);
N. Pal et al. Pattern Recognit 26:1277 (1993); L.G. Shapiro and G.C. Stockman, Computer Vision (Prentice Hall, 2001); D.K. Panjwani et al. IEEE Trans Pattern Anal Mach Intell 17:939 (1995)). The present invention uses an additional stage to organize the clustered regions and extract a set of relevant parameters.
After performing an initial user-specified image size reduction, the first stage of the system of the present invention (Cythe) (feature detection) is responsible for creating and fixing a population of computational agents (A) to the salient features of the target image.
This stage effectively labels all the pixels corresponding to relevant intensity regions; an explanation of saliency determination will be presented in the following sub-section. The second stage of the present invention (feature clustering) is responsible for clustering the 5 fixed agent population (A) into a list of spatially distinct regional groups (G). The final stage of the system of the present invention (post processing) removes large-scale image noise, creates a band-like grouping structure from identified regions, and extracts a parametric representation (P) of the input data.
Detailed explanations of each stage are presented in the following subsections, which also 10 describe the parameterization equations and the four major algorithms used to implement the individual stages of the pipeline. These are: the agent_fixationO routine, which is responsible for fixing the agent population to the salient image features; the propagate_idO routine, responsible for clustering agents into connected groups; the scrub() routine, which removes image noise and erroneous groupings; and the joinO
15 routine, which joins groups into a band-like structure of super-groups.
DeJlnitions (Agent): An agent is a single computational unit that is assigned to a pixel or region of the image grid I. Each agent has a number of internal states and potential actions, and can alter these internal states and/or perform actions based on the information present in a 20 localized area of the image grid I.
(Agent Neighborhood): The agent neighborhood N is a n X n region of the image grid I
centered on the agent location x,,ye. This region determines where an agent will look for and communicate with other agents, see for example the "Neighbouring Region of an Agent" of Liu et al (J. Liu, et al. IEEE Trans Evol Comput 1:141 (1997)).
(Agent View Radius): The agent view radius R is a (2r+1) x (2r+1) region of the image grid I centered on the agent location xa,ya (Fig. 7, left). This area helps determine agent feature detection preferences, and the pixels within this area are used in the calculation of Average Pixel Intensity e= This is akin to the image region used in the "local binary pattern and contrast measure" of Ojala and Pietikainen (T. Ojala et al.
Pattern Recogn 32:477 (1999)) and the area used to acquire local stimulus by Liu et al. (J.M.
Liu et al.
IEEE Trans Pattern Anal Mach Intell 21 544 (1999); J. Liu, et al. IEEE Trans Evol Comput 1:141 (1997)) and Mirzayans et al (T. Mirzayans, et al. Neural Netw World 15:243 (2005)).
(Average Pixel Intensity): This value, denoted ,, is the average pixel intensity value that agent a observes within its view radius R. Average Pixel Intensity is equivalent to the "mean pixel value" component of Liu et al.'s texture definition, as used in their multi-agent feature detection routine (J. Liu, et al. IEEE Trans Evol Comput 1:141 (1997)).
Feature Detection The first stage of the Cythe parameterization pipeline takes the input image 1, scales it to user specified dimensions (u X v), renders the image as a two-dimensional array (X.L.
Wu, IEEE Trans Inf Theory 38:1755 (1992)), and creates an agent population A
equal in size to the number of pixels in the image grid. A single agent is assigned to every pixel in the image grid. These agents then use the information available in their local neighbourhood to detect features and sort themselves into regions; this is the standard approach used in most agent-based image processing systems.
During the feature detection stage of the pipeline, each agent calls on a fixation routine-agent_fixation() -to determine its immediate behaviour. When the fixation routine is called, the agent will perform one of two actions: an agent will affix to (and therefore identify as a salient region) a pixel at image grid location xa,y. if the pixel has an intensity value greater than the agent-computed average pixel intensity B., or, if this condition is not satisfied, the agent a will be removed from the agent population A. In this way, agents are able to detect salient intensity edges in the image I independent of differing background intensity values (i.e. a fixation routine based on an agent's relation to its "average pixel intensity" is an adaptive thresholding function, as described by Pal and Pal (N. Pal et at. Pattern Recognit 26:1277 (1993)).
After the entire agent population has been polled for fixation, only agents that reside on salient pixels will remain in the population (T. Mirzayans, et al. Neural Netw World 15:243 (2005)) (Fig. 7, right). To aid in effective region segmentation, the entire fixed agent population is scanned and all agents with more horizontal neighbours than vertical neighbours removed. As the intensity bands present in the scattering images are vertical (as discussed in the introductory section), this helps eliminate any horizontal intensity `bridges'; much like a horizontally-selective version of the "opening"
operator used in binary image analysis (L,.G. Shapiro et al. Computer Vision (Prentice Hall, 2001), the removal of these weakly-connected 'bridges' facilitates region discrimination (as shown by Wang and Yuan (Y. Wang et at. IEE Proc-Vis Image Signal Process 149:173 (2002)). As shown in Fig. 7, after the agent_fixationO routine, members of the agent population will be fixed on areas of high intensity relative to the local image texture (right - shown here for an agent view radius of r 1). This adaptive process detects edges independent of differing background levels. Pixel color indicates 8-bit intensity, from 0 (black) to 255 (white)An example of the removal process is shown in Fig. 8. In Fig. 8, solid circles indicate fixed agents and striped circles represent agents that will be removed, severing the connection between the two minimally-connected bands;numbers inside the pixels represent the ratio of horizontal to vertical neighbours within the 4-neighbourhood of a given agent (H:V).
Once the agent population has completely labeled all relevant pixels in the image grid, a clustering process-propagate _idO-takes over to form the population A into a set of spatially distinct regions G (i.e. it links all adjacent agents to create spatially-connected sub-regions). propagate_idO is a form of the classical connected components labeling algorithm (L.G. Shapiro et al. Computer Vision (Prentice Hall, 2001)), traditionally used to identify spatially-connected image features. Each time the propagate-idO
routine is called, a sweep is done over the entire agent population; each agent in the population polls all other agents in its local neighbourhood (N) for their current ID value.
Based on its initial scan, an agent records the highest ID value, idm0,, in its local neighbourhood. The agent then re-propagates the value id ax, to all neighbours with ID values less than idm.õ
and the receiving agents take on the new maximum ID value. The entire agent population is iterated through until no further ID changes are observed (T. Ojala of al.
Pattern Recogn 32:477 (1999)). At this point all agents in a separate physical region will share a unique ID number. A single iteration of the propagation process is shown graphically in Fig. 9.
As shown in Fig. 9, initially, the agent surveys its local neighbourhood and records the ID
values of its neighbours (left). Seeing there is a higher ID in the area (shown with striped circles), it takes on this ID value and re-broadcasts the new ID to its neighbourhood (right). This leads to an agent neighbourhood that is homogeneous with respect to the highest ID value.
It is important to note that ID propagation occurs in an agent's 4-neighbourhood (i.e. to agents left, right, above, and below the agent, but not on diagonal corners (L.G. Shapiro et at. Computer Vision (Prentice Hall, 2001)). This aids in band discrimination and removes additional band bridges. Due to the close horizontal proximity of bands in the scaled image I, it was found that communication within an agents full 8-neighbourhood could lead to a number of intensity regions being erroneously grouped into a single region.
Allowing diagonal communication between agents did not facilitate any useful connections beyond those gained through purely horizontal and vertical transmission.
Since evey agent starts with a unique 1D value, the clustering process guarantees that every connected image region will have a common unique identifier (L.G.
Shapiro et at, Computer Vision (Prentice Hall, 2001)); a set of agent groups (G) is formed, where each group (G) contains a list of spatially-connected agents (Ar) that share a common ID value (As is a non-overlapping subset of the initial population A).
As in previous work, a feature detection stage followed by a clustering stage is able to effect image segmentation. However, to utilize (and parameterize) the detected regions in the context of scattering image analysis, a third stage to organize and parse the segmentation results is required.
After the creation of homogeneous ID regions, several post-processing routines take over to remove high-level noise, join vertically correlated regions into a band hierarchy (i.e.
create super-groups out of related regions), and extract the final parametric representation of the input file. The first process--scrub searches through the list of agent groups G
and removes all groups (and their member agents) smaller than a given percentage of the image size from A and G respectively; the removal size threshold 0 can be empirically set by human users to match the input image conditions. This method of removing small connected objects was used by Prasad et at to eliminate background noise in their cellular tracking system (B. Prasad, et at Mews Sci Technol 16:909 (2005)).
Each group that survives the scrubO routine is then analyzed for its dimensions and center point (gx,gy). This effects a simple geometric characterization of all surviving groups in G.
Next, horizontally-related regions are connected into band-like structures using the joinO
routine, a simplified variant of the standard one-dimensional k-means clustering algorithm (R.O. Duda, at al. Pattern Classification (2nd Ed., Wiley Interscience, New York, 2001)).
As in the k-means algorithm, join() creates list of super-groups and assigns one or more image regions g to each super-group g' based on the horizontal distance d=Ixg.
- x5J
between the group center and the super-group center. Assignment occurs if d is less than a user defined threshold 0 (specified as a percentage of the image size), and each group may be assigned to only one super-group. A super-group's center is iteratively re-calculated based on the location of its member sub-groups. The join() process continues until every group has been assigned, clustering image regions with respect to their horizontal proximity. This allows the recognition of vertical bands in a scattering image while still retaining the detailed statistics of each individual sub-group. As such, join0 creates a region hierarchy out of the agent population which can be stored at minimal cost for later retrieval and parameter estimation.
In the last step of the Cythe pipeline, the super-group hierarchy is traversed and cross-referenced with the initial image I to extract a number of useful global parameters P
(shown in Table 2). These parameters describe the overall structure and inherent complexity (in terms of spatial frequency components) of the image 1, and are used to numerically represent the image features generated by light scattering through a biological cell (i.e. the number of regions, their size/shape, their relation to each other, and the variance of region width and intensity).
Table 2: The set of useful image parameters (P).
# Parameter Description I B The number of bands in the image 2 BSmin The minimum band spacing 3 BSmaz The maximum band spacing 4 BSavg The average band spacing 5 aBWms,, The average over all bands of minimum band width 6 aBW,H1 The average over all bands of maximum band width 7 aBWa,,g The average over all bands of average band width 8 aBWdt, The average over all bands of band width deviation 9 aBlm,o The average over all bands of minimum band intensity aBI The average over all bands of maximum band intensity 11 -aBla'e The average over all bands of average band intensity 12 aBldr,, The average over all bands of band intensity deviation 13 aBInõ The average over all bands of nearest-neighbour band intensity deviation While finding a direct correlation between scattering signatures and the initial model parameters of a FDTD simulation or the structure of a cell has been shown to be an unsolved problem (V.P. Maltsev, Rev Sci Instrum 71:243 (2000)), the parameters in P
5 allow us to infer structural information from the presence of intensity regions with varying spatial frequency. The knowledge that certain cellular structures will generate intensity regions of a given spatial frequency allows relationships to be made between the extracted image parameters P and the initial layout of the scattering source. From initial experiments, it is expected that there will be direct correlations between these parameters 10 and the underlying cell model parameters; it has been observed that this is true for relations between small organelle content and several a .$r/alWparameters.
In this case, each super-group g' extracted by the Cythe pipeline corresponds to a detected intensity band `b' in the scattering image. Based on previous work (K. Singh, et al Cytometry 69A:307 (2006)) and a series of FDTD simulation experiments (which demonstrate the presence of vertical intensity bands in scattering images) it is observed that it is most effective to use a band-based parameterization scheme. In this approach, the small high-frequency intensity areas resulting from smaller scattering centers are effectively described by variations to the width and intensity of existing intensity bands (i.e the presence and magnitude of high-frequency intensity fluctuation is indicated by changes to parameters 5- 13, Table 2). A similar parameter set could be created for images without an observable band-like structure.
These parameters are extracted from the final super-group hierarchy through a series of mathematical operations, shown in Egs.(4) and (5) below. Every detected super-group is analyzed with Eqs.(4) and (5), and the resulting values are combined into the set of parameters P. Width statistics are derived by iterating through the agent population, intensity statistics are derived by taking a single-pixel wide intensity sample down the vertical center line of each super-group, and band spacing statistics are generated by comparing the horizontal centers of all super-groups.
min' (y) and minx (y) are defined as the minimum and maximum angular values that still contain pixels belonging to band b at the vertical image position y. The function intensity (xb,y) is the 8-bit intensity value at the horizontal center point x of band b, at the vertical positiony. Set yb is the set of vertical values for band b. The functions wino, max0, and avg O are the standard minimum, maximum, and average operations performed on the list of values for a band. Band spacing (BS) is defined as the distance between the horizontal centers of two neighbouring bands; (xb - .b+11 Values for the maximum, minimum, and average band spacing are calculated using the standard operations.
fWa(Y) nt xx(y) - rninA (y) (EQ
4(a)) min(BWV(y), )' E Y1') (EQ
4(b)}
BWrb.,x = max(BWb(y), y e Yb) (EQ
4(e)) BW = avg(BWb(Y), y E Yb) R {EQ
4(d)) 0Wd'', Y, IBWb(y) ~-BW. $1 Y{EQ
4(e)}
BIb(y) = intensity (xb,y) {EQ
5(a)) min(BIb(y), y E yb) (EQ
5(b)) Br,. = max(glb(y), y E Yb) {EQ
5(c)}
B1.11 a = avg(B11(y), y E Yb) {EQ
5(d)) ' yri~ ~g(b(y) -131~,,8~ {EQ
5(e)) 5(f)) There is a dramatic increase in the amount of information available when the number of values in this extended parameter set is compared to the number of indicatrix parameters derived from one-dimensional scattering intensity slices. It is expected that this increase in parametric image information will lead to a corresponding increase in the predictive power of future classification systems. Intensity band relationships (such as band spacing BS, Table 2) can be used to predict the nature of larger cell structures (K.A.
Sem'yanov, at al. Appl Opt 39:5884 (2000)), while variations in region width and region intensity due to high-frequency image components (Parameters 5-13, Table 2) may be used to detect the presence and number of micro- and nano- scale cellular organelles (work in preparation).
The final step in any automated diagnostic system is a method to deduce cellular structure from the extracted scattering pattern parameters P. There are a number of potential machine learning approaches that can successfully associate a set of extracted parameters with an initial labeled data set to create a classifier with predictive power (R.O. Duda, et al. Pattern Classification (2nd Ed., Wiley Interscience, New York, 2001); M.
Witten ei al. Data Mining. Practical Machine Learning Tools and Techniques (Morgan Kaufmann, 2005)). A pattern recognition system based on the parametrization approach has been developed as described below.
Example 1: Simulation and Verification of Scattering Patterns Using mtPatterns, an extensive set of test images were created and compared (via qualitative observations and quantitative values such as characteristic blob spacing) to experimental images generated by a miniaturized cytometry device. These results and comparisons are presented below. In addition to a bank of test images with cell size and nuclear size values R. ~ 0.11tm-20.0 m and R; = 0.1 m-20.0 m (mt = 5-1000), additional tests were performed with the specific cell parameters for the Raji human cell line given by Su et al (Ro -=8.0 m, R; ~4.0}rm, mt --=83-677 (X.-T. Su, et al.
Optics Express 15:17 (2007))). This allowed exploration of the behaviour of mtPatterns over a wide range of realistic simulation situations and also compare simulations to actual experimental cytometry results for human Raji cells. Each test was performed at least three times with different random placement of organelles.
The following parameters were used in all mtPatterns simulations: an incident light wavelength of X = 632nm, a CCD receptive field area of U x V = 3mm x 3mm, with the scatter centroid centered d = 5mm below the CCD plane. These dimensions conform to those achievable in a physical experimental system, and can be seen to fulfil the XRD
constraints presented in the previous section. Shown in Fig. 2, this setup gives a viewable side-scatter region between 77.3 and 106.7 in both the 0 and cp axes (a solid angle of approx. 30 ). Two point-spread functions M(0(u,vIr), cp(u,vjr)) were used:
pure isotropic radiation and a theoretical power function for a single spherical scatterer (mitochondria-like, generated from the angular scattering profile presented by Gourley et al. (P.L.
Gourley, IEEE J Quantum Electron 11:818 (2005))). For these cases, scaling factor M(0(u,vlr), cp(u,vjr)) was specified as the image intensity (normalized to a unit scale) at position 0(u,vir) , gp(u,vlr). These point-spread functions are shown in Fig.
3 where, pure white represents an intensity scaling of 1.0, pure black indicates a scaling of 0.0 (i.e. no power transmission).
The following sections present a detailed qualitative and quantitative analysis of the mtPatterns algorithm, and show that a scattering simulation based only on the structural layout of small organelles can accurately produce valid experimental images without considering complex optical paths and the interaction of large cellular components such as the nucleus.
In XRD, analysis is typically done by calculating the Fourier transform of potential crystal structures and comparing them to experimental scattering patterns. This allows researchers to infer correct crystal structure (a famous example being the discovery of the helical nature of DNA). Later work is then sometimes done to try and reconstruct the phases of incident X-rays, potentially yielding the exact structure of the scatterer.
A similar process is possible for cellular scattering; the following results show the possibility for predictive assessment of cell structure (e.g. mitochondria number and distribution) based solely on the observed layout of experimental images.
Example 2: Comparison of mtPatterns and experimental cytometry images To evaluate the mtPatterns algorithm, mtPatterns results were compared to experimental wide-angle cytometry patterns-captured using the method described by Su er at.
(X.-T.
Su, el al. Optics Express 15:17 (2007))-from the laser scattering of human Raji cells.
This experimental data was scaled and cropped to the same side-scatter angular range as the mtPattems data (a solid angle of 74-106 ) and normalized to the same intensity and contrast levels. Estimated cell parameters from the experimental data (X.-T.
Su, et at.
Optics Express 15:17 (2007)) were used as input to mtPatterns to generate corresponding simulations. This process allowed for direct visual and numerical comparison.
The similarity between images may be visually judged by examining the shape and image structure of scattering images obtained from mtPatterns and from real experimental cytometers. Fig. 4, top, shows the experimental scattering signatures for two human Raji cells, taken using a miniaturized cytometer, with estimated cellular parameters of R4 --=8.01m, R, - 4.0 m, mt -83 - 677, as per the experiments of Su et al. (X.-T.
Su, et at.
Optics Express 15:17 (2007)). As shown in Fig. 4, orange and yellow marks on the bottom row indicate blob center points / spacing gaps used in blob spacing calculations.
The horizontal axis corresponds to changes in 0, vertical to changes in Y. The patterns show n in Fig. 4 are structurally similar to the pattern generated by mtPatterns (Fig. 4, top, right) when initialized with the average parameters of a Raji cell estimated by Su et al. (Ro = 8.0 m, Ri = 4.Q m, mt = 300).
It is extremely difficult to objectively compare scattering images in quantitative terms.
However, since the spacing of the scattering distribution relates to scattering pattern blob spacing (Eqs. 1,2), one effective way to numerically compare the similarity of scattering images is to evaluate the characteristic angular spacing between neighbouring scattering intensity blobs. This metric also can help guide the process of inferring experimental scattering structure from simulated images.
For the numerical comparisons presented here, spacing (in image space) between neighbouring maximum intensity regions (i.e. blob peaks) was measured as shown by the mesh in Fig. 4 and normalized to the angular range of each image to give a set of angular blob spacing values for each image. These spacing values were then used to compute characteristic spacing.
Using this metric, the experimental images of Fig. 4 were compared to the mtPatterns simulation generated using the corresponding parameters. This comparison can be seen qualitatively in Fig. 4, bottom, and quantitatively as follows. As described above, mtPattems was initialized with the estimated parameters of a Raji cell (Ro =
8.0 m, Ri =
4.0 m, mt = 300, as given by Su et al.). With these parameters, an average spacing between the maxima of intensity regions (i.e. blob centers) of 5.51 (Std.Dev.
1,51, 73 samples; Fig. 4, right) was observed. This compared well to the spacing values from the experimental cytometry images: 5.12 (Std.Dev. 1.47, 85 samples; Raji Cell A-Fig. 4, left) and 6.04 (Std.Dev. 1.46, 62 samples; Raji Cell B-Fig. 4, middle).
As such, the mtPatterns characteristic spacing value was observed to be within the range of values from the two experimental cells: spacing Raji Cell A < spacing mtPatterns <
spacing Raji Cell B). The slight variations in average characteristic spacing between the two experimental samples are likely due to changes in internal cell structure (such as organelle number and placement) and are well within one standard deviation.
As a back-of-the-envelope consistency check, it is possible to confirm that these observed characteristic spacing values relate to a reasonable range of organelle spacing values by examining what characteristic spacing values we would expect from a uniform crystal lattice with structural parameters similar to those of a Raji cell. As mentioned above, for the parameters of a typical Raji cell of 8 m radius with 83-677 organelles uniformly distributed between its 4 m nucleus and the cell wall, we would expect to find a characteristic organelle spacing in the range of 1.58 m (677 mt) to 3.19 m (83 mt). Using these values in Bragg's law from the methods section, this leads to an expected blob spacing angular range of 5.69 -11.51 The observed blob spacing results from our mtPattems simulation (5.51 ) and experimental images (5.12 , 6.04 ) fit at the lower bounds of this range, which is reasonable considering the simulations and experimental images of Fig. 4 are not really a uniform perfectly-diffracting lattice.
Deviations from the characteristic scattering peak spacing of the uniform lattice are expected due to the displacement of the cell nucleus and the non-uniform (e.g. randomly distributed, or closely aggregated) arrangement of organelles.
Lastly, it was observed that the experimental results (Fig. 4, left, middle) appeared to have more intensity toward the centre of the image. This background intensity could indicate the contributions of larger cell components and microstructures (e.g. the nucleus), light from the experimental setup, and/or the washed-out superposition of any non-uniform mitochondrial scattering. As these differences are primarily broad low-frequency features, such variations could be potentially detected, isolated and/or filtered out in the analysis process to facilitate comparison with simulated images.
The combination of the ability to calculate blob spacing and the ability to rapidly generate realistic simulated scattering plots with the mtPatterns algorithm facilitates the prediction or cell structure and cell diagnosis based on mitochondria spacing directly from an observed scattering image. Once it is possible to quickly simulate realistic scattering patterns, simple visual and numerical comparisons could be used to perform real-time classification of cytometrically interrogated cells. As shown by XRD, great leaps in structural assessment can be made by comparing experimental and simulated scattering patterns.
In addition, it should be noted that once organelle scattering has been isolated, feature extraction methods as described in the present invention may be used to identify and classify the scattering contributions of larger microstructures, giving further information about the cell as a whole or providing context for organelle analysis.
Example 3:Non-uniform mitochondrial scattering As shown in Fig. 5, there was little difference between mtPatterns scattering images created using isotropic scatterers (Fig. 5A) and scatterers with the more complex point-spread functions characteristic of larger mitochondria (Fig. 5B). Increased scatterer spacing can be seen to lead to a decrease in scattering pattern blob spacing (e.g. 2A, 3A).
In terms of scattering blob placement and spacing, there is little difference between uniform (A) and non-uniform (B) mitochondrial scattering. To allow visual comparison, the images in Fig. 5 have been normalized with respect to the same minimum and maximum intensity values (i.e. 0-255), and cover the same solid angle (77.3 -106.7 in 0 and (p) Forward scatter is toward the right. Experiments showed little structural variation (in terms of scattering blob structure) between large-angle scattering patterns generated using a uniform point-spread function (Fig. 3, left) and those generated by radiators using the point-spread function described by Gourley et al. (Fig. 3, right). This was expected, as the point-spread function of a non-isotropic mitochondria is almost identical to an isotropic point-spread function when observed over a solid angle of approximately 30 , centered on the side-scatter region. As such, the use of isotropic scattering points to simulate mitochondria proved to be a valid first-order approximation in the large-angle scattering domain, Further justification can be posed in terms of Fourier convolution theorem.
Given that an individual (larger) mitochondria has a scattering pattern in isolation of M(S), we can infer that it has a complex structure in the spatial domain m(r) (i.e. the inverse Fourier transform of M(S)). M(S) can be quite complex in the case of molecules, or a simple low-order function for atoms (N. Kasai and M. Kakudo, X-Ray Diffraction by Macromolecules (Springer, New York, 2005)). Since a population of scatterers is described by p(r), one can see that for the case that they are larger mitochondria with similar complex scattering patterns-the actual structural make-up of the population will be the convolution p(r) m(r) (e.g. an array of complex individual structures). Using the convolution theorem, this will result in the multiplication of the Fourier transforms of F{p(r)} and F
{m(r)} in the scattering domain as follows (N. Kasai and M. Kakudo, X-Ray Diffraction by Macromolecules (Springer, New York, 2005)):
Atrue(S) = A(S)M(S) {EQ 61 In physical terms, this will be equivalent to the multiplication of a point-spread function (e.g. Fig. 3, right) with the result of an isotropic scattering simulation.
This is what one can see in Fig. 5, where row B is simply the pixel-by-pixel multiplication of row A by the appropriate angular slice of point-spread function Fig. 3, right. Once known, the scattering of an individual large mitochondria may be added or removed from an image as needed to reveal the underlying scattering structure. This result is known from XRD (N.
Kasai and M. Kakudo,X-Ray)fraction by Macromolecules (Springer, New York, 2005)).
Example 4: Observations on the effect of scatter distribution and interaction Fig. 5 also shows that an increase in the number of scatterers lead to increased image complexity (i.e, smaller blobs and smaller characteristic spacing). In addition, an increased scatterer distribution radius R. also lead to increased image complexity. A
tightly arranged organelle distribution (e.g. smaller organelle containing volume, such as Pig.
5, A2) resulted in larger homogeneous intensity regions with greater characteristic spacing, while wider distributions (such as Fig. 5, A5) generated a number of small, tightly spaced intensity regions.
This behaviour is expected from Fourier theory and X-ray diffraction literature, as described above. As in Fourier theory, the spacing between-and size of-intensity regions on the simulated CCD region were inversely proportional to the spacing of the original organelle distribution. The more dense the scattering distribution, the more distance between blobs in the scattering plot (Fig. 5). These trends will be useful in future feature-based classification systems, as image feature complexity can help deduce initial cell structure, whether by direct comparison of features or phase reconstruction techniques.
In the limit of a small, very tightly packed set of scattering points, the generated scattering pattern was very similar to that expected from a single scatterer located at the distributions centroid-irrespective of the number of mt, for very small scatterer volumes the scattering signatures washed out into a single bright intensity blob with no complex structure, though the un-normalized intensity of the blob increased in proportion to the number of scatters.
This result is expected from scattering theory; as R-0 A, scattered phases will align and all scattering contributions with be uniformly constructive regardless of scatterer number.
This could be used as a useful predictor of the raw number of organelles found within a given structural plot, allowing the classification of both the organelle spacing (through blob spacing) and number (through raw blob intensity) of scattering organelles from a single scattering plot.
The average mtPatterns simulation took between a fraction of a second to about a minute on a standard personal laptop computer running the python interpreter.
Example 5: Application of Analysis Functions Two testing methods were employed to verify the validity of the Cythe system:
qualitative image analysis, and a quantitative statistical breakdown. For the qualitative analysis, the system was presented with images representative of all three cellular scattering cases described herein (e.g. intensity bands with a number of randomly placed intensity regions, as in Fig. 6 right). Varying levels of high-and low-frequency intensity variation may be present in a single image, leading to complex, information-rich image structures (right). Due to the difficulty surrounding quantitative segmentation analysis, statistical analysis was performed on images containing the first two scattering cases (intensity bands and bands with interference). This is explained below.
In both cases the test images closely matched experimental scattering patterns (K.
Singh, et at Cytometry 69A:307 (2006)) and numerical FDTD simulations (C. Liu, et al.
JBiomed Opt 10:014007 (2005)), both visually and in the magnitude of the output parameters.
In an ideal testing environment FDTD simulations would be used in conjunction with experimental data to verify the success of the segmentation system. However, to numerically analyze system accuracy it is necessary to identify the 'true' segmentation and parameterization of experimental data. As 'true' image boundaries are subjective in all but the simplest segmentation problems, most segmentation evaluation methods rely on qualitative boundary assessments for comparison values (7.K. Udupa, et al.
Comput Med Imaging Graph 30(2):75 (2006)); the few attempts at true quantitative evaluation typically rely on correlation data, and still involve comparisons with a manual (i.e.
human) segmentation (N. Pal et at Pattern Recognit 26:1277 (1993); S. Lee at al. IEEE
Trans Image Process 14:312 (2005); J.K. Udupa, et al. Comput Med Imaging Graph 30(2):75 (2006)).
To quantitatively verify the validity of the Cythe extraction pipeline a mathematical model was used to create a set of viable test images. These images contained a fixed number of vertical intensity bands of varying intensity and width, irregularly placed high-frequency intensity components, intensity band overlap, differing background levels, blurring, and poorly defined band boundaries (i.e. qualities we observed in experimental scattering images). Unlike manually measured experimental scattering patterns, these model images were numerous and provided a well defined set of 'tnie' parameter values (derived directly from the generated mathematical image model) with which to statistically validate Cythe's parameter extraction.
As bands are represented in our test images by smooth intensity gradients with no discrete edges, the 'true' band width parameter BW'(y),,,,r was measured as the horizontal distance between band points where the pixel intensity was 80% of the band's maximum intensity, relative to a black background. This width most accurately reflected observations about real scattering band width. Because of this approximate edge value definition, the validation data for band width parameters is slightly less precise than for other parameters, as seen below.
These quantitative test images contained a more regular distribution of high-frequency intensity components than was found in experimental images or our qualitative analysis images; high-frequency intensity regions were randomly placed only on intensity bands, as in image cases relating to geometric and Mie scattering. This additional regularity was needed generate reliable true values for band parameterization---images containing a completely random distribution of high-frequency regions (as expected from Rayleigh scattering) would suffer from the same subjective evaluation problems as real experimental data.
Thus, each quantitative test image consisted of a varying number of Gaussian intensity regions superimposed on a series of vertical intensity bands. Like real scattering patterns, the test images contained bands of varying width and maximum intensity that were placed at intervals across a black background. The intensity profile of individual bands, the size and orientation of Gaussian intensity regions, and the variation of maximum band intensity across the image were picked to match the intensity profiles expected in actual scattering images. finally, a 5 by 5 Gaussian blur was applied to the images to smooth out any unrealistic intensity transitions.
These test images were then presented to the Cythe system for analysis. Each test image was processed by the full computational pipeline (i.e. feature extraction, feature clustering, and post-processing) to produce a set of output parameters (P,,,h,), Another set of parameters were derived directly from the mathematical model used to generate the test images; these parameters P,.a,irepresented the 'true parameter values' used in the creation of the test images. The true parameters P,,,l were then compared to the parameter values extracted by our pipeline P,,the (i.e. how well they demonstrated a correlated linear relationship that allowed accurate prediction of the true parameter values).
Both the true parameter set P,&al and Cythe parameter set Pyh, included all thirteen parameters outlined in Table 2. As explained in the previous section, this band-based parameterization scheme (calculated with Equations 4 and 5) can be used to represent the influence of both large scattering structures and nanostructure-derived high-frequency intensity variation.
To assess the pipeline's ability to detect changes in band width and band intensity, these tests were performed on 162 sample images. Two different test sets were generated. The first set (T,:143 images) was used to determine the system's ability to detect variation in band width and band intensity-parameters primarily influenced by the presence of smaller scattering sources. In this set the number of intensity bands was held constant while the number of Gaussian intensity regions present in the image was varied between zero and fifty. The second set (Tz:19 images) was used to test the system's ability to detect changes in band structure and spacing, which relate to scattering from larger microstructural cellular objects. In test T2, the number of intensity bands was varied between two and twenty, while the number of Gaussian regions inserted into the image was held constant.
After each test set the Cythe parameter extractions were compared to the true parameters.
System success was determined by measuring how closely the Cythe parameters matched the true parameters, as evaluated with a range of statistical tests for correlation and similarity (described in the following section).
This procedure is similar to the comparison metric of Bovenkamp et al., where the plot of human v.s. machine solutions was compared to a unity slope to determine accuracy (E.G.P. Bovenkamp, et al. Pattern Recogn 37:647 (2004)). In the absence of any methods to objectively compare and evaluate segmentation schemes J.K. Udupa, et al.
Comput Med Imaging Graph 30(2):75 (2006); N. Pal et al. Pattern Recognit 26:1277 (1993);
S. Lee at of. IEEE Trans Image Process 14:312 (2005); L. Bergman, et of. Image Vision Comput 23:417 (2005)), this approach allowed a quantitative characterization of system success.
In addition to these quantitative test images, we performed a series of qualitative tests on a range images of containing features from all three scattering image cases (e.g. Fig.6, right). This was done to determine the system's ability to remove region bridges, detect regions in noisy images, and join detected regions into a structural hierarchy. For these tests, a test series was generated consisting of images with vertical intensity bands, randomly distributed high-frequency intensity regions, and vertical intensity bands overlayed with a pattern of randomly distributed high-frequency regions of irregular shape and size.
For these tests the scrubO and joino thresholds (0, 0) were set to 0.0018% and 0.028%
respectively. The agent view radius was R=5 X 5, and the agent neighbourhood was N=3 X 3. Images were reduced to 1=125 X 125, These values were empirically derived on a small subset of the images, and subsequently used on the full set of test images without modification; one parameter setting performed well for an entire family of images.
As shown in Fig. 13, the Cythe pipeline was able to affix the agent population (A) to the vertical intensity bands in the test image. Region color was assigned based on each group's unique ID value; all regions were verified to contain distinct ID
values. A visual comparison of the Cythe labeling (Fig. 13, right) with the initial image (Fig.
13, left) showed that the Cythe extraction matched quite closely with our expectations from test image. It was also observed that the system was able to detect the presence and magnitude of width variations in the band structure (Fig. 13, bottom row). In addition to being able to detect linear bands, Cythe was able to detect small, arbitrarily-shaped intensity regions of varying brightness (Fig. 13, middle). As shown by the difference between Fig. 13 left and right, Cythe was also able to detect the level of high-frequency variation present in images containing high-frequency components that overlap a pattern of vertical intensity bands. This observation further supports the efficacy of the band-based parameterization scheme 1?. Random intensity regions (like those expected from Rayleigh scattering) were indicated by width and intensity deviations within the detected band structure-their intensity contributed to and noticeably altered the shape of existing bands.
It was observed that Cythe was able to remove parameter-degrading horizontal intensity bridges and use the clustering stage to group the agent population (A) into a set of distinct regions (G). The removal of horizontal bridging can be seen in Fig. 14, and the ability to form a population into spatially connected regions can be seen by the homogeneous region colors in Figs. 13 and 14. In Fig. 14(Left) - the initial test image. In Fig.
14(Middle) - the agent population directly after the agent_fixationO routine; there are three bridges at this point. In Fig. 14(Right) - final region identification after post-processing.
Weak connections between bands did not adversely affect region identification-the two horizontal bridges were removed in the feature detection stage, and the diagonal propagation restriction prevented ID leaking over the remaining bridge (which was subsequently removed by the scrub() routine). The dots represent fixed agents (middle), and different shades in the clustering image indicate spatially distinct regions (right).As shown in the difference between the two agent populations in Fig. 14 (middle and right), it was found that horizontal bridges less than three pixels in width were removed during the feature detection stage . In addition, the use of a 4-neighbourhood for communication in the feature clustering stage prevented distinct bands from being classified as a single region due to any remaining weak connections (Figs. 13 and 14).
The joinO routine constructs a set of vertical bands g' out of the detected image regions G.
For noisy images this process would not be possible without prior use of the scrubO
routine to filter out small unconnected intensity regions. Figure 15 illustrates the use of the joinO and scrub() routines in the creation of a vertical band structure for simple images with and without 10% of the images pixels assigned a random 8-bit intensity noise value (i.e. random or independent noise, as expected from dust on a lens or bad CCD
pixels). Lines indicate band position (xg), and differently shaded regions in the post-processing image indicate spatially distinct regions g. While portions of the agent population affixed to noise-related pixel clusters (Fig. 15, bottom middle), the scrubO
routine removed these small groups and the pipeline identified the same regions found in the noise-free image (Fig. 15, right). In addition, the detected regions were joined into the same band structure for both the noisy and noise-free image (as shown by the number and horizontal position of the yellow vertical lines, Fig. 15, right). This leads to the same parameters being extracted for both the noisy and noise-free images. Similar performance was observed for Poisson/couting noise, though high levels of Gaussian noise required the use of a larger scrub threshold due to larger detected noise regions. A join threshold of 0=0.08 was used for the tests in Fig. 15.
In addition to accurately parameterizing the model test images, Cythe was able to extract realistic parameters for a large set of FDTD scattering simulation images containing many arbitrarily-shaped randomly-distributed high-frequency intensity regions, as derived from complex cell structures with varying physical characteristics and organelle distributions (work in preparation).
The parameters extracted by Cythe from the test images allowed accurate prediction of the true parameters P,,,i extracted from the initial test images. This was statistically determined by calculating the correlation coefficient (r - the amount of covariance in the two populations, a good indicator of segmentation accuracy (N. Pal et al.
Pattern Recognir 26:1277 (1993))), the statistical significance of the correlation (P(r) -the probability that correlation value r could have arisen by pure chance for a given sample size), the chi-squared significance P(x2) - the probability of both input and output variables coming from the same distribution), and the standard error (SE) for each population of input/output variables, all calculated as per J.R. Taylor, An introduction to error analysis (2nd Ed., University Science Books, Sausalito, California, 1997; or through equations disclosed herein.
This comparison is shown in tabular form for tests T1 (band intensity parameters, Table 3, and band width parameters, Table 4 and T3 (band number/spacing parameters;
Table 5).
These tables present the statistics for an image reduction size of 1125 X 125.
From statistical theory (r.R. Taylor, An introduction to error analysis (2nd Ed., University Science Books, Sausalito, California, 1997)), r values greater than 0.216 (test TT, 143 samples) and 0.561 (test T2, 20 samples) indicate a statistical correlation (i.e. a probability P(r)<0.01$ that the correlation score could have originated by pure chance).
These threshold r values are based on the sample sizes used in our experiments. As shown in Tables 3-5, our derived values are consistently greater than these minimum values for statistical correlation. Similarly, chi-squared significance values approaching p(y2) = 1.00 indicate no difference in the distribution of input and output values.
The uncertainty in each parameter was estimated by adding Poisson/counting noise (i.e.
each pixel was varied according to a normal distribution equal to the square root of the pixel value) and processing the resulting image by the same method as the test data sets.
This was done for 56 images, allowing the extraction of a standard deviation that then allowed the calculation of chi-squared significance values.
As described above, the parameters shown in Tables 3 and 4 are used to characterize the intensity contributions from smaller Mie and Rayleigh scattering sources, while the parameters in Table 5 characterize the intensity contributions from larger Mie and geometric scattering objects.
Table 3. Statistical analysis for band intensity parameters.
Parameter Description r P(r) PO
aBl,,v Avg. Band Intensity Average 0.992 <0.0001 1.000 aBlmin Avg. Band Intensity Minimum 1.000 <0.0001 1.000 aBJmax Avg. Band Intensity Maximum 0.998 <0.0001 1.000 aBldeõ Avg. Band Intensity Deviation 1.000 <0.0001 1.000 aBl,,,, Avg. Band Intensity Minimum (NN) 1.000 <0.0001 1.000 *nearest neighbour Table 4. Statistical analysis for band width parameters.
Parameter Description P(r) P() aBWQõg Avg. Band Width Average 0.386 <0.0001 1.000 aBWj, Avg. Band Width Minimum 0.872 <0.0001 0,528 aBWm Avg. Band Width Maximum 0.724 <0.0001 1.000 aBWd,Y Avg. Band Width Deviation 0.907 <0.0001 0.286 Table S. Statistical analysis for band number/spacing parameters.
Parameter Description r P(r) F(() B Number of Bands 1.000 <0.0001 1.000 BS,,,, Minimum Band Spacing 0.995 <0.0001 1.000 BSmc Maximum Band Spacing 0.993 <0.0001 1.000 BSd, Average Band Spacing 1.000 <0.0001 1.000 With regard to the system's ability to correctly identify deviations in band intensity (test TI), we found that Cythe was able to identify the magnitude and variance of intensity to a high degree of certainty. At an image size of I=125 X 125 Cythe was able to correctly identify the number of bands in every test image. The input and output intensity parameters showed strong correlation, as indicated by the r, P(r), and P(X2) values (Table 3). Standard error for these intensity parameters was less than half an intensity step on a 8-bit intensity scale.
The close relationship between input and output parameters was also evident for band width and band width deviation parameters (aBWm;õ/mar/avg/dev). as shown by the values in Table 4. As explained in the previous section, difficulty defining the 'true' width values in the test images lead to greater variability in the evaluation statistics r and xz. While width statistics (Table 4) did show lower correlation between input and output values than the other parameters (Table 3, Table 5), all other values represented an excellent fit. Width values were still well above the thresholds for chance correlation, as indicated by r >
0-261, P(r) << 0.01. Despite having a high degree of correlation, the parameters aBWd,,õ
and aBW,,,,,, exhibited a low P(x2), and further investigation showed that this deviation in input/output distribution similarity was due to a shallow (i.e < 0.5) regression slope between the input and output parameter sets. Considering the lack of 'true-value' precision when quantitatively analyzing spatial parameters in this situation, the set of width statistics in Table 4 sufficiently demonstrated a distinct relation between the actual layout of the test images and the Cythe parameter extraction.
In addition to band width and intensity parameters, the system was able to accurately determine the number and spacing of bands (test T2). As shown in Table 5, the correlation coefficient (r) for each band-structure parameter approached 1.0 (i.e perfect correlation).
This indicates a one-to-one correspondence between the input parameters Põor and the output parameters P,,,h,. For the parameters BS,,,,,, BSmax, BS,,g there was a standard error of less than 1.1% of the image width for both reduction levels. There were no band number (B) identification errors in test T2, and the chi-squared significance test for all parameters in Table 5 indicated no statistical difference between the input and output parameters.
With respect to the magnitude of observed values from Equation 4, a typical range of 0.72-9.6 pixels was found for aBW;,,fõ,,;,~,.a, and 0.0-0.99 pixels for aBWde,=. For Equation 5, intensity parameter values were typically between 127.2-157.2 for aBjmp,fimar/arg, 0.59-4.29 for aBJ,,,,, and 0.26-72.7 for aBJ,.. Band spacing parameters varied greatly depending on the number of detected intensity bands in a sample image; for our tests we found spacing parameters between 6.46-14.8 (Test Ti) and 6.48-72.2 pixels (Test T2). The standard deviation of parameter values observed under conditions with counting noise and random noise was much less than the total parameter range observed during these tests.
As the size of images presented to the system increased (with the Agent View Radius being held constant), Cythe began to identify small erroneous bands within the larger regions of the image. At an image size of 1=150 X 150 the pipeline incorrectly identified one extra band in 67 of the 143 T, tests images. This lead to a noticeable decline in correlation values, and incurred a corresponding increase in standard error.
It is apparent that the relationship between image size and Agent View Radius plays a role in feature detection; this will be discussed in the following section.
Edge detection is by its very nature a local undertaking (N. Pal et al.
Pattern Recognit 26:1277 (1993)) and thus lends itself well to an agent-based framework. By determining fixation based on an adaptive local threshold (p,,, the average intensity value within an agent's view radius R), Cythe was able to effectively label all edges irrespective of the differing background intensity levels found in scattering images. By setting the adaptive threshold level greater than the local average (as in the agent axa~,onO
routine), the system consistently labeled the high-intensity side of all edges, isolating the band regions from the lower-intensity background.
Using an Agent View Radius of R=5 X 5$, we found 1= 125 X 125 to be the most appropriate size for image reduction. At this image size and view radius, the fixation routine was able to accurately divide the image into spatially distinct regions regardless of differing background levels and gradient slopes. The distance between identified band edges was small enough that the two edge-labeling agent populations for a given band connected along the center of their band. This allowed bands to be detected as continuous units in both our model test images and complex FOTD scattering simulations.
We found that it was important to select a view radius close to the size of target image features in the reduced image I. The two edge regions of a single band may not connect if the Agent View Radius is significantly smaller than the band size. This lead to the identification of extra bands by the clustering stage. By varying the size of the view radius (i.e. the adaptive thresholding region (N. Pal et at Pattern Recognit 26:1277 (1993)) to match the image reduction level, feature extraction remains accurate at any image size (though increased image size comes with an increased computational cost, as described below). This follows from recent work in image saliency detection and model matching (L. Itti, et al. IEEE Trans Pattern Anal Mach Intell 20:1254 (1998); V.
Navalpakkam et al Vision Res 45:205 (2005)).
The propagate-idO routine was a reliable and effective way to cluster the labeled pixels into contiguous regions. This routine, which was based on the connected components labeling algorithm commonly used in region identification problems (L.O.
Shapiro et al.
Computer Vision (Prentice Hall, 2001)), provided a simple way to identify groups of connected agents. Much like the self organizing tile behavior shown by Ghrist and Lipsky R. (Christ et al. "Gramatical Self Assembly for Planar Tiles", in Proceedings of International Conference on MEMS, NANO and Smart Systems, W. Badawy and W.
Moussa, eds. (Banff, Alberta, 2004), pp. 205-211), our system was able to effectively perform long-range organization through simple local interactions. In addition, the distributed approach lends itself well to parallelization-one of the major advantages of multi-agent systems (A.P. Engelbrecht, "Computational Intelligence: An Introduction", (John Wiley & Sons, 2002)).
The removal of band bridging (as described herein) was essential in the accurate clustering of spatially distinct regions. The agent fixation stage eliminated direct horizontal communication over bridges by eroding bridging agents (shown in Fig. 14), while the use of an agents 4-neighbourhood for communication prevented ID propagation over any remaining (weak) junctions between band protrusions. Without the removal of band bridging it was impossible to successfully parameterize complex images. A
similar restriction on the union of weakly-connected regions has proved effective in other segmentation and image identification situations (Y. Wang et al IEE Proc-Vis Image Signal Process 149:173 (2002); L.G. Shapiro et al, Computer Vision (Prentice Hall, 2001)). The assumption that there will be no strong horizontal links between bands follows from the structure of experimental scattering images and our understanding of cellular scattering mechanisms.
Using an agent neighbourhood of size N=3 X 3 further prevented erroneous ID
propagation between distinct bands. By only allowing communication between adjacent agents, ID information was not able to travel over gaps between neighbouring intensity bands.
Due to the nature of the local interactions and the multiple sweeps through the agent population, the ID propagation routine was found to be the largest computational component of the parameterization pipeline. As the proper choice of agent view radius and image reduction size decreased the final size of experimental images to approximately 1=125 X 125, scalability was not an issue for our application. In the case of larger images, the use of a union-find structure (described in L.G. Shapiro at al, Computer Vision (Prentice Hall, 2001)) in the connected components algorithm would greatly reduce the number of iterations through the agent population (though it would require a higher level of centralized control).
The joinO routine was found to be an effective way to model the structure inherent in scattering images. It has been shown that changes in the relationships between full bands are indicative of large changes in cellular structure (K.A. Sem'yanov, et al.
Appl Opt 39:5884 (2000)). By linking several vertically aligned regions to a single band structure, we were able to analyze the relationship between whole band units while still retaining specific information regarding the variation present in each band and its associated sub-regions.
As shown in the results section, the joinO routine managed to consistently group smaller regions into cohesive bands, even in the presence of noise. Noisy images were divided into the same number of bands (i.e. super-groups) as noise-free images (pig.15). This is important for the parameterization stage of the pipeline; band discrimination plays a large part in the calculation of band-based statistics, which in turn contain vital information about the nature of the scattering source.
The scrubO routine helped the parameter extraction process by removing any large noise regions that remained after the initial image reduction. By keeping the scrub threshold low, large features were still preserved (e.g. Fig. 15) while small agent clusters were rejected as noise; this parameter can be tuned to the specific nature (ambient noise and feature size) of the images under analysis. However, it should be noted that setting the scrub level too low can cause erroneous band identification, whereby bands of small pixel mass may appear near the edges of each real band. Extra bands will distort the extracted parameter space and should be avoided.
With regard to the selection of the variables for joinO and scrubO (i.e B and 0): these values are derived empirically based on observations regarding the size of noise artifacts inherent in a scattering image and the approximate size and frequency of bands in the image. Once selected, the parameters performed effectively on the entire test set. If significant changes are made to the ambient background noise or the angular range of a scattering image, the system threshold levels may need to be re-calculated.
While there are a number of other potential frequency analysis methods (such as Fast Fourier Transforms) which could be used to ascertain spatial frequency information from a scattering image, the Cythe parameterization routine allowed the extraction of a related structure of image regions within the context of a scattering situation (effectively embedding frequency information within an interpretable band-like structure).
This relationship information is of use to human observers, both for validating the extracted parameter data and for comparing results with those previously published in the cell scattering literature (which generally reference the number and size of bands, or the angular location and span of intensity band maxima).
Image size reduction was found to play an important role in both the generalization of region boundaries and the rejection of low-level noise, as it influences the degree of abstraction applied to the input image (L. Itti, et al_ IEEE Trans Pattern Anal Mach Intell 20:1254 (1998)). A similar reduction approach is used in saliency-based vision systems to detect high-level features in natural scenes, where detail (and the associated noise) is sacrificed to rapidly form an accurate structural impression of the image (1... Itti, et al.
IEEE Trans Pattern Anal Mach Intell 20:1254 (1998)).
Appropriate choice of image reduction size depends on the size of the Agent View Radius, and the number, spacing, and width of intensity bands within an image. A large reduction to an image with very narrow bands or important high-resolution features could merge independent intensity regions, or render some relevant features undetectable.
Failing to reduce an image with wide bands could lead to erroneous band detection or extra computational cost / increased run times. It was observed that disparity between image reduction size and Agent View Radius either lead to the identification of too many small intensity regions (e.g. when the true aBW R) or the grouping of many distinct initial regions into a small number of larger features (e.g. when the true aBW,,,,g>>
R).
Image reduction was also essential for manageable run times, as un-reduced experimental cytometer images are typically greater than 700 pixels on a side. At an image reduction size of Y = 100 X 100, an entire pipeline run took approximately six seconds.
At a reduction size of I = 300 X 300 or larger, runs lasted two minutes or more.
All performance tests were conducted on a Pentium IV desktop computer. The entire Cythe pipeline and all related routines were implemented in the Python programming language.
Cythe is expandable and may be readily adapted to new scattering situations;
once intensity regions are segmented and numerically represented (as in the group list G) it is possible to test for any number of spatial relationships. In addition, the pipeline should be robust to variations in expected image structure. By adjusting the post-processing settings, agent view radius, and image reduction size, Cythe can be made to detect intensity regions with greatly varying geometric properties. Furthermore, due to the local nature of the parameter calculation equations, slight band curvatures should not adversely affect parameter extraction.
In the event that images with different (e.g. non-vertical) spatial relationships need to be analyzed, the joino routine may be modified or replaced to create a different region hierarchy, and orientation-related changes may be made to the band bridge removal process. This would also allow identification of randomly-placed intensity regions, such as would be generated by a field of Rayleigh scatters, without imposing a band structure on the intensity data. The additional analysis of intensity region perimeter and area would allow further distinctions to be made between differing region types (e.g.
independent regions and full bands). This would facilitate the parameterization of heterogeneous images consisting of horizontal bands, tightly grouped region clusters, blobs arranged without a band structure, or any other arbitrary cluster shape.
To this end, Cythe can be used within other applications, including the identification of geometric objects in natural scenes, and the detection of bright fluorescent regions during the genetic analysis of cell populations (P.M. Pilarski, et at, "An artificial intelligence system for detecting abnormal chromosomes in malignant lymphocytes", in Proceedings of Canadian Society for Immunology, Annual Conference (Halifax, Canada, 2006), pp.
126.) The final goal of the Cythe system is the classification of biological samples based on light scattering. Our preliminary results have shown that cell classes (e.g. those with features indicating cell health or malignancy) typically reside at the extremes of the possible parameter space (work in preparation). The difference between cell classes appears to be much greater than the variation due to noise, such as from imperfections in a fluid wave-guide or CCD. Thus, based on the parameter deviation indicated from random and counting noise (as herein), measurement noise should only moderately detract from Cythe's classification ability and the correlations we observed between input and output parameters.
Example 6: Cell Cytometry Integration A set of FDTD test images were presented to the system of the present invention (Cythe).
This system returned a numerical parameterization of the scattering intensity image in terms of scattering region shape and variability. These parameters were then analyzed for information content; a selection of these extracted parameters were given to a simple multilayer pereeptron, which produced a prediction of cellular organelle content.
Many groups have shown that the use of 1 DTD simulations can produce realistic scattering signatures that match well with experimental results; their work has proved that such simulations are a reliable indication of light scattering from actual scattering sources (R. Drezek, et al. Applied Optics 38:3651 (1999); R. Drezek, et al. Optics Express 6:147 (2000); A.K. Dunn Light Scattering Properties of Cells Phd, University of Texas at Austin, 1997; C. Liu, et al. JBiomed Opt 10:014007 (2005)). A bank of tests were conducted on a "library" of FDTD scattering simulation images of varying complexity (generated using the method of Liu et al. (C. Liu, er al. JBiomed Opt 10:014007 (2005)).
The full 'library' of test images contained thirty-eight scattering simulations which covered a spectrum of 2pm spherical cells with varying nuclear sizes (between 0.6 pm and 1.6 pm), nuclear positions (0 pm to 0.4 Irm from the center point), organelle content (0 to 50 organelles of size 0.3 pin), and random organelle distributions (five distinct positional random seeds). For these tests, the following assumptions were made: a nuclear index of refraction of 1.39, an organelle index of 1.40, a cytoplasm index of 1.38, an external medium index of 1.334, and an incident laser wavelength of 632.8nm. All cellular components were modeled as spheres, speeding FDTD computation and allowing effective pattern validation with Mie theory; the scattering from more detailed cellular geometries will be the subject of future FDTD studies.
By analysing using the system of the present invention over a wide range of random distributions and nuclear orientations it is believed that any extracted parameter correlations should be largely invariant to rotated cells and/or randomly distributed cellular components, a problem that has attracted a fair amount of attention in the literature (C. Liu, et al. JBiomed Opt 10:014007 (2005)).
Of these parameters, it is the average minimum band width (aBWiõ ), the average band width deviation (aBWd ,) and the average nearest neighbour intensity deviation (aBJ,,,,) were the most relevant for determining organelle content in experimental images. The formulae for these parameters are shown in Equations 1, 8 and 9, modified from pixel based form to use discrete angular units. In these equations, b represents an individual intensity band, a indicates the set of possible 0-values for a band, BWb(4)) indicates band width (in degrees, along the 9 axis) at angle 0, while BWb(4) indicates the intensity value at the band center at angular height 0.
OBIõõ a avgy{ ~4 EI316(d) - DI6( " 1)i}
{EQ 71 aD6Y,~v = avgy{ 1~ E iDWb(~) - avg(DWb)i}
~~ me s {$Q 8) aBW.,i,, = av9b{min(BWb(0), ¾ E fib)) (EQ 9) A simple multi-layer perceptron was used to show how two image parameters can be used to classify a cell based on its scattering image. The classifier was generated using the standard WEKA data-mining toolkit (I.H. Witten et al. Data Mining. Practical Machine Learning Tools and Techniques (Morgan Kaufmann, 2005)). The most successful classifier consisted of a two node network; the first node used a sigmoid activation function and took as input the weighted parameters aBl,,,, and aBlda,,, and a threshold value.
The second node of the system used a linear activation function and took the weighted output of the first node and a threshold value to generate the final classification value.
Training lasted 500 epochs with a learning rate of 0.005 and a momentum term of 0.005;
training time was less than 0.05 seconds. One potential layout of this simple network is shown in Figure 10.
Ten-fold cross validation (i.e_ CV-10, (R.O. Duda, et al. Pattern Class1cation (2nd Ed., Wiley lnterscience, New York, 2001)) was used to verify network performance on the full training library. Each parameter set was evaluated based on its correlation coefficient (lie how well changes in true value matched changes in the output prediction), mean absolute error (average error in the number of predicted organelles), root mean squared error, and relative absolute error (in percent) (R.O. Duda, et al. Pattern Classification (2nd Ed., Wiley Interscience, New York, 2001); I.R. Taylor, An introduction to error analysis (2nd Ed., University Science Books, Sausalito, California, 1997)).
Using the system of the present invention it was possible to derive a set of parameter/cell-structure correlations that held for a full library containing a range of nuclear positions, random organelle distributions, cell sizes, and nuclear sizes. When performing tests on the entire library of randomly distributed cells with varying nuclear size and position, it was observed that the number of organelles in the simulated cell was related to the average value across all bands of the maximum band width (aBW,,,rõ) and the variability of band width (aBWa ). In addition, a correlation between the number of organelles in a simulated image and the average value across all bands of the band nearest-neighbor intensity deviation (aBW,,,,) was observed. These trends can be seen in Figure 11. An example of raw Cythe region detection for simulation images without and without organelles is presented in Figure 12.
By combining the information contained in all three parameters, it is possible to use a multilayer perceptron to accurately infer the number of small scattering in the cell. The multilayer perceptron was tested on the entire library of FDTD scattering images using standard 10-fold cross validation (R.O. Duda, et al. Pattern Classification (2nd Ed., Wiley Interscience, New York, 2001)). Several combinations of image parameters were tested; a selection of these tests are shown in Table 6.
A simple network based on the parameters aBWd,, and aBI,,,, achieved the best performance with a correlation coefficient of 0.9839, a mean absolute error less than 2.6 predicted organelles, and a relative absolute error of only 14.05% (Table 6).
This performance is based on 10-fold cross validation of the 38 item image library.
Classifier accuracy is expected to improve dramatically with a larger sample library (such as from actual patient samples).
The classifier accuracy is observed to decrease when other parameter combinations were used (e.g. aB1õõ and aBWõ), and when more or less than two parameters were given to the multilayer perception. This can be seen in Table 6. Experiments showed that the parameter aBI,,,, contained the most useful information regarding organelle content. This can be seen, in part, by the low uncertainty on datapoints in Figure 11, top.
The inclusion of small organelles lead to an increase in high-frequency intensity variations in a two-dimensional FDTD scattering plot, which was detectable in several Cythe image parameters. As they are based on high-frequency intensity variation, the parameters aBI,,,,, aBWvpr and aBWm, all proved useful in organelle content prediction.
rot an analysis system to be valid on a completely unknown (experimental) test instance, it must be able to detect changes in the parameter of interest despite potential cell rotations and random organelle placement. The library of test images contained a diverse set of cellular structures, and the system showed robust behaviour with respect to random organelle distribution, nuclear size, and nuclear placement (Figure 11).
All of the FDTD tests were performed on cells of 4 im diameter. In the event that an extracted of set analysis rules is not independent of cytoplasm size, a viable alternative is to determine a rule set for each range of cell sizes and apply the correct set based on high-level cell analysis (such as the raw number of intensity bands detected or the spacing between intensity bands (K.A. Sem'yanov, et al. Appl Opt 39:5884 (2000))_ A
larger library of simulations and experimental results will increase the range of detectable intracellular structures, and will dramatically increase the accuracy of the generated classifiers.
Fig. 16, and 17 show the scattering simulations for increasing numbers of simulated mitochondria (Fig. 16) and distribution patterns (Fig. 17). These simulations were generated by a three-dimensional array of isotropic scattering points, which is a rapid and accurate approximation to typical (time consuming) organelle scattering simulations and observed experimental scattering patterns. Single dimensional FFT analysis of mitochondrial distribution (Fig. 18), simulated increasing organelle number (Fig. 19) and FDTD simulated mitochondrial content (Fig. 20) was performed. As shown in Fig.
18, each plot is the super-average of six different images of differing random placement seed along both axes, and high frequency power increases with the radius of scattering distribution. Fig. 19, shows for both a nuclear distribution (left) and large-radius distribution (right) and each plot is the super-average of six different images of differing random placement seed along both axes. Fig. 20 shows both low and high y-axis resolution (left and right, respectively) and stratification of plots in the high frequency region of the 0 axis (from 0,15, 20, 25, 30, 40 and 0 mitochondria, bottom to top) indicate the ability to determine the mitochondrial content directly from scattering images (bottom right). Some stratification is also present for the 0axis, to a lesser degree (top right).
Of note, there is a noticeable qualitative difference in the scattering patterns with changing organelle content and distribution (Figs. 16 and 17) and a noticeable high-frequency FFT
power stratification due to changing FDTD organelle content (Fig. 20);
however, at lower spatial frequencies there is little noticeable power change due to the number of organelles, though changes to organelle distribution are immediately apparent (Fig. 19).
While particular embodiments of the present invention have been described in the foregoing, it is to be understood that other embodiments are possible within the scope of the invention and are intended to be included herein. It will be clear to any person skilled in the art that modifications of and adjustments to this invention, not shown, are possible without departing from the spirit of the invention as demonstrated through the exemplary embodiments. The invention is therefore to be considered limited solely by the scope of the appended claims.
A(S) Jp(r)ep{2i(S.r)}dv, {EQ 1}
ry, the intensity contribution of all the scattering sources in a volume can be In summa mapped to a two dimensional receptive plane as related by the Fourier transform in Eq. 1.
This allows one to know something about the scatterer distribution p(r) based on the scattering amplitude A(S). More specifically, the spacing between blobs in the reciprocal (scattering) domain inversely corresponds to the spacing of real scatterer points in the scattering object in XRD, p(r) can be used to efficiently calculate A(S), and, although this is not the focus of the present invention, XRD methods exist to use A(S) to infer p(r).
Clearly then, an XRD-like approach to eytometry as contemplated by the present invention would have significant benefits.
As a simple visual example of this relationship, Fig. I shows how changes to spatial distribution (p (r), bottom row) impact blob spacing in the two-dimensional x-ray scattering pattern (A(S), top row) of a small 2 x 2 grid of atoms with horizontal spacings of lA and vertical spacings of IA, 0.5A, and 0.33A (left to right, respectively). As the y-axis spacing between atoms decreases by a factor of two and three (bottom middle, right), y-axis spacing in the reciprocal plot increases by a factor of two and three (top middle, right). Physical and Fourier dimensions are listed in A and 1/A on the plots.
Data was generated using DISCUS, the Fourier-transform based scattering simulator of Proffen and Neder (T. Proffen and R. B. Neder, JAppl Crystallogr 30 (1997)). This figure was generated using DISCUS, the widely-used X-ray scattering simulator of Proffen and Neder (T. Proffen and R. B. Neder, JAppl Crystallogr 30 (1997)). DISCUS uses continuous Fourier transforms to simulate the scattering from complex bounded (i.e. non-infinite and constrained) crystals and collections of atoms. In this example (Fig. 1), as the y-axis spacing between atoms decreases by a factor of two and three (from 1A
to 0.33A, bottom middle and right), y-axis spacing between blobs in the reciprocal plot (e.g. the scattering or Fourier domain, top middle and right) increases by a factor of two and three (from 1A-' to 3A-1 , top row). This holds with Eq. 1.
As another simple consistency check, one can see from standard diffraction theory that the characteristic spacing of a uniform diffracting lattice will directly impact the angular spacing of maxima in the resulting diffraction pattern. Using Bragg's Law (2d sin 0 m mX.
, with d as the lattice spacing and m as the order of principal maxima ), intensity maxima spacing varies as a function of scatterer point spacing. Using Bragg's Law (above) for a diffraction lattice with point spacings of 14m, 2.09 m (i.e. the characteristic organelle spacing expected from a 8 m human cell of nuclear size 4 m with 300 mitochondria uniformly distributed throughout the cytoplasm at a concentration of 0.21mt /
m3 , as per the cell dimensions of Su et al. X.-T. Su, et al. Optics Express 15:17 (2007)), and 3 m, one can generate a diffraction pattern that contains a characteristic spacing between maxima of X18 , X8.7 , and --6 respectively. Though real cell organelles and macromolecules are not organized in a rigid regular lattice (there are many theorized mitochondria) distributions, and the displacement caused by the nucleus will alter effective organelle spacing), it is easy to see that the range and behaviour of these values compares well with our expectations from Fourier theory (presented above).
With this theoretical background in mind, the present invention provides a method to 5 quickly and inexpensively simulate the large-angle scattering from a series of mitochondria-like scatterers.
The mtPatterns algorithm As shown by Eq. 1 and Fig. 1, the key component of a scattering pattern A(S) is actually derived from the spatial distribution of the scatterers p(r). This points to a rapid new 10 scattering simulation method based primarily on the spatial arrangement of mitochondria.
From Eq. 1 it is known that an arbitrary distribution of scattering points can be used to synthesize a final scattering pattern. Let us assume that p(r) is a distribution of small, isotropically radiating mitochondria. Given that each scatterer has position r and radiates isotropically, we see that the distribution p(r) is a Dirac delta function S(x - r)--the 15 distribution has unit intensity only at the exact position r of each mitochondria, and is zero everywhere else (i.e. a set of discrete points r e R3 ). This allows us to reduce Eq. 1 into the form of Eq. 2-the phase-shifted sum of each scatterer's influence on each point S of the receptive field A(S).
A(S) = Leap{-27ri(S = r)}
r {EQ 2) Eq. 2 may be easily implemented for the rapid computer simulation of scattering patterns.
One implementation is presented as the mtPattems algorithm (Tab. 1). As per Eq. 2, the mtPattems algorithm creates a three-dimensional array of isotropic scatterers r from a user-specified spatial distribution of scattering points. For this work it is assumed that the bounds of this distribution (r) are a spherical shell of inside radius R; and outside radius R0. However, any arbitrary volume may be specified. Next, it creates a two-dimensional receptive field A(u, v) of size U x V and positions it a specified distance d away from the isotropic scattering population centroid ra , along the z-axis (this simulates the CCD
camera). For realistic systems, d is typically than R0.
Table 2: The set of useful image parameters (P).
0 Create an array r of mt scattering points in .V= space, within radius bounds [Ri, R0]
1 Create a receptive field A as an array of size U x V, normal to the z-axis 2 Initialize all values in A to zero 3 Position receptive field A distance d above population center ro along z axis 4 ForallrEr:
For all u. V E A:
6 with wavelength A, calculate intensity iu v,lr, based on distance `r - (u, v) 7 if non-uniform scattering: lu.v1r = iu;vlr = M(euvir! Ou,vjr) 8 A(u, v) + = iu,vir 9 Return: A
Takes: (nit, R1, R0, U, V, d, M }, Returns: {A }
At this point, the algorithm determines the phase-shifted intensity contribution i(u,vir) of every scatter r E r at every point (u, v) E U , V of the receptive array A(u, v). This is done 5 by evaluating, for the given wavelength X , the phase and amplitude of the scattered signal after traveling a distance equivalent to the shortest path between the scatterer r and each point on the receptive field A(u, v). The individual scattering contributions at each point on the receptive field are then superimposed to generate the final scattering image.
Therefore, as an optional step, a point-spread function M(0(u,vir), cp(u,v}r)) may be applied to scale the isotropic radiation of each scatter by a known set of angular intensity values (which may be generated numerically or empirically). This allows for the simulation of non-uniform scattering, including the effect of varying shape and rotation of larger organelles. In one implementation of the present invention, M(O(u,vlr), (p(u,vlr)) was a look-up table of real values from 0.0 to 1.0, indexed by the angles 0(u,vlr) and (p(u,vJr) between the scatterer r and the point (u, v) on the receptive field; this was done to allow image files to be loaded as scattering functions. The present invention also allows for continuous point-spread functions to be used, as available (e.g. the Rayleigh-Gans approximation).
For the remainder of the description of the present invention, spherical coordinates will be used to describe the angle between a scattering object and the receptive plane. Specifically contemplated is 0 as the angle along the path of the incident light (i.e. with the axis normal to the incident light and parallel to the receptive field) and (p as the rotation around the SUBSTITUTE SHEET (RULE 26) path of incident light (e.g. 0 = 0 is pure forward scattering, 0 = 180 is pure back scattering, and 0 = 90 is pure side scattering along direction cp).
Scattering Pattern Parameterization and Classification Given the challenge of solving the inverse problem for scattering from a living cell, the literature to date has focused on the empirical classification of cells based on their scatter at a few specific angles or an angular slice through the center of the full two-dimensional scattering pattern (commonly called the "indicatrix"). It is evident from the rich structure of the scatter patterns (along both the 4, and a axis) that there is far more information present than is contained in simple angular slices.
As shown by recent efforts in the art, a more computationally tractable method is to effect a 'parametric solution' to the inverse scattering problem (V.P. Maltsev, Rev Soi Instrum 71:243 (2000); K.A. Sem'yanov, et al. Appl Opt 39:5884 (2000); N. Ghosh, et al. Appl Phys Lett 88:084101 (2006); Z. Ulanowski, et al. Appl Opt 37:4027 (1998)). In this two-step method (parameterization and pattern recognition), they parameterize some aspect of a scattering pattern and use a set of mathematical relations (K.A. Sem'yanov, et al. Appl Opt 39:5884 (2000); V.F. Maltsev, Rev Sci Instrum 71:243 (2000)), fast Fourier transforms (N. Ghosh, et al. Appl Phys Lett 88:084101 (2006)), or standard data mining techniques (Z. Ulanowski, et al. Appl Opt 37:4027 (1998)) to relate the extracted parameters to the initial structure of the scattering source. This process is rapid by comparison to forward methods, and may allow a degree of structural generalization that alleviates some of the problems caused by non-unique forward solutions.
Extracting viable parametric information from information-rich wide-angle scattering signatures presents a number of computational challenges. Because of complex cellular geometries, intensity bands may partially overlap in some places, the maximum intensity of each band may differ greatly from that of its neighbours, and the ambient background intensity is not consistent over the entire image. In addition, band boundaries are smooth gradients, not sharp intensity level transitions. These characteristics make it quite difficult to extract relevant features from an image and group them into meaningful regions.
While the art has addressed the individual components that make up this high-level segmentation problem (e.g. feature detection/extraction, connected component labeling, noise rejection, region clustering), the prior art has not developed a way to extract and analyze the full range of information present in two-dimensional cytometric scattering images. Analysis of information present in two dimensional scattering images involves partitioning two-dimensional scattering images into spatially distinct regions and S extracting high-level semantic information (i.e. image parameters) from the detected regions. Computer vision and image segmentation lie at the heart of most medical image analysis schemes. While there are many possible methods to segment wide-angle scattering images, after surveying the body of current segmentation literature the system of the present invention provides within the framework of a multi-agent image processing environment (described below) due to its demonstrated power, flexibility, and novelty.
Multi-agent segmentation systems have been thoroughly tested in a number of image processing situations, and demonstrate comparable or superior performance when compared to traditional methods. In addition, the distributed nature of multi-agent systems is a benefit for future hardware implementation. As such, they provide a solid basis for the development of a cytometric image processing pipeline.
A large body of recent image segmentation work relies on the use of multi-agent swarms, including particle swarm optimizations, evolutionary autonomous agents, and ant-colony optimizations. These multi-agent ('swarm') systems are composed of a population of autonomous or semi-autonomous 'agents' that collaborate (directly, indirectly, and/or competitively) to achieve a common goal. In this context, an agent is defined as a independent computational unit with a set of internal states and action rules;
an agent's future behaviour depends on its current condition, programmed rules, and the state of its environment (A.P. Engelbrecht, "Computational Intelligence: An Introduction", (John Wiley & Sons, 2002). (Multi-agent systems are finding widespread use in engineering and computer science applications, ranging from process optimization to computer vision, population modeling to industrial control; Engelbrecht provides a good introduction to this topic (A.P. Engelbrecht, "Computational Intelligence: An Introduction", (John Wiley &
Sons, 2002).
Prior art segmentation algorithms have one thing in common: they attempt to break a complex image into a set of smaller regions, where each region is homogeneous with respect to one or more parameters (e.g. contrast, intensity, texture) (N. Pal et al. Pattern Recognit 26:1277 (1993)). The effectiveness of each method varies depending on the size, texture, orientation, and shape of features contained in an image; no single approach will work well for every image (N. Pal et al. Pattern Recognit 26:1277 (1993)). In most cases, image sub-division is a two stage process-an image is segmented into smaller sections which are then clustered into groups based on some similarity metric (D.K.
Panjwani et al.
IEEE Trans Pattern Anal Mach Intell 17:939 (1995); S. Lee et al. IEEE Trans Image Process 14:312 (2005)); such as the split-and-merge or region-growing approach, recently used for tracking cells in diagnostic images (B. Prasad, at al Meas Sci Technol 16:909 (2005)). Unlike the prior art the system of the present invention does not involve agent reproduction or movement.
To parameterize scattering images detection of continuous intensity regions is needed, as well as characterization of the regions with respect to their spatial orientation within the image, their intensity profile, and their relationship to other parts of the image. This allows numerical representation of the low and high frequency intensity structures present in scattering images (as described above).
The system of the present invention provides the first computational system designed to comprehensively parameterize wide-angle scattering signatures. The system of the present invention is able to identify the overall structure and relationships present in a scattering image. The resulting parameterization scheme, built from the numerical characterization of intensity bands and independent intensity blobs, can be used in the identification of cellular morphology. This facilitates the rapid division of experimental samples into healthy and diseased categories for expedient medical diagnosis. The prior art has shown viable two-stage image segmentation systems: in stage one all salient pixels are labeled with one or more identifiers; in stage two all labeled pixels are clustered and grouped according to some similarity or congruency metric (A.K. lain at al. IEEE Trans Pattern Anal Mach Intell 18:195 (1996); S. Lee at al. IEEE Trans Image Process 14:312 (2005);
N. Pal et al. Pattern Recognit 26:1277 (1993); L.G. Shapiro and G.C. Stockman, Computer Vision (Prentice Hall, 2001); D.K. Panjwani et al. IEEE Trans Pattern Anal Mach Intell 17:939 (1995)). The present invention uses an additional stage to organize the clustered regions and extract a set of relevant parameters.
After performing an initial user-specified image size reduction, the first stage of the system of the present invention (Cythe) (feature detection) is responsible for creating and fixing a population of computational agents (A) to the salient features of the target image.
This stage effectively labels all the pixels corresponding to relevant intensity regions; an explanation of saliency determination will be presented in the following sub-section. The second stage of the present invention (feature clustering) is responsible for clustering the 5 fixed agent population (A) into a list of spatially distinct regional groups (G). The final stage of the system of the present invention (post processing) removes large-scale image noise, creates a band-like grouping structure from identified regions, and extracts a parametric representation (P) of the input data.
Detailed explanations of each stage are presented in the following subsections, which also 10 describe the parameterization equations and the four major algorithms used to implement the individual stages of the pipeline. These are: the agent_fixationO routine, which is responsible for fixing the agent population to the salient image features; the propagate_idO routine, responsible for clustering agents into connected groups; the scrub() routine, which removes image noise and erroneous groupings; and the joinO
15 routine, which joins groups into a band-like structure of super-groups.
DeJlnitions (Agent): An agent is a single computational unit that is assigned to a pixel or region of the image grid I. Each agent has a number of internal states and potential actions, and can alter these internal states and/or perform actions based on the information present in a 20 localized area of the image grid I.
(Agent Neighborhood): The agent neighborhood N is a n X n region of the image grid I
centered on the agent location x,,ye. This region determines where an agent will look for and communicate with other agents, see for example the "Neighbouring Region of an Agent" of Liu et al (J. Liu, et al. IEEE Trans Evol Comput 1:141 (1997)).
(Agent View Radius): The agent view radius R is a (2r+1) x (2r+1) region of the image grid I centered on the agent location xa,ya (Fig. 7, left). This area helps determine agent feature detection preferences, and the pixels within this area are used in the calculation of Average Pixel Intensity e= This is akin to the image region used in the "local binary pattern and contrast measure" of Ojala and Pietikainen (T. Ojala et al.
Pattern Recogn 32:477 (1999)) and the area used to acquire local stimulus by Liu et al. (J.M.
Liu et al.
IEEE Trans Pattern Anal Mach Intell 21 544 (1999); J. Liu, et al. IEEE Trans Evol Comput 1:141 (1997)) and Mirzayans et al (T. Mirzayans, et al. Neural Netw World 15:243 (2005)).
(Average Pixel Intensity): This value, denoted ,, is the average pixel intensity value that agent a observes within its view radius R. Average Pixel Intensity is equivalent to the "mean pixel value" component of Liu et al.'s texture definition, as used in their multi-agent feature detection routine (J. Liu, et al. IEEE Trans Evol Comput 1:141 (1997)).
Feature Detection The first stage of the Cythe parameterization pipeline takes the input image 1, scales it to user specified dimensions (u X v), renders the image as a two-dimensional array (X.L.
Wu, IEEE Trans Inf Theory 38:1755 (1992)), and creates an agent population A
equal in size to the number of pixels in the image grid. A single agent is assigned to every pixel in the image grid. These agents then use the information available in their local neighbourhood to detect features and sort themselves into regions; this is the standard approach used in most agent-based image processing systems.
During the feature detection stage of the pipeline, each agent calls on a fixation routine-agent_fixation() -to determine its immediate behaviour. When the fixation routine is called, the agent will perform one of two actions: an agent will affix to (and therefore identify as a salient region) a pixel at image grid location xa,y. if the pixel has an intensity value greater than the agent-computed average pixel intensity B., or, if this condition is not satisfied, the agent a will be removed from the agent population A. In this way, agents are able to detect salient intensity edges in the image I independent of differing background intensity values (i.e. a fixation routine based on an agent's relation to its "average pixel intensity" is an adaptive thresholding function, as described by Pal and Pal (N. Pal et at. Pattern Recognit 26:1277 (1993)).
After the entire agent population has been polled for fixation, only agents that reside on salient pixels will remain in the population (T. Mirzayans, et al. Neural Netw World 15:243 (2005)) (Fig. 7, right). To aid in effective region segmentation, the entire fixed agent population is scanned and all agents with more horizontal neighbours than vertical neighbours removed. As the intensity bands present in the scattering images are vertical (as discussed in the introductory section), this helps eliminate any horizontal intensity `bridges'; much like a horizontally-selective version of the "opening"
operator used in binary image analysis (L,.G. Shapiro et al. Computer Vision (Prentice Hall, 2001), the removal of these weakly-connected 'bridges' facilitates region discrimination (as shown by Wang and Yuan (Y. Wang et at. IEE Proc-Vis Image Signal Process 149:173 (2002)). As shown in Fig. 7, after the agent_fixationO routine, members of the agent population will be fixed on areas of high intensity relative to the local image texture (right - shown here for an agent view radius of r 1). This adaptive process detects edges independent of differing background levels. Pixel color indicates 8-bit intensity, from 0 (black) to 255 (white)An example of the removal process is shown in Fig. 8. In Fig. 8, solid circles indicate fixed agents and striped circles represent agents that will be removed, severing the connection between the two minimally-connected bands;numbers inside the pixels represent the ratio of horizontal to vertical neighbours within the 4-neighbourhood of a given agent (H:V).
Once the agent population has completely labeled all relevant pixels in the image grid, a clustering process-propagate _idO-takes over to form the population A into a set of spatially distinct regions G (i.e. it links all adjacent agents to create spatially-connected sub-regions). propagate_idO is a form of the classical connected components labeling algorithm (L.G. Shapiro et al. Computer Vision (Prentice Hall, 2001)), traditionally used to identify spatially-connected image features. Each time the propagate-idO
routine is called, a sweep is done over the entire agent population; each agent in the population polls all other agents in its local neighbourhood (N) for their current ID value.
Based on its initial scan, an agent records the highest ID value, idm0,, in its local neighbourhood. The agent then re-propagates the value id ax, to all neighbours with ID values less than idm.õ
and the receiving agents take on the new maximum ID value. The entire agent population is iterated through until no further ID changes are observed (T. Ojala of al.
Pattern Recogn 32:477 (1999)). At this point all agents in a separate physical region will share a unique ID number. A single iteration of the propagation process is shown graphically in Fig. 9.
As shown in Fig. 9, initially, the agent surveys its local neighbourhood and records the ID
values of its neighbours (left). Seeing there is a higher ID in the area (shown with striped circles), it takes on this ID value and re-broadcasts the new ID to its neighbourhood (right). This leads to an agent neighbourhood that is homogeneous with respect to the highest ID value.
It is important to note that ID propagation occurs in an agent's 4-neighbourhood (i.e. to agents left, right, above, and below the agent, but not on diagonal corners (L.G. Shapiro et at. Computer Vision (Prentice Hall, 2001)). This aids in band discrimination and removes additional band bridges. Due to the close horizontal proximity of bands in the scaled image I, it was found that communication within an agents full 8-neighbourhood could lead to a number of intensity regions being erroneously grouped into a single region.
Allowing diagonal communication between agents did not facilitate any useful connections beyond those gained through purely horizontal and vertical transmission.
Since evey agent starts with a unique 1D value, the clustering process guarantees that every connected image region will have a common unique identifier (L.G.
Shapiro et at, Computer Vision (Prentice Hall, 2001)); a set of agent groups (G) is formed, where each group (G) contains a list of spatially-connected agents (Ar) that share a common ID value (As is a non-overlapping subset of the initial population A).
As in previous work, a feature detection stage followed by a clustering stage is able to effect image segmentation. However, to utilize (and parameterize) the detected regions in the context of scattering image analysis, a third stage to organize and parse the segmentation results is required.
After the creation of homogeneous ID regions, several post-processing routines take over to remove high-level noise, join vertically correlated regions into a band hierarchy (i.e.
create super-groups out of related regions), and extract the final parametric representation of the input file. The first process--scrub searches through the list of agent groups G
and removes all groups (and their member agents) smaller than a given percentage of the image size from A and G respectively; the removal size threshold 0 can be empirically set by human users to match the input image conditions. This method of removing small connected objects was used by Prasad et at to eliminate background noise in their cellular tracking system (B. Prasad, et at Mews Sci Technol 16:909 (2005)).
Each group that survives the scrubO routine is then analyzed for its dimensions and center point (gx,gy). This effects a simple geometric characterization of all surviving groups in G.
Next, horizontally-related regions are connected into band-like structures using the joinO
routine, a simplified variant of the standard one-dimensional k-means clustering algorithm (R.O. Duda, at al. Pattern Classification (2nd Ed., Wiley Interscience, New York, 2001)).
As in the k-means algorithm, join() creates list of super-groups and assigns one or more image regions g to each super-group g' based on the horizontal distance d=Ixg.
- x5J
between the group center and the super-group center. Assignment occurs if d is less than a user defined threshold 0 (specified as a percentage of the image size), and each group may be assigned to only one super-group. A super-group's center is iteratively re-calculated based on the location of its member sub-groups. The join() process continues until every group has been assigned, clustering image regions with respect to their horizontal proximity. This allows the recognition of vertical bands in a scattering image while still retaining the detailed statistics of each individual sub-group. As such, join0 creates a region hierarchy out of the agent population which can be stored at minimal cost for later retrieval and parameter estimation.
In the last step of the Cythe pipeline, the super-group hierarchy is traversed and cross-referenced with the initial image I to extract a number of useful global parameters P
(shown in Table 2). These parameters describe the overall structure and inherent complexity (in terms of spatial frequency components) of the image 1, and are used to numerically represent the image features generated by light scattering through a biological cell (i.e. the number of regions, their size/shape, their relation to each other, and the variance of region width and intensity).
Table 2: The set of useful image parameters (P).
# Parameter Description I B The number of bands in the image 2 BSmin The minimum band spacing 3 BSmaz The maximum band spacing 4 BSavg The average band spacing 5 aBWms,, The average over all bands of minimum band width 6 aBW,H1 The average over all bands of maximum band width 7 aBWa,,g The average over all bands of average band width 8 aBWdt, The average over all bands of band width deviation 9 aBlm,o The average over all bands of minimum band intensity aBI The average over all bands of maximum band intensity 11 -aBla'e The average over all bands of average band intensity 12 aBldr,, The average over all bands of band intensity deviation 13 aBInõ The average over all bands of nearest-neighbour band intensity deviation While finding a direct correlation between scattering signatures and the initial model parameters of a FDTD simulation or the structure of a cell has been shown to be an unsolved problem (V.P. Maltsev, Rev Sci Instrum 71:243 (2000)), the parameters in P
5 allow us to infer structural information from the presence of intensity regions with varying spatial frequency. The knowledge that certain cellular structures will generate intensity regions of a given spatial frequency allows relationships to be made between the extracted image parameters P and the initial layout of the scattering source. From initial experiments, it is expected that there will be direct correlations between these parameters 10 and the underlying cell model parameters; it has been observed that this is true for relations between small organelle content and several a .$r/alWparameters.
In this case, each super-group g' extracted by the Cythe pipeline corresponds to a detected intensity band `b' in the scattering image. Based on previous work (K. Singh, et al Cytometry 69A:307 (2006)) and a series of FDTD simulation experiments (which demonstrate the presence of vertical intensity bands in scattering images) it is observed that it is most effective to use a band-based parameterization scheme. In this approach, the small high-frequency intensity areas resulting from smaller scattering centers are effectively described by variations to the width and intensity of existing intensity bands (i.e the presence and magnitude of high-frequency intensity fluctuation is indicated by changes to parameters 5- 13, Table 2). A similar parameter set could be created for images without an observable band-like structure.
These parameters are extracted from the final super-group hierarchy through a series of mathematical operations, shown in Egs.(4) and (5) below. Every detected super-group is analyzed with Eqs.(4) and (5), and the resulting values are combined into the set of parameters P. Width statistics are derived by iterating through the agent population, intensity statistics are derived by taking a single-pixel wide intensity sample down the vertical center line of each super-group, and band spacing statistics are generated by comparing the horizontal centers of all super-groups.
min' (y) and minx (y) are defined as the minimum and maximum angular values that still contain pixels belonging to band b at the vertical image position y. The function intensity (xb,y) is the 8-bit intensity value at the horizontal center point x of band b, at the vertical positiony. Set yb is the set of vertical values for band b. The functions wino, max0, and avg O are the standard minimum, maximum, and average operations performed on the list of values for a band. Band spacing (BS) is defined as the distance between the horizontal centers of two neighbouring bands; (xb - .b+11 Values for the maximum, minimum, and average band spacing are calculated using the standard operations.
fWa(Y) nt xx(y) - rninA (y) (EQ
4(a)) min(BWV(y), )' E Y1') (EQ
4(b)}
BWrb.,x = max(BWb(y), y e Yb) (EQ
4(e)) BW = avg(BWb(Y), y E Yb) R {EQ
4(d)) 0Wd'', Y, IBWb(y) ~-BW. $1 Y{EQ
4(e)}
BIb(y) = intensity (xb,y) {EQ
5(a)) min(BIb(y), y E yb) (EQ
5(b)) Br,. = max(glb(y), y E Yb) {EQ
5(c)}
B1.11 a = avg(B11(y), y E Yb) {EQ
5(d)) ' yri~ ~g(b(y) -131~,,8~ {EQ
5(e)) 5(f)) There is a dramatic increase in the amount of information available when the number of values in this extended parameter set is compared to the number of indicatrix parameters derived from one-dimensional scattering intensity slices. It is expected that this increase in parametric image information will lead to a corresponding increase in the predictive power of future classification systems. Intensity band relationships (such as band spacing BS, Table 2) can be used to predict the nature of larger cell structures (K.A.
Sem'yanov, at al. Appl Opt 39:5884 (2000)), while variations in region width and region intensity due to high-frequency image components (Parameters 5-13, Table 2) may be used to detect the presence and number of micro- and nano- scale cellular organelles (work in preparation).
The final step in any automated diagnostic system is a method to deduce cellular structure from the extracted scattering pattern parameters P. There are a number of potential machine learning approaches that can successfully associate a set of extracted parameters with an initial labeled data set to create a classifier with predictive power (R.O. Duda, et al. Pattern Classification (2nd Ed., Wiley Interscience, New York, 2001); M.
Witten ei al. Data Mining. Practical Machine Learning Tools and Techniques (Morgan Kaufmann, 2005)). A pattern recognition system based on the parametrization approach has been developed as described below.
Example 1: Simulation and Verification of Scattering Patterns Using mtPatterns, an extensive set of test images were created and compared (via qualitative observations and quantitative values such as characteristic blob spacing) to experimental images generated by a miniaturized cytometry device. These results and comparisons are presented below. In addition to a bank of test images with cell size and nuclear size values R. ~ 0.11tm-20.0 m and R; = 0.1 m-20.0 m (mt = 5-1000), additional tests were performed with the specific cell parameters for the Raji human cell line given by Su et al (Ro -=8.0 m, R; ~4.0}rm, mt --=83-677 (X.-T. Su, et al.
Optics Express 15:17 (2007))). This allowed exploration of the behaviour of mtPatterns over a wide range of realistic simulation situations and also compare simulations to actual experimental cytometry results for human Raji cells. Each test was performed at least three times with different random placement of organelles.
The following parameters were used in all mtPatterns simulations: an incident light wavelength of X = 632nm, a CCD receptive field area of U x V = 3mm x 3mm, with the scatter centroid centered d = 5mm below the CCD plane. These dimensions conform to those achievable in a physical experimental system, and can be seen to fulfil the XRD
constraints presented in the previous section. Shown in Fig. 2, this setup gives a viewable side-scatter region between 77.3 and 106.7 in both the 0 and cp axes (a solid angle of approx. 30 ). Two point-spread functions M(0(u,vIr), cp(u,vjr)) were used:
pure isotropic radiation and a theoretical power function for a single spherical scatterer (mitochondria-like, generated from the angular scattering profile presented by Gourley et al. (P.L.
Gourley, IEEE J Quantum Electron 11:818 (2005))). For these cases, scaling factor M(0(u,vlr), cp(u,vjr)) was specified as the image intensity (normalized to a unit scale) at position 0(u,vir) , gp(u,vlr). These point-spread functions are shown in Fig.
3 where, pure white represents an intensity scaling of 1.0, pure black indicates a scaling of 0.0 (i.e. no power transmission).
The following sections present a detailed qualitative and quantitative analysis of the mtPatterns algorithm, and show that a scattering simulation based only on the structural layout of small organelles can accurately produce valid experimental images without considering complex optical paths and the interaction of large cellular components such as the nucleus.
In XRD, analysis is typically done by calculating the Fourier transform of potential crystal structures and comparing them to experimental scattering patterns. This allows researchers to infer correct crystal structure (a famous example being the discovery of the helical nature of DNA). Later work is then sometimes done to try and reconstruct the phases of incident X-rays, potentially yielding the exact structure of the scatterer.
A similar process is possible for cellular scattering; the following results show the possibility for predictive assessment of cell structure (e.g. mitochondria number and distribution) based solely on the observed layout of experimental images.
Example 2: Comparison of mtPatterns and experimental cytometry images To evaluate the mtPatterns algorithm, mtPatterns results were compared to experimental wide-angle cytometry patterns-captured using the method described by Su er at.
(X.-T.
Su, el al. Optics Express 15:17 (2007))-from the laser scattering of human Raji cells.
This experimental data was scaled and cropped to the same side-scatter angular range as the mtPattems data (a solid angle of 74-106 ) and normalized to the same intensity and contrast levels. Estimated cell parameters from the experimental data (X.-T.
Su, et at.
Optics Express 15:17 (2007)) were used as input to mtPatterns to generate corresponding simulations. This process allowed for direct visual and numerical comparison.
The similarity between images may be visually judged by examining the shape and image structure of scattering images obtained from mtPatterns and from real experimental cytometers. Fig. 4, top, shows the experimental scattering signatures for two human Raji cells, taken using a miniaturized cytometer, with estimated cellular parameters of R4 --=8.01m, R, - 4.0 m, mt -83 - 677, as per the experiments of Su et al. (X.-T.
Su, et at.
Optics Express 15:17 (2007)). As shown in Fig. 4, orange and yellow marks on the bottom row indicate blob center points / spacing gaps used in blob spacing calculations.
The horizontal axis corresponds to changes in 0, vertical to changes in Y. The patterns show n in Fig. 4 are structurally similar to the pattern generated by mtPatterns (Fig. 4, top, right) when initialized with the average parameters of a Raji cell estimated by Su et al. (Ro = 8.0 m, Ri = 4.Q m, mt = 300).
It is extremely difficult to objectively compare scattering images in quantitative terms.
However, since the spacing of the scattering distribution relates to scattering pattern blob spacing (Eqs. 1,2), one effective way to numerically compare the similarity of scattering images is to evaluate the characteristic angular spacing between neighbouring scattering intensity blobs. This metric also can help guide the process of inferring experimental scattering structure from simulated images.
For the numerical comparisons presented here, spacing (in image space) between neighbouring maximum intensity regions (i.e. blob peaks) was measured as shown by the mesh in Fig. 4 and normalized to the angular range of each image to give a set of angular blob spacing values for each image. These spacing values were then used to compute characteristic spacing.
Using this metric, the experimental images of Fig. 4 were compared to the mtPatterns simulation generated using the corresponding parameters. This comparison can be seen qualitatively in Fig. 4, bottom, and quantitatively as follows. As described above, mtPattems was initialized with the estimated parameters of a Raji cell (Ro =
8.0 m, Ri =
4.0 m, mt = 300, as given by Su et al.). With these parameters, an average spacing between the maxima of intensity regions (i.e. blob centers) of 5.51 (Std.Dev.
1,51, 73 samples; Fig. 4, right) was observed. This compared well to the spacing values from the experimental cytometry images: 5.12 (Std.Dev. 1.47, 85 samples; Raji Cell A-Fig. 4, left) and 6.04 (Std.Dev. 1.46, 62 samples; Raji Cell B-Fig. 4, middle).
As such, the mtPatterns characteristic spacing value was observed to be within the range of values from the two experimental cells: spacing Raji Cell A < spacing mtPatterns <
spacing Raji Cell B). The slight variations in average characteristic spacing between the two experimental samples are likely due to changes in internal cell structure (such as organelle number and placement) and are well within one standard deviation.
As a back-of-the-envelope consistency check, it is possible to confirm that these observed characteristic spacing values relate to a reasonable range of organelle spacing values by examining what characteristic spacing values we would expect from a uniform crystal lattice with structural parameters similar to those of a Raji cell. As mentioned above, for the parameters of a typical Raji cell of 8 m radius with 83-677 organelles uniformly distributed between its 4 m nucleus and the cell wall, we would expect to find a characteristic organelle spacing in the range of 1.58 m (677 mt) to 3.19 m (83 mt). Using these values in Bragg's law from the methods section, this leads to an expected blob spacing angular range of 5.69 -11.51 The observed blob spacing results from our mtPattems simulation (5.51 ) and experimental images (5.12 , 6.04 ) fit at the lower bounds of this range, which is reasonable considering the simulations and experimental images of Fig. 4 are not really a uniform perfectly-diffracting lattice.
Deviations from the characteristic scattering peak spacing of the uniform lattice are expected due to the displacement of the cell nucleus and the non-uniform (e.g. randomly distributed, or closely aggregated) arrangement of organelles.
Lastly, it was observed that the experimental results (Fig. 4, left, middle) appeared to have more intensity toward the centre of the image. This background intensity could indicate the contributions of larger cell components and microstructures (e.g. the nucleus), light from the experimental setup, and/or the washed-out superposition of any non-uniform mitochondrial scattering. As these differences are primarily broad low-frequency features, such variations could be potentially detected, isolated and/or filtered out in the analysis process to facilitate comparison with simulated images.
The combination of the ability to calculate blob spacing and the ability to rapidly generate realistic simulated scattering plots with the mtPatterns algorithm facilitates the prediction or cell structure and cell diagnosis based on mitochondria spacing directly from an observed scattering image. Once it is possible to quickly simulate realistic scattering patterns, simple visual and numerical comparisons could be used to perform real-time classification of cytometrically interrogated cells. As shown by XRD, great leaps in structural assessment can be made by comparing experimental and simulated scattering patterns.
In addition, it should be noted that once organelle scattering has been isolated, feature extraction methods as described in the present invention may be used to identify and classify the scattering contributions of larger microstructures, giving further information about the cell as a whole or providing context for organelle analysis.
Example 3:Non-uniform mitochondrial scattering As shown in Fig. 5, there was little difference between mtPatterns scattering images created using isotropic scatterers (Fig. 5A) and scatterers with the more complex point-spread functions characteristic of larger mitochondria (Fig. 5B). Increased scatterer spacing can be seen to lead to a decrease in scattering pattern blob spacing (e.g. 2A, 3A).
In terms of scattering blob placement and spacing, there is little difference between uniform (A) and non-uniform (B) mitochondrial scattering. To allow visual comparison, the images in Fig. 5 have been normalized with respect to the same minimum and maximum intensity values (i.e. 0-255), and cover the same solid angle (77.3 -106.7 in 0 and (p) Forward scatter is toward the right. Experiments showed little structural variation (in terms of scattering blob structure) between large-angle scattering patterns generated using a uniform point-spread function (Fig. 3, left) and those generated by radiators using the point-spread function described by Gourley et al. (Fig. 3, right). This was expected, as the point-spread function of a non-isotropic mitochondria is almost identical to an isotropic point-spread function when observed over a solid angle of approximately 30 , centered on the side-scatter region. As such, the use of isotropic scattering points to simulate mitochondria proved to be a valid first-order approximation in the large-angle scattering domain, Further justification can be posed in terms of Fourier convolution theorem.
Given that an individual (larger) mitochondria has a scattering pattern in isolation of M(S), we can infer that it has a complex structure in the spatial domain m(r) (i.e. the inverse Fourier transform of M(S)). M(S) can be quite complex in the case of molecules, or a simple low-order function for atoms (N. Kasai and M. Kakudo, X-Ray Diffraction by Macromolecules (Springer, New York, 2005)). Since a population of scatterers is described by p(r), one can see that for the case that they are larger mitochondria with similar complex scattering patterns-the actual structural make-up of the population will be the convolution p(r) m(r) (e.g. an array of complex individual structures). Using the convolution theorem, this will result in the multiplication of the Fourier transforms of F{p(r)} and F
{m(r)} in the scattering domain as follows (N. Kasai and M. Kakudo, X-Ray Diffraction by Macromolecules (Springer, New York, 2005)):
Atrue(S) = A(S)M(S) {EQ 61 In physical terms, this will be equivalent to the multiplication of a point-spread function (e.g. Fig. 3, right) with the result of an isotropic scattering simulation.
This is what one can see in Fig. 5, where row B is simply the pixel-by-pixel multiplication of row A by the appropriate angular slice of point-spread function Fig. 3, right. Once known, the scattering of an individual large mitochondria may be added or removed from an image as needed to reveal the underlying scattering structure. This result is known from XRD (N.
Kasai and M. Kakudo,X-Ray)fraction by Macromolecules (Springer, New York, 2005)).
Example 4: Observations on the effect of scatter distribution and interaction Fig. 5 also shows that an increase in the number of scatterers lead to increased image complexity (i.e, smaller blobs and smaller characteristic spacing). In addition, an increased scatterer distribution radius R. also lead to increased image complexity. A
tightly arranged organelle distribution (e.g. smaller organelle containing volume, such as Pig.
5, A2) resulted in larger homogeneous intensity regions with greater characteristic spacing, while wider distributions (such as Fig. 5, A5) generated a number of small, tightly spaced intensity regions.
This behaviour is expected from Fourier theory and X-ray diffraction literature, as described above. As in Fourier theory, the spacing between-and size of-intensity regions on the simulated CCD region were inversely proportional to the spacing of the original organelle distribution. The more dense the scattering distribution, the more distance between blobs in the scattering plot (Fig. 5). These trends will be useful in future feature-based classification systems, as image feature complexity can help deduce initial cell structure, whether by direct comparison of features or phase reconstruction techniques.
In the limit of a small, very tightly packed set of scattering points, the generated scattering pattern was very similar to that expected from a single scatterer located at the distributions centroid-irrespective of the number of mt, for very small scatterer volumes the scattering signatures washed out into a single bright intensity blob with no complex structure, though the un-normalized intensity of the blob increased in proportion to the number of scatters.
This result is expected from scattering theory; as R-0 A, scattered phases will align and all scattering contributions with be uniformly constructive regardless of scatterer number.
This could be used as a useful predictor of the raw number of organelles found within a given structural plot, allowing the classification of both the organelle spacing (through blob spacing) and number (through raw blob intensity) of scattering organelles from a single scattering plot.
The average mtPatterns simulation took between a fraction of a second to about a minute on a standard personal laptop computer running the python interpreter.
Example 5: Application of Analysis Functions Two testing methods were employed to verify the validity of the Cythe system:
qualitative image analysis, and a quantitative statistical breakdown. For the qualitative analysis, the system was presented with images representative of all three cellular scattering cases described herein (e.g. intensity bands with a number of randomly placed intensity regions, as in Fig. 6 right). Varying levels of high-and low-frequency intensity variation may be present in a single image, leading to complex, information-rich image structures (right). Due to the difficulty surrounding quantitative segmentation analysis, statistical analysis was performed on images containing the first two scattering cases (intensity bands and bands with interference). This is explained below.
In both cases the test images closely matched experimental scattering patterns (K.
Singh, et at Cytometry 69A:307 (2006)) and numerical FDTD simulations (C. Liu, et al.
JBiomed Opt 10:014007 (2005)), both visually and in the magnitude of the output parameters.
In an ideal testing environment FDTD simulations would be used in conjunction with experimental data to verify the success of the segmentation system. However, to numerically analyze system accuracy it is necessary to identify the 'true' segmentation and parameterization of experimental data. As 'true' image boundaries are subjective in all but the simplest segmentation problems, most segmentation evaluation methods rely on qualitative boundary assessments for comparison values (7.K. Udupa, et al.
Comput Med Imaging Graph 30(2):75 (2006)); the few attempts at true quantitative evaluation typically rely on correlation data, and still involve comparisons with a manual (i.e.
human) segmentation (N. Pal et at Pattern Recognit 26:1277 (1993); S. Lee at al. IEEE
Trans Image Process 14:312 (2005); J.K. Udupa, et al. Comput Med Imaging Graph 30(2):75 (2006)).
To quantitatively verify the validity of the Cythe extraction pipeline a mathematical model was used to create a set of viable test images. These images contained a fixed number of vertical intensity bands of varying intensity and width, irregularly placed high-frequency intensity components, intensity band overlap, differing background levels, blurring, and poorly defined band boundaries (i.e. qualities we observed in experimental scattering images). Unlike manually measured experimental scattering patterns, these model images were numerous and provided a well defined set of 'tnie' parameter values (derived directly from the generated mathematical image model) with which to statistically validate Cythe's parameter extraction.
As bands are represented in our test images by smooth intensity gradients with no discrete edges, the 'true' band width parameter BW'(y),,,,r was measured as the horizontal distance between band points where the pixel intensity was 80% of the band's maximum intensity, relative to a black background. This width most accurately reflected observations about real scattering band width. Because of this approximate edge value definition, the validation data for band width parameters is slightly less precise than for other parameters, as seen below.
These quantitative test images contained a more regular distribution of high-frequency intensity components than was found in experimental images or our qualitative analysis images; high-frequency intensity regions were randomly placed only on intensity bands, as in image cases relating to geometric and Mie scattering. This additional regularity was needed generate reliable true values for band parameterization---images containing a completely random distribution of high-frequency regions (as expected from Rayleigh scattering) would suffer from the same subjective evaluation problems as real experimental data.
Thus, each quantitative test image consisted of a varying number of Gaussian intensity regions superimposed on a series of vertical intensity bands. Like real scattering patterns, the test images contained bands of varying width and maximum intensity that were placed at intervals across a black background. The intensity profile of individual bands, the size and orientation of Gaussian intensity regions, and the variation of maximum band intensity across the image were picked to match the intensity profiles expected in actual scattering images. finally, a 5 by 5 Gaussian blur was applied to the images to smooth out any unrealistic intensity transitions.
These test images were then presented to the Cythe system for analysis. Each test image was processed by the full computational pipeline (i.e. feature extraction, feature clustering, and post-processing) to produce a set of output parameters (P,,,h,), Another set of parameters were derived directly from the mathematical model used to generate the test images; these parameters P,.a,irepresented the 'true parameter values' used in the creation of the test images. The true parameters P,,,l were then compared to the parameter values extracted by our pipeline P,,the (i.e. how well they demonstrated a correlated linear relationship that allowed accurate prediction of the true parameter values).
Both the true parameter set P,&al and Cythe parameter set Pyh, included all thirteen parameters outlined in Table 2. As explained in the previous section, this band-based parameterization scheme (calculated with Equations 4 and 5) can be used to represent the influence of both large scattering structures and nanostructure-derived high-frequency intensity variation.
To assess the pipeline's ability to detect changes in band width and band intensity, these tests were performed on 162 sample images. Two different test sets were generated. The first set (T,:143 images) was used to determine the system's ability to detect variation in band width and band intensity-parameters primarily influenced by the presence of smaller scattering sources. In this set the number of intensity bands was held constant while the number of Gaussian intensity regions present in the image was varied between zero and fifty. The second set (Tz:19 images) was used to test the system's ability to detect changes in band structure and spacing, which relate to scattering from larger microstructural cellular objects. In test T2, the number of intensity bands was varied between two and twenty, while the number of Gaussian regions inserted into the image was held constant.
After each test set the Cythe parameter extractions were compared to the true parameters.
System success was determined by measuring how closely the Cythe parameters matched the true parameters, as evaluated with a range of statistical tests for correlation and similarity (described in the following section).
This procedure is similar to the comparison metric of Bovenkamp et al., where the plot of human v.s. machine solutions was compared to a unity slope to determine accuracy (E.G.P. Bovenkamp, et al. Pattern Recogn 37:647 (2004)). In the absence of any methods to objectively compare and evaluate segmentation schemes J.K. Udupa, et al.
Comput Med Imaging Graph 30(2):75 (2006); N. Pal et al. Pattern Recognit 26:1277 (1993);
S. Lee at of. IEEE Trans Image Process 14:312 (2005); L. Bergman, et of. Image Vision Comput 23:417 (2005)), this approach allowed a quantitative characterization of system success.
In addition to these quantitative test images, we performed a series of qualitative tests on a range images of containing features from all three scattering image cases (e.g. Fig.6, right). This was done to determine the system's ability to remove region bridges, detect regions in noisy images, and join detected regions into a structural hierarchy. For these tests, a test series was generated consisting of images with vertical intensity bands, randomly distributed high-frequency intensity regions, and vertical intensity bands overlayed with a pattern of randomly distributed high-frequency regions of irregular shape and size.
For these tests the scrubO and joino thresholds (0, 0) were set to 0.0018% and 0.028%
respectively. The agent view radius was R=5 X 5, and the agent neighbourhood was N=3 X 3. Images were reduced to 1=125 X 125, These values were empirically derived on a small subset of the images, and subsequently used on the full set of test images without modification; one parameter setting performed well for an entire family of images.
As shown in Fig. 13, the Cythe pipeline was able to affix the agent population (A) to the vertical intensity bands in the test image. Region color was assigned based on each group's unique ID value; all regions were verified to contain distinct ID
values. A visual comparison of the Cythe labeling (Fig. 13, right) with the initial image (Fig.
13, left) showed that the Cythe extraction matched quite closely with our expectations from test image. It was also observed that the system was able to detect the presence and magnitude of width variations in the band structure (Fig. 13, bottom row). In addition to being able to detect linear bands, Cythe was able to detect small, arbitrarily-shaped intensity regions of varying brightness (Fig. 13, middle). As shown by the difference between Fig. 13 left and right, Cythe was also able to detect the level of high-frequency variation present in images containing high-frequency components that overlap a pattern of vertical intensity bands. This observation further supports the efficacy of the band-based parameterization scheme 1?. Random intensity regions (like those expected from Rayleigh scattering) were indicated by width and intensity deviations within the detected band structure-their intensity contributed to and noticeably altered the shape of existing bands.
It was observed that Cythe was able to remove parameter-degrading horizontal intensity bridges and use the clustering stage to group the agent population (A) into a set of distinct regions (G). The removal of horizontal bridging can be seen in Fig. 14, and the ability to form a population into spatially connected regions can be seen by the homogeneous region colors in Figs. 13 and 14. In Fig. 14(Left) - the initial test image. In Fig.
14(Middle) - the agent population directly after the agent_fixationO routine; there are three bridges at this point. In Fig. 14(Right) - final region identification after post-processing.
Weak connections between bands did not adversely affect region identification-the two horizontal bridges were removed in the feature detection stage, and the diagonal propagation restriction prevented ID leaking over the remaining bridge (which was subsequently removed by the scrub() routine). The dots represent fixed agents (middle), and different shades in the clustering image indicate spatially distinct regions (right).As shown in the difference between the two agent populations in Fig. 14 (middle and right), it was found that horizontal bridges less than three pixels in width were removed during the feature detection stage . In addition, the use of a 4-neighbourhood for communication in the feature clustering stage prevented distinct bands from being classified as a single region due to any remaining weak connections (Figs. 13 and 14).
The joinO routine constructs a set of vertical bands g' out of the detected image regions G.
For noisy images this process would not be possible without prior use of the scrubO
routine to filter out small unconnected intensity regions. Figure 15 illustrates the use of the joinO and scrub() routines in the creation of a vertical band structure for simple images with and without 10% of the images pixels assigned a random 8-bit intensity noise value (i.e. random or independent noise, as expected from dust on a lens or bad CCD
pixels). Lines indicate band position (xg), and differently shaded regions in the post-processing image indicate spatially distinct regions g. While portions of the agent population affixed to noise-related pixel clusters (Fig. 15, bottom middle), the scrubO
routine removed these small groups and the pipeline identified the same regions found in the noise-free image (Fig. 15, right). In addition, the detected regions were joined into the same band structure for both the noisy and noise-free image (as shown by the number and horizontal position of the yellow vertical lines, Fig. 15, right). This leads to the same parameters being extracted for both the noisy and noise-free images. Similar performance was observed for Poisson/couting noise, though high levels of Gaussian noise required the use of a larger scrub threshold due to larger detected noise regions. A join threshold of 0=0.08 was used for the tests in Fig. 15.
In addition to accurately parameterizing the model test images, Cythe was able to extract realistic parameters for a large set of FDTD scattering simulation images containing many arbitrarily-shaped randomly-distributed high-frequency intensity regions, as derived from complex cell structures with varying physical characteristics and organelle distributions (work in preparation).
The parameters extracted by Cythe from the test images allowed accurate prediction of the true parameters P,,,i extracted from the initial test images. This was statistically determined by calculating the correlation coefficient (r - the amount of covariance in the two populations, a good indicator of segmentation accuracy (N. Pal et al.
Pattern Recognir 26:1277 (1993))), the statistical significance of the correlation (P(r) -the probability that correlation value r could have arisen by pure chance for a given sample size), the chi-squared significance P(x2) - the probability of both input and output variables coming from the same distribution), and the standard error (SE) for each population of input/output variables, all calculated as per J.R. Taylor, An introduction to error analysis (2nd Ed., University Science Books, Sausalito, California, 1997; or through equations disclosed herein.
This comparison is shown in tabular form for tests T1 (band intensity parameters, Table 3, and band width parameters, Table 4 and T3 (band number/spacing parameters;
Table 5).
These tables present the statistics for an image reduction size of 1125 X 125.
From statistical theory (r.R. Taylor, An introduction to error analysis (2nd Ed., University Science Books, Sausalito, California, 1997)), r values greater than 0.216 (test TT, 143 samples) and 0.561 (test T2, 20 samples) indicate a statistical correlation (i.e. a probability P(r)<0.01$ that the correlation score could have originated by pure chance).
These threshold r values are based on the sample sizes used in our experiments. As shown in Tables 3-5, our derived values are consistently greater than these minimum values for statistical correlation. Similarly, chi-squared significance values approaching p(y2) = 1.00 indicate no difference in the distribution of input and output values.
The uncertainty in each parameter was estimated by adding Poisson/counting noise (i.e.
each pixel was varied according to a normal distribution equal to the square root of the pixel value) and processing the resulting image by the same method as the test data sets.
This was done for 56 images, allowing the extraction of a standard deviation that then allowed the calculation of chi-squared significance values.
As described above, the parameters shown in Tables 3 and 4 are used to characterize the intensity contributions from smaller Mie and Rayleigh scattering sources, while the parameters in Table 5 characterize the intensity contributions from larger Mie and geometric scattering objects.
Table 3. Statistical analysis for band intensity parameters.
Parameter Description r P(r) PO
aBl,,v Avg. Band Intensity Average 0.992 <0.0001 1.000 aBlmin Avg. Band Intensity Minimum 1.000 <0.0001 1.000 aBJmax Avg. Band Intensity Maximum 0.998 <0.0001 1.000 aBldeõ Avg. Band Intensity Deviation 1.000 <0.0001 1.000 aBl,,,, Avg. Band Intensity Minimum (NN) 1.000 <0.0001 1.000 *nearest neighbour Table 4. Statistical analysis for band width parameters.
Parameter Description P(r) P() aBWQõg Avg. Band Width Average 0.386 <0.0001 1.000 aBWj, Avg. Band Width Minimum 0.872 <0.0001 0,528 aBWm Avg. Band Width Maximum 0.724 <0.0001 1.000 aBWd,Y Avg. Band Width Deviation 0.907 <0.0001 0.286 Table S. Statistical analysis for band number/spacing parameters.
Parameter Description r P(r) F(() B Number of Bands 1.000 <0.0001 1.000 BS,,,, Minimum Band Spacing 0.995 <0.0001 1.000 BSmc Maximum Band Spacing 0.993 <0.0001 1.000 BSd, Average Band Spacing 1.000 <0.0001 1.000 With regard to the system's ability to correctly identify deviations in band intensity (test TI), we found that Cythe was able to identify the magnitude and variance of intensity to a high degree of certainty. At an image size of I=125 X 125 Cythe was able to correctly identify the number of bands in every test image. The input and output intensity parameters showed strong correlation, as indicated by the r, P(r), and P(X2) values (Table 3). Standard error for these intensity parameters was less than half an intensity step on a 8-bit intensity scale.
The close relationship between input and output parameters was also evident for band width and band width deviation parameters (aBWm;õ/mar/avg/dev). as shown by the values in Table 4. As explained in the previous section, difficulty defining the 'true' width values in the test images lead to greater variability in the evaluation statistics r and xz. While width statistics (Table 4) did show lower correlation between input and output values than the other parameters (Table 3, Table 5), all other values represented an excellent fit. Width values were still well above the thresholds for chance correlation, as indicated by r >
0-261, P(r) << 0.01. Despite having a high degree of correlation, the parameters aBWd,,õ
and aBW,,,,,, exhibited a low P(x2), and further investigation showed that this deviation in input/output distribution similarity was due to a shallow (i.e < 0.5) regression slope between the input and output parameter sets. Considering the lack of 'true-value' precision when quantitatively analyzing spatial parameters in this situation, the set of width statistics in Table 4 sufficiently demonstrated a distinct relation between the actual layout of the test images and the Cythe parameter extraction.
In addition to band width and intensity parameters, the system was able to accurately determine the number and spacing of bands (test T2). As shown in Table 5, the correlation coefficient (r) for each band-structure parameter approached 1.0 (i.e perfect correlation).
This indicates a one-to-one correspondence between the input parameters Põor and the output parameters P,,,h,. For the parameters BS,,,,,, BSmax, BS,,g there was a standard error of less than 1.1% of the image width for both reduction levels. There were no band number (B) identification errors in test T2, and the chi-squared significance test for all parameters in Table 5 indicated no statistical difference between the input and output parameters.
With respect to the magnitude of observed values from Equation 4, a typical range of 0.72-9.6 pixels was found for aBW;,,fõ,,;,~,.a, and 0.0-0.99 pixels for aBWde,=. For Equation 5, intensity parameter values were typically between 127.2-157.2 for aBjmp,fimar/arg, 0.59-4.29 for aBJ,,,,, and 0.26-72.7 for aBJ,.. Band spacing parameters varied greatly depending on the number of detected intensity bands in a sample image; for our tests we found spacing parameters between 6.46-14.8 (Test Ti) and 6.48-72.2 pixels (Test T2). The standard deviation of parameter values observed under conditions with counting noise and random noise was much less than the total parameter range observed during these tests.
As the size of images presented to the system increased (with the Agent View Radius being held constant), Cythe began to identify small erroneous bands within the larger regions of the image. At an image size of 1=150 X 150 the pipeline incorrectly identified one extra band in 67 of the 143 T, tests images. This lead to a noticeable decline in correlation values, and incurred a corresponding increase in standard error.
It is apparent that the relationship between image size and Agent View Radius plays a role in feature detection; this will be discussed in the following section.
Edge detection is by its very nature a local undertaking (N. Pal et al.
Pattern Recognit 26:1277 (1993)) and thus lends itself well to an agent-based framework. By determining fixation based on an adaptive local threshold (p,,, the average intensity value within an agent's view radius R), Cythe was able to effectively label all edges irrespective of the differing background intensity levels found in scattering images. By setting the adaptive threshold level greater than the local average (as in the agent axa~,onO
routine), the system consistently labeled the high-intensity side of all edges, isolating the band regions from the lower-intensity background.
Using an Agent View Radius of R=5 X 5$, we found 1= 125 X 125 to be the most appropriate size for image reduction. At this image size and view radius, the fixation routine was able to accurately divide the image into spatially distinct regions regardless of differing background levels and gradient slopes. The distance between identified band edges was small enough that the two edge-labeling agent populations for a given band connected along the center of their band. This allowed bands to be detected as continuous units in both our model test images and complex FOTD scattering simulations.
We found that it was important to select a view radius close to the size of target image features in the reduced image I. The two edge regions of a single band may not connect if the Agent View Radius is significantly smaller than the band size. This lead to the identification of extra bands by the clustering stage. By varying the size of the view radius (i.e. the adaptive thresholding region (N. Pal et at Pattern Recognit 26:1277 (1993)) to match the image reduction level, feature extraction remains accurate at any image size (though increased image size comes with an increased computational cost, as described below). This follows from recent work in image saliency detection and model matching (L. Itti, et al. IEEE Trans Pattern Anal Mach Intell 20:1254 (1998); V.
Navalpakkam et al Vision Res 45:205 (2005)).
The propagate-idO routine was a reliable and effective way to cluster the labeled pixels into contiguous regions. This routine, which was based on the connected components labeling algorithm commonly used in region identification problems (L.O.
Shapiro et al.
Computer Vision (Prentice Hall, 2001)), provided a simple way to identify groups of connected agents. Much like the self organizing tile behavior shown by Ghrist and Lipsky R. (Christ et al. "Gramatical Self Assembly for Planar Tiles", in Proceedings of International Conference on MEMS, NANO and Smart Systems, W. Badawy and W.
Moussa, eds. (Banff, Alberta, 2004), pp. 205-211), our system was able to effectively perform long-range organization through simple local interactions. In addition, the distributed approach lends itself well to parallelization-one of the major advantages of multi-agent systems (A.P. Engelbrecht, "Computational Intelligence: An Introduction", (John Wiley & Sons, 2002)).
The removal of band bridging (as described herein) was essential in the accurate clustering of spatially distinct regions. The agent fixation stage eliminated direct horizontal communication over bridges by eroding bridging agents (shown in Fig. 14), while the use of an agents 4-neighbourhood for communication prevented ID propagation over any remaining (weak) junctions between band protrusions. Without the removal of band bridging it was impossible to successfully parameterize complex images. A
similar restriction on the union of weakly-connected regions has proved effective in other segmentation and image identification situations (Y. Wang et al IEE Proc-Vis Image Signal Process 149:173 (2002); L.G. Shapiro et al, Computer Vision (Prentice Hall, 2001)). The assumption that there will be no strong horizontal links between bands follows from the structure of experimental scattering images and our understanding of cellular scattering mechanisms.
Using an agent neighbourhood of size N=3 X 3 further prevented erroneous ID
propagation between distinct bands. By only allowing communication between adjacent agents, ID information was not able to travel over gaps between neighbouring intensity bands.
Due to the nature of the local interactions and the multiple sweeps through the agent population, the ID propagation routine was found to be the largest computational component of the parameterization pipeline. As the proper choice of agent view radius and image reduction size decreased the final size of experimental images to approximately 1=125 X 125, scalability was not an issue for our application. In the case of larger images, the use of a union-find structure (described in L.G. Shapiro at al, Computer Vision (Prentice Hall, 2001)) in the connected components algorithm would greatly reduce the number of iterations through the agent population (though it would require a higher level of centralized control).
The joinO routine was found to be an effective way to model the structure inherent in scattering images. It has been shown that changes in the relationships between full bands are indicative of large changes in cellular structure (K.A. Sem'yanov, et al.
Appl Opt 39:5884 (2000)). By linking several vertically aligned regions to a single band structure, we were able to analyze the relationship between whole band units while still retaining specific information regarding the variation present in each band and its associated sub-regions.
As shown in the results section, the joinO routine managed to consistently group smaller regions into cohesive bands, even in the presence of noise. Noisy images were divided into the same number of bands (i.e. super-groups) as noise-free images (pig.15). This is important for the parameterization stage of the pipeline; band discrimination plays a large part in the calculation of band-based statistics, which in turn contain vital information about the nature of the scattering source.
The scrubO routine helped the parameter extraction process by removing any large noise regions that remained after the initial image reduction. By keeping the scrub threshold low, large features were still preserved (e.g. Fig. 15) while small agent clusters were rejected as noise; this parameter can be tuned to the specific nature (ambient noise and feature size) of the images under analysis. However, it should be noted that setting the scrub level too low can cause erroneous band identification, whereby bands of small pixel mass may appear near the edges of each real band. Extra bands will distort the extracted parameter space and should be avoided.
With regard to the selection of the variables for joinO and scrubO (i.e B and 0): these values are derived empirically based on observations regarding the size of noise artifacts inherent in a scattering image and the approximate size and frequency of bands in the image. Once selected, the parameters performed effectively on the entire test set. If significant changes are made to the ambient background noise or the angular range of a scattering image, the system threshold levels may need to be re-calculated.
While there are a number of other potential frequency analysis methods (such as Fast Fourier Transforms) which could be used to ascertain spatial frequency information from a scattering image, the Cythe parameterization routine allowed the extraction of a related structure of image regions within the context of a scattering situation (effectively embedding frequency information within an interpretable band-like structure).
This relationship information is of use to human observers, both for validating the extracted parameter data and for comparing results with those previously published in the cell scattering literature (which generally reference the number and size of bands, or the angular location and span of intensity band maxima).
Image size reduction was found to play an important role in both the generalization of region boundaries and the rejection of low-level noise, as it influences the degree of abstraction applied to the input image (L. Itti, et al_ IEEE Trans Pattern Anal Mach Intell 20:1254 (1998)). A similar reduction approach is used in saliency-based vision systems to detect high-level features in natural scenes, where detail (and the associated noise) is sacrificed to rapidly form an accurate structural impression of the image (1... Itti, et al.
IEEE Trans Pattern Anal Mach Intell 20:1254 (1998)).
Appropriate choice of image reduction size depends on the size of the Agent View Radius, and the number, spacing, and width of intensity bands within an image. A large reduction to an image with very narrow bands or important high-resolution features could merge independent intensity regions, or render some relevant features undetectable.
Failing to reduce an image with wide bands could lead to erroneous band detection or extra computational cost / increased run times. It was observed that disparity between image reduction size and Agent View Radius either lead to the identification of too many small intensity regions (e.g. when the true aBW R) or the grouping of many distinct initial regions into a small number of larger features (e.g. when the true aBW,,,,g>>
R).
Image reduction was also essential for manageable run times, as un-reduced experimental cytometer images are typically greater than 700 pixels on a side. At an image reduction size of Y = 100 X 100, an entire pipeline run took approximately six seconds.
At a reduction size of I = 300 X 300 or larger, runs lasted two minutes or more.
All performance tests were conducted on a Pentium IV desktop computer. The entire Cythe pipeline and all related routines were implemented in the Python programming language.
Cythe is expandable and may be readily adapted to new scattering situations;
once intensity regions are segmented and numerically represented (as in the group list G) it is possible to test for any number of spatial relationships. In addition, the pipeline should be robust to variations in expected image structure. By adjusting the post-processing settings, agent view radius, and image reduction size, Cythe can be made to detect intensity regions with greatly varying geometric properties. Furthermore, due to the local nature of the parameter calculation equations, slight band curvatures should not adversely affect parameter extraction.
In the event that images with different (e.g. non-vertical) spatial relationships need to be analyzed, the joino routine may be modified or replaced to create a different region hierarchy, and orientation-related changes may be made to the band bridge removal process. This would also allow identification of randomly-placed intensity regions, such as would be generated by a field of Rayleigh scatters, without imposing a band structure on the intensity data. The additional analysis of intensity region perimeter and area would allow further distinctions to be made between differing region types (e.g.
independent regions and full bands). This would facilitate the parameterization of heterogeneous images consisting of horizontal bands, tightly grouped region clusters, blobs arranged without a band structure, or any other arbitrary cluster shape.
To this end, Cythe can be used within other applications, including the identification of geometric objects in natural scenes, and the detection of bright fluorescent regions during the genetic analysis of cell populations (P.M. Pilarski, et at, "An artificial intelligence system for detecting abnormal chromosomes in malignant lymphocytes", in Proceedings of Canadian Society for Immunology, Annual Conference (Halifax, Canada, 2006), pp.
126.) The final goal of the Cythe system is the classification of biological samples based on light scattering. Our preliminary results have shown that cell classes (e.g. those with features indicating cell health or malignancy) typically reside at the extremes of the possible parameter space (work in preparation). The difference between cell classes appears to be much greater than the variation due to noise, such as from imperfections in a fluid wave-guide or CCD. Thus, based on the parameter deviation indicated from random and counting noise (as herein), measurement noise should only moderately detract from Cythe's classification ability and the correlations we observed between input and output parameters.
Example 6: Cell Cytometry Integration A set of FDTD test images were presented to the system of the present invention (Cythe).
This system returned a numerical parameterization of the scattering intensity image in terms of scattering region shape and variability. These parameters were then analyzed for information content; a selection of these extracted parameters were given to a simple multilayer pereeptron, which produced a prediction of cellular organelle content.
Many groups have shown that the use of 1 DTD simulations can produce realistic scattering signatures that match well with experimental results; their work has proved that such simulations are a reliable indication of light scattering from actual scattering sources (R. Drezek, et al. Applied Optics 38:3651 (1999); R. Drezek, et al. Optics Express 6:147 (2000); A.K. Dunn Light Scattering Properties of Cells Phd, University of Texas at Austin, 1997; C. Liu, et al. JBiomed Opt 10:014007 (2005)). A bank of tests were conducted on a "library" of FDTD scattering simulation images of varying complexity (generated using the method of Liu et al. (C. Liu, er al. JBiomed Opt 10:014007 (2005)).
The full 'library' of test images contained thirty-eight scattering simulations which covered a spectrum of 2pm spherical cells with varying nuclear sizes (between 0.6 pm and 1.6 pm), nuclear positions (0 pm to 0.4 Irm from the center point), organelle content (0 to 50 organelles of size 0.3 pin), and random organelle distributions (five distinct positional random seeds). For these tests, the following assumptions were made: a nuclear index of refraction of 1.39, an organelle index of 1.40, a cytoplasm index of 1.38, an external medium index of 1.334, and an incident laser wavelength of 632.8nm. All cellular components were modeled as spheres, speeding FDTD computation and allowing effective pattern validation with Mie theory; the scattering from more detailed cellular geometries will be the subject of future FDTD studies.
By analysing using the system of the present invention over a wide range of random distributions and nuclear orientations it is believed that any extracted parameter correlations should be largely invariant to rotated cells and/or randomly distributed cellular components, a problem that has attracted a fair amount of attention in the literature (C. Liu, et al. JBiomed Opt 10:014007 (2005)).
Of these parameters, it is the average minimum band width (aBWiõ ), the average band width deviation (aBWd ,) and the average nearest neighbour intensity deviation (aBJ,,,,) were the most relevant for determining organelle content in experimental images. The formulae for these parameters are shown in Equations 1, 8 and 9, modified from pixel based form to use discrete angular units. In these equations, b represents an individual intensity band, a indicates the set of possible 0-values for a band, BWb(4)) indicates band width (in degrees, along the 9 axis) at angle 0, while BWb(4) indicates the intensity value at the band center at angular height 0.
OBIõõ a avgy{ ~4 EI316(d) - DI6( " 1)i}
{EQ 71 aD6Y,~v = avgy{ 1~ E iDWb(~) - avg(DWb)i}
~~ me s {$Q 8) aBW.,i,, = av9b{min(BWb(0), ¾ E fib)) (EQ 9) A simple multi-layer perceptron was used to show how two image parameters can be used to classify a cell based on its scattering image. The classifier was generated using the standard WEKA data-mining toolkit (I.H. Witten et al. Data Mining. Practical Machine Learning Tools and Techniques (Morgan Kaufmann, 2005)). The most successful classifier consisted of a two node network; the first node used a sigmoid activation function and took as input the weighted parameters aBl,,,, and aBlda,,, and a threshold value.
The second node of the system used a linear activation function and took the weighted output of the first node and a threshold value to generate the final classification value.
Training lasted 500 epochs with a learning rate of 0.005 and a momentum term of 0.005;
training time was less than 0.05 seconds. One potential layout of this simple network is shown in Figure 10.
Ten-fold cross validation (i.e_ CV-10, (R.O. Duda, et al. Pattern Class1cation (2nd Ed., Wiley lnterscience, New York, 2001)) was used to verify network performance on the full training library. Each parameter set was evaluated based on its correlation coefficient (lie how well changes in true value matched changes in the output prediction), mean absolute error (average error in the number of predicted organelles), root mean squared error, and relative absolute error (in percent) (R.O. Duda, et al. Pattern Classification (2nd Ed., Wiley Interscience, New York, 2001); I.R. Taylor, An introduction to error analysis (2nd Ed., University Science Books, Sausalito, California, 1997)).
Using the system of the present invention it was possible to derive a set of parameter/cell-structure correlations that held for a full library containing a range of nuclear positions, random organelle distributions, cell sizes, and nuclear sizes. When performing tests on the entire library of randomly distributed cells with varying nuclear size and position, it was observed that the number of organelles in the simulated cell was related to the average value across all bands of the maximum band width (aBW,,,rõ) and the variability of band width (aBWa ). In addition, a correlation between the number of organelles in a simulated image and the average value across all bands of the band nearest-neighbor intensity deviation (aBW,,,,) was observed. These trends can be seen in Figure 11. An example of raw Cythe region detection for simulation images without and without organelles is presented in Figure 12.
By combining the information contained in all three parameters, it is possible to use a multilayer perceptron to accurately infer the number of small scattering in the cell. The multilayer perceptron was tested on the entire library of FDTD scattering images using standard 10-fold cross validation (R.O. Duda, et al. Pattern Classification (2nd Ed., Wiley Interscience, New York, 2001)). Several combinations of image parameters were tested; a selection of these tests are shown in Table 6.
A simple network based on the parameters aBWd,, and aBI,,,, achieved the best performance with a correlation coefficient of 0.9839, a mean absolute error less than 2.6 predicted organelles, and a relative absolute error of only 14.05% (Table 6).
This performance is based on 10-fold cross validation of the 38 item image library.
Classifier accuracy is expected to improve dramatically with a larger sample library (such as from actual patient samples).
The classifier accuracy is observed to decrease when other parameter combinations were used (e.g. aB1õõ and aBWõ), and when more or less than two parameters were given to the multilayer perception. This can be seen in Table 6. Experiments showed that the parameter aBI,,,, contained the most useful information regarding organelle content. This can be seen, in part, by the low uncertainty on datapoints in Figure 11, top.
The inclusion of small organelles lead to an increase in high-frequency intensity variations in a two-dimensional FDTD scattering plot, which was detectable in several Cythe image parameters. As they are based on high-frequency intensity variation, the parameters aBI,,,,, aBWvpr and aBWm, all proved useful in organelle content prediction.
rot an analysis system to be valid on a completely unknown (experimental) test instance, it must be able to detect changes in the parameter of interest despite potential cell rotations and random organelle placement. The library of test images contained a diverse set of cellular structures, and the system showed robust behaviour with respect to random organelle distribution, nuclear size, and nuclear placement (Figure 11).
All of the FDTD tests were performed on cells of 4 im diameter. In the event that an extracted of set analysis rules is not independent of cytoplasm size, a viable alternative is to determine a rule set for each range of cell sizes and apply the correct set based on high-level cell analysis (such as the raw number of intensity bands detected or the spacing between intensity bands (K.A. Sem'yanov, et al. Appl Opt 39:5884 (2000))_ A
larger library of simulations and experimental results will increase the range of detectable intracellular structures, and will dramatically increase the accuracy of the generated classifiers.
Fig. 16, and 17 show the scattering simulations for increasing numbers of simulated mitochondria (Fig. 16) and distribution patterns (Fig. 17). These simulations were generated by a three-dimensional array of isotropic scattering points, which is a rapid and accurate approximation to typical (time consuming) organelle scattering simulations and observed experimental scattering patterns. Single dimensional FFT analysis of mitochondrial distribution (Fig. 18), simulated increasing organelle number (Fig. 19) and FDTD simulated mitochondrial content (Fig. 20) was performed. As shown in Fig.
18, each plot is the super-average of six different images of differing random placement seed along both axes, and high frequency power increases with the radius of scattering distribution. Fig. 19, shows for both a nuclear distribution (left) and large-radius distribution (right) and each plot is the super-average of six different images of differing random placement seed along both axes. Fig. 20 shows both low and high y-axis resolution (left and right, respectively) and stratification of plots in the high frequency region of the 0 axis (from 0,15, 20, 25, 30, 40 and 0 mitochondria, bottom to top) indicate the ability to determine the mitochondrial content directly from scattering images (bottom right). Some stratification is also present for the 0axis, to a lesser degree (top right).
Of note, there is a noticeable qualitative difference in the scattering patterns with changing organelle content and distribution (Figs. 16 and 17) and a noticeable high-frequency FFT
power stratification due to changing FDTD organelle content (Fig. 20);
however, at lower spatial frequencies there is little noticeable power change due to the number of organelles, though changes to organelle distribution are immediately apparent (Fig. 19).
While particular embodiments of the present invention have been described in the foregoing, it is to be understood that other embodiments are possible within the scope of the invention and are intended to be included herein. It will be clear to any person skilled in the art that modifications of and adjustments to this invention, not shown, are possible without departing from the spirit of the invention as demonstrated through the exemplary embodiments. The invention is therefore to be considered limited solely by the scope of the appended claims.
Claims (6)
- WHAT IS CLAIMED IS:
l. A system for analyzing mammalian cells comprising A source of coherent electromagnetic illumination;
A receptacle for at least one mammalian cell wherein the coherent electromagnetic illumination may impinge, upon said at least one mammalian cell;
- At least one detector capable of receiving scattering information arising from the reflection, transmission, refraction and diffraction of coherent electromagnetic radiation through said at least one mammalian cell;
- Means to convert scattering information from said at least one detectors to a digital signal; and - A digital computation device in digital communication with said means to convert scattering information from at least one detectors to a digital signal, wherein said digital computation device fixes an agent population for the salient image features; clusters agents into connected groups; removes image noise and erroneous groupings and joins groups into band-like super groups - 2. A method to simulate the scattering patterns of biological cells comprising an algorithmic method that considers the scattering contributions of a three-dimensional array of scatterers and uses this information to formulate detailed large-angle two-dimensional scattering plots.
- 3. The method of claim 2 wherein the scatterers are cellular organelles present in a mammalian cell.
- 4. The method of claim 3 wherein the organelles are mitochondria.
- 5. A method for analyzing laser scattering images from a wide-angle, two dimensional cytometer comprising the fixing of an agent population for the salient image features; clustering of agents into connected groups, removing image noise and erroneous groupings and joining groups into band-like super groups.
- 6. A method for the analysis of scattering profiles arising from the reflection, transmission, refraction and diffraction of coherent electromagnetic radiation through a mammalian cell comprising the steps of directing electro-magnetic radiation at said cell;
receiving the reflected, transmitted, refracted and diffracted radiation, and applying computer processing methods to the received radiation intensity patterns..
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US87647806P | 2006-12-22 | 2006-12-22 | |
US60/876,478 | 2006-12-22 | ||
PCT/CA2007/002321 WO2008077245A1 (en) | 2006-12-22 | 2007-12-24 | Novel methods for cellular image analysis and simulation |
Publications (1)
Publication Number | Publication Date |
---|---|
CA2707015A1 true CA2707015A1 (en) | 2008-07-03 |
Family
ID=39186186
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CA2707015A Abandoned CA2707015A1 (en) | 2006-12-22 | 2007-12-24 | Novel methods for cellular image analysis and simulation |
Country Status (2)
Country | Link |
---|---|
CA (1) | CA2707015A1 (en) |
WO (1) | WO2008077245A1 (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8873827B2 (en) | 2012-06-29 | 2014-10-28 | General Electric Company | Determination of spatial proximity between features of interest in biological tissue |
CN112184696B (en) * | 2020-10-14 | 2023-12-29 | 中国科学院近代物理研究所 | Cell nucleus and organelle counting and area calculating method and system thereof |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6067157A (en) * | 1998-10-09 | 2000-05-23 | University Of Washington | Dual large angle light scattering detection |
-
2007
- 2007-12-24 CA CA2707015A patent/CA2707015A1/en not_active Abandoned
- 2007-12-24 WO PCT/CA2007/002321 patent/WO2008077245A1/en active Application Filing
Also Published As
Publication number | Publication date |
---|---|
WO2008077245A1 (en) | 2008-07-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Berry et al. | Functional summaries of persistence diagrams | |
Buchner | Collaborative nested sampling: Big data versus complex physical models | |
Novaković et al. | Evaluation of classification models in machine learning | |
Cogan et al. | Jet-images: computer vision inspired techniques for jet tagging | |
Theera-Umpon et al. | Morphological granulometric features of nucleus in automatic bone marrow white blood cell classification | |
Sabino et al. | A texture approach to leukocyte recognition | |
Wu et al. | Application of a self-organizing map to identify the turbulent-boundary-layer interface in a transitional flow | |
US11747258B2 (en) | Holographic characterization of protein aggregates | |
Lynch et al. | Parameter estimation of neuron models using in-vitro and in-vivo electrophysiological data | |
Pala et al. | Fractal dimension-based viability analysis of cancer cell lines in lens-free holographic microscopy via machine learning | |
Gupta et al. | Bat-inspired algorithm for feature selection and white blood cell classification | |
Saturi et al. | Histopathology breast cancer detection and classification using optimized superpixel clustering algorithm and support vector machine | |
Guttorp et al. | Bayesian inference for non-Markovian point processes | |
EP2332073B1 (en) | Shape parameter for hematology instruments | |
CA2707015A1 (en) | Novel methods for cellular image analysis and simulation | |
Mukherjee et al. | A deep learning framework for classifying microglia activation state using morphology and intrinsic fluorescence lifetime data | |
KR101913952B1 (en) | Automatic Recognition Method of iPSC Colony through V-CNN Approach | |
Jeffries et al. | Analysis of flow cytometry data using an automatic processing tool | |
Ribeiro et al. | Breast cancer diagnosis using a neural network | |
EP3528057A1 (en) | Portable diagnosis test and use thereof | |
CN116642819B (en) | Method and device for identifying cell population | |
Dahal et al. | Detection of the Diabetic Retinopathy by Classical-Quantum Transfer Learning | |
Dong et al. | Object identification by marked point process | |
Pouli et al. | Markov Random Fields | |
Robison | Imaging White Blood Cells using a Snapshot Hyper-Spectral Imaging System |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
FZDE | Discontinued |