WO2022229638A1

WO2022229638A1 - Method & apparatus for processing microscope images of cells and tissue structures

Info

Publication number: WO2022229638A1
Application number: PCT/GB2022/051072
Authority: WO
Inventors: Anthony SINADINOS; Aarash SALEH; Kyriel PINEAULT; Uta Griesenbach; Eric Alton
Original assignee: Imperial College Innovations Limited
Priority date: 2021-04-27
Filing date: 2022-04-27
Publication date: 2022-11-03
Also published as: GB202106025D0

Abstract

A computer implemented method of identifying cell boundaries in a microscope image of DAPI stained cells, the method comprising: obtaining digital image data comprising a first colour channel of a microscope image of cells; identifying, based on the digital image data, locations of cell nuclei in the microscope image; determining, based on the locations of cell nuclei, a distance map corresponding to the microscope image and indicating a distance from each location in the map to a nearest nucleus; identifying, based on the distance map, for each of the locations of the cell nuclei a region surrounding the each location comprising pixels which are closer to the corresponding nucleus than to any other, thereby to provide a region map comprising a plurality of such regions; determining, for each region in the region map and based on the distance map, a cell boundary surrounding said corresponding nucleus; and, modifying the cell boundaries based on autofluorescence data corresponding to the microscope image.

Description

Method & Apparatus for Processing Microscope Images of Cells and Tissue

Structures Technical Field

The present disclosure relates to immunofluorescence and immunohistochemistry, and still more particularly to the analysis of microscope images of cells, such as those found in tissue samples. The present disclosure provides digital image processing techniques having particular application in the analysis of pulmonary tissue, such as that taken from the human lung.

Background

DAPI, or 4',6-diamidino-2-phenylindole, is a fluorescent stain that binds strongly to adenine- thymine-rich regions in DNA. It is used extensively in fluorescence microscopy. As DAPI can pass through an intact cell membrane, it can be used to stain both live and fixed cells, though it passes through the membrane less efficiently in live cells and therefore provides a marker for membrane viability.

DAPI is a popular nuclear counterstain for use in multicolor fluorescent techniques. Its blue fluorescence stands out in vivid contrast to green, yellow or red fluorescent probes of other structures. DAPI stains nuclei specifically, with little or no cytoplasmic labeling. DAPI may be used as a counterstain for immunofluorescence when green (FITC) or red (Texas Red) fluorescent marker is used.

Immunofluorescence and immunohistochemistry involve the use of antibodies to detect and localise proteins and other antigens in biological samples. Tissues and other samples are prepared by a process call “fixation”. As a result, cells or tissues retain cellular antigen distribution and preserve its cellular morphology.

Samples are subjected to a permeabilizing process exposing the antigens which are usually not accessible, thus allowing antigen detection within cells and tissue structures. Protein localization and quantification methods are widely used in research and diagnostics for, but no limited to, respiratory diseases.

Whole-slide images of tissue samples are rich in information and are used as typical diagnostic tools in immunohistochemistry and immunocytochemistry. Tissue image analysis, when performed correctly, can result in the generation of tissue-derived readouts that are precise and highly reproducible. Such analysis may require cell counting, and an identification and segmentation both of individual cells and structures within tissues such as epithelia in lung tissue.

However, the process of manual identification and counting of cells could easily be influenced by visual bias which leads to human error, and the counting process can be laborious. The use of digital images and computer implemented methods may enable relevant information to be extracted in an objective fashion, leading to results of improved precision and reproducibility. This may reduce or even potentially eliminate human bias.

It is a problem to measure pulmonary tissue samples in a high-throughput and an unbiased manner.

Summary

Vectors can be used for introducing genes into cells to treat disease. Non-viral vectors can be used for this purpose. A non-viral vector has been used to treat cystic fibrosis by introducing a corrected gene into pulmonary cells. Cystic fibrosis is a good candidate for such treatment because it is believed to arise from a single point mutation. The main difficulty in such treatments is the reliability of getting the relevant gene into the relevant cells. Non-viral vector systems have been found to have some issues.

The purpose of a vector is to take genes to a particular cell type. Viral vectors protect and shuttle genes into target cells. They use displayed surface proteins to recognise specific molecules on their target cell. So called “pseudo-typing” relates to the modification of the displayed protein to change the target cell type for a particular virus - in other words, to cause the virus to bind to a particular chosen cell type. It has been found that particular pulmonary cells can be targeted using a particular pseudo-type. Advantageously, it may also be possible to put different genes into cells to treat other diseases. These may relate to diseases of the lungs, but it is possible also to use pulmonary cells to generate proteins to treat diseases arising in other areas of the body.

The present disclosure finds particular application in the monitoring, validation, and administration of such treatments. Embodiments of the disclosure may thus provide computer implemented methods of monitoring uptake of gene therapy in pulmonary cells, and tissue structures of the lung. Such methods may comprise segmenting tissue structures and/or identifying cell boundaries according to the methods described herein thereby to provide quantitative technical measures by which the progress/implementation of such therapies can be assessed. In addition to the cell types described herein, other types of cells, and other tissue structures may also be monitored using the methods and apparatus of the present disclosure.

The present disclosure relates to a computer image processing method that aims to detect and demarcate from each other single cells that have been spun onto a microscope slide which have been stained for immunohistochemistry/immunofluorescence (antibody-based protein detection).

The present disclosure also relates to a computer image processing method that aims to detect and demarcate tissue (epithelial) regions in thin tissue cross-sections, which have been stained for immunohistochemistry/immunofluorescence (antibody-based protein detection).

In an aspect there is provided a computer implemented method of identifying cell boundaries in a microscope image of DAPI stained cells, the method comprising: obtaining digital image data comprising a first colour channel of a microscope image of cells; identifying, based on the digital image data, locations of cell nuclei in the microscope image; determining, based on the locations of cell nuclei, a distance map corresponding to the microscope image and indicating a distance from each location in the map to a nearest nucleus; identifying, based on the distance map, for each of the locations of the cell nuclei a region surrounding each location, said region comprising pixels which are closer to the corresponding nucleus than to any other, thereby to provide a region map comprising a plurality of such regions; determining, for each region in the region map and based on the distance map, a cell boundary surrounding said corresponding nucleus; and, modifying the cell boundaries based on autofluorescence data corresponding to the microscope image.

There is also provided a computer implemented method of identifying airway epithelium in a microscope image of DAPI stained cells pulmonary tissue samples, the method comprising: obtaining digital image data comprising a first colour channel of a microscope image of cells; identifying, based on the digital image data, locations of cell nuclei in the microscope image; determining, based on the locations of cell nuclei, a distance map corresponding to the microscope image and indicating a distance from each location in the map to a nearest nucleus; identifying, based on the distance map, for each of the locations of the cell nuclei a region surrounding the each location comprising pixels which are closer to the corresponding nucleus than to any other, thereby to provide a region map comprising a plurality of such regions; determining, for each region in the region map and based on the distance map, a cell boundary surrounding said corresponding nucleus; determining, based on a second colour channel of the microscope image, a total cell area, and restricting the cell boundaries to the total cell area.

It will be appreciated in the context of the present disclosure that a map corresponding to the microscope image may comprise a plurality of pixels, each pixel of the map corresponding to a pixel, or contiguous group of pixels, in the microscope image (and microscope image data).

For example, the distance map may comprise pixels each of which spatially correspond to a respective corresponding pixel of a mask - such as a pixel at that same location in the mask. The mask may be generated from the image data and may identify the nuclei locations as the “foreground” areas of the mask. The pixels of the distance map may each comprise a value indicating a distance from the corresponding pixel in the mask to the nearest nucleus in the corresponding mask, for example this may comprise the distance from the nearest edge of the nearest nucleus. Likewise, the region map may spatially correspond to the microscope image and/or the mask identifying nuclei locations in the same way.

Identifying nuclei in the above aspects may comprise applying a threshold to generate a mask in which the nuclei are distinguishable from the rest of the mask - e.g. they are “foreground” pixels, and the rest is background. The mask may be binary. The threshold which generates the mask may be based on one of (a) a measure of entropy and (b) a measure of fuzziness in the image. One such threshold is a so-called Huang threshold. Determining a cell boundary surrounding said corresponding nucleus may comprise watershedding the distance map in each region of the region map. The cell nuclei locations may be used as the seed locations for this watershedding, and the watershedding in each region may be constrained by the region boundary. In other words, constrained such that the watershed line cannot be further from the nucleus than the region boundary. The cell boundary obtained from said watershedding may be dilated to obtain a maximal estimate of the cell boundary location, for example this may provide an estimate of the largest possible cell boundary associated with the nucleus.

This dilating may comprise a gradient flood. The dilating may be done by expanding the boundary by a selected margin - e.g. a selected distance from the nearest nucleus. This may be done based on the distance map, for example based on a gradient representation of the distance map.

The auto fluorescence data may be obtained from a different colour channel of the microscope image than the first colour channel, for example the first colour channel may comprise the blue colour channel, and the auto fluorescence data may be obtained from the green colour channel or from the total intensity data or from any other indication of autofluorescence.

Modifying the cell boundaries based on autofluorescence data may comprise applying a mask generated from image intensity in the green channel.

The second colour channel may comprise a channel configured to show contrast from a stain such as fast-red. The total cell area may be determined by thresholding of such a fast-red colour channel.

The present disclosure also provides a computer implemented method of identifying airway epithelium in a microscope image of DAPI stained cells pulmonary tissue samples, the method comprising: obtaining digital image data comprising a first colour channel of a microscope image of cells; identifying, based on the digital image data, locations of cell nuclei in the microscope image; identifying, based on the locations of cell nuclei in the microscope image a plurality of regions of interest, ROI, determining, for each ROI, a measure of the number density of nuclei in the ROI; obtaining, for each ROI, autofluorescence data for the ROI; and determining whether the each ROI comprises airway epithelium based on the measure of number density and the auto fluorescence data. The measure of number density may comprise an indication of where regions of relatively greater density of nuclei may be located in the image, for example to identify regions which are more dense with nuclei than some threshold, which may be based on a measure of the nuclei density in the image as a whole or at least a part thererof. The method may comprise merging regions of a mask identifying said locations of cell nuclei to cause formation of contiguous regions from neighbouring nuclei, which are closer to each other than a minimum selected distance. This may be implemented using morphological operations, and/or filtering operations having a structuring element or filter kernel adapted to provide merging of nuclei which are closer to each other than this selected minimum distance. Once neighbouring nuclei have been merged, ROIs may be formed, and the method may comprise selecting band ROIs. Each band ROI may comprise an area encompassing a prospective (e.g. candidate) epithelial structure formed from merged nuclei. At this stage, the foreground pixels within each ROI represent nuclei-associated pixels, over several area- coverage thresholds - for example, within the main processing loop (starting and looping over the initially identified ROIs), a first logical check may select for ROIs that have a mean pixel value over a selected threshold value, such as 95 (on an 8-bit scale, the maximum mean equals 255, when all pixels within the region are white).

The method may comprise expanding (e.g. increasing the thickness of) each such an ROI to provide a band of at least a selected minimum thickness.

Obtaining the digital image data in the methods described herein may further comprise operating a microscope, and a digital image capture device coupled to the optics of the microscope to capture the microscope image.

Embodiments provide a computer program product comprising program instructions configured to program a processor to perform the method of any preceding claim.

Embodiments provide a computer system, configured to perform any one or more of the methods described and/or claimed herein.

An embodiment provides a computer implemented method of digital image processing configured to segment areas and quantify molecular signals from typical experimental pulmonary samples. Embodiments provide novel algorithms regarding pixel manipulations.

Other embodiments are envisaged as will be appreciated by the skilled addressee in the context of the present disclosure.

Brief Description of Drawings Embodiments of the disclosure will now be described in detail with reference to the accompanying drawings, in which:

Figure 1 shows a computer apparatus according to the present disclosure;

Figure 2 comprises a first flow chart illustrating a computer implemented method of identifying cell boundaries in a microscope image of DAPI stained tissue;

Figure 3 is a second flow chart illustrating a computer implemented method of identifying and segmenting cell nuclei in a microscope image, which method may be used in implementations of the method illustrated in Figure 1;

Figure 4 is a series of images indicating the processing flow associated with the method of Figure 3;

Figure 5 shows an example of some digital microscope images obtained from a cytospin slide, and the subsequent image data and processing results obtained in methods of the present disclosure;

Figure 6 is a series of images indicating the processing flow of a method for use in identifying epithelia in images of pulmonary tissue samples; and

Figure 7 comprises a third flow chart illustrating a computer implemented method of identifying epithelial regions, e.g. in whole-tissue cross-sectional images of pulmonary tissue samples.

Specific Description Figure 1 shows a biological imaging apparatus, for obtaining digital image data corresponding to a microscope slide.

As illustrated in Figure 1 , this apparatus comprises a microscope 200 having a microscope stage 202 for supporting a slide 203 in a field of view of the microscope optics 204. Coupled to the microscope optics for capturing a digital image of the microscope slide, is image capture device 206, such as a digital camera. Communicatively coupled to the image capture device is data interface 208, and communication interface 212 such as a network or serial BUS or other appropriate communication means. The communication interface 212 connects the data interface 208 to a data store 210, and to the computer 214. The microscope 200, is configured to capture microscope images of a slide on the stage 202. The image capture device 206 comprises electronic image capture devices arranged to receive optical signals from the microscope optics, and to convert these signals into digital image data. The data interface 208 provides the digital image data to the communication interface 212, from where it can be stored, temporarily or permanently (e.g. in non-volatile storage) in the data store 210, or in a volatile memory of the computer such as RAM or a cache . The computer 214 is thus operable to obtain the image data either locally, or from the data store 210, or via the data interface from the microscope 200. Any one or more of such methods may be used.

In operation, the computer 214 may operate the microscope apparatus to capture microscope images to obtain corresponding digital image data, e.g. via the communication interface 212. The computer 214 then implements one or more of the methods described below with reference to the accompanying drawings.

It will be appreciated in the context of the present disclosure that the methods described herein may serve to improve both the processing speed, and reliability and accuracy of the quantitative image processing results provided by the computer 214 based on the microscope images captured by the microscope 200.

Figure 2 illustrates a flow chart of a computer implemented method of identifying cell boundaries in a microscope image of DAPI stained cells.

The method operates on digital image data, obtained 10 from a resource such as the data store 210 or the digital camera 206 coupled to a microscope objective for obtaining digital images of cells. The cells typically comprise pulmonary cells, such as those obtained from a cytospin preparation in which a cell suspension is centrifuged to isolate cells of interest for analysis. The cells of interest may be concentrated from the suspension by such techniques and deposited onto an area of a microscope slide for imaging. Typically a single colour channel is selected for initial processing. The example illustrated in Figure 5-A shows a cytospin preparation of human nasal epithelial cells, which was used to validate the methods of the present disclosure. In this micrograph, nuclei are stained with DAPI, vector-RNA is labelled with Fast-red chromogen, and the green channel is used for background (inc. autofluorescence) and, optionally, GFP detection.

In the methods described herein, once the digital image data has been obtained 10, the computer identifies 12 the locations of cell nuclei in the image. Most often, particularly in DAPI stained images, the blue colour channel is used for identifying cell nuclei as illustrated in Figure 5-A. As a preliminary step, the computer applies a noise reduction filter to reduce image noise if necessary. Suitable noise reduction filters may also be configured to preserve edges in the original image. Examples of such denoising filters include rolling ball filters and/or median filters, but other techniques to reduce background noise power may also be used. To identify the cell nuclei the computer then applies a threshold to the denoised data to provide a binary map using a threshold selected so that the cell nuclei are in the suprathreshold regions. One example of an appropriate method is to choose a threshold based on the fuzziness or entropy of the image. One type of fuzziness based thresholding is known as a Huang threshold (see Image Thresholding by Minimizing the Measures of Fuzziness; Liang-Kai Huang and Mao- Jiun J. Wang; Pattern Recognition, Vol 28, No 1, pp41-51, 1995).

The thresholded data can then be used to identify 12 the locations of the cell nuclei based on this binary map, for example based on the locations of supra threshold regions. Selection of the suprathreshold regions to identify 12 nuclei locations may be done in a variety of different ways. One useful approach is described below with reference to Figure 3, but other methods may be used. The result is a digital map which corresponds to the original image data.

The next step is that, for each pixel in this map, a distance is calculated indicating the distance from that pixel to the nearest cell nucleus in the map. This provides a distance map.

The distance map is then processed by the computer to provide 16 a region map. This region map comprises a plurality of regions, each surrounding a corresponding one of the cell nuclei locations. Each region in the map comprises only the pixels which are closer to the cell nucleus in that region than to any of the other cell nuclei. These regions may be referred to as Voronoi regions, and the region map may be referred to as a Voronoi tessellation. Methods of determining such regions will be apparent to the skilled addressee having read the present disclosure.

The computer then steps through each of these regions, and taking each region in turn determines 18, within each region, the position of a maximal cell boundary - that is to say a boundary within the region and centred on the nucleus of that region. To do this, an initial estimate of the cell boundary position is determined by a gradient flood watershed of the distance map. As will be appreciated by the skilled addressee in the context of the present disclosure, such a method may first determine a gradient map based on the (e.g. Euclidean) distance map, e.g. a gradient representation of the distance of each background pixel from a nucleus (e.g. in the case of an 8-bit integer scaling 1 pixel away from a nucleus = value of 254 [255-1] of course a different scaling may be used).. The computer then takes each nucleus as a seed for the watershed of this gradient map (the locations of the nuclei having been identified by the computer in the preceding step 12 of the method). The resulting watershed lines represent the segmentations between nuclei. For adjacent (or even originally joined) nuclei, this boundary may split the nuclei (if they have been individually seeded in the preceding steps). Because they divide adjacent nuclei even though they are not a true estimate of the cell boundary in a strict sense, these watershed lines may be referred to as an initial estimate of cell boundary.. The computer then dilates the boundaries obtained by watershedding the nuclei mask. This may comprise dilating the watershed lines by a selected dilation factor to generate the maximal cell boundary. This dilation factor may be chosen based on the data and/or based on characteristics of the tissue. A distance threshold (which may be heuristically defined) is used to highlight any background that is up to a selected distance (e.g. 20 pixels) away from a nucleus (remembering that the gradient map has allocated new values to background pixel based on their linear distance from a nearest nucleus).. The resultant dilated nucleus-mask is the “maximal cell boundary”. This may provide an indicator of the largest possible cell boundary which could reasonably be associated with each nucleus. Advantageously, constraining the watershed using the regions generated from the distance map (e.g. Voronoi regions, see above) may prevent the dilated watershed lines from adjacent regions from merging.

The computer then generates a mask from the original microscope image, optionally using a different colour channel than the channel used to identify the nuclei - for example the green channel may be used, or the combined intensity of all channels, to provide an indication of the autofluorescence. A mask generated from such autofluorescence data corresponding to the original microscope image is combined 20 with the maximal cell boundary data to identify regions which are both (a) within a maximal cell boundary, and (b) suprathreshold in the autofluorescence mask.

It has been found that this provides an accurate identification of the cell boundaries for detecting transduced cells. The computer may then apply the cell boundary masks to measure the intensity of the image in another colour channel. For example, the computer may determine the intensity of the fast-red colour channel (or optionally GFP) in each cell. This may enable the computer to identify, and to count, cells in which a supra-threshold intensity is measures in such a colour channel and/or those in which a sub-threshold intensity is measured. It will be appreciated in the context of the present disclosure that more sophisticated processing may be applied, not just a binary threshold. For example a number of different thresholds may be applied, for example to group cells into “high”, “medium” and “low” groups or other groups. Other approaches for cell classification and counting may be applied once the boundaries have been identified in this way. A pictorial representation of the process flow associated with Figure 2 is provided in Figure 5-B. The results of such processing are indicated in Figure 5-C. As noted above, the computer may determine the locations of the cell nuclei based on a binary map, which may be obtained from a thresholding method such as a Huang threshold.

Figure 3 indicates one method by which the computer may automatically segment cell nuclei. The input to this method is a binary image, which may be generated 122 from the thresholded blue colour channel data as described above (e.g. using a Huang threshold or other appropriate threshold). Other types of nucleus map may also be used. This may be pre- processed 120 to reduce background noise (e.g. using a median filter as also explained above). The computer first applies 124 an operation to the mask data to remove small groups of pixels from the “foreground” data - e.g. suprathreshold regions of the mask. This may be done by a morphological “opening” operation. As will be appreciated by the skilled addressee in the context of the present disclosure, an opening operation typically comprises an erosion followed by a dilation, and in most cases the same structuring element is used for both the erosion and the dilation. In the present case the structuring element used for this opening operation may comprise a nearest-neighbour threshold set to 2 pixels, such that the erosion (changing white pixel to black) will only be applied when there are 2 or more white pixels surrounding the pixel and the dilation (changing a black pixel to white) will only be applied when the pixel is surrounded by 2 or more black pixels. The structuring element is important here to effectively shape the outcome of the opening operation. In this case, as well as effectively separating barely touching nuclei, the opening parameters help to ‘round’ harsh edges of each nucleus, which lends itself well to the subsequent mask subtractions.. This provides an “opened” version of the input mask, labelled item A in the process flow shown in Figure 4.

The computer then applies an erosion configured to protect ‘straighter’ structural elements of the image. This may comprise applying a morphological filter having a straight elongate structuring element - e.g. a nearest-neighbour threshold set to 4, and repeated 7 times. Advantageously such straighter elements tend to represent boundary points between nuclei. One way to do this is to erode 126 the “opened mask”, in this second erode operation the nearest neighbour count used in the erode operation may be higher than the nearest neighbour count used in the erode operation of the of the “opening” of the preceding step 124. This provides an eroded mask, labelled item B, in the process flow shown in Figure 4.

The computer then subtracts 128 the eroded mask from the opened mask to provide a first edge mask. This first edge mask is labelled item C in the process flow shown in Figure 4.

The computer then applies 130 an edge detection process to the “opened” version of the input mask A. This may be done using a convolution kernel configured to identify edges or by any other appropriate edge detection algorithm such as may be apparent to the skilled addressee having read the present disclosure. This generates a second edge mask, labelled item D in the process flow shown in Figure 4.

The computer then subtracts 132 the second edge mask from the first edge mask to provide an edge difference mask, labelled item E in the process flow shown in Figure 4.

The edge difference mask is then eroded 134, this may be done using a nearest-neighbour threshold method, comprising a threshold of 8 pixels to remove isolated single pixels. The mask is subsequently dilated (e.g. using a nearest neighbour value of 2), and then skeletonised to provide a skeleton mask, labelled item F in the process flow shown in Figure 4.

The computer then sums 136 the edge difference mask and the skeleton mask, and skeletonises the sum of these two masks to provide a skeleton edge mask, which is labelled item G in the process flow shown in Figure 4. This skeletonising may be done by a convolution kernel such as a single pixel uni-directional dilation.

This skeleton edge mask is then subtracted 138 from the “opened mask” to provide an intermediate mask. The preceding steps of the method 124, 126, 128, 130, 132, 134, 136, 138 are then applied 140 to this intermediate mask to obtain a mask from which the locations of the nuclei can be identified. The map itself may provide these locations, or measures such as a “centre of mass” or centroid of the suprathreshold regions in the map may be used. Other approaches to provide such a map and/or to identify the nuclei locations in the map may be used in step 12 of the method described with reference to Figure 2.

Figure 4 provides a pictorial representation of the image data maps at each stage of the method described with reference to Figure 3 as referenced above.

Figure 5 provides a visual representation of the process as a whole, including constituent processing steps. Results have demonstrated accurate segmentation and reporter signal determination over threshold, even within highly dense cell areas.

Figure 5-A shows an an example micrograph (inset left) and accompanying montage representation of component colour channels (right) from a cytospin preparation of human nasal epithelial cells. Nuclei are stained with DAPI, vector-RNA is labelled with Fast-red chromogen, and the green channel is used for background (inc. autofluorescence) and, optionally, GFP detection.

Figure 5-B provides a pictorial representation of cell segmentation steps such as those described above with reference to Figure 2. These may include: • nuclei-channel median filter denoising;

• base thresholding;

• a seeded segmentation algorithm (see Figure 3 and accompanying description) to identify individual nuclei. This has been found to be reliable even amongst dense areas;

• a distance map based Voronoi tessellation to demarcate absolute boundaries between cells;

• a gradient fill of the distance map to an arbitrary exaggeration range, with the tessellation boundaries preventing cell area merging;

• followed by a green-channel limitation to restrict segmentation area to only cell area.

Figure 5-C shows example outputs of the tool after using the segmentation steps to measure Fast-red, or optionally GFP, in each cell. Image 1 (left), shows a binary readout indicating Fast-red negative cells surrounded by region-of-interest (ROI) and positive cells marked with a corresponding ROI. Optionally, the measured signal (Fast-red here) can be registered over a positive threshold, but the positive signal can be further stratified into ‘low’, ‘medium’, and ‘high’ levels, as shown in the same cells in image 2 (right). No scale bar depicted as segmentation and signal determination is processed independent of absolute scaling; Fast- red or GFP amount may be calculated based on the relative number (e.g. percentage) of positive cells present in the image.

Embodiments of the present disclosure may also provide methods of identifying particular structures in micrograph images of tissues, such as pulmonary tissue - e.g. mammalian pulmonary tissue.

Figure 6 illustrates a pictorial representation of one such method.

Figure 6-A shows an example micrograph (left) and accompanying montage representation of component colour channels (right) from an epithelia enclosed alveolar space from a histological preparation of mouse lung. Nuclei are stained with DAPI, vector-RNA is labelled with Fast-red chromogen, and the green channel is used for background (inc. autofluorescence) detection.

Figure 6-B shows the determination of total cell area based on thresholding of the fast-red channel data, and determination of an RNA positive signal based on a background subtraction (top-hat algorithm) and thresholding of the fast-red channel.

Figure 6C illustrates a process flow in which • a threshold is applied to the digital image data to the blue colour channel of the DAPI image to obtain a binary map;

• a seeded segmentation algorithm, such as that described with reference to Figure 3, is applied to this binary map to identify individual nuclei;

• a distance map is determined, and from that a region map such as a Voronoi tessellation to demarcate absolute boundaries between cells (e.g. as in step 14 and 16 of the method described with reference to Figure 2);

• applying a gradient flood watershed of the distance map, and dilating the obtained watershed line to obtain a maximal cell boundary (e.g. as in steps 18 and 20 of the method described with reference to Figure 2)

• the resultant area occupied by the maximal cell boundaries is then limited to a pre defined ‘total (cell) area’, which may be determined from thresholding of the fast-red channel data.

This approach has been found reliably to segment epithelial cells from micrographs of pulmonary tissues. In images segmented in this way, the pre-determined Fast-red positive signal may be measured in each segmented area to output the cell number normalised epithelia positive area %. Whilst this approach may be used to perform cell segmentation entirely automatically, in some embodiments the computer may comprise a user interface, such as a GUI, for obtaining user-identified boundaries. The algorithm summarised above may then be applied within these user-drawn boundaries. It has been found that, whilst wholly automatic segmentation is still useful even crude and quick user modifications may be even more beneficial.

Another such method will now be described with reference to Figure 7.

In this method, the computer first obtains the digital image data as explained above with reference to Figure 1 , the image data may comprise microscope images of pulmonary tissue stained with DAPI. The computer separates the image data into constituent colour channels.

The fast-red colour channel may be thresholded twice - once using a first threshold to provide a first fast-red mask indicating total cell area. The computer may also apply a background subtraction to the fast-red channel, such as a top-hat algorithm and threshold the result to provide a second fast red mask.

The computer then determines a mask to identify nuclei in the tissue. This may be done by applying a threshold to the blue colour channel to separate the blue channel data into “background” and “foreground” pixels. One way to do this is to use an adaptive thresholding method configured to increase (e.g. maximise) the inter-group variance of the pixel intensities of the foreground and background groups and/or to reduce (e.g. minimise) the intragroup variance of the foreground and background groups. One example of such a method is the so- called Otsu thresholding method. The thresholding may generate a binary mask, which the computer then denoises and processes so as to smooth the edges of structures - for example by applying a morphological “close” operation. Other smoothing may also be applied, such as a gaussian smoothing (spatial low-pass filtering with a gaussian kernel). These steps may be configured to cause neighbouring suprathreshold regions to merge together.

The computer then processes the resultant mask to identify contiguous groups of suprathreshold pixels in which adjacent bodies have been effectively denoised and merged by the preceding smoothing. Groups of pixels with values lower than a selected minimum number are then selected for the subsequent processing steps. The lower value pixels represent background elements of the image which have now been selected.

These selected groups of contiguous pixels, ROIs, are stored in memory, and for each ROI, the computer performs the following processing steps:

• enlarge the ROI by some margin - e.g. by some chosen number of pixels to abut contiguous nuclei. Remembering that, in the preceding step, we have selected background pixels as ROIs.

• Select pixels of a set distance from each ROI to define a new ROI band area; from the edge of the background body to the extension distance - to provide a group of pixels of at least a minimum width ( this group may comprise a band - e.g. a region around a group of background pixels which surrounds each cluster of merged nuclei.

• Use the band ROI to select pixels from the thresholded blue channel mask of nuclei image.

Optionally, determine a measure of the density of the cells. This may be done using intensity of the ROI pixels in the blue channel (such as a measure of central tendency, e.g. the mean). In the event that the density measure is not greater than a threshold level, the ROI may be excluded from subsequent processing.

• Determine, based on the pixels in the ROI a measure of the number and/or spatial density of nuclei present in the ROI (e.g. a band of pixels). This measure may be based on the integrated intensity (e.g. the sum of all the pixel intensities in the ROI) from the blue channel data. It may also be based on a measure of the percent area, or the total epithelial area that is occupied by the signal being measured, represented as a % of the total area.

• Determine, based on the number and/or spatial density whether the ROI is a candidate epithelial region. • In the event that the band ROI is not a candidate epithelial region, the computer selects the next ROI, enlarges it, ‘bands’ it, and uses the number and/or spatial density to determine whether it is a candidate.

• In the event that the band ROI is a candidate epithelial region, the computer enlarges the ROI and disregards all pixels outside the ROI to generate a modified ROI.

• The modified ROI is then used to determine a local threshold, for the ROI, based on the pixel intensities in the blue channel data.

• The computer then applies the local threshold to refine the ROI, and uses the refined ROI to select corresponding pixel intensity values from the blue channel data.

• The computer then determines based on these selected pixels a measure of the number and/or spatial density of nuclei present in the refined ROI, whether to reclassify the ROI as epithelia.

• In the event that this indicates that the refined ROI is to be reclassified as epithelia, the autofluorescence data in the refined ROI is used to determine whether or not the ROI comprises epithelia.

Histogram stretching and other contrast enhancing post processing may then be applied.

It will be appreciated in the context of the present disclosure that the structures identified in these methods may be used to select from the other colour channels, regions for further analysis such as cell counting and/or measurement of signal intensity. These measures of signal intensity may provide a measure of positive molecular signal within airway-epithelial areas-of- interest, across a whole cross-section of lung.

Any feature of any one of the examples disclosed herein may be combined with any selected features of any of the other examples described herein. For example, features of methods may be implemented in suitably configured hardware, and the configuration of the specific hardware described herein may be employed in methods implemented using other hardware.

It will be appreciated from the discussion above that the embodiments shown in the Figures are merely exemplary, and include features which may be generalised, removed or replaced as described herein and as set out in the claims. With reference to the drawings in general, it will be appreciated that schematic functional block diagrams are used to indicate functionality of systems and apparatus described herein. It will be appreciated however that the functionality need not be divided in this way, and should not be taken to imply any particular structure of hardware other than that described and claimed below. The function of one or more of the elements shown in the drawings may be further subdivided, and/or distributed throughout apparatus of the disclosure. In some embodiments the function of one or more elements shown in the drawings may be integrated into a single functional unit.

In some examples the controllers/processors which implement the present disclosure and the functionality described and claimed herein may be provided by any programmable processor or other such control logic. Examples include a general purpose processor, which may be configured to perform a method according to any one of those described herein. In some examples such a controller may comprise digital logic, such as field programmable gate arrays, FPGA, application specific integrated circuits, ASIC, a digital signal processor, DSP, or by any other appropriate hardware.

In some examples, one or more memory elements can store data and/or program instructions used to implement the operations described herein. Embodiments of the disclosure provide tangible, non-transitory storage media comprising program instructions operable to program a processor to perform any one or more of the methods described and/or claimed herein and/or to provide data processing apparatus as described and/or claimed herein. The controllers/ processors which implement the present disclosure may comprise an analogue control circuit which provides at least a part of this control functionality. An embodiment provides an analogue control circuit configured to perform any one or more of the methods described herein.

The above embodiments are to be understood as illustrative examples. Further embodiments are envisaged. It is to be understood that any feature described in relation to any one embodiment may be used alone, or in combination with other features described, and may also be used in combination with one or more features of any other of the embodiments, or any combination of any other of the embodiments.

Furthermore, equivalents and modifications not described above may also be employed without departing from the scope of the invention, which is defined in the accompanying claims, interpreted in accordance with Article 2 of the Protocol on the Interpretation of Article 69 EPC.

Claims

Claims:

1. A computer implemented method of identifying cell boundaries in a microscope image of DAPI stained cells, the method comprising: obtaining digital image data comprising a first colour channel of a microscope image of cells; identifying, based on the digital image data, locations of cell nuclei in the microscope image; determining, based on the locations of cell nuclei, a distance map corresponding to the microscope image and indicating a distance from each location in the map to a nearest nucleus; identifying, based on the distance map, for each of the locations of the cell nuclei a region surrounding the each location comprising pixels which are closer to the corresponding nucleus than to any other, thereby to provide a region map comprising a plurality of such regions; determining, for each region in the region map and based on the distance map, a cell boundary surrounding said corresponding nucleus; and, modifying the cell boundaries based on autofluorescence data corresponding to the microscope image.

2. A computer implemented method of identifying airway epithelium in a microscope image of DAPI stained cells pulmonary tissue samples, the method comprising: obtaining digital image data comprising a first colour channel of a microscope image of cells; identifying, based on the digital image data, locations of cell nuclei in the microscope image; determining, based on the locations of cell nuclei, a distance map corresponding to the microscope image and indicating a distance from each location in the map to a nearest nucleus; identifying, based on the distance map, for each of the locations of the cell nuclei a region surrounding the each location comprising pixels which are closer to the corresponding nucleus than to any other, thereby to provide a region map comprising a plurality of such regions; determining, for each region in the region map and based on the distance map, a cell boundary surrounding said corresponding nucleus; determining, based on a second colour channel of the microscope image, a total cell area, and restricting the cell boundaries to the total cell area.

3. The method of claim 1 or 2 wherein identifying nuclei comprises applying a threshold which is based on a measure of one of (a) entropy and (b) fuzziness.

4. The method of claim 3 wherein the threshold comprises a Huang threshold.

5. The method of any preceding claim wherein determining, for each region in the region map and based on the distance map, a cell boundary surrounding said corresponding nucleus comprises watershedding the distance map.

6. The method of claim 5 wherein the watershedding comprises using the cell nuclei as seed locations for the watershedding.

7. The method of claim 5 or 6 further comprising dilating the cell boundary obtained from said watershedding to obtain a maximal estimate of the cell boundary location.

8. The method of claim 7 wherein the dilating is performed by a gradient fill for example comprising the expansion of nuclei regions by a set number of pixels along the gradient representation of the distance of each background pixel from the nearest nucleus.

9. The method of claim 7 or 8 wherein the dilating comprises expanding the cell boundary by a selected margin.

10. The method of any of claims 7 to 9 wherein the dilation is constrained not to expand the cell boundary beyond the corresponding region of the region map.

11. The method of any preceding claim wherein the region map comprises a Voronoi tessellation.

12. The method of claim 1 or any preceding claim as dependent thereon wherein the auto fluorescence data is obtained from a different colour channel of the microscope image than the first colour channel, for example wherein the first colour channel comprises the blue colour channel, for example wherein the auto fluorescence data is obtained from the green colour channel.

13. The method of any preceding claim wherein the cells comprise pulmonary cells.

14. The method of claim 1 or any preceding claim as dependent thereon wherein modifying the cell boundaries based on autofluorescence data comprises applying a mask generated from image intensity in the green channel.

15. The method of claim 2, or any preceding claim as dependent thereon wherein the second colour channel comprises the fast red colour channel, for example wherein the total cell area is determined by thresholding of the fast-red colour channel.

16. A computer implemented method of identifying airway epithelium in a microscope image of DAPI stained cells pulmonary tissue samples, the method comprising: obtaining digital image data comprising a first colour channel of a microscope image of cells; identifying, based on the digital image data, locations of cell nuclei in the microscope image; identifying, based on the locations of cell nuclei in the microscope image a plurality of regions of interest, ROI, determining, for each ROI, a measure of the number density of nuclei in the ROI; obtaining, for each ROI, autofluorescence data for the ROI; and determining whether the each ROI comprises airway epithelium based on the measure of number density and the auto fluorescence data.

17. The method of claim 16 wherein identifying the ROIs comprises merging regions of a mask identifying said locations of cell nuclei to cause formation of contiguous regions from closely neighbouring nuclei.

18. The method of claim 16 or 17 wherein identifying the ROIs comprises expanding an elongate ROI to provide a band of at least a selected minimum thickness.

19. The method of any preceding claim wherein obtaining the digital image data comprises operating a microscope to capture the microscope image.

20. A computer program product comprising program instructions configured to program a processor to perform the method of any preceding claim.