METHOD FOR STORING AND RETRIEVING LARGE IMAGES VIA DICOM
Related Application
[01] The present application claims priority to United States provisional patent application serial number 60/806,438 filed on June 30, 2006, which is incorporated herein by reference in its entirety.
Background
1. Field of the Invention
[02] The present invention is generally related to digital pathology and more particularly related to
2. Related Art
[03] The Digital Imaging and Communication in Medicine ("DICOM") standard is maintained by the National Electronic Manufacturer's Association ("NEMA"), and is supported by large image management systems called Picture Archive and Communication Systems ("PACS"). PACS systems are used in hospitals and labs to store, archive, retrieve, search, and manage images used for clinical and research purposes in medicine, most typically for Radiology images such as radiography (e.g., X-Rays), computed tomography ("CT") scans, positron emission tomography ("PET"), and magnetic resonance imaging ("MRI"), but also for other modalities such as Ultrasonography, Cardiology, Endoscopy, and Mammography. A large number of clinical and laboratory instruments support DICOM-standard messaging as a means to communicate image information and store it in PACS systems.
[04] The field of pathology is undergoing a transformation in which digital imaging is becoming increasingly important. This transformation is fueled by the commercial availability of instruments for digitizing microscope slides, such as the Aperio ScanScope® described in U.S. Patent Number 6,711,283 which is incorporated herein by reference in its entirety. The whole-slide images ("WSI" or "digital slides") made by digitizing microscope slides at diagnostic resolution are very large. Frequently multiple images are created at varying resolutions to facilitate rapid panning and zooming for pathologists. The set of images comprising a single scanned digital slide are typically
stored as a single Tagged Image File Format ("TIFF") file. (TIFF is an open-source standard.) In addition to the size of digital slides, the access characteristics of these images differ from other images presently stored in PACS systems. Pathologists need the ability to rapidly pan and zoom when viewing images.
[05] Unfortunately, there are several limitations of the DICOM standard which impact storage of digital slides. These limitations include DICOM's use of signed 16-bit integers to store the pixel dimensions of images, therefore the maximum image dimensions which can be stored are 32K x 32K pixels. This is considerably smaller than a typical digital slide image, and two orders of magnitude smaller than an extremely large digital slide image. Additionally, DICOM uses signed 32-bit integers to store the object size of images, therefore the maximum compressed size of an image is 2GB. In actual practice many PACS systems are not capable of handling individual images this large; because these PACS systems frequently decompress image data in memory, thereby limiting the maximum uncompressed size of images to 2GB, they restrict the limit on compressed image size to something considerably smaller. Additionally, while DICOM provides the capability of accessing individual images in a series, as well as individual images in an entire series or entire study, it does not provide the capability of accessing subregions of an individual image. As noted above, the capability to access subregions is important to provide rapid panning and zooming.
[06] Furthermore, the DICOM standard does not make provision for large two- dimensional images such as the digital slides being created for pathology, nor does it incorporate a way to handle images that are logically divided into sub-regions, nor does it incorporate a way to handle multiple images at varying resolutions. The process for evolving the DICOM standard is well-defined but slow moving, and even after the standard is enhanced to support digital slides it will be years before PACS and instrument vendors implement the enhanced standard. In the meantime, a means of using the currently implemented DICOM standard for storing digital slides is needed. Therefore, what is needed is a system and method that facilitates use of the currently implemented DICOM standard for storing and viewing digital slides.
Summary
[07] Accordingly, to provide a solution for storing and retrieving large images via DICOM and to facilitate integration of digital pathology into hospitals and laboratories,
described herein are systems and methods that acquire digital slides and store these images into commercially available PACS systems using DICOM-standard messaging. Once a digital slide is stored in the PACS system, the PACS capabilities for storing, archiving, retrieving, searching, and managing images are leveraged for these new types of images. Additionally, because any particular case, or experiment may comprise images from multiple modalities, including radiology and pathology, all the images for a case or experiment could be managed together in a PACS system. Other features and advantages of the present invention will become more readily apparent to those of ordinary skill in the art after reviewing the following detailed description and accompanying drawings.
Brief Description of the Drawings
[08] The details of the present invention, both as to its structure and operation, may be gleaned in part by study of the accompanying drawings, in which like reference numerals refer to like parts, and in which:
[09] Figure 1 is a network diagram illustrating an example system for storing and retrieving large images via DICOM according to an embodiment of the present invention;
[10] Figure 2 is a block diagram illustrating an example image capture device according to an embodiment of the present invention;
[11] Figure 3 is a flow diagram illustrating an example process for image capture and storage according to an embodiment of the present invention;
[12] Figure 4 is a flow diagram illustrating an example process for image retrieval and viewing according to an embodiment of the present invention;
[13] Figure 5 is a block diagram illustrating a data mapping from a digital slide to a
DICOM study according to an embodiment of the present invention;
[14] Figure 6 is a block diagram illustrating an example computer system that may be used in connection with various embodiments described herein.
Detailed Description
[15] Certain embodiments as disclosed herein provide for storing and retrieving large images via DICOM. After reading this description it will become apparent to one skilled in the art how to implement the invention in various alternative embodiments and alternative applications. However, although various embodiments of the present invention will be described herein, it is understood that these embodiments are presented by way of example only, and not limitation. As such, this detailed description of various alternative
-A- embodiments should not be construed to limit the scope or breadth of the present invention as set forth in the appended claims.
[16] Fig. 1 is a network diagram illustrating an example system for storing and retrieving large images via DICOM according to an embodiment of the present invention. In the illustrated embodiment, the system comprises an image capture device 20 configured with a data storage area 25, a PACS system 30 configured with a data storage area 35, and a viewer station 40 configured with a data storage area 45. Each of these devices in the illustrated embodiment may be connected to the other by way of network 50 or by way of a direct connection, represented by the dashed lines in the figure. [17] As will be understood by those skilled in the art, the various devices can be implemented as physically separate devices or they may be combined in alternative ways such that certain functions are performed at one device while other functions are performed at a second device or a third device. These implementation choices are contemplated by the broad scope of the invention and the following discussion of a particular embodiment is therefore presented by way of example only and is not to be construed as limiting the claims in any fashion.
[18] The image capture device 20 can be any of a variety of digital slide image capture devices. The function of the image capture device is to scan a high resolution image of a microscope slide and also scan or otherwise create images of the same slide at lower resolutions. The digital slides may be stored in data storage area 25. [19] The PACS system 30 can be any of a variety of PACS systems. The function of the PACS system is to store imagery data using the DICOM standard. DICOM stores images as two-dimensional arrays of pixels. Multiple images may be part of a series, and multiple series may be part of a study. From there studies may be part of a case, and multiple cases may be stored for a given patient and laboratory. This organization has its root in Radiology imaging. Radiology modalities like CT-scans and MRI capture studies which comprise multiple series of images separated in space and/or time. The individual images are typically small and manageable, although in aggregate an entire study comprising tens of series and hundreds of images may be much larger. [20] The viewer station 40 is configured to allow a user to view images from the PACS system 30 or the image capture device 20. The viewer station 40 may be connected directly to the image capture device 20 and the PACS system 30 or it may be connected
indirectly, for example through network 50. The viewer station is configured with a data storage area 45.
[21] Fig. 2 is a block diagram illustrating an example image capture device 20 according to an embodiment of the present invention. In the illustrated embodiment, the device 20 comprises an image capture module 100, a DICOM module 110, and an image viewer module 120. These modules can be implemented in a single device as shown or in separate devices as desired.
[22] The image capture module 100 is configured to manage the image capture process in order to scan a physical microscope slide and create a digital slide that may have one or more levels of resolution. The DICOM module 110 is configured to logically arrange a digital slide into many smaller individual regions, and store each individual region as a DICOM image. A plurality of DICOM images that are all at the same resolution are identified together by the DICOM module 110 as a DICOM series. If a digital slide is translated into multiple DICOM series (e.g., if it includes multiple images at varying resolutions), then the digital slide is stored as a DICOM study, and each set of DICOM images at a particular resolution are stored together as a separate DICOM series in the DICOM study. The following table shows this mapping:
Pathology object DICOM object digital slide DICOM study
Image at varying resolution DICOM series
2D subregion of image DICOM image
[23] In one embodiment, a digital slide is captured under the control of the image capture module 100, for example by scanning a region of a microscope slide with dimensions 20mm x 10mm, at a resolution of .25 microns/pixel. The resulting image will have dimensions 80,000 x 40,000 pixels. Another image with a resolution of 1 micron/pixel may be created from this with dimensions 20,000 x 10,000 pixels, and another image with a resolution of 4 microns/pixel may be created with dimensions 5,000 x 2,500 pixels, and finally a fourth image with a resolution of 16 microns/pixel may be created with dimensions 1,250 x 750 pixels.
[24] The DICOM module 110 then takes each of these images (each is at a different resolution) and logically arranges each image into regions of dimension 500 x 500 pixels, which is a typical dimension for DICOM images (e.g. as created for various Radiology modalities). The following table shows the resultant DICOM objects:
[25] Note when images are not an even multiple of the region size, the DICOM module 110 truncates the regions at the right and/or bottom edge of the image. For example, on the 16 micron/pixel image the regions at the right edge are 250 x 500 pixels, the regions on the bottom edge are 500 x 250 pixels, and the region in the bottom right corner is 250 x 250 pixels. The DICOM standard allows images in a series to have varying dimensions. [26] The image viewer module 120 is configured to allow a user to view digital slide image data that is retrieved from a PACS system. The image viewer module 120 may be integral with the image capture device as shown or it may be located on a viewer station 40, or portions of the image viewer module may reside on one device (e.g., an image server) while other portions may reside on a separate device (e.g., a viewing station). The image viewing module 120 may comprise in part a viewing application, algorithm processing application, digital slide management system, or the like. [27] The TIFF file format typically used for storing digital slides in a single file enables specification of individual image dimensions and other information used to reassemble the image for later viewing or processing. Advantageously, the DICOM standard allows a similar function. In one embodiment, the DICOM module 110 is configured to store image attribute information (e.g., dimensions of regions, number of regions, resolution, magnification, etc.) in the header for each DICOM series, as a DICOM series corresponds to one resolution in a digital slide per the first table. Upon retrieval, this image attribute information from the header is used by the image viewer module 120 (e.g., at the image capture device 20, on the viewer station 40, or at an image server (not shown)) to facilitate reassembly of the digital slide image from the various DICOM images stored as the series. Similarly, the DICOM module 110 may also store digital slide attribute information in the header for a DICOM study. DICOM series and DICOM image attribute information are
typically stored in various DICOM headers as provided in the current version of the specification.
[28] Using this technique, a pathology digital slide or other large two-dimensional image may be stored in a PACS system and viewed from a PACS system as if it were a CT-Scan, MRI, or other radiology image. Each digital slide is a DICOM study, each DICOM study comprises a DICOM series corresponding to a resolution "level" of the digital slide, and each DICOM series comprises DICOM regions from the original image. A digital slide stored as a DICOM study may be associated with a DICOM case as any other study from another modality, and a DICOM case may comprise many studies from many different modalities. All study-oriented functions of the PACS system will work with these studies as they do for any other.
[29] Each DICOM image may be retrieved and viewed separately but in order to view the pathology digital slide as a whole, the entire study must be retrieved and each DICOM series reassembled from its component DICOM images. For some applications it may not be necessary to access the pathology image at its highest resolution, in which case only the series corresponding to the required resolution level need be retrieved. [30] The image data in the stored DICOM images which comprise the regions of the digital slide may have varying color channels and depths. The data storage and retrieval technique is independent of the number and size of the color channels of each pixel. Typical pixel organizations include 12-bit or 16-bit grayscale, and 3 x 8 = 24-bit and 4 x 8 = 32-bit color.
[31] The image data in the stored DICOM images which comprise the regions of the digital slide may be compressed. The techniques described herein are independent of the type and quality of the compression applied. Images may be stored with no compression ("raw"), or with lossless compression such as LZW (Lempel-Ziv- Welch), or with lossy compression such as JPEG or JPEG2000.
[32] Fig. 3 is a flow diagram illustrating an example process for image capture and storage according to an embodiment of the present invention. Initially, in step 200 the microscope slide is scanned to create an image. Next, in step 210 the image is logically divided into regions, each of which are created as a DICOM image. Header information is included in the DICOM image header to orient the image data with respect to the other regions of the digital slide. All of the DICOM images that have the same resolution are then collected together and identified as a DICOM series, as shown in step 220. Then in
step 230 a plurality of DICOM series are assembled to create a DICOM study, which is an equivalent structure to a digital slide. The DICOM study is then stored in the PACS or other data storage area as desired.
[33] Fig. 4 is a flow diagram illustrating an example process for image retrieval and viewing according to an embodiment of the present invention. Initially, in step 300, a server receives a viewing request identifying a DICOM study or portion thereof that is desired for viewing. Next, in step 310 the corresponding DICOM image data is obtained from the PACS system. This may be done by obtaining the entire DICOM study or if the request identified a subset of the DICOM study, then the header information in the DICOM study, series, and images can be consulted to identify the appropriate data to return in response to the request. Once the image data have been returned, the alignment information in the header can be consulted to orient the image data relative to other image data, as seen in step 320 and then in step 330 the image data is displayed on the viewer station or other viewing apparatus or application to which the image data has been directed.
[34] Fig. 5 is a block diagram illustrating a data mapping from a digital slide to a DICOM study according to an embodiment of the present invention. In the illustrated embodiment, the digital slide data structure is represented by the pyramid 450, which has multiple levels that each correspond to a different resolution. The baseline image 480 is the native resolution of the scanner while the intermediate levels and thumbnail level have reduced resolution.
[35] In one embodiment, each level of a digital slide can be logically divided into separate regions, for example regions that are 500 x 500 pixels. Each of these regions can then be stored as a single DICOM image and collectively, all of the regions at the same resolution level in the digital slide pyramid (i.e., data structure) are identified in the PACS system as being in the same DICOM series. As shown in the illustrated embodiment, there are four resolution levels for this digital slide so there are also four DICOM series in the DICOM study that is stored in the PACS system. This correlation and the header information that is stored in the PACS system advantageously allow a digital slide to be reconstructed on a viewing platform after retrieval of imagery data from a PACS system. [36] Fig. 6 is a block diagram illustrating an example computer system 550 that may be used in connection with various embodiments described herein. For example, the computer system 550 may be used in conjunction with an image capture device, PACS
system, or viewer station as described with respect to Fig. 1. However, other computer systems and/or architectures may be used, as will be clear to those skilled in the art. [37] The computer system 550 preferably includes one or more processors, such as processor 552. Additional processors may be provided, such as an auxiliary processor to manage input/output, an auxiliary processor to perform floating point mathematical operations, a special-purpose microprocessor having an architecture suitable for fast execution of signal processing algorithms (e.g., digital signal processor), a slave processor subordinate to the main processing system (e.g., back-end processor), an additional microprocessor or controller for dual or multiple processor systems, or a coprocessor. Such auxiliary processors may be discrete processors or may be integrated with the processor 552.
[38] The processor 552 is preferably connected to a communication bus 554. The communication bus 554 may include a data channel for facilitating information transfer between storage and other peripheral components of the computer system 550. The communication bus 554 further may provide a set of signals used for communication with the processor 552, including a data bus, address bus, and control bus (not shown). The communication bus 554 may comprise any standard or non-standard bus architecture such as, for example, bus architectures compliant with industry standard architecture ("ISA"), extended industry standard architecture ("EISA"), Micro Channel Architecture ("MCA"), peripheral component interconnect ("PCI") local bus, or standards promulgated by the Institute of Electrical and Electronics Engineers ("IEEE") including IEEE 488 general- purpose interface bus ("GPIB"), IEEE 696/S-100, and the like.
[39] Computer system 550 preferably includes a main memory 556 and may also include a secondary memory 558. The main memory 556 provides storage of instructions and data for programs executing on the processor 552. The main memory 556 is typically semiconductor-based memory such as dynamic random access memory ("DRAM") and/or static random access memory ("SRAM"). Other semiconductor-based memory types include, for example, synchronous dynamic random access memory ("SDRAM"), Rambus dynamic random access memory ("RDRAM"), ferroelectric random access memory ("FRAM"), and the like, including read only memory ("ROM").
[40] The secondary memory 558 may optionally include a hard disk drive 560 and/or a removable storage drive 562, for example a floppy disk drive, a magnetic tape drive, a compact disc ("CD") drive, a digital versatile disc ("DVD") drive, etc. The removable
storage drive 562 reads from and/or writes to a removable storage medium 564 in a well- known manner. Removable storage medium 564 may be, for example, a floppy disk, magnetic tape, CD, DVD, etc.
[41] The removable storage medium 564 is preferably a computer readable medium having stored thereon computer executable code (i.e., software) and/or data. The computer software or data stored on the removable storage medium 564 is read into the computer system 550 as electrical communication signals 578.
[42] In alternative embodiments, secondary memory 558 may include other similar means for allowing computer programs or other data or instructions to be loaded into the computer system 550. Such means may include, for example, an external storage medium 572 and an interface 570. Examples of external storage medium 572 may include an external hard disk drive or an external optical drive, or and external magneto-optical drive. [43] Other examples of secondary memory 558 may include semiconductor-based memory such as programmable read-only memory ("PROM"), erasable programmable read-only memory ("EPROM"), electrically erasable read-only memory ("EEPROM"), or flash memory (block oriented memory similar to EEPROM). Also included are any other removable storage units 572 and interfaces 570, which allow software and data to be transferred from the removable storage unit 572 to the computer system 550. [44] Computer system 550 may also include a communication interface 574. The communication interface 574 allows software and data to be transferred between computer system 550 and external devices (e.g. printers), networks, or information sources. For example, computer software or executable code may be transferred to computer system 550 from a network server via communication interface 574. Examples of communication interface 574 include a modem, a network interface card ("NIC"), a communications port, a PCMCIA slot and card, an infrared interface, and an IEEE 1394 fire-wire, just to name a few.
[45] Communication interface 574 preferably implements industry promulgated protocol standards, such as Ethernet IEEE 802 standards, Fiber Channel, digital subscriber line ("DSL"), asynchronous digital subscriber line ("ADSL"), frame relay, asynchronous transfer mode ("ATM"), integrated digital services network ("ISDN"), personal communications services ("PCS"), transmission control protocol/Internet protocol ("TCP/IP"), serial line Internet protocol/point to point protocol ("SLIP/PPP"), and so on, but may also implement customized or non-standard interface protocols as well.
[46] Software and data transferred via communication interface 574 are generally in the form of electrical communication signals 578. These signals 578 are preferably provided to communication interface 574 via a communication channel 576. Communication channel 576 carries signals 578 and can be implemented using a variety of wired or wireless communication means including wire or cable, fiber optics, conventional phone line, cellular phone link, wireless data communication link, radio frequency (RF) link, or infrared link, just to name a few.
[47] Computer executable code (i.e., computer programs or software) is stored in the main memory 556 and/or the secondary memory 558. Computer programs can also be received via communication interface 574 and stored in the main memory 556 and/or the secondary memory 558. Such computer programs, when executed, enable the computer system 550 to perform the various functions of the present invention as previously described.
[48] In this description, the term "computer readable medium" is used to refer to any media used to provide computer executable code (e.g., software and computer programs) to the computer system 550. Examples of these media include main memory 556, secondary memory 558 (including hard disk drive 560, removable storage medium 564, and external storage medium 572), and any peripheral device communicatively coupled with communication interface 574 (including a network information server or other network device). These computer readable mediums are means for providing executable code, programming instructions, and software to the computer system 550. [49] In an embodiment that is implemented using software, the software may be stored on a computer readable medium and loaded into computer system 550 by way of removable storage drive 562, interface 570, or communication interface 574. In such an embodiment, the software is loaded into the computer system 550 in the form of electrical communication signals 578. The software, when executed by the processor 552, preferably causes the processor 552 to perform the inventive features and functions previously described herein.
[50] Various embodiments may also be implemented primarily in hardware using, for example, components such as application specific integrated circuits ("ASICs"), or field programmable gate arrays ("FPGAs"). Implementation of a hardware state machine capable of performing the functions described herein will also be apparent to those skilled
in the relevant art. Various embodiments may also be implemented using a combination of both hardware and software.
[51] Furthermore, those of skill in the art will appreciate that the various illustrative logical blocks, modules, circuits, and method steps described in connection with the above described figures and the embodiments disclosed herein can often be implemented as electronic hardware, computer software, or combinations of both. To clearly illustrate this interchangeability of hardware and software, various illustrative components, blocks, modules, circuits, and steps have been described above generally in terms of their functionality. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the overall system. Skilled persons can implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the invention. In addition, the grouping of functions within a module, block, circuit or step is for ease of description. Specific functions or steps can be moved from one module, block or circuit to another without departing from the invention.
[52] Moreover, the various illustrative logical blocks, modules, and methods described in connection with the embodiments disclosed herein can be implemented or performed with a general purpose processor, a digital signal processor ("DSP"), an ASIC, FPGA or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination thereof designed to perform the functions described herein. A general-purpose processor can be a microprocessor, but in the alternative, the processor can be any processor, controller, microcontroller, or state machine. A processor can also be implemented as a combination of computing devices, for example, a combination of a DSP and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a DSP core, or any other such configuration. [53] Additionally, the steps of a method or algorithm described in connection with the embodiments disclosed herein can be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module can reside in RAM memory, flash memory, ROM memory, EPROM memory, EEPROM memory, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium including a network storage medium. An exemplary storage medium can be coupled to the processor such the processor can read information from, and write information to, the
storage medium. In the alternative, the storage medium can be integral to the processor.
The processor and the storage medium can also reside in an ASIC.
[54] The above description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles described herein can be applied to other embodiments without departing from the spirit or scope of the invention. Thus, it is to be understood that the description and drawings presented herein represent a presently preferred embodiment of the invention and are therefore representative of the subject matter which is broadly contemplated by the present invention. It is further understood that the scope of the present invention fully encompasses other embodiments that may become obvious to those skilled in the art and that the scope of the present invention is accordingly limited by nothing other than the appended claims.