CN115206498A - Data stream processing method of digital pathological image - Google Patents

Data stream processing method of digital pathological image Download PDF

Info

Publication number
CN115206498A
CN115206498A CN202210891998.3A CN202210891998A CN115206498A CN 115206498 A CN115206498 A CN 115206498A CN 202210891998 A CN202210891998 A CN 202210891998A CN 115206498 A CN115206498 A CN 115206498A
Authority
CN
China
Prior art keywords
image
sub
slice
data stream
byte
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210891998.3A
Other languages
Chinese (zh)
Inventor
陈李粮
常亮亮
熊迪
单玲政
汪进
陈睿
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Severson Guangzhou Medical Technology Service Co ltd
Original Assignee
Severson Guangzhou Medical Technology Service Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Severson Guangzhou Medical Technology Service Co ltd filed Critical Severson Guangzhou Medical Technology Service Co ltd
Priority to CN202210891998.3A priority Critical patent/CN115206498A/en
Publication of CN115206498A publication Critical patent/CN115206498A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H30/00ICT specially adapted for the handling or processing of medical images
    • G16H30/20ICT specially adapted for the handling or processing of medical images for handling medical images, e.g. DICOM, HL7 or PACS
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/51Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/53Querying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Library & Information Science (AREA)
  • Software Systems (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Radiology & Medical Imaging (AREA)
  • Epidemiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Primary Health Care (AREA)
  • Public Health (AREA)
  • Medical Treatment And Welfare Office Work (AREA)

Abstract

The present disclosure describes a method for data stream processing of digital pathology images, providing a computing system with a data stream of a plurality of digital pathology images, including preparing a storage environment; the method comprises the steps that a digital pathological image is obtained and stored in a first storage unit of an image storage unit, metadata of the digital pathological image is read by the first storage unit, the digital pathological image is partitioned into a plurality of sub-blocks of image sub-slices comprising adjacent areas in a byte order on the basis of the metadata, then the sub-blocks are stored, sub-slice information of each image sub-slice is generated, and the metadata and the sub-slice information are stored in a metadata unit; after receiving a data request for acquiring a data stream, acquiring a data stream corresponding to at least one image sub-slice based on the metadata, the byte offset of the image sub-slice, and the byte size. Therefore, the digital pathological images can be stored and shared based on the data stream quickly and at low cost.

Description

Data stream processing method of digital pathological image
The present application is a divisional application of a processing system of a data stream, which is filed on 2021, 11/08, and has an application number of 2021113158009, entitled digital pathology image.
Technical Field
The present disclosure relates generally to a method of data stream processing of digital pathology images.
Background
At present, digital pathological image analysis increasingly approaches scene and specialization, and a pathological analysis system gradually becomes a tool for assisting clinical pathological analysis based on ultra-definition digital pathological images and deep learning technologies such as artificial intelligence technology. The digital pathological image is obtained by scanning a traditional pathological slide by using a digital scanner, acquiring a digital pathological image with high resolution, and splicing and integrating the obtained local regional images. Compared with the traditional pathological slide, the digital pathological image has a plurality of advantages in the aspects of storage management, teaching, remote diagnosis, image repeatability and the like, and well solves the problems that the traditional pathological slide is easy to damage, fade and lose, and is difficult to copy and retrieve.
At present, digital pathological images are often stored in hospital internal storage or scanner control host workstations of hospitals in a specific format and are checked by matching with specific client software, or image sub-slices of the digital pathological images are stored according to a hierarchy based on an open-source distributed storage system, and a cache is designed for accelerating the access of the image sub-slices.
However, the amount of information of the digital pathological images is large, the capacity of a single file is large, the digital pathological images are stored in a hospital or stored in a scanner control host workstation, which is often not favorable for data sharing and inconvenient for capacity expansion, and the storage of tens of thousands to hundreds of thousands of unequal image sub-slices according to the hierarchy in the other mode often consumes a large amount of hardware resources and time, and the performance requirement of reading and writing of small files on a hard disk is high, and a relatively expensive solid state disk is usually required to be equipped. In addition, the randomness is often high when the digital pathological images are viewed on line, and the data of the whole digital pathological image is often needed to be analyzed when the digital pathological images are analyzed in an artificial intelligence manner, so that the cache hit rate is low. Therefore, storage and sharing of digital pathology images also presents a significant challenge.
Disclosure of Invention
The present disclosure has been made in view of the above circumstances, and an object thereof is to provide a data stream processing system for digital pathology images, which can store and share digital pathology images based on a data stream quickly and at low cost.
To this end, the present disclosure provides a processing system for a data stream of digital pathological images, which is a processing system for providing a data stream of a plurality of digital pathological images for a computing system, the computing system having a plurality of computing tasks, the computing system buffering the data stream corresponding to each digital pathological image so that the plurality of computing tasks perform computational analysis on each digital pathological image in parallel by multiplexing the data stream, and deleting the data stream from the buffer after completing the computational analysis of each digital pathological image, the processing system including: an acquisition unit for acquiring the digital pathology image and storing the digital pathology image in an image storage unit; the image storage unit comprises a first storage unit, the first storage unit is used for reading metadata of the digital pathological image, acquiring byte size of image sub-slices of the digital pathological image based on the metadata, partitioning the digital pathological image at byte level based on the byte size of the image sub-slices and in byte order to acquire a plurality of sub-partitions, storing the plurality of sub-partitions and generating sub-slice information of each image sub-slice, wherein each sub-partition comprises a plurality of image sub-slices of adjacent regions, the byte size of the sub-partition is larger than the byte size of the data stream, and the sub-slice information comprises the sub-partition where the image sub-slice is located and byte offset relative to the sub-partition where the image sub-slice is located; a metadata unit for recording the metadata and the sub-slice information; and the analysis unit is used for acquiring each piece of sub-slice information in at least one image sub-slice corresponding to the data stream from the metadata unit, and then reading the data stream corresponding to the at least one image sub-slice from the sub-block once from the first storage unit on the basis of the byte offset in each piece of sub-slice information in the at least one image sub-slice and the byte size of the image sub-slice in a manner of sharing a handle opened by the sub-block and returning the data stream to the computing system. Under the condition, the digital pathological image is blocked based on bytes according to the byte sequence, the encoding and decoding operations of the image are avoided, the consumption of hardware resources can be reduced, the generation of massive small files can be avoided, the difficulty of file addressing is reduced, the performance requirement on a storage medium can be further reduced, and the data stream corresponding to the image sub-slice can be acquired based on the byte offset, so that the data block with the byte size of the image sub-slice under the corresponding byte offset can be acquired under the condition that the sub-blocks are not required to be opened integrally, and the load can be further reduced. Therefore, the digital pathological images can be stored and shared based on the data stream quickly and at low cost. In addition, image sub-slices with similar pixel areas can be stored in the same sub-block as much as possible, so that conversion of random reading and sequential reading can be realized locally, and the number of times that the sub-blocks are opened can be reduced. Thereby, the data capacity, performance and cost can be adapted reasonably.
In the processing system according to the present disclosure, it is preferable that the first storage unit sequentially divides the data stream corresponding to the image sub-slice into the plurality of sub-blocks in byte order when the digital pathology image is partitioned, and fills a remaining space of one sub-block with blank data when the remaining space is insufficient to store the data stream corresponding to one image sub-slice. In this case, subsequent single image sub-slices can be acquired from one sub-partition, and the partitioning process can be simplified.
In addition, in the processing system according to the present disclosure, optionally, the image storage unit further includes a second storage unit configured to store a full map of the digital pathology image, and the sub-slice information further includes a byte offset from the full map of the digital pathology image. Thus, the image sub-slices can be subsequently conveniently read from the full map of the digital pathology image according to the byte offset.
In addition, in the processing system according to the present disclosure, optionally, the parsing unit determines the at least one image sub-slice that needs to be read according to the data request; if the digital pathological image is stored in the first storage unit, acquiring a storage path of a sub-block where the image sub-slice is located and a byte offset relative to the sub-block based on sub-slice information of each image sub-slice, and then reading a data block with a byte size of the image sub-slice under the byte offset from the storage path to serve as a data stream corresponding to the image sub-slice; if the digital pathological image is stored in the second storage unit, acquiring a storage path of a full graph of the digital pathological image and a byte offset relative to the full graph of the digital pathological image, and then reading a data block with a byte size of the image sub-slice under the byte offset from the storage path to serve as a data stream corresponding to the image sub-slice; and responding to the data request with a data stream corresponding to the at least one image sub-slice. This enables a data stream corresponding to an image sub-slice to be acquired based on the byte offset.
In addition, in the processing system related to the present disclosure, optionally, the metadata unit is further configured to record a blocking policy and blocking information, where the blocking information includes a blocking number and a storage path of each sub-block. Therefore, the blocking strategy can be conveniently adjusted and the blocking information can be conveniently managed.
In addition, in the processing system according to the disclosure, optionally, the processing system further includes a service registration unit, where the service registration unit is configured to receive a service registration request, register and manage a service, and send the service registration request to the service registration unit to register the service registration unit after the image storage unit and the metadata unit are started. Therefore, the blocking strategy can be conveniently adjusted and the blocking information can be conveniently managed.
In addition, in the processing system according to the present disclosure, optionally, the metadata includes a color channel of the digital pathology image, an image level, and a pixel width and height of each image sub-slice of the image level, a byte size of the image sub-slice is the pixel width and height of the image sub-slice multiplied by the number of the color channels, and the sub-slice information further includes a byte size of the image sub-slice and an image level to which the image sub-slice belongs. In this case, information of the image sub-slice corresponding to each image level can be acquired quickly, and further, the image sub-slice corresponding to each image level can be acquired.
In addition, in the processing system according to the present disclosure, optionally, the first storage unit is further configured to acquire and store a macro map, a tag map, and a thumbnail of the digital pathology image, and the processing system further includes a data interface configured to receive the data request, where the data interface includes at least one of a deep zoom interface configured to acquire an image hierarchy that matches a size of a field of view of a display device, a metadata interface configured to acquire the metadata, a macro map interface configured to acquire the macro map, a thumbnail interface configured to acquire the thumbnail, a tag map interface configured to acquire the tag map, a tile interface configured to acquire the image sub-slice, and a target area interface configured to acquire a data stream corresponding to at least one image sub-slice or at least one image sub-slice in a target area. In this case, it is possible to easily acquire the relevant data of the digital pathology image in different usage scenes.
In addition, in the processing system related to the present disclosure, optionally, the processing system further includes a protocol adaptation unit, where the protocol adaptation unit adapts a reusable or non-reusable request manner based on a usage scenario of the data interface, where the reusable request manner includes a http 2-based grpc protocol request and a socket protocol-based request, and the non-reusable request manner includes an http request and a restful api request. In this case, the serialization requirements in computational analysis can be satisfied, and the overhead of connection establishment can be reduced. In addition, the method can be applied to random access for retrieval, browsing, zooming and view change.
In addition, in the processing system according to the present disclosure, optionally, when the parsing unit reads the digital pathology image, the handle opened by the digital pathology image is shared and a data stream corresponding to the at least one image sub-slice is read at a time. In this case, the data stream corresponding to at least one image sub-slice can be read quickly and the number of times the full map is opened can be reduced.
According to the present disclosure, a data stream processing system for digital pathology images is provided that is capable of storing and sharing digital pathology images based on data streams quickly and at a low cost.
Drawings
The disclosure will now be explained in further detail by way of example only with reference to the accompanying drawings, in which:
fig. 1 is a schematic diagram illustrating a digital pathology image to which an example of the present disclosure relates.
Fig. 2 is a schematic diagram illustrating an environment of a processing system for data flow of digital pathology images to which examples of the present disclosure relate.
Fig. 3 is a block diagram illustrating an example of a processing system to which examples of the present disclosure relate.
Fig. 4 is a block diagram showing an image storage unit according to an example of the present disclosure.
Fig. 5 is a block diagram illustrating another example of a processing system to which examples of the present disclosure relate.
Fig. 6 is a flowchart illustrating an example of a data stream processing method of a digital pathology image according to an example of the present disclosure.
Detailed Description
Hereinafter, preferred embodiments of the present disclosure will be described in detail with reference to the accompanying drawings. In the following description, the same components are denoted by the same reference numerals, and redundant description thereof is omitted. The drawings are schematic and the ratio of the dimensions of the components and the shapes of the components may be different from the actual ones.
It is noted that the terms "comprises," "comprising," and "having," and any variations thereof, in this disclosure, for example, a process, method, system, article, or apparatus that comprises or has a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include or have other steps or elements not expressly listed or inherent to such process, method, article, or apparatus. All methods described in this disclosure can be performed in any suitable order unless otherwise indicated herein or otherwise clearly contradicted by context.
The processing system of the data stream of the digital pathological image can store and share the digital pathological image based on the data stream quickly and with low cost. The system for processing a data stream of digital pathology images to which the present disclosure relates may also be sometimes referred to as a processing system, a data stream server, an image processing system, or an image sharing system. The processing system to which the present disclosure relates may be stateless, may operate on a stand-alone basis and may be scalable to accommodate changes in storage requirements or data retrieval requirements. The data retrieval may be by remotely or locally obtaining a full or partial image of the digital pathology image to review the digital pathology image or by processing the digital pathology image (e.g., computational analysis using a data stream of the digital pathology image). The processing system to which the present disclosure relates is particularly suited for the invocation of batch or streaming data streams in the artificial intelligence processing and/or computational analysis of digital pathology images.
The processing system to which the present disclosure relates may provide a data stream of a plurality of digital pathology images to a computing system. The computing system may be an artificial intelligence technology-based system, and may perform computational analysis on the digital pathology image to obtain a lesion region and/or a classification result of the digital pathology image. In some examples, the computing system may have a plurality of computing tasks, the computing system may cache a data stream corresponding to each digital pathological image such that the plurality of computing tasks perform the computational analysis on each digital pathological image in parallel by multiplexing the data stream, and after the computational analysis of each digital pathological image is completed, the data stream may be deleted from the cache. In general, digital pathology images are very large, and computing systems may involve training of a large number of samples of digital pathology images or simultaneous recognition of multiple digital pathology images. In particular, it also often involves multiple computational tasks to perform computational analysis on the full or partial images of each digital pathology image in parallel. In this case, the computing system is provided with a data stream of the digital pathology image, which can be conveniently cached to avoid opening handles of the digital pathology image or the sub-blocks multiple times, which in turn can reduce the overhead of the data provider (e.g., processing system). Thereby, highly concurrent data stream requests can be supported.
Fig. 1 is a schematic diagram illustrating a digital pathology image to which an example of the present disclosure relates.
The digital pathology image to which the present disclosure relates may be a pathology image obtained by scanning a pathology slide by a digital scanner. In some examples, the digital pathology image may be at least one of a black and white image, a grayscale image, and a color image (e.g., an RGB image).
In addition, the digital pathology image may be a pyramid image having different resolutions (i.e., the digital pathology image may include images of multiple resolutions). As a schematic of the digital pathology image, fig. 1 shows a schematic of the digital pathology image. As shown in fig. 1, the digital pathology image may have a plurality of image levels. For example, the number of the plurality of image levels may be n, and the n image levels may include an image level L1, an image level L2, an image level L3, \8230 \ 8230;, an image level Ln-2, an image level Ln-1, and an image level Ln.
In addition, the resolution of the images at each image level may be different. In practical applications, images of suitable image levels can be acquired according to the use scene for processing and/or displaying. In some examples, the digital pathology image may also be an image of one image level. The pathology analysis system according to the present disclosure may be applied to a digital pathology image having images in a plurality of image levels, and may also be applied to a digital pathology image having an image in one image level.
In some examples, the images of the respective image levels have a plurality of image sub-slices (image sub-slices may also be referred to as tiles). That is, each image level may include a plurality of image sub-slices. The image sub-slices may typically be tens of kb (kilobytes) or several kb in size. In some examples, the size of the image sub-slice in the digital pathology image may be obtained by extracting metadata from the digital pathology image. That is, the size of the image sub-slice in the digital pathology image may be obtained using the metadata. In some examples, a file header of the digital pathology image may be parsed by a toolkit provided by a vendor of an imaging device of the digital pathology image to obtain the metadata. The metadata may be used to describe attributes of the digital pathology image, thereby enabling convenient positioning of image sub-slices in the digital pathology image.
In some examples, the metadata may include at least one of color channels of the digital pathology image, image levels, and pixel widths and heights of individual image sub-slices of the image levels. In some examples, the image levels of the digital pathology image and the size of the image sub-slices of the respective image levels may be obtained using the metadata. In some examples, the pixel width and height of an image sub-slice in the metadata may be multiplied by the number of color channels to obtain the byte size of the image sub-slice. Thereby, the byte size of the image sub-slice can be acquired based on the metadata.
In some examples, the metadata may further include at least one of a file name, a file size, and a scanning time (i.e., an imaging time) of the digital pathology image. In some examples, the metadata may further include at least one of a scanner manufacturer, a scan magnification, a number of total image sub-slices, a unit pixel area, an image compression rate, an image quality, a color coding, a pixel width and height of the entire image, a pixel width and height of a non-empty region of the entire image, a coincident pixel between image sub-slices, and a number of image sub-slices of each image level. In addition, the scan magnification may represent a magnification of a corresponding numerical value. For example, the scan magnification may be 40 magnifications. In addition, the unit pixel area may be the size of the area of the body part corresponding to the pixel. In addition, the color coding may correspond to color channels. For example, if the color channels are three color channels of R, G, and B, the color coding may be RGB.
In addition, digital pathology images are typically very large, such as Whole Slice Images (WSI), and the size of a WSI Image may be 600Mb to 10Gb, so conventional Image storage and sharing methods are generally not suitable for digital pathology images. For example, the storage of digital pathology images often places high demands on the performance of the storage medium, and the sharing of digital pathology images is also easily limited by the efficiency of reading and transmitting the digital pathology images.
Fig. 2 is a schematic diagram illustrating an environment of a processing system for data flow of digital pathology images to which examples of the present disclosure relate. In addition, the environment described in the examples of the present disclosure is for more clearly illustrating the technical solutions of the present disclosure, and does not constitute a limitation on the technical solutions provided by the present disclosure.
In some examples, the processing system to which the present disclosure relates may be applied in an environment 100 as shown in fig. 2. The environment 100 may include a client device 102, a storage device 104, a metadata device 106, and an interface device 108 communicatively coupled to each other via a network. The computing devices that may be used to implement the client device 102, the storage device 104, the metadata device 106, and the interface device 108 may be configured in various ways. In some examples, the computing device may include, but is not limited to, a mobile device, a laptop computer, a desktop computer, and the like. In some examples, the computing device may be a server. For example, the computing device may be a cloud server. Additionally, the computing device may represent a plurality of different devices. For example, the computing device may represent a plurality of cloud servers.
In the environment 100, the acquisition unit 110 of the client device 102 may be used to acquire a digital pathology image (see fig. 2) and store the digital pathology image to the storage device 104. The image storage unit 120 of the storage device 104 reads the metadata at the time of the digital pathology image storage, may perform byte-level blocking on the digital pathology image based on the metadata and in byte order to obtain a plurality of sub-blocks (see fig. 2), wherein each sub-block includes a plurality of image sub-slices of adjacent regions, then stores the plurality of sub-blocks and generates sub-slice information of each image sub-slice, and stores the metadata and the sub-slice information to the metadata unit 130 of the metadata device 106.
Additionally, in environment 100, data retrieval may be performed on stored digital pathology images. Specifically, after the interface device 108 receives the data request, the parsing unit 140 may obtain the sub-slice information of the at least one image sub-slice from the metadata device 106 based on the data request, and obtain the at least one image sub-slice and/or the data stream corresponding to the at least one image sub-slice from the storage device 104 based on the sub-slice information.
The present disclosure relates to a processing system, which incorporates the application scenario of digital pathology images, and allows for reading digital pathology images that are typically ordered when the computing system performs computational analysis. For example, starting from the start point of a specific zoom magnification (i.e., a specific image level) of the digital pathological image, the images of the corresponding portions are read one by one from the digital pathological image in a certain size until the entire reading is completed. In addition, in the medical field, scenes such as film reading and consultation mainly related to data reading with randomness are considered, and the performance requirement on bottom layer storage is very high. Accordingly, the present disclosure is directed to a processing system that, when storing a digital pathology image, byte-level chunking and storing the digital pathology image in byte order based on metadata, causes image sub-slices having similar pixel regions (i.e., a plurality of image sub-slices of adjacent regions) to be stored in (i.e., on) the same sub-tile as much as possible. In this case, the conversion of random reading to sequential reading is locally achieved. Thereby, a relative adaptation of capacity, performance and cost can be met.
Fig. 3 is a block diagram illustrating an example of a processing system to which examples of the present disclosure relate.
As shown in fig. 3, in some examples, the processing system may include an acquisition unit 110, an image storage unit 120, a metadata unit 130, and an analysis unit 140. In some examples, the number of image storage units 120 may be one or more. The plurality of image storage units 120 may constitute a distributed file storage system. In some examples, the number of metadata units 130 may be one or more. The plurality of metadata units 130 may constitute a distributed storage system. In some examples, the number of parsing units 140 may be one or more. Multiple parsing units 140 may support deployment in a load-balanced manner. In this case, the data request can be distributed to the respective parsing units 140. This can improve the load capacity.
In some examples, as shown in fig. 3, the processing system may include an acquisition unit 110. The acquisition unit 110 may be configured to acquire and store a digital pathology image.
In some examples, the manner in which the acquisition unit 110 acquires the digital pathology image may include at least one of client program file listening upload, upload through a shared path, upload through a visualization interface (e.g., upload through a browser or client), file copy upload (e.g., upload through a usb or mobile hard drive copy), and scanner upload. Therefore, the digital pathological image can be acquired in various ways. In addition, the client program file monitoring uploading can monitor the change of the file system through the file I/O operation to acquire the digital pathological image. Specifically, the file uploading may be implemented by recursively monitoring changes of file events in a specific storage path, where the changes of the files may be monitored, and the changes of the file directories may also be monitored. In some examples, only changes to the digital pathology image may be listened to by defining an extension (which may also be referred to as a suffix) of the file. This can improve efficiency. Additionally, a scanner upload may be a scanner communicating with a processing system to upload digital pathology images obtained by the scanner.
In some examples, the acquisition unit 110 may acquire digital pathology images of different disease species. In some examples, the disease species may include, but are not limited to, cervical fluid-based cytology, thyroid cytology, urothelial cytology, histopathology, and pleural effusion cytology, among others.
In some examples, the acquisition unit 110 may store the digital pathology image to the image storage unit 120 (described later). The image storage unit 120 may include a first storage unit 121 and/or a second storage unit 122. In some examples, the acquisition unit 110 may store the digital pathology image to the first storage unit 121 and/or the second storage unit 122 of the image storage unit 120.
In some examples, the acquisition unit 110 may store the digital pathology image under a shared path of the image storage unit 120. In some examples, the shared path may be implemented through a shared mount. For example, the shared path may be mounted under a corresponding directory of the image storage unit 120. In some examples, the manner of implementing shared mount may include, but is not limited to, implementation by SAMBA software (SAMBA software is a free software for implementing SMB protocol on Linux and UNIX systems, SMB or Server Messages Block, information service Block), NFS (Network File System) protocol, ISCSI (Internet Small Computer System Interface) protocol, and the like. In some examples, the shared path may be a mount directory within an organization, such as a hospital, that stores shared access paths and/or distributed file storage.
In some examples, as shown in fig. 3, the processing system may include an image storage unit 120. The image storage unit 120 may be used to store digital pathology images. In some examples, the image storage unit 120 may be used to store a full map and/or sub-tiles of the digital pathology image. In some examples, the image storage unit 120 may be used to block and/or directly store digital pathology images. In this case, a plurality of storage modes can be supported, and the digital pathological image can be stored in blocks while being compatible with a third-party storage system (for example, an existing storage system in a hospital) based on the two storage modes. Therefore, the storage cost of the third-party storage system can be considered, and the ever-increasing storage requirements can be met.
In some examples, the storage location of the digital pathology image may be maintained by a metadata unit 130 (described later). Specifically, the metadata unit 130 may record a storage medium (e.g., the first storage unit 121 and/or the second storage unit 122) of the digital pathology image, and a storage path.
Fig. 4 is a block diagram illustrating the image storage unit 120 according to the present disclosure example.
In some examples, as shown in fig. 4, the image storage unit 120 may include a first storage unit 121 and/or a second storage unit 122. In some examples, image storage unit 120 may support a distributed deployment. That is, the first storage unit 121 and/or the second storage unit 122 in the image storage unit 120 may have a plurality of storage systems, collectively providing a stored service. It is to be noted that the following description of the image storage unit 120 is equally applicable to the first storage unit 121 and the second storage unit 122 unless there is a contradiction.
In some examples, the image storage unit 120 may store the digital pathology image based on metadata of the digital pathology image. In some examples, the metadata may be read by image storage unit 120. In some examples, the metadata may be read by the image storage unit 120 at the time of writing of the digital pathology image. In some examples, the image storage unit 120 may generate an image number for uniquely identifying the digital pathology image when storing the completed digital pathology image. In some examples, the image number may be calculated by combining the file name, file size (i.e., byte volume), and imaging time of the digital pathology image in the metadata. In some examples, the image numbering is irreversible. For example, the file name, file size, and imaging time of a digital pathology image cannot be inferred from the image number.
In some examples, first storage unit 121 may be a native distributed file storage system and second storage unit 122 may be a storage system mounted based on the NFS protocol, SAMBA protocol, or ISCSI protocol (also may be referred to as a third-party storage system). Under the condition, the storage requirements of the digital pathological images under different environments can be met, and the storage requirements of different stages can be met.
As described above, in some examples, the image storage unit 120 may include the first storage unit 121. In some examples, the first storage unit 121 may be used to store the digital pathology image in blocks. In addition, in the blocking storage, the digital pathology image may be blocked to obtain a plurality of sub-blocks, and then the plurality of sub-blocks may be stored.
In some examples, the first storage unit 121 may perform block storage of the digital pathology image according to a blocking policy. In some examples, the partitioning policy may include a byte size of the sub-partition. In this case, the digital pathology image can be segmented in a certain byte size. In some examples, the byte size of the sub-tiles may be set according to the byte size of the digital pathology image. For example, the byte size of the sub-partition may be positively correlated with the byte size of the digital pathology image. This can further reduce the number of sub-blocks. In some examples, the blocking policy may also include a blocking order. For example, the chunking order may include at least one of positive endian, negative endian, hierarchical image, or custom. In some examples, the blocking policy may also include a blocking range. The segmentation range may specify data blocks in the digital pathology image that may be used for segmentation.
As described above, the byte size of the image levels of the digital pathology image and the respective image sub-slices of the respective image levels can be acquired using the metadata. In some examples, the first storage unit 121 may perform byte-level blocking of the digital pathology image based on a byte size of an image sub-slice to acquire a plurality of sub-blocks and then store the plurality of sub-blocks at the time of writing of the digital pathology image. In this case, the digital pathology image is blocked on a byte basis, the encoding and decoding operations of the image are avoided, and the consumption of hardware resources (e.g., CPU) can be reduced. In addition, the method can avoid generating massive small files, reduce the difficulty of file addressing, further reduce the performance requirements on the storage medium, and achieve higher-performance data storage and data retrieval on common mechanical storage media.
In some examples, the first storage unit 121 may partition the digital pathology image by dividing the corresponding image sub-slice into corresponding sub-partitions. In some examples, in the partitioning, a byte offset of each image sub-slice with respect to the full view of the digital pathology image may be obtained based on the file size of the digital pathology image, the image level, the number of image sub-slices of each image level, and the byte size of each image sub-slice, and then a data segment corresponding to the image sub-slice is obtained based on the byte size of the image sub-slice and the byte offset with respect to the full view of the digital pathology image, thereby enabling the data segment (i.e., a data stream corresponding to the image sub-slice) to be partitioned into sub-partitions. This enables the digital pathology image to be segmented.
In some examples, the first storage unit 121 may block the digital pathology image at a byte level in byte order. Specifically, the data stream corresponding to the image sub-slice may be sequentially divided into a plurality of sub-partitions in byte order. In some examples, the byte order may be in a positive or negative order.
In some examples, the first storage unit 121 may sequentially divide the data stream corresponding to the image sub-slices into a plurality of sub-blocks in byte order when the digital pathology image is partitioned, and may fill a remaining space with blank data when the remaining space of one sub-block is not enough to store the data stream corresponding to one image sub-slice. That is, the image sub-slices in each sub-block may be complete. In this case, a subsequent single image sub-slice can be acquired from one sub-partition, and the partitioning process can be simplified. In other examples, when the remaining space of one sub-tile is not enough to store one image sub-slice, the remaining portion of the image sub-slice may also be stored into another sub-tile. In this case, a single image sub-slice can correspond to one or more sub-tiles. Thus, the storage space can be saved.
In other examples, the partitioning may also occur out of byte order. The byte streams corresponding to multiple image sub-slices at a particular location, e.g., at the same location for each image level, may be merged to generate multiple sub-tiles. For example, image sub-slices from 1 st to 50 th of each image level may be merged. It should be noted that the specific location is not particularly limited, and is related to a specific blocking strategy. In this case, if the blocking policy is adjusted according to the data retrieval condition, a sub-block that more matches the data retrieval requirement can be generated.
In some examples, the first storage unit 121 may generate sub-slice information of each image sub-slice after the plurality of sub-blocks are completed. In some examples, the sub-slice information may be recorded to a metadata unit 130 (described later). Thus, the image sub-slice can be subsequently conveniently acquired based on the sub-slice information.
Specifically, the first storage unit 121 may read metadata at the time of writing of the digital pathology image, multiply the pixel width and height of the image sub-slices in the metadata by the number of color channels to acquire the byte size of the image sub-slices based on the metadata, and byte-level block-divide the digital pathology image based on the byte size of the image sub-slices and in byte order to acquire a plurality of sub-blocks, and then store the plurality of sub-blocks and generate sub-slice information of each image sub-slice. Thereby, the digital pathology image can be byte-level blocked based on the metadata and the sub-slice information can be generated.
In other examples, the sub-slice information may not be generated at the time of storage, and may be obtained based on metadata and a blocking policy at the time of requesting acquisition of the image sub-slices.
As described above, the digital pathology image may be segmented to obtain a plurality of sub-segments. In some examples, each sub-tile may include a plurality of image sub-slices. In this case, the number of blocks and the blocking time can be reduced, and the storage path of the sub-blocks can be simplified, thereby improving the efficiency of storing and reading the sub-blocks. In addition, the performance requirement on the storage medium is low, and the storage cost can be reduced.
In some examples, each sub-tile may include a plurality of image sub-slices of a neighboring region, and the byte size of the sub-tile may be greater than the byte size of a data stream provided to a data consumption system (e.g., a computing system). In this case, image sub-slices with similar pixel regions can be stored in the same sub-block as much as possible, and thus the conversion from random reading to sequential reading can be locally achieved. Thereby, the number of times the child block is opened can be reduced.
In some examples, the byte size of the individual sub-blocks may be the same. This can simplify the blocking process. In some examples, the byte size of each sub-partition may be equal to or greater than a preset size. For example, the predetermined size may be 16M, 32M, 64M, or 128M, etc. In this case, the byte size of each sub-chunk is much larger than the size of the data stream requested by the computing system. Thus, the data stream corresponding to one request can be read in one sub-block as much as possible.
In some examples, each sub-partition has corresponding partition information. In some examples, the blocking information may be generated after the first storage unit 121 stores the plurality of sub-blocks. In some examples, the partition information may include partition numbers and storage paths of the respective sub-partitions. In this case, subsequently after determining the sub-partition in which the image sub-slice is located, the corresponding sub-partition can be read according to the storage path.
In some examples, the first storage unit 121 may be further configured to acquire and store at least one of a macro map, a tag map, and a thumbnail of the digital pathology image. In addition, the macro-map may be an image that reflects the actual appearance of the pathology slide (i.e., the image seen by the human eye). In addition, the label map may be an image with a label for identifying a pathology number, such as a two-dimensional code. In addition, the macro map and the tag map may be acquired from a specific location in the entire digital pathology image or may be directly acquired through a separately stored location. The manner in which the macro map and the label map are obtained is dependent on the manufacturer producing the scanner.
As described above, in some examples, the image storage unit 120 may include the second storage unit 122. In some examples, the second storage unit 122 may be used to directly store the digital pathology image. In addition, in the direct storage, the entire map of the digital pathology image may be stored. In some examples, the sub-slice information may be generated after storing the full map of the digital pathology image. In other examples, the sub-slice information may not be generated at the time of storage, and may be obtained based on metadata at the time of the request to acquire the image sub-slices.
As described above, in some examples, the sub-slice information may be generated when storing the digital pathology image. In some examples, if the digital pathology image is block-stored through the first storage unit 121, the sub-slice information may include a sub-block where the image sub-slice is located and a byte offset from the sub-block where the image sub-slice is located. Therefore, the image sub-slice and/or the data stream corresponding to the image sub-slice can be read from the sub-block according to the byte offset conveniently. In some examples, if the digital pathology image is directly stored through the second storage unit 122, the sub-slice information may include a byte offset from a full map of the digital pathology image. Therefore, the image sub-slice and/or the data stream corresponding to the image sub-slice can be read from the full map of the digital pathological image according to the byte offset conveniently. In some other examples, the sub-slice information may include a sub-tile in which the image sub-slice is located, a byte offset from the sub-tile in which it is located, and a byte offset from the full map of the digital pathology image. Thereby, it is possible to support the acquisition of image sub-slices and/or data streams corresponding to image sub-slices from the sub-blocks and the overall image of the digital pathology image.
In some examples, the sub-slice information may also include a byte size of the image sub-slice. Thus, the image sub-slice can be conveniently acquired subsequently according to the byte size and the byte offset.
In some examples, the sub-slice information may also include an image level to which the image sub-slice belongs. In this case, information of the image sub-slice corresponding to each image level can be acquired quickly, and further, the image sub-slice corresponding to each image level can be acquired.
In some examples, as shown in fig. 3, the processing system may include a metadata unit 130. The metadata unit 130 may be used to record data other than the file stream of the digital pathology image. In some examples, metadata unit 130 may support a distributed deployment. In some examples, the metadata unit 130 may be used to record metadata read by the image storage unit 120 (i.e., the first storage unit 121 and/or the second storage unit 122). In this case, the image sub-slices can subsequently be acquired based on the recorded metadata, and the number of reads of the digital pathology image or sub-blocks can be reduced.
In some examples, the metadata unit 130 may also be used to record sub-slice information. In this case, subsequent reading of image sub-slices can be facilitated and adjustment of the blocking strategy can be accommodated. In some examples, for the sub-slice information corresponding to the first storage unit 121, the metadata unit 130 may record a primary key, a secondary key, a sub-partition number where the image sub-slice is located, and an offset amount with respect to the sub-partition, wherein the primary key may be an image number of the digital pathology image, and the secondary key may be composed of an image level and a number corresponding to the image sub-slice. In this way, the sub-slice information of one image sub-slice can be uniquely specified from the primary key and the secondary key.
As described above, in some examples, the sub-slice information may also include a byte size of the image sub-slice. The metadata unit 130 may also be used to record the byte size of the image sub-slices.
In some examples, the metadata unit 130 may also be used to record node information of the image storage unit 120. In some examples, the node information may include access addresses, access protocols, and replica information. In this case, the data stored in the image storage unit 120 can be acquired based on the access protocol of the image storage unit 120. In addition, the access address may include an IP address or domain name, and a port. In addition, the copy information may include a location where a copy of the data stored in the image storage unit 120 is located. In some examples, the node information may further include a type of the storage medium (i.e., the first storage unit 121 or the second storage unit 122). Thus, the corresponding setting can be made according to the performance of the storage medium. In some examples, the metadata unit 130 may also be used to record a blocking policy and blocking information. Therefore, the blocking strategy can be conveniently adjusted and the blocking information can be conveniently managed. In some examples, the metadata unit 130 may also be used to record the storage path of the full map of digital pathology images.
In some examples, as shown in fig. 3, the processing system may include a parsing unit 140. Parsing unit 140 may be configured to parse the received data request to determine the data requested by the data request.
In some examples, the data requested by the data request may be data related to a digital pathology image. The related data of the digital pathology image may be data acquired based on a file stream of the digital pathology image (e.g., a data stream of the digital pathology image).
In some examples, the related data of the digital pathology image may include at least one of metadata, a macro map, a thumbnail, a label map, a tile map (i.e., an image sub-slice), and target region data. The format of the related data may include, but is not limited to, numbers, text, pictures, or data streams, among others. In addition, the macro map, the tag map, and the thumbnail may be extracted and stored when the image storage unit 120 (e.g., the first storage unit 121) stores the digital pathology image. Additionally, the target region data may be data comprised of at least one image sub-slice within the target region. The target region can be a region in the digital pathological image and can be determined according to an actual use scene. For example, when the digital pathology image is browsed, the target region may be a region within a visual field range, and when the calculation analysis is performed, the target region may be a region for calculation analysis in the digital pathology image. In some examples, the target region may be a region of interest in the digital pathology image or a region cropped from the digital pathology image.
In some examples, the data request may include an identification of the digital pathology image (e.g., an image number or a storage path of the digital pathology image). Thus, the requested data can be determined to be an image sub-slice corresponding to one digital pathology image. In some examples, the data request may include an identification of the digital pathology image and an image hierarchy. Thereby, the requested data can be determined to be at least one image sub-slice of the respective image level. In some examples, the data request may include an identification of the digital pathology image, an image level, and a location of the image sub-slice. Thereby, the requested data can be determined to be at least one image sub-slice of the respective location. In some examples, the data request may include information of an identification of the digital pathology image, the image hierarchy, and the target region. The information of the target area may include a start coordinate, a width, a height, and the like of the target area. Thereby, the requested data can be determined to be at least one image sub-slice corresponding to the target region.
In some examples, the parsing unit 140 may retrieve at least one image sub-slice from the image storage unit 120 if the data requested by the data request includes at least one image sub-slice (i.e., the data request is a request including retrieving at least one image sub-slice).
In some examples, at least one image sub-slice may be acquired from the sub-blocks stored by the first storage unit 121. In some examples, at least one image sub-slice may be acquired from the full map of the digital pathology image stored by the second storage unit 122. In some examples, at least one image sub-slice may be retrieved from the first storage unit 121 by a distributed file storage client. In some examples, at least one image sub-slice may be acquired from second storage unit 122 via the NFS, SAMBA, or ISCSI protocols.
In some examples, parsing unit 140 may obtain a data stream corresponding to at least one image sub-slice from image storage unit 120. In some examples, the parsing unit 140 may read a full map or sub-tiles of the digital pathology image from the image storage unit 120 to obtain a data stream corresponding to at least one image sub-slice. In some examples, when reading the full map or the sub-blocks of the digital pathology image from the image storage unit 120, the handle opened by the full map or the sub-blocks of the digital pathology image may be shared and the data stream corresponding to at least one image sub-slice may be read at a time. In this case, the data stream corresponding to at least one image sub-slice can be read quickly and the number of times the full picture or sub-partition is opened can be reduced.
In some examples, parsing unit 140 may obtain sub-slice information for at least one image sub-slice from metadata unit 130 and obtain a data stream corresponding to the at least one image sub-slice from first storage unit 121 and/or second storage unit 122 based on the sub-slice information. This enables a data stream corresponding to an image sub-slice to be easily acquired.
In some examples, the parsing unit 140 may obtain the sub-slice information of the at least one image sub-slice from the metadata unit 130 and then obtain a data stream corresponding to the at least one image sub-slice from the digital pathology image stored by the image storage unit 120 based on the byte offset in the sub-slice information and the byte size of the image sub-slice. In this case, byte-sized data blocks of the image sub-slices at the corresponding byte offsets can be acquired without the need to open the full map or sub-blocks of the digital pathology image in their entirety. This can reduce the load. In some examples, the byte size of the image sub-slices may be obtained based on metadata or recorded in the metadata unit 130 by sub-slice information in storing the digital pathology image.
In some examples, the parsing unit 140 may open a handle of a full map or sub-tiles of the digital pathology image to obtain a pointer of the file stream, move the pointer of the file stream to a location corresponding to the byte offset, and read a corresponding byte-sized image sub-slice starting at the location. This enables reading of a data stream corresponding to an image sub-slice. In some examples, a total byte offset corresponding to at least one image sub-slice may be calculated and a data stream corresponding to the at least one image sub-slice may be obtained based on the total byte offset.
In some examples, when providing the data stream of the plurality of digital pathology images for the computing system, the parsing unit 140 obtains each sub-slice information in at least one image sub-slice corresponding to the data stream from the metadata unit 130, and then reads the data stream corresponding to the at least one image sub-slice from the full view or sub-block of the digital pathology image at a time from the first storage unit 121 or the second storage unit 122 in a manner of sharing a handle of a sub-block or full view opening of the digital pathology image based on a byte offset in each sub-slice information in the at least one image sub-slice and a byte size of the image sub-slice and returns to the computing system. In this case, byte-sized data blocks of the image sub-slices at the corresponding byte offsets can be acquired without the need to open the full map or sub-blocks of the digital pathology image in their entirety. This can reduce the load. In some examples, the data stream may correspond to an image region. That is, after receiving a data request for acquiring a data stream by a computing system, the parsing unit 140 may acquire, from the metadata unit 130, each sub-slice information in at least one image sub-slice corresponding to the target area based on the target area corresponding to the data stream, and then acquire, based on each sub-slice information, a data stream corresponding to at least one image sub-slice corresponding to the target area and return to the computing system.
In some examples, for the first storage unit 121, the parsing unit 140 may obtain sub-slice information of at least one image sub-slice from the metadata unit 130, and then obtain a data stream corresponding to the at least one image sub-slice from the sub-blocks stored by the first storage unit 121 based on a byte offset of the relative sub-block in the sub-slice information and a byte size of the image sub-slice.
In some examples, for the second storage unit 122, the parsing unit 140 may obtain the sub-slice information of the at least one image sub-slice from the metadata unit 130, and then obtain a data stream corresponding to the at least one image sub-slice from the full map of the digital pathology image stored by the second storage unit 122 based on a byte offset of the full map of the relative digital pathology image in the sub-slice information and a byte size of the image sub-slice. In some examples, the byte offset from the full map of the digital pathology image may be obtained based on the metadata or recorded in the metadata unit 130 by sub-slice information in the stored digital pathology image.
In other examples, only the metadata may be stored when storing the digital pathology image, without generating the sub-slice information. The byte offset (i.e., the byte offset of an image sub-slice relative to the full view and/or sub-blocks of the digital pathology image) and the byte size of the image sub-slice may be obtained based on the metadata, and then the data stream corresponding to at least one image sub-slice may be obtained from the digital pathology image stored in the image storage unit 120 based on the byte offset and the byte size of the image sub-slice.
In some examples, parsing unit 140 may respond to the data request with a data stream corresponding to at least one image sub-slice. In this case, the computational analysis can be performed directly on the basis of the data stream. This can improve the efficiency of the calculation analysis. In other examples, the data stream corresponding to at least one image sub-slice may be converted into an image. For example, the image may be a jpeg image or a png image. Therefore, the consulting is convenient.
In some examples, if the digital pathology image is stored in the first storage unit 121, the parsing unit 140 may obtain a storage path of a sub-partition where the image sub-slice is located and a byte offset from the sub-partition based on sub-slice information of each image sub-slice, and then read a byte-sized data block of the image sub-slice at the byte offset from the storage path as a data stream corresponding to the image sub-slice; if the digital pathology image is stored in the second storage unit 122, the parsing unit 140 may acquire a storage path of the full map of the digital pathology image and a byte offset from the full map of the digital pathology image, and then read a byte-sized data block of an image sub-slice at the byte offset from the storage path as a data stream corresponding to the image sub-slice; and the parsing unit 140 responds to the data request with a data stream corresponding to the at least one image sub-slice. This enables the data stream corresponding to the image sub-slice to be acquired based on the byte offset.
Fig. 5 is a block diagram illustrating another example of a processing system to which examples of the present disclosure relate.
As described above, the parsing unit 140 may be used to parse the data request. As shown in fig. 5, in some examples, the processing system may also include a data interface 150. The data interface 150 may be used to receive data requests. In some examples, the data interface 150 may send the received data request to the parsing unit 140, and the parsing unit 140 parses the data request. In this case, coupling can be reduced through the data interface 150, and thus expansion can be easily performed to support load balancing. Thereby, large data volume and highly concurrent access can be supported.
In some examples, the data interface 150 may include at least one of a deep zoom interface, a metadata interface, a macro graph interface, a thumbnail interface, a label graph interface, a tile interface, and a target area interface. In this case, it is possible to easily acquire the relevant data of the digital pathology image in different usage scenes.
In addition, the DeepZoom interface may be used to acquire a hierarchy of images that match the size of the field of view of the display device. In this case, an image hierarchy matching the display device can be acquired, and an image sub-slice of an appropriate image hierarchy can be acquired for display. This can improve the display effect. In some examples, the image hierarchy matching the size of the field of view of the display device may be acquired according to the image size of the image hierarchy having the largest resolution and the size of the screen of the current display device. Specifically, the larger values of the width and height in the image size may be continuously divided by 2 until the result is closest to the larger values of the width and height in the size of the screen, and then the number of divisions by 2 may be taken as the image hierarchy.
In addition, a metadata interface may be used to obtain metadata. Additionally, a macro map interface may be used to acquire the macro map. In addition, a thumbnail interface may be used to retrieve thumbnails. Additionally, a label graph interface may be used to obtain the label graph. In addition, the tile interface may be used to obtain at least one image sub-slice or a data stream corresponding to at least one image sub-slice. In addition, the target region interface may be configured to acquire at least one image sub-slice or a data stream corresponding to at least one image sub-slice within the target region.
As shown in fig. 5, in some examples, the processing system may also include a protocol adaptation unit 160. The protocol adaptation unit 160 may adapt a reusable or non-reusable request manner based on a usage scenario of the data interface 150. For example, the interface involved in calculation and analysis may adopt a multiplexing request mode, and the interface involved in reference may adopt an unreplicable request mode. In this case, the data interface 150 can provide a plurality of request modes in a scenario, and can satisfy the data retrieval requirements in different usage scenarios, and the overhead of the number of connections to the data interface 150 can be reduced.
In some examples, the usage scenario that the protocol adaptation unit 160 may use to differentiate data recall is for a browsing service or a computing service. In some examples, the browsing services can include online slide diagnosis, consultation, and annotation, and the like, and the computing services can include data analysis, training, reasoning, and the like.
In some examples, after distinguishing the usage scenario of data retrieval, the protocol adaptation unit 160 may convert the request mode of the data request of the browsing-type service into an unreversible request mode, and convert the request mode of the data request of the computing-type service into a reusable request mode. This can reduce the overhead of the number of connections to the data interface 150. In some examples, the usage scenario of data retrieval may be differentiated based on user-agent information agreed in the request header of the data request. In some examples, data can be requested in a reusable or non-reusable request by modifying the user-agent information.
In some examples, reusable request manners may include, but are not limited to, requests based on a grpc protocol and requests based on a socket protocol (i.e., reusable request manners may include, but are not limited to, requests established over a grpc protocol connection and requests established over a socket protocol connection). The grpc is designed based on the http2 protocol standard. In some examples, a data stream for computational analysis may be obtained in a manner that supports reusable requests. In this case, the serialization requirements in computational analysis can be satisfied, and the overhead of connection establishment can be reduced. In some examples, the target area interface may support a reusable manner of requesting. Therefore, the requirements for calculating and analyzing the digital pathological images can be met. In some examples, the non-reusable request modes may include http requests and restful api requests. In this case, the random access can be applied to the tuning, browsing, zooming, and view change. In some examples, the deep zoom interface, metadata interface, macro graph interface, thumbnail interface, tab graph interface, and tile graph interface may support a non-reusable manner of request. In this case, the interface with a lower access frequency or more frequently used for random access supports a request manner that is not reusable, and the overhead of the number of connections to the data interface 150 can be reduced.
As shown in fig. 5, in some examples, the processing system may also include a service registration unit 170. The service registration unit 170 may be configured to receive a service registration request, and register and manage a service. In some examples, the image storage unit 120 and the metadata unit 130 may transmit a service registration request to the service registration unit 170 to register to the service registration unit 170 when started. In this case, the image storage unit 120 can be made to be perceived by the acquisition unit 110 (e.g., storage client) and the metadata unit 130. In addition, the metadata unit 130 can be made perceivable by the image storage unit 120. In some examples, the service registration unit 170 may support at least one of a primary/standby + VIP mode, a Raft mode, and a Paxos mode.
In some examples, the processing system may also include a security module (not shown). The security module may be used for security management of data retrieval of digital pathology images. Therefore, the safety of sharing the digital pathological images can be improved. In some examples, the security module may be used to authenticate and authorize data requests based on token (ticket) authentication. In this case, the acquisition of the data related to the digital pathology image can be requested via an authorized source. In some examples, the security module may also be used to record an access log. From this, can make things convenient for follow-up carry out the security audit and can improve the security.
Hereinafter, a data stream processing method of a digital pathology image according to an example of the present disclosure will be described with reference to the drawings. Fig. 6 is a flowchart illustrating an example of a data stream processing method of a digital pathology image according to an example of the present disclosure. In some examples, as shown in fig. 6, the data stream processing method may include preparing a storage environment (step S110), acquiring a digital pathology image and storing the digital pathology image in blocks or directly (step S120), and receiving a data request and returning relevant data of the digital pathology image (step S130).
In some examples, in step S110, a storage environment may be prepared. Specifically, the node information of the image storage unit 120 storing the digital pathology image may be managed by the metadata unit 130, and the metadata unit 130 and the image storage unit 120 are registered to the server registration unit. Thereby, the image storage unit 120 can be made to be perceived by the acquisition unit 110, and the metadata unit 130 can be made to be perceived by the image storage unit 120.
In some examples, in step S120, a digital pathology image may be acquired and stored in blocks or directly. Specifically, the acquisition unit 110 may acquire a digital pathology image and store the digital pathology image to the perceived image storage unit 120, and when writing the digital pathology image, the image storage unit 120 may extract metadata and generate an image number based on the metadata, and then store the digital pathology image based on the metadata, wherein the digital pathology image may be stored in the first storage unit 121 and/or the second storage unit 122 of the image storage unit 120. In some examples, the digital pathology image may be selectively stored in the first storage unit 121 or the second storage unit 122 according to an actual storage requirement. For details, reference is made to the description of the image storage unit 120.
In some examples, the first storage unit 121 may block-store the digital pathology image according to a blocking policy and generate blocking information and sub-slice information, and then store the metadata, the blocking information, and the sub-slice information to the metadata unit 130. In some examples, the first storage unit 121 may perform byte-level blocking of the digital pathology image based on a byte size of an image sub-slice to acquire a plurality of sub-blocks and then store the plurality of sub-blocks at the time of writing of the digital pathology image. For details, reference is made to the related description of the first storage unit 121 and the metadata unit 130.
In some examples, the second storage unit 122 may directly store the digital pathology image and store the metadata and a storage path of the full view of the digital pathology image to the metadata unit 130. In some examples, the second storage unit 122 may also generate sub-slice information for image sub-slices when storing the digital pathology image, and store the sub-slice information to the metadata unit 130. For details, reference is made to the related description of the second storage unit 122 and the metadata unit 130.
In some examples, in step S130, a data request may be received and relevant data of the digital pathology image may be returned. As described above, in some examples, the related data of the digital pathology image may include at least one of metadata, a macro map, a thumbnail, a label map, a tile map (i.e., an image sub-slice), and target region data. Taking a tile map as an example, a data request may be received through the data interface 150 and sent to the parsing unit 140, after the parsing unit 140 determines that the requested data is at least one image sub-slice, the metadata unit 130 queries a storage path for storing a full map or a sub-block of the digital pathology image and node information of a corresponding storage unit, further reads the full map or the sub-block of the digital pathology image from the storage path according to an access address and an access protocol in the node information, and then obtains at least one image sub-slice or a data stream corresponding to at least one image sub-slice from the full map or the sub-block of the digital pathology image. In some examples, the parsing unit 140 may acquire the sub-slice information of at least one image sub-slice from the metadata unit 130, and then acquire a data stream corresponding to the at least one image sub-slice from the digital pathology image stored in the image storage unit 120 based on the byte offset in the sub-slice information and the byte size of the image sub-slice, where the digital pathology image is stored in the first storage unit 121 or the second storage unit 122, and the process of acquiring the data stream corresponding to the image sub-slice is different. For details, refer to the related description of the parsing unit 140.
The processing system of the present disclosure provides a data stream of a plurality of digital pathology images for a computing system, acquires the digital pathology images and extracts metadata, performs byte-level blocking on the digital pathology images based on byte sizes of image sub-slices acquired by the metadata and in byte order to acquire a plurality of sub-blocks of the image sub-slices including a plurality of adjacent regions, then stores the plurality of sub-blocks, and records the metadata and byte offsets of the respective image sub-slices relative to the sub-block in which the sub-blocks are located, and upon receiving a data request to acquire the data stream, reads the data stream corresponding to at least one image sub-slice from the sub-blocks at a time by sharing a handle opened by the sub-blocks by the recorded byte offsets relative to the sub-blocks and the byte sizes of the image sub-slices. Under the condition, the digital pathological image is partitioned based on bytes according to the byte sequence, the encoding and decoding operation of the image is avoided, the consumption of hardware resources can be reduced, massive small files can be avoided, the difficulty of file addressing is reduced, the performance requirement on a storage medium can be further reduced, and data streams corresponding to image sub-slices are obtained based on the byte offset, so that data blocks with the byte size of the image sub-slices under the corresponding byte offset can be obtained under the condition that the sub-partitions do not need to be opened integrally, and the load can be further reduced. Thus, digital pathology images can be stored and shared based on data streams quickly and at a low cost. In addition, image sub-slices with similar pixel areas can be stored in the same sub-block as much as possible, so that conversion of random reading and sequential reading can be realized locally, and the number of times that the sub-blocks are opened can be reduced. Thereby, the data capacity, performance and cost can be adapted reasonably. In addition, the processing system also supports storing the full map of the digital pathology image and acquiring a data stream corresponding to at least one image sub-slice by an offset relative to the full map of the digital pathology image. In this case, even if the full map of the digital pathology image is stored, the data stream corresponding to the image sub-slice can be quickly read.
While the present disclosure has been described in detail in connection with the drawings and examples, it should be understood that the above description is not intended to limit the disclosure in any way. Those skilled in the art can make modifications and variations to the present disclosure as needed without departing from the true spirit and scope of the disclosure, which fall within the scope of the disclosure.

Claims (10)

1. A method for processing a data stream of digital pathological images, applied to a processing system for providing a data stream of a plurality of digital pathological images to a computing system having a plurality of computing tasks for performing computational analysis on the respective digital pathological images in parallel by multiplexing the data stream, the method comprising:
managing node information of an image storage unit of the processing system through a metadata unit of the processing system, and registering the metadata unit and the image storage unit to a service registration unit of the processing system;
acquiring the digital pathological image through an acquisition unit of the processing system and storing the digital pathological image in a first storage unit of an image storage unit perceived by the acquisition unit;
the first storage unit reads metadata of the digital pathology image, acquires byte sizes of image sub-slices of the digital pathology image based on the metadata, performs byte-level blocking on the digital pathology image based on the byte sizes of the image sub-slices and in byte order to acquire a plurality of sub-blocks, stores the plurality of sub-blocks and generates sub-slice information of each image sub-slice, and stores the metadata and the sub-slice information to the metadata unit, wherein each sub-block comprises a plurality of image sub-slices of an adjacent area, and the sub-slice information comprises a sub-block where the image sub-slice is located and a byte offset relative to the located sub-block; and is provided with
And the parsing unit acquires each piece of sub-slice information in at least one image sub-slice corresponding to the data stream from the metadata unit, reads the data stream corresponding to the at least one image sub-slice from the first storage unit based on the byte offset in each piece of sub-slice information in the at least one image sub-slice and the byte size of the image sub-slice, and returns the data stream to the computing system.
2. The method of claim 1, wherein:
if the digital pathological image is stored in the first storage unit, the analysis unit acquires a storage path of a sub-block where the image sub-slice is located and a byte offset relative to the sub-block based on sub-slice information of each image sub-slice, and then reads a data block with the byte size of the image sub-slice under the byte offset from the storage path to serve as a data stream corresponding to the image sub-slice; if the digital pathological image is stored in a second storage unit used for storing the full map of the digital pathological image, the analysis unit acquires a storage path of the full map of the digital pathological image and a byte offset relative to the full map of the digital pathological image, and then reads a data block with a byte size of the image sub-slice under the byte offset from the storage path to serve as a data stream corresponding to the image sub-slice.
3. The method of claim 2, wherein:
when the analysis unit reads the digital pathological image, the handle opened by the digital pathological image is shared and the data stream corresponding to the at least one image sub-slice is read at one time; and/or
And when the analysis unit reads the sub-blocks, the data stream corresponding to the at least one image sub-slice is read from the sub-blocks once by sharing the handles opened by the sub-blocks.
4. The method of claim 1, wherein:
the byte order is either positive or negative.
5. The method of claim 1, wherein:
the byte size of each sub-block is the same, when the digital pathological image is partitioned, the first storage unit sequentially divides the data stream corresponding to the image sub-slices into the plurality of sub-blocks according to the byte order, and when the residual space of one sub-block is not enough to store the data stream corresponding to one image sub-slice, blank data is used for filling the residual space.
6. The method of claim 1, wherein:
the computing system caches the data streams corresponding to the digital pathological images so that the plurality of computing tasks perform the computational analysis on the digital pathological images in parallel by multiplexing the data streams, and deletes the data streams from the cache after completing the computational analysis on the digital pathological images.
7. The method of claim 1, wherein:
the sub-partition has a byte size greater than a byte size of the data stream.
8. The method of claim 1, wherein:
the byte size of the sub-partition is positively correlated with the byte size of the digital pathology image.
9. The method of claim 1, wherein:
the metadata includes color channels of the digital pathology image, an image level, and pixel width heights of respective image sub-slices of the image level, a byte size of the image sub-slices being a pixel width height of the image sub-slices multiplied by a number of the color channels, the sub-slice information further including a byte size of the image sub-slices and an image level to which they belong.
10. The method of claim 1, wherein:
the multiple analysis units are deployed in a load balancing mode, and the data requests are distributed to the analysis units to improve load capacity.
CN202210891998.3A 2021-11-08 2021-11-08 Data stream processing method of digital pathological image Pending CN115206498A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210891998.3A CN115206498A (en) 2021-11-08 2021-11-08 Data stream processing method of digital pathological image

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111315800.9A CN114038541B (en) 2021-11-08 2021-11-08 System for processing a data stream of digital pathology images
CN202210891998.3A CN115206498A (en) 2021-11-08 2021-11-08 Data stream processing method of digital pathological image

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
CN202111315800.9A Division CN114038541B (en) 2021-11-08 2021-11-08 System for processing a data stream of digital pathology images

Publications (1)

Publication Number Publication Date
CN115206498A true CN115206498A (en) 2022-10-18

Family

ID=80143464

Family Applications (2)

Application Number Title Priority Date Filing Date
CN202111315800.9A Active CN114038541B (en) 2021-11-08 2021-11-08 System for processing a data stream of digital pathology images
CN202210891998.3A Pending CN115206498A (en) 2021-11-08 2021-11-08 Data stream processing method of digital pathological image

Family Applications Before (1)

Application Number Title Priority Date Filing Date
CN202111315800.9A Active CN114038541B (en) 2021-11-08 2021-11-08 System for processing a data stream of digital pathology images

Country Status (1)

Country Link
CN (2) CN114038541B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115423690A (en) * 2022-11-04 2022-12-02 之江实验室 High-resolution liver cancer pathological image display method and system based on image pyramid
CN116403684A (en) * 2023-06-08 2023-07-07 杭州医策科技有限公司 Digital pathological image loading method and device

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115942003A (en) * 2022-12-26 2023-04-07 杭州医策科技有限公司 Pathological picture reading method based on browser server architecture

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105357304B (en) * 2015-11-16 2016-11-16 广州华银医学检验中心有限公司 Remote pathological diagnosis section Digital Image Processing and transmission technology
JP6956591B2 (en) * 2017-10-30 2021-11-02 株式会社Nobori Telepathology diagnosis system and telepathology diagnosis method
CN108735284A (en) * 2018-05-16 2018-11-02 南京图思灵智能科技有限责任公司 The shared method of pathological image and the shared platform equipped with database server
JP7458328B2 (en) * 2018-05-21 2024-03-29 コリスタ・エルエルシー Multi-sample whole-slide image processing via multi-resolution registration
CN110288613B (en) * 2019-06-12 2022-09-02 中国科学院重庆绿色智能技术研究院 Tissue pathology image segmentation method for ultrahigh pixels
CN110570953A (en) * 2019-09-09 2019-12-13 杭州憶盛医疗科技有限公司 Automatic analysis method and system for digital pathology panoramic slice image
CN111222064B (en) * 2019-12-25 2023-08-01 宁波市科技园区明天医网科技有限公司 Cloud storage method for digital pathological section
CN113488144B (en) * 2021-07-14 2023-11-07 内蒙古匠艺科技有限责任公司 Slice image processing method

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115423690A (en) * 2022-11-04 2022-12-02 之江实验室 High-resolution liver cancer pathological image display method and system based on image pyramid
US12112027B2 (en) 2022-11-04 2024-10-08 Zhejiang Lab System and method for displaying high-resolution liver cancer pathological image based on image pyramid
CN116403684A (en) * 2023-06-08 2023-07-07 杭州医策科技有限公司 Digital pathological image loading method and device
CN116403684B (en) * 2023-06-08 2023-08-11 杭州医策科技有限公司 Digital pathological image loading method and device

Also Published As

Publication number Publication date
CN114038541A (en) 2022-02-11
CN114038541B (en) 2022-07-22

Similar Documents

Publication Publication Date Title
CN114038541B (en) System for processing a data stream of digital pathology images
US11017018B2 (en) Systems and methods of building and using an image catalog
RU2394269C2 (en) System and method of distributing video data
US7565441B2 (en) Image transfer and archival system
US20070180265A1 (en) Film management method
KR100990098B1 (en) Data processing system, data processing method, information processing device, and computer readable recording medium for recording the computer program
US20040109197A1 (en) Apparatus and method for sharing digital content of an image across a communications network
US9558401B2 (en) Scanbox
US20030005464A1 (en) System and method for repository storage of private data on a network for direct client access
CN106650211B (en) Storage server
Haynes et al. Vss: A storage system for video analytics
KR20090039405A (en) Processing method of tagged information and the client-server system for the same
CN109885577B (en) Data processing method, device, terminal and storage medium
US11947826B2 (en) Method for accelerating image storing and retrieving differential latency storage devices based on access rates
US20120150881A1 (en) Cloud-hosted multi-media application server
US20050195430A1 (en) Image registration apparatus, image retrieval apparatus, image management method, and storage medium
CN113448946B (en) Data migration method and device and electronic equipment
KR102481009B1 (en) Method for rapid reference object storage format for chroma subsampled images
US20140269911A1 (en) Batch compression of photos
CN113936776B (en) Distributed multi-disease artificial intelligence pathological analysis system
CN114205631B (en) Video storage, catalog generation and migration methods, devices, equipment and media
CN103678862A (en) Information processing apparatus, information processing method, and information processing program
US11860843B2 (en) Data processing method and device
US20220342898A1 (en) Apparatus, method and computer-readable medium for access
WO2024212707A1 (en) Image file generation method and apparatus, and image processing method and apparatus

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination