WO2023126246A1

WO2023126246A1 - Screening for subtle condition sign detection

Info

Publication number: WO2023126246A1
Application number: PCT/EP2022/086911
Authority: WO
Inventors: Joël Valentin STADELMANN; Nicole Schadewaldt; Heinrich Schulz
Original assignee: Koninklijke Philips N.V.
Priority date: 2021-12-27
Filing date: 2022-12-20
Publication date: 2023-07-06

Abstract

The present invention relates to medical imaging. In order to improve the detection of early signs of a disease in a medical image, an image processing apparatus is provided that comprises an input, a processor, and an output. The input is configured to receive medical image data of a patient. The processor is configured to detect a plurality of objects in the medical image data. The plurality of objects comprises one or more body parts. The processor is further configured to perform a search in a model library and select one or more machine-learning models matching the plurality of detected objects. The model library comprises a plurality of machine-learning models and each machine learning model has been trained for a respective object. The processor is further configured to apply each selected machine-learning model on the respective body part to identify a clinical finding in the medical image data. The output is configured to provide the identified clinical finding.

Description

SCREENING FOR SUBTLE CONDITION SIGN DETECTION

FIELD OF THE INVENTION

The present invention relates to medical imaging, and in particular to an image processing apparatus, to a medical imaging system, to an image processing method, to a computer program element, and to a computer-readable data carrier.

BACKGROUND OF THE INVENTION

Early stages of diseases often do not cause discomfort to the patients. Their manifestation on medical images (e.g. chest X-rays) may be subtle, and thus not be noticed if a medical doctor (e.g. radiologist) is not specifically looking for them. Therefore, those early signs may be often missed. This may prevent the early treatment of the condition, which usually implies heavier treatments at the disease’s later stages.

SUMMARY OF THE INVENTION

There may, therefore, be a need to improve the detection of early signs of a disease in a medical image.

The invention is defined by the independent claims. The dependent claims define advantageous embodiments.

According to a first aspect of the present invention, there is provided an image processing apparatus. The image processing apparatus comprises an input, a processor, and an output. The input is configured to receive medical image data of a patient. The processor is configured to detect a plurality of objects in the medical image data. The plurality of objects comprises one or more body parts. The processor is further configured to perform a search in a model library and select one or more machinelearning models matching the plurality of detected objects. The model library comprises a plurality of machine-learning models and each machine learning model has been trained for a respective object. The processor is further configured to apply each selected machine-learning model on the respective body part to identify a clinical finding in the medical image data. The output is configured to provide the identified clinical finding.

The image processing apparatus as described herein detects one or more body parts in the received medical image data e.g. using an object detection model such as faster R-CNN (Region Based Convolutional Neural Networks), YOLO (You Only Look Once), or SSD (Single Short Detector). Knowledge of a particular body part allows to re-use in subsequent steps machine-learning models trained for different applications, e.g. machine-learning models trained for identifying an anatomy of interest in the particular body part. Therefore, if e.g. the shoulder and arm are identified in the chest X-ray image, two machine-learning models may be used including one machine-learning model trained for identifying an anatomy of interest in the shoulder and the other machine-learning model trained for identifying an anatomy of interest in the arm. For instance, after detecting the shoulder joint using YOLO, a segmentation neural network trained for skeletal traumatology may be applied to segment bones and cartilage in the shoulder, and an abnormality detection neural network trained for skeletal traumatology may be applied to detect abnormality in the shoulder. In other words, instead of directly using a machine learning model trained for e.g. chest X-ray to recognize chest X-ray findings, the image processing method as described herein uses anatomical part detection before applying machine learning model that was specifically trained for the detected anatomical part to crops of the original chest X-ray. This may increase the chance of detecting early signs of a disease in the medical image and also the chance of starting the treatment at the earliest stage.

According to an example of the present disclosure, the input may be configured to receive information about a device. The processor may be configured to perform a search in the model library to select a machine learning model matching the device and apply the selected machine-learning model on the medical image data to identify the device.

In other words, the image processing apparatus as described herein may also be used in the assessment of devices. For example, while e.g. guidewires of catheters can be easily depicted visually, the tip of such devices can only be visualized clearly by carefully adapting the visualization settings. Here fully automated optimization of the visualization in the tip region may greatly improve reading the image.

According to an example of the present disclosure, each selected machine learning model may comprise a joint segmentation module and a normality assessment module. The joint segmentation module is configured to segment objects of interest in the medical image data. The normality assessment module is configured to compare the segmented objects interest with training data to determine a probability of an occurrence of an abnormality in the medical image data.

This will be explained in detail hereinafter and particularly with respect to Figs. 3. According to an example of the present disclosure, the joint segmentation module may comprise fully convolutional neural networks (FCNS), U-Net, or generative adversarial networks (GAN).

According to an example of the present disclosure, the normality assessment module may comprise deep perceptual autoencoders.

According to an example of the present disclosure, the processor may be configured to perform an automated contrast adaption in a region of interest that comprises the identified clinical finding.

In other words, the image processing method as described herein may highlight discrepancies on the image and notify a medical doctor for an additional review by adapting the contrast for the type of tissue visualized. According to an example of the present disclosure, the processor may be configured to perform an automated contrast adaption in a region of interest that comprise the device.

According to an example of the present disclosure, the processor may be further configured to process the medical image data to enhance the region of interest in a displayed image resulting from the medical image data.

According to an example of the present disclosure, the processor may be configured to generate and provide, via the output, a notification to notify a healthcare practitioner regarding the clinical finding and prompt a responsive action with respect to a patient associated with the medical image data.

In other words, upon detection of possible symptoms, the image processing apparatus may notify the radiologist for an additional review of this region.

According a second aspect of the present invention, there is provided a medical imaging system. The medical imaging system comprises an imaging apparatus (e.g. an X-ray imaging apparatus such as a chest X-ray imaging apparatus) configured to acquire medical image data of a patient, and an apparatus according to the first aspect and any associated example.

According to a third aspect of the present invention, there is provided an image processing method, comprising: receiving medical image data acquired using an imaging apparatus; detecting a plurality of objects in the medical image data, wherein the plurality of objects comprises one or more body parts; performing a search in a model library and selecting one or more machine-learning models matching the plurality of detected objects, wherein the model library comprises a plurality of machine-learning models and each machine learning model has been trained for a respective object; and applying each selected machine-learning model on a respective body part to identify a clinical finding in the medical image data; and providing the clinical finding.

The image processing method may be at least partly computer-implemented, and may be implemented in software or in hardware, or in software and hardware. Further, the method may be carried out by computer program instructions running on means that provide data processing functions. The data processing means may be a suitable computing means, such as an electronic control module etc., which may also be a distributed computer system. The data processing means or the computer, respectively, may comprise of one or more processors, a memory, a data interface, or the like.

According to another aspect of the present invention, there is provided a computer program product comprising instructions which, when the program is executed by one or more processors, cause the one or more processors to carry out the steps of the method according to the second aspect and any associated example.

According to a further aspect of the present invention, there is provided a computer- readable data carrier having stored thereon the computer program product. As used herein, the term “patient” may refer to a human or an animal.

These and other aspects of the invention will be apparent from and elucidated with reference to the embodiments described hereinafter.

BRIEF DESCRIPTION OF THE DRAWINGS

Fig. 1 illustrates a flow chart describing an exemplary image processing method.

Figs. 2A-2C show an effect of intensity windowing on shade of grey discemibility

Fig. 3 illustrates a flow chart describing a further exemplary image processing method.

Fig. 4 schematically shows an example of chest X-rays with locally enhanced contrast to highlight bone structures.

Fig. 5 illustrates an exemplary image processing apparatus.

Fig. 6 illustrates an exemplary medical imaging system.

DETAILED DESCRIPTION OF EMBODIMENTS

The manifestation of early signs of diseases on medical images (e.g. X-ray images, MRI images, or images of other imaging modalities) may be subtle. For example, chest X-ray is arguably the most prescribed radiological study because of its versatility and comparatively low cost. It is routinely used to identify or rule out cardiopulmonary condition, verify the position of devices, and evaluate fractures.

However, the interpretation of chest X-rays may be biased by a patient’s specific indication. A primary focus on the actual reason for the exam may easily result in parts of the image information being unused. For instance, a patient in cardiac distress would have his chest X-ray reviewed for heart symptoms, but markers of skeletal tuberculosis in shoulders could be unnoticed, even though they are present on the image.

Moreover, it is common practice when reviewing an X-ray, to window the pixel intensity in order to obtain the best possible contrast of the tissues of interest, while saturating other tissues. Thus, radiologists reviewing bones would have soft tissue shown as a dark-grey haze, whereas the review of soft tissue would show bones as uniform white bars. For example, if the intensity windowing is chosen in a manner showing even bone mineralization, the bronchial structure, however, may be invisible. On the other hand, if the lung internal structure is made clear, the bone structures may be harder to discern.

Consequently, even though early signs of a disease may be present on a chest X-ray, their subtleness can cause them to go unnoticed during the review. This may prevent the early treatment of the condition, which usually implies heavier treatment at the disease’s later stages.

Towards this end, an image processing method is provided. Fig. 1 illustrates a flow chart describing an exemplary image processing method 100 according to an embodiment of the present disclosure. It should be noted that although the following detailed description is described in relation to chest X-ray images for the purposes of illustration, anyone of ordinary skill in the art will appreciate that the method and apparatus described above and below can be adapted to other medical imaging modalities including, but not limited to, MRI (magnetic resonance imaging), and to other body part including, but not limited to, joints, neck, spine, limbs, and other parts of the body. For example, as will be explained hereinafter, the image processing method as described herein may enhance the information, which is present on a chest X-ray, but too faint to be noticed by eye. As foreseen, the image processing method as described herein may be applied to the screening of skeletal tuberculosis. This method may be modified for the screening of other lesions. For instance, spleen cyst and liver fat appear darker on CT scans. It is possible that those lesions could be picked up on chest or abdominal X-rays, in which case the image processing method as described herein could be applied to the early detection of hepatitis or drug- related damage to the liver. Accordingly, the following described examples are set forth without any loss of generality to, and without imposing limitations upon, the claimed invention.

The image processing method 100 may be implemented as a device, module or related component in a set of logic instructions stored in a non-transitory machine- or computer-readable storage medium such as random access memory (RAM), read only memory (ROM), programmable ROM (PROM), firmware, flash memory, etc., in configurable logic such as, for example, programmable logic arrays (PLAs), field programmable gate arrays (FPGAs), complex programmable logic devices (CPLDs), in fixed-functionality hardware logic using circuit technology such as, for example, application specific integrated circuit (ASIC), complementary metal oxide semiconductor (CMOS) or transistor-transistor logic (TTL) technology, or any combination thereof. For example, computer program code to carry out operations shown in the image processing method 100 may be written in any combination of one or more programming languages, including object-oriented programming languages and conventional procedural programming languages, such as the JAVA, SMALLTALK, C++, Python, or "C" programming languages or similar programming languages. For example, the exemplary image processing method may be implemented as an image processing apparatus 10 shown in Figs. 5 and 6.

At block 110, medical image data of a patient is received. In some examples, the medical image data of the patient may be received from an imaging apparatus, such as X-ray imaging system, CT scanner, or MRI scanner, which is configured to acquire a medical image of the patient. In some examples, the medical image data of the patient may be received from a database, such as PACS (Picture Archiving and Communication System). The medical image data may comprise a two-dimensional image comprising image pixel data and/or a three-dimensional image comprising image voxel data.

The medical image data may be acquired from any body part of the patient, such as joints, neck, spine, limbs, chest, or other parts of the body. For the purposes of illustration, the following approach is described in relation to the chest of the patient. At block 120, a plurality of objects is detected in the medical image data. The plurality of objects comprises one or more body parts, such as head, arm, leg, torso, shoulder, hand, feet, etc. In a chest X-ray image, for example, body parts like shoulder, torso, and arm may be detected.

The plurality of objects may be detected using an object detection model, which may propose regions of an image that have specific contents, such as body parts. Examples of the object detection model may include, but are not limited to, faster R-CNN, YOLO, and SSD, which will be briefly explained below.

The faster R-CNN is a deep convolutional network used for object detection. Faster R- CNN is structured with three different parts including the Feature Extraction Network (FEN), Region Proposal Network (RPN), and the Classification Network. In faster R-CNN, the input medical image data is firstly fed into the FEN, which is typically composed by a pre-trained CNN model without fully- connected output layers. The FEN simply utilizes its pre-trained model to generate the feature map of the input image, which is subsequently fed into the RPN and the Classification Network. The RPN is used to generate proper region proposals of target object on the feature map. In RPN, firstly, a convolving filter slides through the feature map, and the center point of the sliding window is mapped backwards to the point on the original input image. The point mapped back is called an anchor. Subsequently several fixed- size bounding boxes are generated around the anchor on the original input image. Via the anchor generating process, bounding boxes with different anchors would be evenly and sufficiently distributed among each region in the original image. After the anchor’s bounding boxes are generated on the original input image, according to their loU (Intersection over Union) over ground truths, these bounding boxes would be labelled as positive (loU > 0.7) or negative (loU < 0.3). The labelled bounding boxes are subsequently fed to train two following parts: a softmax classifier that could distinguish positive anchors, and a bounding box regression part that could make the bounding boxes coordinates closer to the ground truths. After positive anchor bounding boxes are selected and tuned by the regression part of RPN, region proposals are generated and would be fed to the following Classification Network. The Classification Network takes feature map generated by FEN and the region proposal generated by RPN as input. In the Classification Network, a fully-connected layer and a softmax layer, that applies the softmax function to normalize the output to a probability distribution, are utilized to classify the object in the region proposals via for example thresholding. Meanwhile, bounding box regression is utilized to adjust the region proposal for more precise bounding boxes. After classification and bounding box regression, the object detection result of Faster R-CNN output is generated.

A standard softmax function cr: IR -> (0,l) is defined when K > 1 by the formula

where z is the input vector of K real numbers. In this case K may be seen as the number of different classes in the classifier. In simple words, it applies the standard exponential function to each element z_t of the input vector z and normalizes these values by dividing by the sum of all these exponentials: this normalization ensures that the sum of the components of the output vector c(z) is 1. However, instead of e also a different base b>0 may be used. In practice, the softmax function is often preferred over other normalization methods since it finds the maximum likelihood estimate and minimizes the cross-entropy between predictions. A skilled person in the art will however see that also different normalization methods may be used.

YOLO is an algorithm that utilizes a single CNN to implement end-to-end objection detection tasks. The input image is firstly resized with a fixed size, and then fed into the CNN network. Finally, the detection result is generated via processing the CNN output. Compared with Faster R-CNN, the YOLO design enables end-to-end training and real-time speeds while maintaining high average precision, a simple working flow ofYOLO is described below. The CNN of YOLO firstly divides the input image into S x S grids, each grid being assigned to detect the objects whose center point is located in the grid. Each grid would predict several bounding boxes with confidence values. The confidence value contains two parts: the first part is whether a ground truth object is contained in the grid, and the second part is the precision of the bounding box, which could be calculated by the loU between the ground truth and the prediction bounding box. Then for these generated bounding boxes, NMS (Non Maximum Suppression) method is used to reduce the overlapping bounding boxes with lower confidence. For classification, if the center of the ground truth object falls into the grid, then the grid is responsible to generate a bounding box and classify the bounded object. For these grids, each divided grid would also predict multiple conditional class probabilities according to the object class types. In the end, the bounding boxes with the class of highest confidence value would be output by YOLO.

SSD is a one-stage object detection algorithm. The original SSD network is a refitted network based on a pre-trained VGG-16 model. It removed the fully-connected layers at the tail of VGG- 16, and added four extra feature layers to perform the object detection tasks. Different from YOLO which use a fully-connected layer to generate detection results, SSD directly uses Convolution layer to generate bounding boxes directly on the feature map and maps it back to the input images later.

Knowledge of the body part(s) allows to re-use in subsequent steps machine-learning modules trained for different applications.

At block 130, a search is performed in a model library and one or more machine -learning models matching the plurality of detected objects are selected. The model library comprises a plurality of machine-learning models and each machine-learning model has been trained for a respective object.

For example, each machine-learning model in the model library may be labelled with a respective object for which the machine-learning model has been specifically trained. For example, after detecting the shoulder joint using an object detection model (e.g. YOLO, SSD, faster R-CNN, or the like), a machine-learning model labelled with skeletal traumatology may be applied to bones and cartilage in the shoulder.

In some examples, each selected machine-learning model may comprise a joint segmentation module and a normality assessment module. The joint segmentation module is configured to segment objects of interest in the medical image data. The normality assessment module is configured to compare the segmented objects interest with training data to determine a probability of an occurrence of an abnormality in the medical image data. For example, as shown in Fig. 1, the block 130 may further comprise block 130a and block 130b.

At block 130a, a search may be performed in a library of Al (Artificial Intelligence) segmentation modules and one or more joint segmentation modules may be selected that match the plurality of detected objects. For instance, after detecting the shoulder joint using an object detection model, a segmentation neural network trained for skeletal traumatology may be selected to be applied to segment bones and cartilage in the shoulder.

Examples of the joint segmentation module may include, but are not limited to, fully convolutional neural networks (FCN), U-Net, or generative adversarial networks (GAN), which will be briefly discussed below.

An FCN is derived from a CNN-based segmentation network. It trains end-to-end, pixels- to-pixels digital input images for a given segmentation task. The idea of an FCN is to build convolutional layers without any fully connected layers and to produce an output size that corresponds to the input. The input data feature map is encoded and decoded using transposed convolution to attain the same size output. As the network decodes, the skip connection sums pre-extracted feature maps to recover the spatial information during pooling operations.

A U-Net is an FCN that relies on the use of data augmentation aided toward precise localization in biomedical image segmentation. The U-Net architecture includes multiple up-sampling layers, skip connection that concatenates feature maps, and learnable weight filters. The result shows outstanding performance in both biomedical image segmentation and crack detection.

GAN-based segmentation models can be considered as a two-player game between a generator, which learns how to generate samples resembling real data, and a discriminator, which learns how to discriminate between real and generated data. Both the generator and the discriminator cost functions are minimized simultaneously. The iterative minimization of cost functions eventually leads to a Nash equilibrium where neither can further unilaterally minimize its cost function. In the end, the GAN discriminator provides an abstract unsupervised representation of the input images.

The output of the joint segmentation module, such as segmentation CNN, U-NET, FCN, or GAN, usually comprises a softmax layer that is condensed into labelled regions by means of thresholding. The output of a softmax layer can be interpreted as probabilities that a region belongs to a given label. These probabilities may then be thresholded to classify a region to a single class. For instance, the probabilities may be represented as heatmaps to highlight class-specific regions of images.

At block 130b, a search is performed in a library of Al normality modules and one or more normality assessment modules may be selected that match the plurality of detected objects. For instance, in the above-described example of the shoulder joint, a normality assessment module that was developed to verify skeletal traumatology images may be selected to be applied for normality assessment of the shoulder joint. The probability maps generated by the joint segmentation module(s) at block 130a may be used as input for the normality assessment module(s) at block 130b.

Various normality assessment modules may be used for abnormality detection. In some examples, normality assessment may be carried out using a CNN model, such as SqueezeNet, VGG network, InceptionV3, and DenseNet. In some examples, normality assessment may be carried out using deep perceptual autoencoders. Those autoencoders are used in order to verify, that the image is similar to a training set and thus can be analyzed reliably by an analysis neural network. In the above-described example, since the shoulder joint has been detected by an object detection model (e.g. YOLO), it is again possible to re-use a deep perceptual autoencoder that was developed to verify skeletal traumatology images.

At block 140, each selected machine-learning model is applied on the respective body part to identify a clinical finding in the medical image data. For example, a segmentation neural network trained for skeletal traumatology may be applied to segment bones and cartilage in the shoulder. A deep perceptual autoencoder that was developed to verify skeletal traumatology images may be applied for abnormality detection in the shoulder.

The normality assessment module may provide an abnormality score for each object. If the abnormality score given by the normality assessment module exceeds a given threshold, the associated subpart of the image may be labelled as potential clinical finding.

At block 150, the identified clinical finding is provided.

In some examples, the identified clinical finding may be provided to a display for review by a user (e.g. radiologist). For example, the image processing apparatus shown in Fig. 5 or 6 may generate and provide, via the output, a notification to notify a healthcare practitioner regarding the clinical finding and prompt a responsive action with respect to a patient associated with the medical image data.

In some examples, the medical image data may be processed to enhance the region of interest in a displayed image resulting from the medical image data. For example, an automated contrast adaption may be performed in a region of interest that comprises the identified clinical finding. This will be explained below.

On a digital image, shades of grey are coded as numbers. Because of intensity windowing, faint differences in pixel intensities may be impossible to notice by eye. For example, Figs. 2A-2C show an effect of intensity windowing on shade of grey discemibility. In Figs. 2A-2C, different pixel intensities are indicated with different fill patterns. As shown in Fig. 2A, values close to 0 are usually black (indicated with dense diagonal lines) and values close to 1 are usually white (indicated with no diagonal lines). In order to ease the reading of a chest X-ray, it is common practice to modify the thresholds of black and white. It results that exceeding the thresholds will be saturated to black or white. For instance, on Fig. 2A, contrast settings are set in order to show all grey values in the [0, 1] range. On Fig. 2B, contrast settings are saturating to white all values over 0.4. On Fig. 2C, contrast settings saturate to black all values under 0.6. By measuring the minimal and maximal pixel values in a region of interest, such as the body part that comprises the identified clinical finding, it is possible to improve the contrast in this region, for instance using the following relation for each pixel: r _ > lopt — ‘ ,max ‘mm . where I_max and I_mt_n are respectively the maximum and minimum intensity values in the region, I and I_opt are respectively a pixel value and the value that is substituted a the pixel’s position. It will be appreciated by a skilled person that more complex optimization schemes are possible.

Fig. 3 shows a flow chart describing a further exemplary image processing method according to an example of the present disclosure. The exemplary image processing method 100 begins with the reception of medical image data of a patient. In this example, the medical image data is chest X- ray medical image data. This exemplary image processing method 100 comprises the following two branches: Branch 1

In the first branch, i.e. branch 1 shown in Fig. 3, an Al module (e.g. CNN) dedicated to chest X-rays findings may be applied at block 190. This Al module may be trained for findings that are typical to chest X-ray analysis, for instance atelectasis, pneumothorax, etc. and does not analyse specific image regions for incidental findings.

Branch 2

The image processing method described with respect to Fig. 1 is added to the second branch, i.e. branch 2. The second branch begins at block 120 with a detection of body parts, for instance using a YOLO neural network, a SSD, a faster R-CNN, or any other suitable object detection model. Those neural networks propose regions of an image that have specific contents.

Knowledge of the body part(s) allows to re-use in subsequent steps machine learning modules trained for different applications. For instance, after detecting the shoulder joint using an object detection model (e.g. YOLO, SSD, faster R-CNN), a segmentation neural network (e.g. FCN, U-Net, GAN, or any other suitable segmentation neural network) trained for skeletal traumatology may be selected from a library of Al segmentation modules and may be applied to segment bones and cartilage in the shoulder at block 130a. The output of segmentation neural networks, such as U-Net, usually consists in a softmax layer that is condensed into labelled regions by means of thresholding. However, the softmax layer can be interpreted as probabilities that a region belongs to a given label.

The probability maps can be used as input for a normality assessment module at block 130b. Deep perceptual autoencoders are an example of the normality assessment module. Those autoencoders may be used in order to verify that the image is similar to a training set and thus can be analyzed reliably by an analysis neural network. Since the shoulder joint has been detected by e.g. Y OLO, it is again possible to re-use a deep perceptual autoencoder that was developed to verify skeletal traumatology images.

If the abnormality score given by a deep perceptual autoencoder exceeds a given threshold, the contrast settings in the region of interest may be automatically improved at block 160 according to the above-described automated contrast adaptation approach, depending on the type of tissue present in the region.

Afterwards, the body part may be highlighted at block 170, for instance using a bounding box such as the one sketched as shown in Fig. 4.

Finally, the radiologist may be notified to perform a more thorough review of the region at block 180.

As a further example, the image processing method as described herein may be used in the assessment of devices. For example, while e.g. guidewires of catheters can be easily depicted visually, the tip of such devices can only be visualized clearly be carefully adapt the visualization settings. It is also possible to perform an automated contrast adaption in a region of interest that comprise the device. In this way, fully automated optimization of the visualization in the tip region can greatly improve reading the image.

Fig. 5 illustrates an exemplary image processing apparatus 10. The image processing apparatus 10 comprises an input 12, one or more processors 14, and an output 16.

In general, the image processing apparatus 10 may comprise various physical and/or logical components for communicating and manipulating information, which may be implemented as hardware components (e.g., computing devices, processors, logic devices), executable computer program instructions (e.g., firmware, software) to be executed by various hardware components, or any combination thereof, as desired for a given set of design parameters or performance constraints. Although Fig. 5 may show a limited number of components by way of example, it can be appreciated that a greater or a fewer number of components may be employed for a given implementation.

In some implementations, the image processing apparatus 10 may be embodied as, or in, a device or apparatus, such as a server, workstation, or mobile device. The image processing apparatus 10 may comprise one or more microprocessors or computer processors, which execute appropriate software. The processor 14 of the image processing apparatus 10 may be embodied by one or more of these processors. The software may have been downloaded and/or stored in a corresponding memory, e.g., a volatile memory such as RAM or a non-volatile memory such as flash. The software may comprise instructions configuring the one or more processors to perform the functions as described herein.

It is noted that the image processing apparatus 10 may be implemented with or without employing a processor, and also may be implemented as a combination of dedicated hardware to perform some functions and a processor (e.g., one or more programmed microprocessors and associated circuitry) to perform other functions. For example, the functional units of the image processing apparatus 10, e.g., the input 12, the one or more processors 14, and the output 16 may be implemented in the device or apparatus in the form of programmable logic, e.g., as a Field-Programmable Gate Array (FPGA). In general, each functional unit of the apparatus may be implemented in the form of a circuit.

In some implementations, the image processing apparatus 10 may also be implemented in a distributed manner. For example, some or all units of the image processing apparatus 10 may be arranged as separate modules in a distributed architecture and connected in a suitable communication network, such as a 3rd Generation Partnership Project (3GPP) network, a Long Term Evolution (LTE) network, Internet, LAN (Local Area Network), Wireless LAN (Local Area Network), WAN (Wide Area Network), and the like.

The processor(s) 14 may execute instructions to perform the image processing method as described herein.

The input 12 and the output 14 may include hardware and/or software to enable the image processing apparatus 10 to receive a data input, and to communicate with other devices and/or a network. The input 12 may receive the data input via a wired connection or via a wireless connection. The output 16 may also provide cellular telephone communications, and/or other data communications for the image processing apparatus 10.

Fig. 6 shows schematically and exemplary an example of an X-ray imaging system 200. In this example, the exemplary X-ray imaging system is a chest radiography imaging system 200. The chest radiography imaging system 200 comprises an image processing apparatus 10, an X-ray imaging device 210, and a system console 220.

The X-ray imaging system 210 comprises an X-ray source 212 and an X-ray detector 214. The X-ray detector 214 is spaced from the X-ray source 212 to accommodate a patient PAT to be imaged.

In general, during an image acquisition, a collimated X-ray beam (indicated with arrow P) emanates from the X-ray source 212, passes through the patient PAT at a region of interest (ROI), experiences attenuation by interaction with matter therein, and the attenuated beam then strikes the surface of the X-ray detector 214. The density of the organic material making up the ROI determines the level of attenuation. That is the rib cage and lung tissue in the chest radiography imaging examination. High density material (such as bone) causes higher attenuation than less dense materials (such as lung tissue). The registered digital values for the X-ray are then consolidated into an array of digital values forming an X-ray projection image for a given acquisition time and projection direction.

Overall operation of the X-ray imaging system 210 may be controlled by an operator from the system console 220. The system console 220 may be coupled to a screen or monitor 230 on which the acquired X-ray images or imager settings may be viewed or reviewed. An operator such as a medical lab technical can control via the system console 220 an image acquisition run by releasing individual X-ray exposures for example by actuating a joystick or pedal or other suitable input means coupled to the system console 220. In the example of Fig. 6, the patient PAT stands facing a flat surface behind which is the X-ray detector 214. According to a different example (not shown), the X-ray imaging system 210 is of the C-arm type, and the patient PAT is actually lying on an examination table instead of standing.

As described above, the image processing apparatus 10 may be any computing device, including desktop and laptop computers, smartphones, tablets, etc. The image processing apparatus 10 may be a general-purpose device or a device with a dedicated unit of equipment suitable for providing the functionality as described herein. In the example of Fig. 6, the components of the image processing apparatus 10 are shown as integrated in one single unit. However, in alternative examples, some or all components may be arranged as separate modules in a distributed architecture and connected in a suitable communication network. The image processing apparatus 10 and its components may be arranged as dedicated FPGAs or as hardwired standalone chips, such as the image processing apparatus shown in Fig. 6. In some examples (not shown), the image processing apparatus 10 or some of its components may be resident in the console 230 running as software routines.

In operation, an X-ray image acquired by the X-ray imaging device 210 is provided to the image processing apparatus 10. The image processing apparatus 10 then automatically searches the acquired X-ray image for early signs of diseases according to the image processing method as described herein. The image processing apparatus 10 may analyze the pixel values that would display changes in intensity too faint to be noticed with the current display settings. Upon detection of possible symptoms, the image processing apparatus may notify the radiologist for an additional review of this region and/or automatically improve the contrast settings in the region of interest, depending on the type of tissue present in the region. This in turn would allow to detect diseases at their earliest stages, which would allow for lighter forms of treatment.

In another exemplary embodiment of the present invention, a computer program or a computer program element is provided that is characterized by being adapted to execute the method steps of the method according to one of the preceding embodiments, on an appropriate system.

The computer program element might therefore be stored on a computer, which might also be part of an embodiment of the present invention. This computer may be adapted to perform or induce a performing of the steps of the method described above. Moreover, it may be adapted to operate the components of the above -de scribed apparatus. The computer can be adapted to operate automatically and/or to execute the orders of a user. A computer program may be loaded into a working memory of a data processor. The data processor may thus be equipped to carry out the method of the invention.

This exemplary embodiment of the invention covers both, a computer program that right from the beginning uses the invention and a computer program that by means of an up-date turns an existing program into a program that uses the invention.

Further on, the computer program element might be able to provide all necessary steps to fulfil the procedure of an exemplary embodiment of the method as described above. According to a further exemplary embodiment of the present invention, a computer readable medium, such as a CD-ROM, is presented wherein the computer readable medium has a computer program element stored on it which computer program element is described by the preceding section.

A computer program may be stored and/or distributed on a suitable medium, such as an optical storage medium or a solid-state medium supplied together with or as part of other hardware, but may also be distributed in other forms, such as via the internet or other wired or wireless telecommunication systems.

However, the computer program may also be presented over a network like the World Wide Web and can be downloaded into the working memory of a data processor from such a network.

According to a further exemplary embodiment of the present invention, a medium for making a computer program element available for downloading is provided, which computer program element is arranged to perform a method according to one of the previously described embodiments of the invention.

It has to be noted that embodiments of the invention are described with reference to different subject matters. In particular, some embodiments are described with reference to method type claims whereas other embodiments are described with reference to the device type claims. However, a person skilled in the art will gather from the above and the following description that, unless otherwise notified, in addition to any combination of features belonging to one type of subject matter also any combination between features relating to different subject matters is considered to be disclosed with this application. However, all features can be combined providing synergetic effects that are more than the simple summation of the features.

While the invention has been illustrated and described in detail in the drawings and foregoing description, such illustration and description are to be considered illustrative or exemplary and not restrictive. The invention is not limited to the disclosed embodiments. Other variations to the disclosed embodiments can be understood and effected by those skilled in the art in practicing a claimed invention, from a study of the drawings, the disclosure, and the dependent claims.

In the claims, the word “comprising” does not exclude other elements or steps, and the indefinite article “a” or “an” does not exclude a plurality. A single processor or other unit may fulfil the functions of several items re-cited in the claims. Measures recited in mutually different dependent claims may advantageously be combined. Any reference signs in the claims may not be construed as limiting the scope.

Claims

Claim 1. An image processing apparatus (10), comprising: an input (12) configured to receive medical image data of a patient (PAT); a processor (14) configured to: detect a plurality of objects in the medical image data, wherein the plurality of objects comprises one or more body parts, perform a search in a model library and select one or more machine-learning models matching the plurality of detected objects, wherein the model library comprises a plurality of machinelearning models and each machine learning model has been trained for a respective object, and apply each selected machine-learning model on the respective body part to identify a clinical finding in the medical image data; and an output (16) configured to provide the identified clinical finding.

Claim 2. The apparatus (10) according to claim 1, wherein the input (12) is further configured to receive information about a device; and wherein the processor (14) is further configured to perform a search in the model library to select a machine learning model matching the device and apply the selected machine-learning model on the medical image data to identify the device.

Claim 3. The apparatus (10) according to claim 1 or 2, wherein each selected machine learning model comprises a joint segmentation module and a normality assessment module; wherein the joint segmentation module is configured to segment objects of interest in the medical image data; and wherein the normality assessment module is configured to compare the segmented objects interest with training data to determine a probability of an occurrence of an abnormality in the medical image data.

Claim 4. The apparatus (10) according to claim 3, wherein the joint segmentation module comprises: a fully convolutional neural network; or a generative adversarial network.

Claim 5. The apparatus (10) according to claim 3, wherein the normality assessment module comprises deep perceptual autoencoders.

Claim 6. The apparatus (10) according to claim any one of the preceding claims, wherein the processor (14) is further configured to process the medical image data to enhance the region of interest in a displayed image resulting from the medical image data.

Claim 7. The apparatus (10) according to claim 6, wherein the processor (14) is further configured to perform an automated contrast adaption in a region of interest that comprises the identified clinical finding.

Claim 8. The apparatus (10) according to claim 6 or 7, wherein the processor (14) is further configured to perform an automated contrast adaption in a region of interest that comprise the device.

Claim 9. The apparatus (10) according to any one of the preceding claims, wherein the processor (14) is further configured to generate and provide, via the output (16), a notification to notify a healthcare practitioner regarding the clinical finding and prompt a responsive action with respect to a patient (PAT) associated with the medical image data.

Claim 10. A medical imaging system (200), comprising: an imaging apparatus (210) configured to acquire medical image data of a patient (PAT); and an apparatus (10) according to any one of the preceding claims.

Claim 11. An image processing method (100), comprising: receiving (110) medical image data of a patient (PAT); detecting (120) a plurality of objects in the medical image data, wherein the plurality of objects comprises one or more body parts; performing (130) a search in a model library and selecting one or more machine-learning models matching the plurality of detected objects, wherein the model library comprises a plurality of machine-learning models and each machine learning model has been trained for a respective object; and applying (140) each selected machine-learning model on a respective body part to identify a clinical finding in the medical image data; and providing (150) the clinical finding. 17

Claim 12. A computer program product comprising instructions for a processor to carry out the method of claim 11.

Claim 13. A computer-readable data carrier having stored thereon the computer program product of claim 12.