CN116547710A

CN116547710A - Method and system for segmenting and identifying at least one tubular structure in a medical image

Info

Publication number: CN116547710A
Application number: CN202180072219.0A
Authority: CN
Inventors: 阿德里安·海茨; 朱利安·魏因佐恩; 吕克·索莱尔
Original assignee: Simple Shares Of Patients Can Be Seen
Current assignee: Simple Shares Of Patients Can Be Seen
Priority date: 2020-10-22
Filing date: 2021-10-19
Publication date: 2023-08-04
Also published as: KR20230092947A; CA3191807A1; US20230410291A1; EP4233002A1; WO2022084286A1; JP2023548041A

Abstract

A computer-implemented method for segmenting and identifying at least one tubular structure having a 3D layout and being located in a body part of an object in a medical image showing a volume of interest region of the object containing the at least one body part and for providing a marked 3D image of the structure is proposed, the method essentially comprising the steps of: providing a set of 2D medical images corresponding to respective mutually different cross-sectional views from planes of the medical images, the planes all being perpendicular to a given direction or all intersecting each other at a given straight line; segmenting a visible section of the relevant body part present in each of the 2D medical images, and creating a corresponding 2D body part mask image, the visible section comprising in particular a complete linear contour or an outer boundary of the body part visible in the 2D image under consideration; preprocessing each 2D medical image by applying a corresponding body part mask image to each 2D medical image; segmenting the tubular structure in said generated preprocessed image; performing the aforementioned steps with at least one other set of 2D medical images, the other set of 2D medical images corresponding to other respective different cross-sectional views; the results of the tubular structure segmentation of the different sets of preprocessed images are combined.

Description

Method and system for segmenting and identifying at least one tubular structure in a medical image

Technical Field

The present invention relates to the field of data processing, more particularly to processing and analysis of images, in particular segmentation and marking of medical images, and to a computer-implemented method for segmentation and identification of at least one tubular structure having a 3D tree layout and being located in at least one body part of a subject in a medical image.

Background

A three-dimensional image produced by a medical imaging device such as a scanner, MRI, ultrasound, CT or SPEC type image is composed of a set of voxels, which are the basic units of a 3D image. A voxel is a 3D extension of a pixel, which is the basic unit of a 2D image. Each voxel is associated with a gray scale or density, which can be considered as a result of a 2D function F (x, y) or a 3D function F (x, y, z), where x, y and z represent spatial coordinates (see fig. 1).

In a 3D image, voxels can be seen in 2D from various planes (slices). The three major planes in the medical image are the axial, sagittal and frontal planes (fig. 2). However, an infinite number of planes can be created as planes parallel to each other along an axis perpendicular to the axial, sagittal, or frontal planes, or an infinite number of planes can be created by rotation about an intersection line (e.g., the intersection of the sagittal and coronal planes, see fig. 2), each plane having a different goniometer.

Typically, 2D or 3D medical images contain a set of anatomical and pathological structures (organs, bones, tissues, … …) or artificial elements (stents, implants, instruments, … …) that a clinician must delineate in order to evaluate and define and program its treatment strategy. In this regard, organs and disorders must be identified in the image, which is to mark (e.g., color) each pixel of the 2D image or each voxel of the 3D image. This process is called segmentation.

The segmentation of the bronchial tree and pulmonary arteries and veins in CT scans can play an important role in patient care during both diagnostic and therapeutic phases. The same applies to other organs or body parts of the subject, such as the liver or kidneys, and their vascular tree. However, extracting these tubular structures using manual or interactive segmentation tools is time consuming and prone to error due to the complexity of the arterial and vascular trees.

The tubular structure now defines the lung and liver map (as a non-limiting example of a target organ) and allows understanding of leaf vascularization. Thus, the arterial and vascular structures in the case of the lung and the vascular structures in the case of the liver may provide relevant information to plan interventions and also provide insight for diagnosis.

To perform these tasks in the case of the lung, the bronchial tree and pulmonary artery and vein segmentation should be reliable and accurate. Manually obtaining a complete tubular structure segmentation of the lung is tedious due to the complex tree structure. Experts often need to rely on their knowledge to be able to infer the presence of blood vessels or whether they belong to the arterial tree or the venous tree. Finally, manually extracting the lung tree is time consuming and prone to inter-rater and intra-rater variability. Therefore, there is a need to develop methods capable of handling complex structures.

There are many known methods for performing segmentation, in particular automatic methods using algorithms, in particular AI algorithms.

In this case, many variants of neural networks have been used in the prior art, generally all based on standard non-specific architecture, which leads to a waste of unsuitable resources and a lack of efficiency and accuracy worldwide.

Thus, several research groups have previously proposed methods for automatically performing bronchial tree or vessel tree segmentation for the lungs. For bronchial tree extraction, the EXACT'09 challenge (see reference [1] detailed below) allows a comparison of methods based primarily on the region growing method. Recently, different deep learning methods (see references [2] to [8 ]) have been applied to the bronchus segmentation task, and are superior to the previous methods. Most methods are based on a 3D version of the U-Net architecture (see reference [9 ]). For example, qin et al (see reference [3 ]) propose transforming binary segmentation of bronchi into a 26-class connectivity prediction task that allows for reduced fragmentation. Zhao et al (see reference [5 ]) combine 2D U-Net for detecting horizontal bronchial branches with 3D U-Net for other branch orientations. Although bronchial tree segmentation is performed using 3D networks, it is unclear whether the selection of 3D networks is superior to 2D methods in terms of computational ease and expressivity (see references [8], [10], and [11 ]).

Disclosure of Invention

The main object of the present invention is to propose a new method to overcome the above-mentioned limitations, not only concerning tubular structures in the lungs, but also concerning other organs and body parts, with the aim of using limited resources and obtaining reliable and accurate results in a limited time frame.

To this end, the invention proposes a computer-implemented method for segmenting and identifying at least one tubular structure having a 3D tree layout and being located at a body part of an object in a medical image showing a volume of interest region of the object containing said body part and for providing a labeled 3D image of said structure,

the method mainly comprises the following steps:

providing a set of 2D medical images corresponding to respective cross-sectional views of the region of interest containing the body part, which are different from each other, the planes of the medical images all being perpendicular to a given direction or all intersecting each other at a given straight line,

segmenting visible segments of the relevant body part present in each 2D medical image, and creating a corresponding 2D body part masking image, the visible segments comprising in particular a complete linear contour or an outer boundary of the body part visible in the 2D image under consideration,

Preprocessing each 2D medical image by applying a corresponding body part masking image to each 2D medical image, thereby producing a processed image containing only image data of the original 2D image related to the body part,

it is possible to segment the tubular structures in the generated preprocessed image by segmenting different kinds of tubular structures in a differential segmentation process,

the aforementioned steps are performed with at least one other set of 2D medical images, wherein the at least one other set of 2D medical images corresponds to different respective other cross-sectional views along other planes parallel to each other or intersecting each other of the same volume of interest region containing the same body part,

the results of the segmentation of the tubular structure of the different sets of preprocessed images are combined to provide a labeled 3D image of one tubular structure or a different type of tubular structure.

The basic principle of the present invention is therefore to provide a fully automated method for performing segmentation and recognition tasks based on a hierarchical approach and relying on a 2-step cascade of computation and data processing procedures, which method only processes relevant data and is therefore more efficient in terms of computational resource and processing time usage.

To increase the reliability level of the automatic segmentation, it is possible to provide: the step of segmenting the tubular structure of the considered preprocessed image comprises: image data of at least one other pre-processed cross-sectional view of the same set of images is also considered, the at least one other pre-processed cross-sectional view being adjacent to the pre-processed image under consideration, e.g. the image data of at least the nearest cross-sectional view located on each side of the pre-processed image under consideration.

According to a preferred embodiment of the invention, the body part segmentation step is performed by using a 2D neural network, preferably a U-Net neural network, and the tubular structure segmentation step is performed by using a 2.5D neural network.

Advantageously, a dedicated neural network, previously trained on data marked by an expert or a set of experts, is used for the body part segmentation and preprocessing steps of each set of 2D medical images, the dedicated and trained neural network also being used to segment the tubular structures in each set of preprocessed images, possibly one specific neural network for each tubular structure.

More precisely, it is conceivable that in the preliminary preparation phase of the invention, the final training parameter values of the neural network or of a set of neural networks intended to process a first set of medical images are used as starting parameter values for training at least one other neural network or a set of neural networks intended to process another set of medical images in the preprocessing phase and/or the segmentation phase of the tubular structure.

According to a conventional slice of the 3D medical image, the first set of medical images may correspond to an axial view of the object, and the second and third sets of medical images may correspond to a sagittal view and a coronal view, respectively.

Advantageously, with respect to the preprocessing of the 2D medical images, the step of segmenting the relevant body part in each 2D medical image and creating the corresponding body part mask image comprises: the outline, boundary and/or interior region of the body part is determined and the connection site of the interior tubular structure with at least one other body part or body component that is or is not part of the masking image is located and identified.

In order to be able to process the image data in a similar way, even though they may originate from various sources or in different formats, prior to the step of segmenting the relevant body part in each of the given set of 2D medical images and creating the corresponding body part mask image, the images (in particular CT scan images) are subjected to an initial processing workflow comprising at least a re-sizing operation and a resizing operation, and possibly also a normalization operation.

In addition, during the preprocessing stage, the modified preprocessed image resulting from the application of the body part mask image to the original 2D medical image is submitted to an isotropic resampling operation and a rescaling operation prior to the step of applying the segmented tubular structure.

In order to use computing resources in an optimized manner, the present invention may provide: in view of further processing, any medical images in which the corresponding body-part masking image in any of the sets of medical images is empty and any pre-processed images that do not display any part of the body-part under consideration are ignored.

According to a most preferred application of the invention, the medical image is an image of a chest cavity of a human subject, wherein the relevant body part is a lung, wherein the segmentation step further comprises identifying a left lung and a right lung, and wherein at least some of the body part mask images further comprise representations of the trachea and connection areas between pulmonary arteries and veins and the heart.

In this case, a practical embodiment of the invention may provide a fully automated method of performing pulmonary, bronchial and pulmonary artery and vein segmentation relying on a cascade convolutional neural network, the method comprising a first part dedicated to right and left lung segmentation (using e.g. a slightly modified 2D U-Net architecture) and a second part; the second part relies on a three-way 2.5D full convolution network along the axial, coronal, and sagittal slices, which is fed with pre-processed 2D images (masked with convex hull of the lung) and focused on performing tubular structure and component extraction and labeling.

In a first alternative, with respect to lung segmentation, two different types of tubular structures, namely a bronchial tree and a pulmonary artery vessel tree, are segmented in parallel, wherein arteries and veins are marked in the vessel tree in a further step.

In a second alternative, three different types of tubular structures, namely the bronchial tree, the pulmonary artery tree and the pulmonary vein tree, are segmented in parallel.

With respect to another application of the invention, the medical image is an abdominal image, wherein the relevant body part is the liver, and wherein the vascular system to be segmented and marked comprises the portal vein, hepatic vein and hepatic artery.

According to another aspect of the invention, the final merging step is performed by means of a fusion operation comprising operation types selected among fusion by union, fusion by majority voting, fusion by logarithmic average, fusion by neural network and fusion by simultaneous truth and performance level estimation.

The invention also comprises an image processing system capable of performing organ and internal tubular structure segmentation and labeling, in particular lung, bronchi and pulmonary artery and vein segmentation, fully automatically, said system being dependent on a cascade convolutional neural network, characterized in that it comprises a first part dedicated to organ segmentation, for example right and left lung segmentation, and based on a slightly modified 2D U-Net architecture, and a second part based on a three-way 2.5D full convolutional network along axial, coronal and sagittal slices, fed with a preprocessed 2D image of the first part, and configured to perform tubular structure and element extraction and labeling.

Drawings

The invention will be better understood using the following description, which refers to at least one preferred embodiment, given by way of non-limiting example and explained with reference to the accompanying drawings, in which:

FIG. 3 schematically and broadly illustrates the hierarchical workflow steps (here three steps) of the method of the present invention by way of example of a 2D axial view (CT scan image) of the lung;

FIGS. 4A and 4B show in more refined representation two alternatives (here four or five steps) of the hierarchical workflow steps of the method of the invention applied to segmentation and labeling of lungs, bronchi and pulmonary arteries and veins, based on the same input image as in FIG. 3;

FIG. 5 illustrates two ways to achieve vessel segmentation starting from multiple axial views similar to one of FIGS. 3 and 4;

FIG. 6 schematically illustrates a possible initial processing workflow that an input image is undergoing before applying a hierarchical workflow step;

FIG. 7 illustrates various processing steps applied to an input image to obtain a preprocessed image in accordance with the present invention;

FIG. 8 is a symbolic representation of an example of a 2.5D U-Net architecture for tubular structure and element segmentation in accordance with the present invention;

Fig. 9 is a more accurate schematic representation of the cascading and layering framework of the method and system architecture of the present invention, as related to the embodiment shown in fig. 4A, and,

FIG. 10 schematically illustrates an example of a fusion workflow and architecture according to the present invention;

FIG. 11 is a representation of the hierarchical workflow steps (here three steps) of the method of the present invention, similar to one of FIG. 4A, applied to segmentation and labeling of arteries and veins of a liver in a 2D image, according to another embodiment of the method of the present invention;

fig. 12 to 20 are gray scale versions of the objects represented in color in fig. 1 to 5 and 8 to 10, respectively.

Detailed Description

Fig. 1 to 11 show, at least in part, a computer-implemented method for segmenting and identifying at least one tubular structure having a 3D tree layout and being located in at least one body part of an object containing the body part in a medical image showing a volume of interest region of the object, and for providing a labeled 3D image of the structure.

According to the invention, the method essentially comprises the following steps:

providing a set of 2D medical images corresponding to respective mutually different cross-sectional views across the region of interest containing the body part, the planes of the medical images all intersecting each other perpendicular to a given direction or all at a given straight line,

Segmenting visible segments of the relevant body part present in each of the 2D medical images, and creating a corresponding 2D body part masking image, the visible segments comprising in particular a complete linear contour or an outer boundary of the body part visible in the 2D image under consideration,

it is possible to segment the tubular structures in the resulting preprocessed image by segmenting different kinds of tubular structures in a differential segmentation process,

the aforementioned steps are performed with at least one other set of 2D medical images, the at least one other set of 2D medical images corresponding to other respective different cross-sectional views along other mutually parallel or intersecting planes of the same volume of interest region containing the same body part,

the results of the segmentation of the tubular structure of the different sets of preprocessed images are combined to provide labeled 3D images of the tubular structure of one or different kinds.

The basic idea of the invention is to propose a procedure organized around a hierarchical workflow step approach and utilizing a 2-step cascade framework.

Although the invention is described herein primarily with respect to the lungs, one skilled in the art will readily recognize that the invention may be applied to any organ including tubular structures, particularly those of the vascularization (vascular) type. This is especially true for the liver.

A first rough illustration of a first example of one of the core features of the inventive method is evident in fig. 3, fig. 3 showing that the workflow can be broken down into three main steps:

the first step involves segmenting the right and left lungs directly from the CT scan,

the second step involves extracting bronchi as a class in conjunction with vessels on the lung mask CT image. This step thus gives a pre-segmentation of the bronchial and vascular tree to be used in the final step. The only pretreatment applied to the lungs is to select the largest connected component for each lung,

the third and final steps aim at CT scanning and connect pre-segmentation of bronchi and vessels with CT scanning as input to the neural network. This helps the network to have detected tubular structures and should only classify them.

A more accurate description of the foregoing core features of the present invention may be made with reference to fig. 4A and 4B.

In fact, as shown in these figures, the hierarchically organized process advantageously consists of four main steps.

The first step is a lung segmentation. The goal of this processing step is to divide the left and right lungs into two classes. Once the lung is segmented, post-processing is applied to the lung. The post-processing step includes selecting a first connected component for each lung (which gives the final lung segmentation), and then computing the convex hull of the combined lung. This last operation allows to create a lung mask that will still contain the trachea and the connections between the pulmonary artery and vein and the heart. This central part of the lung is mandatory to obtain a complete segmentation.

The mask is then applied to the original CT scan image to produce a preprocessed image, and slices along the z-axis that do not contain the mask are discarded.

After this step, internal structure segmentation may be performed. Each structure may be partitioned independently and thus may be calculated in parallel. In one step (fig. 4A) or in two consecutive steps (fig. 4B), one part is directed to segmentation of the bronchial tree and the other part is directed to joint segmentation of the pulmonary artery and vein.

These segments will be described in detail below.

The lung segmentation consists of three basic steps.

The first basic step involves a pre-processing of the incoming CT scan volume. The preprocessing workflow is depicted in fig. 6.

The input volume value will first be trimmed between two thresholds to limit the HU value range. These values (vmin and vmax) were set to-1000 and 1000, respectively, to capture structures from bone into the air contained in the lungs. When the network takes slices of a fixed-size shape, each slice is sized to match the 512 x 512 shape. Finally, a normalization step is applied. The normalization step includes mean centering and variance centering (putting variance to one). This last step will be replaced by a rescaling operator (rescaling operator) which places the value between 0 and 1. Normalization will be discarded because it does not improve convergence during training. However, for numerical stability problems, the value should always be between 0 and 1. This problem explains the replacement of the rescaled portion. Finally, a volume with a fixed slice shape and with normalized values is obtained.

The second basic lung segmentation step, which is the core step, is the segmentation step using a neural network. The neural network model is based on a modified 2D U-Net (see "U-Net: convolutional Networks for Biomedical Image Segmentation" by O.Ronneberger et al, lecture Notes in Computer Science, pp.234-341,Springer International Publishing). The architecture used differs at two points. The number of filters is divided by two in order to reduce network complexity, as no initial complexity is required to segment the lungs. In addition, to speed up convergence, a batch normalization layer is placed after each convolution. The final 1 x 1 convolutional layer consists of 3 classes (background, right lung and left lung). The penalty used during training is a weighted cross entropy penalty. This loss is sufficient for the lungs, as the lungs represent approximately 15% of the total volume in the training set. The optimizer used was an Adam optimizer with a learning rate of 3 e-4. The network is trained 50 times and the model for best validation loss is maintained.

Finally, the last block or third basic step of lung segmentation consists of two post-processing operations. The first post-processing operation to generate the final lung segmentation involves selecting the largest connected component for each lung, followed by morphological hole filling. The first operation aims at avoiding the detection of including some outliers, while the second operation aims at filling small holes that may occur in the lungs. The second post-processing operation results will be fed to the bronchial tree and the pulmonary artery and vein segments (see fig. 4A or fig. 4B). The first post-processing step is common to the step of generating the final segmentation. However, the second post-processing operation is replaced by the computation of the convex hull of the lung (which is therefore regrouped into one class). This final mask will then be used to mask the incoming CT scan to focus the network's attention only on the interior of the lungs. The use of convex hulls allows to preserve the area between the lungs, which is mandatory to divide the trachea at the junction between the heart and the pulmonary arteries and veins.

Bronchial tree segmentation is more complex than the lung because it involves the use of multiple views. But the part can be divided again into three steps.

The first step includes a pretreatment of the input volume. The first three pre-processing operations presented are common to each view, and only the last two operations are specifically oriented. The preprocessing workflow is shown in fig. 7.

The first step of the preprocessing chain corresponds to the masking of the input (original 2D image) by the result of the lung segmentation. Again, this operation limits the information to only the lungs and their central regions. Isotropic resampling is then applied. This operation allows each volume to be placed at equal intervals. This operation is important because in the case of bronchi, not only axial views will be used, but also coronal and sagittal views will be fed to the neural network. The last common operation of each view is to rescale the volume value between 0 and 1. From this last operation, an axially oriented preconditioned volume is obtained. For an orientation block, if the network used is trained on axial slices, the orientation operation acts like an identity operation. However, for sagittal and coronal training, the volume is rotated such that the sagittal and coronal planes are aligned with the axial planes, respectively. The final operation will suppress slices/views that are not part of the lung. This last operation is performed because it is useless for rendering slices outside the lung, and because the segmentation of the bronchial tree can always be limited to the lung. Furthermore, the clipping operation allows for a reduction in computation time for both efficient prediction and post-processing at the prediction time.

The neural network component for bronchial segmentation is slightly different compared to the system/method component for pulmonary segmentation.

First, the number of channels is now equal to the number of channels used in the previously cited publication (U-Net: convolutional Networks for Biomedical Image Segmentation ", lecture Notes in Computer Science, pp.234-341,Springer International Publishing, O.Ronneberger et al), and most importantly, the input is now 2.5D. Fig. 8 shows an architecture for bronchial tree segmentation when implementing the method according to the invention in an exemplary manner.

In practice, each input consists of the considered slice (preprocessed 2D image) to be segmented and its neighboring slices (on opposite sides of the considered slice). The addition of such adjacent slices helps the network to learn about the local 3D background throughout the considered cutting plane through the relevant organ or body part (in this case the lung). Such a local 3D background is important when segmenting tubular structures and improves the quality of the results.

The training process should be special in this case, according to the experience of the inventors. Thus, the first training phase involves training of the neural network fed by axial slices. The network is trained from scratch 50 times. Second, a network fed with sagittal or coronal slices is trained by transfer learning. The final weights from the axial network are reused as starting points for other view networks. This transfer allows faster convergence and lower losses. Furthermore, from a sagittal or coronal perspective, the tubular structures are not always in communication. The use of transfer learning helps alleviate this drawback. The sagittal and coronal networks were trained 20 times. The race loss is used for training due to the higher imbalance between background and bronchial tree compared to lung. Adam is also used to train each view, with a learning rate of 3e-4.

In this regard, three networks trained on axial, sagittal, and coronal views are available. In order to obtain a final volume, the three results must preferably be combined or aggregated using a fusion procedure according to the method of the invention.

A possible fusion framework is shown by way of example in fig. 10.

As a merging or aggregation method, the inventors mainly tested four different fusion procedures, namely: by union fusion, by majority voting fusion, by log mean (log averaging) fusion, by "Simultaneous Truth and Performance level estimation" (STAPLE) (S.K. Warfield et al, "STAPLE: an algorithm for the validation of image segmentation", vol.23, pp.903-921), and by neural network fusion.

First, each voxel marker is kept at least once by union fusion through one of the three networks marked as part of the bronchial tree. This fusion allows more parts of the bronchial tree to be restored than can be found independently for each view.

Quantitatively, it has been noted that this property is reflected by high recall compared to all other methods by fusion or just by different views. This characteristic indicates that each network can find useful information that may be missed by other networks. The main weakness of such networks is now that their accuracy is lower than in other approaches. This is due to the fact that: each view captures false positives (which may be artifacts, for example) and the fusion remains everything. Thus, the bad attributes encountered by each view are inherited through union fusion.

In summary, it can be noted that the union and allow recovery of all elements detected by each neural network. Thus, the union and combination will give the most detailed results. However, since all detection is restored, false positives found by each network will also accumulate, resulting in a loss of accuracy.

The fusion by majority voting works by: only voxels labeled at least twice by three networks are kept as part of the bronchial tree. This approach is in contrast to fusion by union, in terms of precision and recall. In practice, majority voting helps to achieve high accuracy by filtering out artifacts detected by only one view. On the other hand, if one view is more efficient in a portion of the bronchial tree, true positives detected by only one view will also be filtered out. This property results in lower recall than by union fusion. However, majority voting achieves a recall rate comparable to that which each view can recover.

In summary, it can be noted that majority vote merging will have the opposite (antagnisic) property compared to union merging. This type of fusion will be selective and will filter false positives found by only a single neural network. On the other hand, there will be a loss of true positives found by only a single neural network. Finally, majority vote merging will have better accuracy than union merging, but it will also find fewer elements globally

Operation is performed at the logarithmic level by log-mean fusion. Then, to get the final segmentation, a threshold must be applied to those averaged logarithms. The primary choice is typically 0.5, but the threshold may be optimized after averaging the logarithms. Furthermore, the choice of threshold allows a compromise between fusion by union and fusion by majority voting. In practice, if the threshold is set to a low value, it will tend to merge by selecting more voxels. On the other hand, a high value is chosen, which will be more conservative and then tend to be fused by majority voting. This feature is interesting because it allows to adjust the final result according to what properties it is desired to output.

In summary, log-mean combining is a method that occurs between the two above methods. Once the logarithm is averaged, a threshold must be applied to obtain the final segmentation. The setting of this threshold will determine whether it is the accuracy or number of true positives to be recovered that is preferred.

The purpose of the STAPLE fusion is to generate real data (ground truth) based on annotations made by multiple experts or automated methods of the same case. It is based on a expectation maximization algorithm and allows optimization of recall and specificity of each segment relative to the final segment. With respect to the present invention, this approach allows for high accuracy to be achieved at the expense of loss of recall metrics, as compared to other approaches. However, this method is not suitable for a small number of raters or automated methods.

In summary, fusion by STAPLE (simultaneous true and performance level estimation) will aim to reduce the number of false positives and false negatives. This type of merging requires more input than other types of merging and therefore will be more efficient when merging, for example, multiple non-orthogonal planes.

Finally, neural network fusion will be able to take into account the specificity of the problem under consideration, and will allow the fusion to be optimized with respect to the examples provided. In addition, this type of fusion may allow for mitigating a certain number of artifacts that may be introduced during transformation of the input image prior to prediction.

Orthogonal view fusion allows improved performance, independent of the fusion method used, since each view will bring meaningful information compared to the other views. Furthermore, fusion improves the most difficult case by reducing global fragmentation.

The last operation in the bronchial tree partition is post-processing. The inventors have studied various strategies and only two simple operations appear to help clear the final segmentation. The first post-treatment is to select the largest communicating component. The method allows for the final result to be cleared and outliers to be removed. In some cases, some branches of the bronchial tree may be disconnected and filtered out by this operation. In order to be less aggressive in filtering out connected components, the second method is adapted to filter out only connected components having less than 100 voxels, preferably less than 50 voxels.

The pulmonary artery and vein segmentation system/method component used by the inventors is very similar to the previously described bronchial component and borrows most of its features. Only the fusion and post-processing steps differ from the methods previously proposed herein with respect to bronchial segmentation.

In fact, only majority voting fusion is considered for such pulmonary artery and vein segments, as it is natural to convert this type of fusion from binary segmentation to multi-class segmentation. However, logarithmic average fusion may also be used in this case. However, the threshold value must be retrieved empirically.

In addition, arterial and venous segmentations are more fragmented than bronchial tree segmentations, so the use of a maximum communication component can inhibit a large portion of the vascular tree. To this end, only the second method of filtering connected components having less than 100 voxels (preferably having less than 50 voxels) should be applied directly without losing part of the tree.

The comparative tests by the inventors using several known metrics (dice score, precision, recall, asSD, hausdorff distance, skeleton embedding precision, recall and solid sphere metric) have shown that the method according to the invention is superior or at least equal to the prior art method while using fewer resources and/or faster when using the same training cases as the other 2D and 3D methods in the prior art.

A first practical example of the method according to the invention, which is applied by the inventors to the segmentation of a tubular structure of the lung, will be described in more detail below with reference to fig. 8, 9 and 10.

This first example includes a 2-step cascade framework (abbreviated as LuBrAV) for pulmonary, bronchial, pulmonary artery and vein segmentation. The method applied can be summarized as follows: first, right and left lung segmentation was performed using a slightly modified 2D U-Net architecture. The incoming CT scan is then masked with the convex hull of the lung and provided to a three-path structure of axial, coronal, and sagittal slices made up of a 2.5D full convolution network to extract the tubular assembly.

The proposed method is divided into two steps: lung extraction followed by tubular structure segmentation.

Fig. 9 shows a flow chart of the proposed method, i.e. a first part dedicated to lung segmentation and a second part of the tubular structure segmented using masked input.

Where X is considered to be the volume of nz X nx X ny chest CT scans. The first step aims at segmenting the right and left lungs. The idea is to limit the volume to only the lungs, as the lungs represent only about 15% of the total volume. Thus, a large part of the volume contains irrelevant information for the segmentation of the tubular structure. The estimation of the lung mask may be defined as:

Wherein y-RL and y-LL represent the right lung prediction mask and the left lung prediction mask, respectively; fL is the operator that allows segmentation and classification of the two lungs. In order to limit the volume to the region of interest, the masked volume is defined as follows:

where Hull () is the convex Hull operator and elemental form product. Therefore, XL contains only non-constant values inside and between the lungs. This operation allows restricting X to the lungs and relaxing their central portion, which contains the tracheal and cardiac connections. These are mandatory to understand correctly the vascularization of both lungs. After the amount of extraneous information is reduced, two operators fB and fAV are applied to perform bronchial, arterial and venous segmentation,

wherein y-B, y-A, y-V represent bronchial, arterial and venous tree prediction masks, respectively.

The lung assembly consists of two parts: left and right lung extraction and CT scan masking using convex hulls of the extracted lungs. Lung extraction was performed using a 2D U-Net network. The architecture is based on [9], with an additional batch normalization layer after each two convolutions and up-sampling of the convolutions block [15 ]. This modification allows improved convergence. During training, the weighted cross entropy loss is used as a loss function to mitigate imbalance. The largest connected component of each category is preserved to filter possible artifacts. A convex hull operation is then applied to merge the left and right lungs. Finally, the result of this operation is used to mask the incoming CT and limit the information to only the lung region (equation (2)).

For tubular structural assemblies, methods based on a combination of 2.5D U-Net applied to axial, sagittal and coronal views are presented. The 2.5D architecture used is shown in fig. 8 (C represents the number of categories).

In addition to the current slice to be segmented, the 2.5D U-Net used also considers k previous and subsequent neighboring slices. Adjacent slices allow for the collection of 3D local information that helps to segment the tubular structure. The number of final output channels C depends on the task under consideration: for bronchi, the number of final output channels is two; or for arterial and venous segmentation, the number of final output channels is three. This architecture is then used along multiple views (see fig. 10).

As shown in fig. 10, three parallel 2.5D U-nets are used to learn the tubular structure representation along the orthogonal view and are combined into a single output.

Using orthogonal views allows for further enrichment of 3D context information. In practice, each view may capture different relevant information and may find tubular structures that were missed by the other two views. The partitions derived from each view are then combined to obtain the final output. For example, for bronchi, the output of this portion is defined as:

f _B (X _L ) ＝ F [f ^a _B (X _L ),f ^s _B (X _L ),f ^c _B (X _L )] (5)

Where F is the fusion operator and faB, fsB, fcB is the segmentation operator along the axial, sagittal and coronal views using 2.5D U-Net, respectively. Depending on the number of categories to be segmented, the union of fusion operators, majority Voting (MV), and Logarithmic Averaging (LA) can be considered. In the case of a tubular structure, to mitigate the imbalance between tubular class and background, the generalized Dice loss [16] is used as a training loss for each view. Because the lung tubular structure represents a tiny fraction of the total volume compared to the lung, the price loss is more appropriate compared to the weighted cross entropy.

To evaluate the method of the present invention, the inventors collected 193 chest CT scans from several hospitals. Thus, these data are obtained on scanners from different manufacturers and according to various protocols, reflecting the wide variability of daily routine clinical data. The slice spacing of these CT volumes varies from 0.625 to 2.0 mm. No selection was made for the CT scan collected, meaning that the dataset contained a diseased lung (tumor, funnel chest,.. The post-operative lung). 185 of all images were fully annotated by six expert radiologists depicting five previously mentioned categories. All six experts annotate the eight scans to compare the performance of the method against the variability between raters. The 185 cases data set was divided into three sets containing 118 cases, 30 cases and 37 cases corresponding to the training set, the validation set and the test set, respectively.

To evaluate segmentation quality, several complementary metrics are considered, namely, the Dice score, the average symmetric surface distance (AsSD), and the Hausdorff distance. The skeleton embedding method proposed in [5] is also considered: it allows calculation of the accuracy and recall of the centerline of the tubular structure. Centerline assessment allows avoiding the influence of vessel diameter and thus assessing equally thin and thick vessels. Finally, the proposed method is evaluated for eight cases annotated by each expert. For comparison with multiple experts, the intersection (corresponding to (resp.) union) of all expert segmentations is considered to calculate the recall (corresponding to the precision) by skeleton embedding. The decision to compute recall for intersections of all expert segmentations evaluates the method to at least be able to detect structures agreed upon by all experts. The accuracy is calculated for the union of expert segmentations to avoid false positives due to variability between raters.

Two experiments were performed. First, the proposed method was evaluated with various fusion strategies and compared with the standards 2D U-Net and 3D U-Net (Table below). 2D U-Net takes as input a 2D axial slice and gives as output the five categories considered. 3D U-Net is equivalent to one set forth in [5 ].

In the table above, the score presented is the average standard deviation (distance given in mm) over the test set. The scoring is: the Dice score, the average symmetric surface distance (AsSD), the Hausdorff distance (Haus), the skeleton embedding accuracy (Prec), and the skeleton embedding recall (Reca). The logarithmic average is thresholded, where the best value of the Dice score on the validation set is found by a grid search between 0.1 and 0.5.

The lung segmentation was not evaluated in the above table, as the presented framework and multi-class 2DU-Net achieved similar high Dice scores (> 0.98), low assds (< 1 mm), and Hausdorff distances below 20 mm. The fusion method provides different properties for the final segmentation. High recall will be given at the expense of lower accuracy by union fusion. Majority voting fusion will provide high accuracy and low recall. Finally, the choice of the threshold for the logarithmic average will provide a compromise between recall and precision. For arteries and veins only majority vote fusion was evaluated, as the fusion naturally shifts to multi-category problems. The proposed method is to a large extent superior to 2D and 3D baselines, especially on arteries and veins. The baseline approach lacks accuracy and recall compared to the proposed framework. This difference can be explained by the reduction of X to XL and the use of multiple views. In addition, 2D U-Net typically provides inconsistent links between heart and lung entries. The method remains consistent between heart and lung entries due to the multiple views and the narrowed space to be segmented. Furthermore, the use of majority voting fusion allows for a reduction in class confusion between pulmonary arteries and veins.

Finally, on the one hand, the performance of the method according to the invention is also compared, and on the other hand, the performance of the union (exact) and intersection (recall) of six expert segmentations is also compared (table below):

(a) Bronchus

(b) Pulmonary artery and vein

The results have been given as follows: mean ± standard deviation at the percentage of each metric.

The performance obtained on the multi-rater dataset is superior to that obtained on a test set annotated by only one expert per image. This difference in performance is due to the fact that: different experts will find different tubular structures at the periphery of the lungs. Thus, this second experiment was performed to mitigate variability between raters during the test and to give a more accurate estimate of accuracy and recall. Fusion in each case helps to improve accuracy and recall in terms of mean and standard deviation. These experiments demonstrate the ability of the present method and system to segment tubular structures of the lung with high accuracy. Fig. 4 shows a 3D reconstruction obtained from a reference axial network and a majority vote fusion. Majority voting fusion of arteries and veins can avoid some confusion between arteries and veins (see green circles).

Briefly, a 2.5D cascade method of segmenting tracheal and vascular tubular lung structures was proposed. A two-step pipeline (pipeline) helps the neural network focus on the tubular structure segmentation. Each pipeline module exploits the complementary nature of each orthogonal view by fusing. Recall is always better when considering fusion of multiple views compared to axial only views and general others. This shows that a network trained on different views can capture relevant information that may be lost on a single view. Furthermore, the use of fusion allows in some cases to avoid arterial/venous confusion. Finally, the proposed frame structure may also partially address the situation where the disease alters the structural aspects of the tubular structure (blood vessels through lung nodules, funnel chest,..once.,) the patient is in need of treatment.

As a second practical example, the application of the method and system of the present invention in segmenting the liver and vessels interacting with the liver will now be described.

The main categories considered here are: liver, hepatic vein, hepatic artery and portal vein. For the previous example involving the lung, consider a 2-step approach: first, the liver will be segmented by the first part, and then the incoming computed tomography (CT scan) will be limited to the liver convex hull region only. The second portion will then segment the blood vessel. The required adaptation of each part will be given herein. The database used for abdominal structure segmentation consisted of 148 cases (119 cases would be used for training and 29 cases for testing).

First, the inventors tested liver segmentation using the same framework presented in the report on the layering method. The main variation between the two approaches comes from the finishing operation in which vmin and vmax are adjusted to the abdominal region. vmin and vmax are set to-100 and 400, respectively.

Because of the common boundary with other organs sharing overlapping Henry Units (HU), segmenting the liver is generally more complex than the lung, two methods are decided to be used to improve the segmentation of the liver. First, not only the axial plane for segmentation but also the coronal and sagittal planes are considered. This modification implies the use of a different preprocessing pipeline corresponding to that shown in fig. 7, but wherein the step "apply lung mask" is replaced by the "trim" step.

On the other hand, in order to mitigate boundary leakage to adjacent organs, segmentation of the following organs is considered in addition to liver segmentation: colon, gall bladder, right kidney, left kidney, pancreas, spleen, stomach, heart, right lung and left lung. Since these organs may be in contact with the liver shell, leakage to them will be reduced.

The training phase is performed in the same way as presented previously with respect to the 2.5D multi-view layering method. The axial network trains from scratch, while the sagittal and coronal networks start from the final weights in the axial direction.

The fusion operation used is a majority vote, as multiple categories are merged. The following table contains the quantitative results obtained from each organ. Since the heart and lungs do not serve as the starting point for the expert, they are not reserved for evaluation.

The score presented is the mean ± standard deviation over the test set (distance given in mm and Dice score given in percent).

The consistency between this method and an expert can be seen to be weaker compared to the lungs. However, the liver remains highly overlapping with the reference. Furthermore, the standard deviation of scoring is high for each structure showing variability consistent with expert segmentation. For organ segmentation, a simple post-processing is applied to each class, which is the choice of the largest connected component that allows filtering out outliers.

The liver vessel contains the following structure: portal vein, hepatic vein and hepatic artery. Allowing segmentation of portions of the hepatic tubular structure borrows everything from almost pulmonary artery and vein assemblies. The main difference is that the values vmin and vmax are set to-100 and 400, respectively.

Similar to the bronchial or pulmonary artery and vein assemblies, the input to the network is masked with a convex hull of the organ of interest, in this case the liver. Furthermore, no post-processing is applied to the resulting segmentation. Quantitative evaluations are listed in the following table:

The score presented is the mean ± standard deviation over the test set (distance given in m and Dice score given in percent). Metrics of arteries, hepatic veins and portal veins are calculated only within the liver convex hull envelope.

For each metric and each structure, a better average score is obtained by majority voting fusion than for an axial view of only nearly the same standard deviation. Again, this difference demonstrates the relevance of using a fusion approach to such segmentation. To further expand the analysis, the vessel was divided into two parts: venous tubular structures and arterial tubular structures.

This second example shows: when applied to the liver and the blood vessels interacting with it, the method according to the invention, with slight modification with respect to the first example, is able to provide the following segmentation: colon, gall bladder, right kidney, left kidney, pancreas, spleen, stomach, heart, right and left lung, portal vein, hepatic vein and hepatic artery.

Regarding the identification of veins and arteries, the number of false positives remains low compared to true positives.

The following references are previously cited in this specification (by corresponding numerals in brackets), and the contents and teachings thereof are incorporated herein as examples of prior art knowledge:

[1]P.Lo,B.van Ginneken,J.M.Reinhardt,Y.Tarunashree,P.A.de Jong,B.Irving,C.Fetita,M.Ortner,R.Pinho,J.Sijbers,M.Feuerstein,A.Fabijanska,C.Bauer,R.Beichel,C.S.Mendoza,R.Wiemker,J.Lee,A.P.Reeves,S.Born,O Weinheimer,E.M.vanRikxoort,J.Tschirren,K.Mori,B.Odry,D.P.Naidich,I.J.Hart-mann,E.A.Hoffman,M.Prokop,J.H.Pedersen and M.de Bruijne,“Extractionof Airways From CT(EXACT’09),”IEEE Transactions on MedicalImaging,vol.31,no.11,pp.2093–2107,Nov.2012.

[2]D.Jin,Z.Xu,A.P.Harrison,K.George,and D.J.Mollura,“3D Convolutional Neural Networks with Graph Refinement forAirway Segmentation Using Incomplete Data Labels,”in MachineLearning in Medical Imaging,vol.10541,pp.141–149.SpringerInternational Publishing,Cham,2017.

[3]Y.Qin,M.Chen,H.Zheng,Y.Gu,M.Shen,J.Yang,X.Huang,Y.Zhu,and G.Yang,“AirwayNet:A Voxel-Connectivity AwareApproach for Accurate Airway Segmentation Using ConvolutionalNeural Networks,”in Medical Image Computing and Computer AssistedIntervention MICCAI 2019,vol.11769,pp.212–220.SpringerInternational Publishing,Cham,2019.

[4]A.G.Juarez,H.A.W.M.Tiddens,and M.de Bruijne,“Automatic Airway Segmentation in chest CT using ConvolutionalNeural Networks,”arXiv:1808.04576[cs],Aug.2018,arXiv:1808.04576.

[5]T.Zhao,Z.Yin,J.Wang,D.Gao,Y.Chen,and Y.Mao,“Bronchus Segmentation and Classification by Neural Networks andLinear Programming,”in Medical Image Computing and ComputerAssisted Intervention MICCAI 2019,vol.11769,pp.230–239.SpringerInternational Publishing,Cham,2019.

[6]J.Yun,J.Park,D.Yu,J.Yi,M.Lee,H.J.Park,J.Lee,J.B.Seo,and N.Kim,“Improvement of fully automated airway segmentationon volumetric computed tomographic images using a 2.5 dimensionalconvolutional neural net,”Medical Image Analysis,vol.51,pp.13–20,Jan.2019.

[7]Q.Meng,H.R.Roth,T.Kitasaka,M.Oda,J.Ueno,and K.Mori,“Tracking and Segmentation of the Airways in Chest CT Using aFully Convolutional Network,”in Medical Image Computing andComputer-Assisted Intervention MICCAI 2017,vol.10434,pp.198–207.Springer International Publishing,Cham,2017.

[8]C.Wang,Y.Hayashi,M.Oda,H.Itoh,T.Kitasaka,A.F.Frangi,and K.Mori,“Tubular Structure Segmentation Using SpatialFully Connected Network with Radial Distance Loss for 3D MedicalImages,”in Medical Image Computing and Computer AssistedIntervention MICCAI 2019,vol.11769,pp.348–356.SpringerInternational Publishing,Cham,2019.

[9]O.Ronneberger,P.Fischer,and T.Brox,“U-Net:Convolutional Networks for Biomedical Image Segmentation,”inLecture Notes in Computer Science,pp.234–241.Springer InternationalPublishing,2015.

[10]Y.Wang,Y.Zhou,W.Shen,S.Park,E.Fishman,and A.Yuille,“Abdominal multi-organ segmentation with organ-attentionnetworks and statistical fusion,”Medical image analysis,vol.55,pp.88–102,2019.

[11]M.Perslev,E.B.Dam,A.Pai,and C.Igel,“One Network toSegment Them All:A General,Lightweight System for Accurate 3DMedical Image Segmentation,”in Medical Image Computing andComputer Assisted Intervention MICCAI 2019,vol.11765,pp.30–38.Springer International Publishing,Cham,2019.

[12]H.Cui,X.Liu,and N.Huang,“Pulmonary VesselSegmentation Based on Orthogonal Fused U-Net++of Chest CT Images,”in Medical Image Computing and Computer Assisted InterventionMICCAI 2019,vol.11769,pp.293–300.Springer,2019.

[13]P.Nardelli,D.Jimenez-Carretero,D.Bermejo-Pelaez,M.J.Ledesma-Carbayo,F.N.Rahaghi,and R.Estepar,“Deep-learningstrategy for pulmonary artery-vein clas-sification of non-contrast CT images,”in 2017IEEE 14th International Symposium on Biomedical Imaging(ISBI 2017),Melbourne,Australia,Apr.2017,pp.384–387.

[14]P.Nardelli,D.Jimenez-Carretero,D.Bermejo-Pelaez,G.R.Washko,F.N.Rahaghi,M.J.Ledesma-Carbayo,and R.Estpar,“Pulmonary ArteryVein Classification in CT Images Using Deep Learning,”IEEE Transactions on Medical Imaging,vol.37,no.11,pp.2428–2440,Nov.2018.

[15]S.Ioffe and C.Szegedy,“Batch normalization:Acceler-ating deep network training by reducing internal covari-ate shift,”in Proceedings of the 32nd International Con-ference on International Conference on Machine Learn-ing-Volume 37.ICML’15,pp.448–456,JMLR.org.

[16]C.Sudre,W.Li,T.Vercauteren,S.Ourselin,and J.Car-doso,“Generalised Dice Overlap as a Deep Learning Loss Function for Highly Unbalanced Segmentations,”in Deep Learning in Medical Image Analysis and Multi-modal Learning for Clinical Decision Support,pp.240–248.Springer,2017.

Of course, the invention is not limited to at least one embodiment described and represented in the drawings. Modifications may still be made, in particular from the standpoint of composition of the various elements or by substitution of technical equivalents, without departing from the scope of the invention.

Claims

1. A computer-implemented method for segmenting and identifying at least one tubular structure having a 3D layout and being located in a body part of an object in a medical image showing a volume of interest region of the object containing the at least one body part and for providing a marked 3D image of the structure,

the method mainly comprises the following steps:

Preprocessing each 2D medical image by applying a corresponding body part masking image to each 2D medical image, thereby producing a processed image comprising only image data related to the body part in the original 2D image,

the tubular structures in the resulting preprocessed image are segmented as much as possible by segmenting different kinds of tubular structures in a differential segmentation process,

the aforementioned steps are performed with at least one other set of 2D medical images, wherein the at least one other set of 2D medical images corresponds to different respective other cross-sectional views along other planes parallel to each other or intersecting each other of a same volume of interest region containing a same body part,

2. The method according to claim 1, wherein the step of segmenting the tubular structure of the considered preprocessed image comprises: image data of at least one other pre-processed cross-sectional view of the same set of images, which is adjacent to the pre-processed image under consideration, e.g. the image data of at least the nearest adjacent cross-sectional view on each side of the pre-processed image under consideration, is also considered.

3. The method according to claim 1 or 2, wherein the body part segmentation step is performed by using a 2D neural network, preferably a U-Net neural network, and the tubular structure segmentation step is performed by using a 2.5D neural network.

4. A method according to any one of claims 1 to 3, wherein a dedicated neural network previously trained on expert-labeled data is used for the body-part segmentation and preprocessing steps of each set of 2D medical images, the dedicated and trained neural network also being used for segmenting the tubular structures in each set of preprocessed images, possibly one specific neural network for each tubular structure.

5. Method according to claim 4, wherein in the preliminary preparation phase final training parameter values of the neural network intended to process the first set of medical images are used as starting parameter values for training at least one other neural network intended to process the other set of medical images in the pre-processing phase and/or the tubular structure segmentation phase.

6. The method of any of claims 1 to 5, wherein a first set of medical images corresponds to an axial view of the object, and wherein a second set of medical images and a third set of medical images correspond to a sagittal view and a coronal view, respectively.

7. The method according to any one of claims 1 to 6, wherein the step of segmenting the relevant body part in each 2D medical image and creating a corresponding body part mask image comprises: a contour and/or an interior region of the body part is determined and a connection site of the interior tubular structure with at least one other body part or body component that is or is not part of the masking image is located and identified.

8. The method according to any one of claims 1 to 7, wherein prior to the step of segmenting the relevant body part in each 2D medical image of a given set and creating a corresponding body part mask image, the image is subjected to an operation of an initial processing workflow comprising at least a re-sizing operation and a resizing operation, and possibly further comprising a normalization operation.

9. The method according to any one of claims 1 to 8, wherein, prior to the step of applying segmentation of the tubular structure, during the preprocessing stage, a modified preprocessed image resulting from applying the body part mask image to the original 2D medical image is submitted to an isotropic resampling operation and a rescaling operation.

10. The method according to any of claims 1 to 9, wherein any medical images in which the respective body part masking image in any of the sets of medical images is empty and any pre-processed images which do not display any part of the body part under consideration are ignored in view of further processing.

11. The method of any of claims 1 to 10, wherein the medical image is an image of a chest of a human subject, wherein the relevant body part is a lung, wherein the segmenting step further comprises identifying a left lung and a right lung, and wherein at least some body part masking images further comprise representations of a trachea and connection areas between pulmonary arteries and veins and the heart.

12. The method according to claim 11, wherein two different types of tubular structures, namely a bronchial tree and a pulmonary artery vessel tree, are segmented in parallel, wherein the artery and the vein are marked in the vessel tree in a further step.

13. The method of claim 11, wherein three different types of tubular structures, namely a bronchial tree, a pulmonary artery tree, and a pulmonary vein tree, are segmented in parallel.

14. The method according to any one of claims 1 to 10, wherein the medical image is an abdominal image, wherein the relevant body part is a liver, and wherein the vascular system to be segmented and marked comprises portal vein, hepatic vein and hepatic artery.

15. The method according to any one of claims 1 to 14, wherein the final merging step is performed by means of a fusion operation comprising an operation type selected among fusion by union, fusion by majority vote, fusion by logarithmic average, fusion by neural network and fusion by simultaneous truth and performance level estimation.

16. An image processing system capable of performing organ and internal tubular structure segmentation and labeling, in particular lung, bronchi and pulmonary artery and vein segmentation, fully automatically, the system being dependent on a cascade convolutional neural network, characterized in that the system comprises a first part and a second part, the first part being dedicated to organ segmentation, e.g. right and left lung segmentation and the first part being based on a slightly modified 2D U-Net architecture; the second part is based on a three-way 2.5D full convolution network along axial, coronal and sagittal slices, the second part being fed with a preprocessed 2D image of the first part and configured to perform tubular structure and element extraction and labeling.

17. The image processing system of claim 16, wherein the image processing system is configured and arranged to perform the method of any one of claims 1 to 15.