CN114463248A

CN114463248A - Seed relabeling for seed-based segmentation of medical images

Info

Publication number: CN114463248A
Application number: CN202111267702.2A
Authority: CN
Inventors: 王轶青; G·J·J·帕尔玛
Original assignee: International Business Machines Corp
Current assignee: Maredif Usa
Priority date: 2020-10-30
Filing date: 2021-10-29
Publication date: 2022-05-10

Abstract

The application relates to seed relabeling for seed-based segmentation of medical images. A mechanism is provided for seed relabeling for seed-based slice lesion segmentation. The mechanism receives a lesion mask for a three-dimensional medical image volume. The lesion mask corresponds to detected lesions in the medical image volume, and wherein each detected lesion has a lesion contour. The mechanism generates a distance map for a given two-dimensional slice in the medical image volume based on the lesion mask. The distance map comprises the distance to the lesion contour of each voxel of a given two-dimensional slice. The mechanism performs local maximum identification to select a set of local maxima from the distance map such that each local maximum has a value greater than its immediate neighbor point. The mechanism performs seed relabeling based on the distance map and the set of local maxima to generate a set of seeds. Each seed represents the center of a different component of the lesion outline. The mechanism performs image segmentation on the lesion mask based on the set of seeds to form a split lesion mask.

Description

Seed relabeling for seed-based segmentation of medical images

Technical Field

The present application relates generally to an improved data processing apparatus and method, and more particularly to a mechanism for seed relabeling for seed-based segmentation of medical images.

Background

Liver lesions (liver lesions) are a group of abnormal cells in the liver of a biological entity and may also be referred to as masses or tumors. Non-cancerous or benign liver lesions are common and do not spread to other areas of the body. Such benign liver lesions do not usually cause any health problems. However, some liver lesions develop as a result of cancer. Patients with a particular medical condition (medial condition) may be more likely to have cancerous liver lesions than others. These medical conditions include, for example, hepatitis b or c, cirrhosis, iron storage disease (hemochromatosis), obesity, or exposure to toxic chemicals such as arsenic or aflatoxins.

Liver lesions are typically only identifiable by having a medical imaging test, such as, for example, ultrasound, Magnetic Resonance Image (MRI), Computerized Tomography (CT), or Positron Emission Tomography (PET) scans. Such medical imaging tests must be viewed by a human medical imaging Subject Matter Expert (SME) who must use their own knowledge and expertise and human abilities to view patterns (patterns) in images to determine whether the medical imaging test shows any lesions. If human SME identifies a potentially cancerous lesion, the patient's physician may perform a live tissue examination to determine if the lesion is cancerous.

Abdominal Contrast Enhanced (CE) CT is the current standard for assessing various abnormalities (e.g., lesions) in the liver. These lesions can be assessed by human SME as malignant (hepatocellular carcinoma, cholangiocarcinoma, angiosarcoma, metastasis and other malignant lesions) or benign (hemangioma, focal nodular hyperplasia, adenoma, cyst or lipoma, granuloma, etc.). Manual evaluation of such images by human SME is important to guide subsequent intervention. Many times, in order to properly assess lesions in CE CT, a multi-stage study is conducted, wherein the multi-stage study provides different levels of medical imaging of healthy liver parenchyma enhancement and comparison to lesion enhancement to determine difference detection. Human SME can then determine a diagnosis of the lesion based on these differences.

Disclosure of Invention

This summary is provided to introduce a selection of concepts in a simplified form that are further described herein in the detailed description. This summary is not intended to identify key factors or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.

In one illustrative embodiment, a method is provided in a data processing system comprising at least one processor and at least one memory including instructions executable by the at least one processor to implement a trained machine learning computer model for seed relabeling for seed-based sliced lesion segmentation. The trained machine learning computer model performs a method comprising: a lesion mask of a three-dimensional medical image volume is received. The lesion mask corresponds to detected lesions in the medical image volume, and wherein each detected lesion has a lesion contour. The method further comprises generating a distance map for a given two-dimensional slice in the medical image volume based on the lesion mask. The distance map includes distances to a lesion contour of each voxel of the given two-dimensional slice. The method also includes performing local maximum identification to select a set of local maximum values from the distance map such that each local maximum has a value greater than its immediate neighbor point. The method further includes performing seed relabeling based on the distance map and the set of local maxima to generate a set of seeds, wherein each seed in the set of seeds represents a center of a different component of a lesion contour. The method also includes performing image segmentation on the lesion mask based on the set of seeds to form a split lesion mask. This has the benefit of providing a split lesion mask that is not overly disruptive. The method of the illustrative embodiments combines regions of the split lesion mask that are likely to be sites of the same lesion.

In an example embodiment, generating the distance map comprises performing gaussian smoothing on the distance map. This has the benefit of smoothing the distance values that will result in extraneous maxima (outliers maxima) that will result in further excessive fragmentation within the disrupted lesion mask. Performing gaussian smoothing reduces the number of candidate points that can be considered local maxima, potentially reducing the number of different regions that must be combined during seed relabeling.

In another example embodiment, performing the seed relabeling includes grouping the first and second local maxima in response to determining that the first and second local maxima are direct neighbors. This has the benefit of eliminating local maxima that are direct neighbors of candidates for segmenting lesions, since a very close local maximum is unlikely to be the center of a different lesion.

In yet another example embodiment, performing seed relabeling comprises: determining a circle centered at each local maximum, the circle having a radius equal to the corresponding distance of the circle in the distance map; calculating a measure of overlap of a first circle centered at the first local maximum and a second circle centered at the second local maximum; and grouping the first local maximum and the second local maximum if the overlap metric is greater than a predetermined threshold. This embodiment assumes that the lesion is substantially circular, like a bubble, and determines whether the local maximum is the center of an overlapping bubble. This has the benefit of grouping local maxima together if they are likely to represent the same lesion.

In a further example embodiment, the overlap metric is calculated as follows:

wherein | S1| represents an area of the first circle, | S2| represents an area of the second circle, | S1| S2| represents an area of intersection of the first circle and the second circle, and | S1| S2| represents an area of a combination of the first circle and the second circle. These embodiments provide the benefit of assigning values to the metrics to determine whether the local maxima define different lesions or may correspond to the same lesion. These embodiments provide an alternative formula for calculating an overlap metric value that can be compared to a threshold value.

In another example embodiment, performing image segmentation comprises performing a watershed algorithm on the lesion mask based on the set of local maxima to form an initial split lesion mask defining a first set of lesions. In yet another example embodiment, performing image segmentation further comprises merging lesions of the first set of lesions based on the seed relabeling result to form a revised split lesion mask. These embodiments provide the following benefits: known watershed algorithms are used to segment the image into a split lesion mask while also merging regions that may be part of the same lesion, thereby avoiding excessive splitting.

In other illustrative embodiments, a computer program product comprising a computer usable or readable medium having a computer readable program is provided. The computer readable program, when executed on a computing device, causes the computing device to perform various ones, and combinations of, the operations outlined above with regard to the method illustrative embodiment.

In yet another illustrative embodiment, a system/apparatus is provided. The system/apparatus may include one or more processors and memory coupled to the one or more processors. The memory may include instructions that, when executed by the one or more processors, cause the one or more processors to perform various ones, and combinations of, the operations outlined above with regard to the method illustrative embodiment.

These and other features and advantages of the present invention will be described in, or will become apparent to those of ordinary skill in the art in view of, the following detailed description of the exemplary embodiments of the present invention.

Drawings

The invention, as well as a preferred mode of use and further objectives and advantages thereof, will best be understood by reference to the following detailed description of an illustrative embodiment when read in conjunction with the accompanying drawings, wherein:

FIG. 1 is an example block diagram of an AI pipeline implementing multiple specially configured and trained ML/DL computer models to perform anatomy recognition and lesion detection in input medical image data in accordance with one illustrative embodiment;

FIG. 2 is an example flowchart overview of example operation of an AI pipeline in accordance with an illustrative embodiment;

FIG. 3A is an exemplary diagram illustrating an exemplary input volume of a slice (medical image) of the abdomen of a human patient according to one illustrative embodiment;

FIG. 3B shows another depiction of the input volume of FIG. 3A, with a segment of slices along with their corresponding axial scores s'_infAnd s'_supAre represented together;

FIG. 3C is an exemplary diagram of the input volume of FIG. 3A, wherein the volume is axially divided into n fully overlapping sections;

4A-4C are exemplary diagrams of one illustrative embodiment of an ML/DL computer model configured and trained to estimate segments of an input volume of a medical image in accordance with one illustrative embodimentS 'of'_supAnd s'_infA value;

FIG. 5 is a flowchart outlining an example operation of liver detection and predetermined amount of anatomy determination logic of an AI pipeline in accordance with one illustrative embodiment;

FIG. 6 is an exemplary diagram of an integration of ML/DL computer model (ensemble) for performing lesion detection in an anatomical structure of interest (e.g., liver), according to one illustrative embodiment;

fig. 7 is a flowchart outlining an example operation of liver/lesion detection logic in an AI pipeline in accordance with one illustrative embodiment;

FIG. 8 depicts a block diagram of aspects of lesion segmentation in accordance with one illustrative embodiment;

FIG. 9 depicts the results of lesion detection and slice partitioning in accordance with one illustrative embodiment;

10A-10D illustrate seed positioning in accordance with an illustrative embodiment;

FIG. 11A is a block diagram illustrating a mechanism for lesion splitting (suspension splitting) in accordance with one illustrative embodiment;

FIG. 11B is a block diagram illustrating a mechanism for seed relabeling in accordance with one illustrative embodiment;

FIG. 12 is a flowchart outlining an example operation of lesion fission in accordance with one illustrative embodiment;

13A-13C illustrate z-connections of lesions in accordance with an illustrative embodiment;

14A and 14B illustrate the results of a training model for z-direction lesion attachment in accordance with one illustrative embodiment;

FIG. 15 is a flowchart outlining an example operation of a mechanism for connecting two-dimensional lesions along a z-axis in accordance with one illustrative embodiment;

FIG. 16 illustrates an example of contours of two lesions in the same image in accordance with one illustrative embodiment;

FIG. 17 is a flowchart outlining an exemplary operation of a mechanism for sliced contour refinement in accordance with one illustrative embodiment;

FIG. 18A is an example of a ROC curve determined for patient level and lesion level operating points in accordance with one illustrative embodiment;

FIG. 18B is an example flow diagram of an operation for performing false positive removal (false positive removal) based on patient level and lesion level operating points in accordance with one illustrative embodiment;

FIG. 18C is an example flowchart of an operation for performing voxel-wise false positive removal based on input volume level and voxel level operation points in accordance with one illustrative embodiment;

FIG. 19 is a flowchart outlining an example operation of false positive removal logic of an AI pipeline in accordance with one illustrative embodiment;

FIG. 20 is an exemplary diagram of a distributed data processing system in which aspects of the illustrative embodiments may be implemented; and are combined

FIG. 21 is an example block diagram of a computing device in which aspects of the illustrative embodiments may be implemented.

Detailed Description

Detection of lesions or abnormal cell groups is a largely manual process in modern medicine. Since this is a manual process, this is fraught with sources of error due to human limitations regarding the ability of individuals to detect portions of digital medical images showing such lesions, especially in view of the greater need for such individuals to assess an increasing number of images in a shorter amount of time. While some automated image analysis mechanisms have been developed, there remains a need to improve such automated image analysis mechanisms to provide more efficient and accurate analysis of medical image data to detect lesions in an imaged anatomical structure (e.g., a liver or other organ).

Illustrative embodiments are directed specifically to improved computing tools that provide trained automated computer-driven artificial intelligence medical image analysis by a machine learning/deep learning computer process, detecting anatomical structures, detecting lesions or other biological structures of interest in or associated with such anatomical structures, performing a specialized segmentation of the detected lesions or other biological structures, performing false positive removal based on the specialized segmentation, and performing classification of the detected lesions or other biological structures and providing results of the lesion/biological structure detection to a downstream computing system to perform additional computer operations. The following description of illustrative embodiments will assume embodiments specifically directed to mechanisms of illustrative embodiments specifically trained with respect to liver lesions as biological structures of interest, although illustrative embodiments are not limited thereto. Rather, those of ordinary skill in the art will recognize that the machine learning/deep learning based artificial intelligence mechanism of the illustrative embodiments may be implemented with respect to a wide variety of other types of biological structures/lesions in or associated with other anatomical structures represented in medical imaging data without departing from the spirit and scope of the present invention. Further, the illustrative embodiments may be described in terms of the medical imaging data being Computed Tomography (CT) medical imaging data, however, the illustrative embodiments may be implemented with any digital medical imaging data from various types of medical imaging techniques including, but not limited to, Positron Emission Tomography (PET) and other nuclear medicine imaging, ultrasound, Magnetic Resonance Imaging (MRI), elastography, photoacoustic imaging, echocardiography, magnetic particle imaging, functional near infrared spectroscopy, elastography, various radiographic imaging including fluoroscopy, and the like.

In general, the illustrative embodiments provide an improved Artificial Intelligence (AI) computer pipeline that includes a plurality of specially configured and trained AI computer tools, (e.g., neural networks, cognitive computing systems, or other AI mechanisms trained based on a limited data set to perform specified tasks). The configured and trained AI computer tools are each specifically configured/trained to perform a particular type of artificial intelligence processing on an input medical image volume represented as one or more sets of data and/or metadata defining medical images captured by a medical imaging technique. Typically, these AI tools employ Machine Learning (ML)/Deep Learning (DL) computer models (or simply ML models) that, while simulating the human mental process about the generated results, use different computer processes that are specific to the computer tools and in particular the ML/DL computer models that learn patterns and relationships (e.g., image classification or labels, data values, medical treatment recommendations, etc.) between data representing the specific results. The ML/DL computer model is essentially a function of elements including the machine learning algorithm, the configuration settings of the machine learning algorithm, the features of the input data recognized by the ML/DL computer model, and the tags (or outputs) generated by the ML/DL computer model. The functions of these elements are specifically adjusted through a machine learning process to generate a specified ML/DL computer model instance. Different ML models can be specially configured and trained to perform different AI functions with respect to the same or different input data.

Since an Artificial Intelligence (AI) pipeline implements multiple ML/DL computer models, it should be understood that these ML/DL computer models are trained by ML/DL procedures for a specific purpose. Thus, as an overview of the ML/DL computer model training process, it should be understood that machine learning involves the design and development of techniques that take empirical data (such as medical image data) as input and recognize complex patterns in the input data. One common mode in machine learning techniques is to use an underlying computer model M whose parameters are optimized to minimize a cost function associated with M, given input data. For example, in the context of classification, the model M may be a straight line that separates the data into two classes (e.g., labels) such that M ═ a × x + b × y + c, and the cost function would be the number of misclassified points. The learning process then operates by adjusting the parameters a, b, c such that the number of misclassification points is minimized. After this optimization phase (or learning phase), the model M may be used to classify new data points. Generally, given input data, M is a statistical model, and the cost function is inversely proportional to the likelihood of M. This is merely a simple example of providing a general explanation of machine learning training and other types of machine learning that use different patterns, cost (or loss) functions, and the optimization may be used with the mechanisms of the illustrative embodiments without departing from the spirit and scope of the present invention.

For the purpose of anatomical structure detection and/or lesion detection (where a lesion is an "anomaly" in medical imaging data), the learning machine may construct an ML/DL computer model of the normal structural representation to detect data points in the medical image that deviate from this normal structural representation ML/DL computer model. For example, a given ML/DL computer model (e.g., a supervised, unsupervised, or semi-supervised model) may be used to generate and report an anomaly score to another device, generate a classification output indicating one or more categories into which an input is classified, probabilities or scores associated with different categories, and/or the like. Example machine learning techniques that may be used to construct and analyze such ML/DL computer models may include, but are not limited to, Nearest Neighbor (NN) techniques (e.g., k-NN models, replicator NN models, etc.), statistical techniques (e.g., bayesian networks, etc.), clustering techniques (e.g., k-means, etc.), neural networks (e.g., reservoir networks, artificial neural networks, etc.), Support Vector Machines (SVMs), etc.

The processor-implemented Artificial Intelligence (AI) pipeline of the illustrative embodiments typically includes one or both of a Machine Learning (ML) and Deep Learning (DL) computer model. In some cases, one or the other of ML and DL may be used or implemented to achieve a particular result. Traditional machine learning may include or use algorithms such as bayesian decision making, regression, decision trees/forests, support vector machines or neural networks, etc. Deep learning may be based on a deep neural network and may use multiple layers, such as convolutional layers. Such DLs (such as using hierarchical networks) may be effective in their implementations and may provide enhanced accuracy relative to conventional ML techniques. Conventional ML is generally distinguishable from DL because DL models may outperform classical ML models, however, DL models may consume relatively large amounts of processing and/or power resources. In the context of the illustrative embodiments, references herein to one or the other of ML and DL are understood to encompass one or both forms of AI processing.

With respect to the illustrative embodiments, after configuration and training by the ML/DL training process, an ML/DL computer model of the AI pipeline is performed, and the ML/DL computer model performs complex computer medical imaging analysis to detect anatomical structures in the input medical image and generate contours (assumed hereafter as CT medical image data) that specifically identify target biological structures of interest (assumed hereafter to be liver lesions for the purposes of description of the illustrative embodiments), classification of the target biological structures of interest, specify where these target biological structures of interest (e.g., liver lesions) are present in the input medical image, and output of other information that helps a human Subject Matter Expert (SME), such as a radiologist, physician, etc., understand the medical condition of the patient from the point of view of the captured input medical image. Further, the output can be provided to other downstream computer systems to perform additional artificial intelligence operations (such as treatment recommendations based on classification, contours, and other decision support operations).

Initially, an Artificial Intelligence (AI) pipeline of the illustrative embodiments receives an input volume of Computed Tomography (CT) medical imaging data and detects which portion of a body of a biological entity is depicted in the CT medical imaging data. A "volume" of a medical image is a three-dimensional representation of the internal anatomic structure of a biological entity consisting of a stack of two-dimensional slices (slices), which may be individual medical images captured by medical imaging techniques. The stack of slices may also be referred to as a "slab" (and differs from the slices themselves in that the stack represents a portion of the anatomical structure having a thickness, wherein the stack of slices or slabs generates a three-dimensional representation of the anatomical structure.

For the purposes of this specification it will be assumed that the biological entity is a person, however, the invention may operate on medical images for various types of biological entities. For example, in veterinary medicine, the biological entities may be different types of small animals (e.g., pets such as dogs, cats, etc.) or large animals (e.g., horses, cattle, or other farm animals). For embodiments in which the AI pipeline is specially trained for detecting liver lesions, the AI pipeline determines whether the incoming CT medical imaging data represents an abdominal scan present in the CT medical imaging data, and if not, the operation of the AI pipeline is terminated with respect to the incoming CT medical imaging data because the correct part or portion of the human body is not pointed to. It should be appreciated that there may be different AI pipelines in accordance with the illustrative embodiments, that are trained to process input medical images for different portions of a body and different target biological structures, and that the input CT medical images may be input to each of the AI pipelines, or routed to the AI pipelines based on the body part or classification of the body part depicted in the input CT medical images, for example, the classification of the input CT medical images with respect to the body part or body part represented in the input CT medical images may be performed first, and then a corresponding trained AI pipeline may be selected from a plurality of trained AI pipelines of the type described herein to process the input CT medical images. For purposes of the following description, a single AI pipeline trained to detect liver lesions will be described, but it will be apparent to one of ordinary skill in the art in view of this description that it extends to a suite or collection of AI pipelines.

Assuming that the volume of input CT medical images comprises medical images of the abdomen of a human body (for the purpose of liver lesion detection), further processing of the input CT medical images is performed in two primary stages (primary stages), which may be performed substantially in parallel and/or sequentially with respect to each other, depending on the desired implementation. The two main stages include a phase classification stage and an anatomical structure detection stage (e.g., a liver detection stage in the case where the AI pipeline is configured to perform liver lesion detection).

The phase classification stage determines whether the volume of the input CT medical image includes a single imaging phase or multiple imaging phases. Phase in medical imaging is an indication of contrast agent uptake. For example, in some medical imaging techniques, the phase may be defined in terms of when a contrast agent is introduced into the biological entity, which allows capturing of medical images, including the path of the captured contrast agent. For example, the phases may include a pre-contrast phase, an arterial contrast phase, a portal contrast phase, and a delay phase, wherein medical images are captured in any or all of these phases. The phase is generally related to the timing after injection and the features of the structural enhancement within the image. Timing information may be considered to "classify" potential phases (e.g., the delay phase will always be acquired after the portal phase) and estimate the potential phase for a given image. One example of the use of this type of information to determine Phase is described in commonly assigned and co-pending U.S. patent application Ser. No. 16/926,880, entitled "Method of Determining Phase of a Computerized Tomography Image," filed on 13.7.2020. In addition, timing information may be used in conjunction with other information (sampling, reconstruction kernels, etc.) to pick the best representation for each phase (a given acquisition may be reconstructed in several ways).

Once the images in the input volume are assigned or classified to their corresponding phases based on the enhanced timing and/or features, it may be determined whether the volume includes images of a single phase (e.g., portal vein phase present but no artery phase) or multiple phase exams (e.g., portal vein and artery) based on the phase classification. If the phase classification indicates that there is a single phase in the volume of the input CT medical image, further processing of the AI pipeline is performed as described below. If multiple phases are detected, the volume is not further processed by the AI pipeline. However, in some illustrative embodiments, while such single/multiple phase based volume filtering accepts only volumes with images from a single phase and rejects multiple phase volumes, in other illustrative embodiments, the AI pipeline process described herein may filter out images of volumes that are not classified into a target phase of interest, e.g., may retain portal phase images in the volume while filtering out images of volumes that are not classified as part of the portal phase, thereby modifying the input volume to a modified volume having only a subset of images classified as the target phase. Furthermore, as previously discussed, different AI pipelines may be trained for different types of volumes, in some demonstrative embodiments, phase classification of images within an input volume may be used to route or distribute images of the input volume to corresponding AI pipelines, which are trained and configured to process images of different phases, such that the input volume may be subdivided into sub-volumes and routed to their corresponding AI pipelines for processing, e.g., a first sub-volume corresponding to a portal phase image is sent to a first AI pipeline while a second sub-volume corresponding to an arterial phase is sent to a second AI pipeline for processing. If the volume of the input CT medical image includes a single phase, or after filtering and optionally routing the sub-volumes to a corresponding AI pipeline, such that the AI pipeline processes the image of the input volume or sub-volume of a single phase, then the volume (or sub-volume) is passed on to the next stage of the AI pipeline for further processing.

The second main stage is the anatomy of interest (liver in the exemplary embodiment) detection stage, where the volumetric portion depicting the anatomy of interest is identified and passed to the next downstream stage of the AI pipeline. The anatomy detection stage of interest (hereinafter referred to as the liver detection stage according to an example embodiment) includes a Machine Learning (ML)/Depth Learning (DL) computer model that is specially trained and configured to perform computerized medical image analysis to identify a portion of the input medical image (e.g., the liver) that corresponds to the anatomy of interest. Such medical image analysis may include training an ML/DL model on labeled trained medical image data as input to determine whether the input medical image (training image during training) includes an anatomical structure of interest (e.g., a liver). Based on the ground truth (ground true) of the image tag, the operating parameters of the ML/DL model can be adjusted to reduce the loss or error in the results generated by the ML/DL model until convergence is achieved (i.e., the loss is minimized). By this procedure, the ML/DL model is trained to recognize patterns of medical image data indicating the presence of an anatomical structure of interest (in this example, a liver). Thereafter, once trained, an ML/DL model may be performed on the new input data to determine whether the new input medical image data has a pattern indicating the presence of an anatomical structure, and if the probability is greater than a predetermined threshold, it may be determined that the medical image data includes an anatomical structure of interest.

Thus, at the liver detection stage, the AI pipeline uses the trained ML/DL computer model to determine whether the volume of input CT medical images includes an image depicting the liver. The volumetric portion depicting the liver is passed along with the results of the phase classification stage to a determination stage of the AI pipeline that determines whether there is a single phase medical imaging and whether there is at least a predetermined amount of anatomical structure of interest in the volumetric portion depicting the anatomical structure of interest (e.g., the liver). Whether a predetermined amount of anatomical structure of interest is present may be determined based on known measurement mechanisms that determine measurements of the structure from the medical image (e.g., calculating the size of the structure from differences in pixel locations within the image). The measurements may be compared to predetermined dimensions (e.g., average dimensions) of the anatomy of similar patients having similar demographics, such that if the measurements represent at least a predetermined amount or portion of the anatomy, further processing may be performed by the AI pipeline. In one illustrative embodiment, for example, the determination determines whether there is at least 1/3 liver in the portion of the volume of the input CT medical image determined to depict liver. Although 1/3 is used in the example embodiment, any predetermined amount of structure determined to be suitable for a particular implementation may be used without departing from the spirit and scope of the present invention.

In one illustrative embodiment, to determine whether a predetermined amount of anatomical structure of interest is present in a volume of an input CT medical image, an axial score (axial score) is defined such that a slice of the medical image corresponding to a first representation in the volume having the anatomical structure of interest (e.g., the liver), i.e., a First Slice (FSL) containing the liver, is given a slice score of 0 and a Last Slice (LSL) containing the liver has a score of 1. Assuming a human biological entity, the first and last slices are defined from the lowest slice (MISV) in the volume (closest to the lower extremities, e.g. the feet) to the highest slice (MSSV) in the volume (closest to the head). Liver axial score estimation (LAE) scoring s by a pair of slices_supAnd s_infIs defined, the pair ofSheet score s_supAnd s_infSlice scores corresponding to MSSV and MISV slices, respectively. As will be described in more detail below, the ML/DL computer model is specially configured and trained to determine a slice score s for a volume of an input CT medical image_supAnd s_inf. Knowing these slice scores and knowing from the above definition that the liver extends from 0 to 1, the mechanism of the illustrative embodiments is able to determine the fraction (fraction) of the liver in the field of view of the volume of the input CT medical image.

In some illustrative embodiments, the slice score s may be found indirectly by first dividing a volume of the input CT medical image into a plurality of segments, and then for each segment, performing a configured and trained ML/DL computer model on the slices of that segment to estimate the height of each slice_supAnd s_infTo determine at segment s'_supAnd s'_infThe uppermost (closest to the head) and lowermost (closest to the feet) liver slices in (b). Given s'_supAnd s'_infBy extrapolation to find s_supAnd s_infSince it is known how the segments are positioned with respect to the entire volume of the input CT medical image. The method is based on a robust estimator of the height of an arbitrary slice from the input volume (or a sub-volume associated with the target phase). Such an estimator may be obtained by learning a regression model, for example by using a deep learning model that performs an estimation of the height from a chunk (a set of consecutive slices). For example, long-short term memory (LSTM) type artificial neural network models are suitable for these tasks because of their ability to encode the ordering of slices containing the liver and abdominal anatomy. It should be noted that for each volume there will be n s_supAnd s_infWhere n is the number of segments per volume. In one illustrative embodiment, the final estimate is obtained by taking an unweighted average of the n estimates, however, in other illustrative embodiments, other functions of the n estimates may be used to generate the final estimate.

Having determined s of the volume of the input CT medical image_supAnd s_infBased on these values, a score of the anatomical structure of interest (e.g., the liver) is calculated. This task is made possible by the estimation of the height of each slice. From the estimates of the height of the first slice (H1) and the height of the last slice (H2) of the liver in the input volume, the part of the liver that is visible in the input volume can be denoted (min (H1, H1) -max (H2, H2))/(H1-H2), assuming that the actual heights of the first and last slices of the liver, whether they are contained in the input volume, are H1 and H2. This calculated score may then be compared to a predetermined threshold to determine whether a predetermined minimum amount of anatomical structure of interest is present in the volume of the input CT medical image, e.g., at least 1/3 livers are present in the volume of the input CT medical image.

If the determination results in a determination that there are multiple phases and/or that there is not a predetermined amount of anatomical structure of interest delineating the anatomical structure in the portion of the volume of the input CT medical image, further processing of the volume may be aborted. If the determination results in a determination that the volume of the input CT medical image includes a single phase and at least a predetermined amount of the anatomical structure of interest (e.g., the liver shown 1/3 in the image), then the portion of the volume of the input CT medical image depicting the anatomical structure is forwarded to a next stage of the AI pipeline for processing.

At the next stage of the AI pipeline, the AI pipeline performs lesion detection on the portion of the volume of the input CT medical image that represents the anatomical structure of interest (e.g., the liver). This liver and lesion detection stage of the AI pipeline uses integration of ML/DL computer models to detect liver and lesions in the liver as represented in the volume of the input CT medical image. Integration of ML/DL computer models liver and lesion detection is performed using differently trained ML/DL computer models, wherein the ML/DL computer models are trained and a loss function is used to balance false positives and false negatives in lesion detection. Further, the integrated ML/DL computer model is configured such that the third loss function makes outputs of the ML/DL computer models coincide with each other.

Assuming liver detection and lesion detection are performed at this stage of the AI pipeline, a first ML/DL computer model is performed on the volume of the input CT medical image to detect the presence of a liver. The ML/DL computer model may be the same ML/DL computer model as employed in a previous AI pipeline stage of anatomy of interest detection and, therefore, may utilize previously obtained results. Multiple (two or more) other ML/DL computer models are configured and trained to perform lesion detection in a portion of a medical image depicting a liver. The first ML/DL computer model is configured with two loss functions. The first loss function penalizes errors in false negatives, i.e. the classification incorrectly indicates the absence of a lesion (normal anatomy). The second loss function penalizes errors in false positive results, i.e. the classification incorrectly indicates the presence of a lesion (abnormal anatomical structure). The second ML/DL is trained to detect lesions using an adaptive loss function that penalizes false positive errors in slices of the liver containing normal tissue and penalizes false negative errors in slices of the liver containing lesions. The detection outputs from the two ML/DL models are averaged to produce the final lesion detection.

The results of the liver/lesion detection stage of the AI pipeline include one or more contours (outlines) of the liver and a detection map identifying portions of the medical imaging data elements corresponding to detected lesions, e.g., a voxel-wise map of detected liver lesions in the volume of the input CT medical image. The image map is then input to a lesion segmentation stage of the AI pipeline. As will be described in more detail below, the lesion segmentation stage uses a watershed technique (watershed technique) to partition the map to generate image element (e.g., voxel) partitions of the input CT medical image. The liver lesion segmentation stage identifies all contours corresponding to lesions present in slices of the volume of the input CT medical image based on the partition and performs an operation of identifying which contours correspond to the same lesion in three dimensions. Lesion segmentation aggregates the associated lesion contours to generate a three-dimensional partition of the lesion. Lesion segmentation uses lesion image elements (e.g., voxels) represented in a medical image and a repair of non-liver tissue to individually focus on each lesion and perform active contour analysis. In this way, individual lesions may be identified and processed without being biased for analysis by other lesions in the medical image or for portions of the image outside the liver.

The result of lesion segmentation is a list of lesions with their corresponding appearance or contour in the volume of the input CT medical image. These outputs may include findings that are not actual lesions. To minimize the impact of those false positives, the output is provided to the next stage of the AI pipeline involving false positive removal using a trained false positive removal model. This false positive removal model of the AI pipeline acts as a classifier to identify from the detected findings what output is the actual lesion and what is a false positive. The input consists of the detected surrounding-finding image Volume (VOI) associated with the mask resulting from the lesion segmentation refinement. The data as a result of the detection/segmentation stage is used to train the false positive removal model: the object detected by the detection algorithm as a lesion from the ground truth is used to represent the lesion category during training, while the detection of any lesion that does not match the ground truth is used to represent a non-lesion (false positive) category.

To further improve overall performance, a dual operating point strategy was employed on lesion detection and false positive models. This idea is to note that the output of the AI pipeline can be interpreted at different levels. First, the output of the AI pipeline can be used to discern the examination volume, i.e., the input volume or image Volume (VOI), with or without a lesion. Second, the output of the AI pipeline is intended to maximize the detection of lesions, whether or not they are contained in the same patient/exam/volume. For clarity, measurements taken for an examination will be referred to herein as "patient level" and measurements taken for a lesion will be referred to herein as "lesion level". Maximizing sensitivity at the "lesion level" will reduce specificity at the "patient level" (for patients, one detection is sufficient to say that contains a lesion). This may eventually be suboptimal for clinical use, as one must choose between having poor specificity at the patient level or low sensitivity at the lesion level.

In view of this, the illustrative embodiments use a dual operating point approach for both lesion detection and false positive removal. The rationale is to run the process first using the first operating point which gives reasonable performance at the patient level. Then, for patients from the first round having at least the detected lesion, the detected lesion is reinterpreted/processed using the second operating point. The second operating point is selected to be more sensitive. Although the specificity of this second operating point is lower than the first operating point, this loss of specificity is included at the patient level, since all patients with no lesion detected with the first operating point remain as they are, regardless of whether additional lesions would be detected by the second operating point. Thus, patient level specificity is determined only by the first operating point. Patient level sensitivity is between one of the first and second operating points taken alone (a false negative condition from the first operating point may be changed to a true positive by the second operating point). On the lesion side, the actual lesion level sensitivity is improved compared to only the first operating point. Lesion specificity is better than that taken from the less specific second operating point alone, since there are no false positives from cases treated with the first operating point alone.

While the illustrative embodiments will assume a particular configuration and use of the dual operation point method, it should be understood that the dual operation point method may be used with other configurations and for other purposes where one is interested in measuring performance at both the group level (in the illustrative embodiment, this group level is the "patient level") and the element level (in the illustrative embodiment, this element level is the "lesion level"). While in the illustrative embodiment, the dual operation point method applies to both lesion detection and false positive removal, it is understood that the dual operation point method may extend beyond these stages of the AI pipeline. For example, the detection of lesions may be performed at the voxel level (element) versus the volume level (group), rather than at the patient level and lesion level. As another example, voxel or lesion levels may be used for element levels, and slabs (slice sets) may be used as group levels. In yet another example, all examined volumes may be used as a group level, rather than a single volume. It should be understood that the method may also be applied to two-dimensional images of the image to be analyzed (e.g., breast, mammography, etc. rays 2D X) rather than to three-dimensional volumes. Specificity (such as the average number of false positives per patient/group) can be used to select an operating point. In addition, although the illustrative embodiments are described as applied to lesion detection and classification, the dual operating point based approach may be applied to other structures (clips, stents, implants, etc.) and beyond medical imaging.

The results of the detection and false positive removal based on the dual operating points result in the identification of a final filtered list of lesions to be further processed by the lesion classification stage of the AI pipeline. At the lesion classification stage of the AI pipeline, a configured and trained ML/DL computer model is performed on the lesion list and its corresponding contour data, thereby classifying the lesion into one of a plurality of predetermined lesion classifications. For example, each lesion in the final filtered list of lesions and its attributes (e.g., contour data) may be input into a trained ML/DL computer model, which then operates on this data to classify the lesion as a particular type of lesion. The classification may be performed using a classifier (e.g., a trained neural network computer model) previously trained on the underlying true value data in conjunction with the results of previous processing steps of the AI pipeline. The classification task may be more or less complex, e.g. it may provide a label between benign, malignant or uncertainty, or in another example, the actual lesion type (e.g. cyst, metastasis, hemangioma, etc.). The classifier may be, for example, a neural network-based computer model classifier (e.g., SVM, decision tree, etc.) or a deep-learning computer model. The actual input to the classifier is a patch around the lesion, which in some embodiments may be enhanced with a lesion mask or outline (contour).

After classifying the lesion through the lesion classification stage of the AI pipeline, the AI pipeline outputs a list of lesions and their classifications, along with any contour attributes of the lesion. In addition, the AI pipeline may also output liver contour information for the liver. The AI pipeline generated information may be provided to a further downstream computing system for further processing and generation of representations of the anatomical structure of interest and any detected lesions present in the anatomical structure. For example, a graphical representation of a volume of an input CT medical image may be generated in a medical image viewer or other computer application, where anatomical structures and detected lesions are superimposed or otherwise highlighted in the graphical representation using contour information generated by the AI pipeline. In other illustrative embodiments, downstream processing of AI pipeline generated information may include diagnostic decision support operations, automated medical imaging report generation based on detected lesion lists, classifications, and contours. In other illustrative embodiments, based on the classification of the lesion, different treatment recommendations may be generated for review and consideration by the practitioner.

In some illustrative embodiments, the list of lesions, their classification, and contours may be stored in a history data structure associated with the patient corresponding to the volume of the input CT medical image, such that multiple executions of the AI pipeline over different volumes of the input CT medical image associated with the patient may be stored and evaluated over time. For example, differences between a list of lesions and/or their associated classifications and contours may be determined to assess the progress of a patient's disease or medical condition, and such information is presented to a medical professional for use in assisting in the treatment of the patient.

Other downstream computing systems and processes for specific anatomical structures and lesion detection information generated by the AI mechanism of the illustrative embodiments may be implemented without departing from the spirit and scope of the invention. For example, the output of the AI pipeline may be used by another downstream computing system to process anatomical and lesion information in the output of the AI pipeline to identify differences from other sources of information (e.g., radiology reports) in order to make the clinical staff aware of potential overlooked findings.

Thus, the illustrative embodiments provide mechanisms for providing an automated AI pipeline comprising a plurality of configured and trained ML/DL computer models that implement various artificial intelligence operations for various stages of the AI pipeline to identify anatomical structures and lesions associated with these anatomical structures in a volume of input medical images, determine contours associated with such anatomical structures and lesions, determine classifications of such lesions, and generate a list of such lesions and contours of the lesions and the anatomical structures for further downstream computers to process AI-generated information from the AI pipeline. The operation of the AI pipeline is automated such that there is no human intervention at any stage of the AI pipeline, but rather a specially configured and trained ML/DL computer model trained by a machine learning/deep learning computer process is employed to perform the specified AI analysis of the various stages. The only points at which human intervention may exist are before the input of the volume of the medical image (e.g., during medical imaging of the patient), and after the output of the AI pipeline (e.g., viewing the enhanced medical image presented via the computer image viewing application based on the output of the lesion list and contours generated by the AI pipeline). Thus, the AI pipeline performs operations that human beings cannot perform as mental processes and does not organize any human activities, as the AI pipeline is specifically directed to improved automated computer tools implemented as artificial intelligence using specified machine learning/deep learning processes that exist only within a computer environment.

Before proceeding with a discussion of the various aspects of the illustrative embodiments and the improved computer operations performed by the illustrative embodiments, it should first be appreciated that throughout this specification the term "mechanism" will be used to refer to elements of the invention that perform the various operations, functions, etc. The term "mechanism" as used herein may be an implementation of a function or aspect of an illustrative embodiment in the form of an apparatus, program, or computer program product. In the case of a process, the process is implemented by one or more devices, apparatuses, computers, data processing systems, and the like. In the case of a computer program product, the logic represented by the computer code or instructions embodied in or on the computer program product is executed by one or more hardware devices to implement functions or perform operations associated with a specified "mechanism". Thus, the mechanisms described herein may be implemented as dedicated hardware, software executing on hardware, thereby configuring hardware to implement dedicated functions of the invention that are not otherwise capable of being performed by hardware, software instructions stored on a medium such that the instructions are readily executable by hardware, thereby specifically configuring hardware to perform the described functionality and particular computer operations described herein, processes or methods for performing the described functions, or any combination thereof.

The description and claims may utilize the terms "a," "an," "at least one," and "one or more" with respect to particular features and elements of the illustrative embodiments. It should be understood that these terms and phrases are intended to state that there is at least one, but may also be more than one, of the particular features or elements present in a particular illustrative embodiment. That is, these terms/phrases are not intended to limit the specification or claims to the presence of a single feature/element or to the presence of a plurality of such features/elements. Rather, these terms/phrases require only at least a single feature/element, where a plurality of such features/elements are possible within the scope of the description and claims.

Moreover, it should be appreciated that the term "engine" if used herein with respect to describing embodiments and features of the invention is not intended to limit any particular implementation for implementing and/or performing the actions, steps, processes, etc. attributable to and/or performed by the engine. An engine may be, but is not limited to, software, hardware, and/or firmware, or any combination thereof, that performs the specified function including, but not limited to, any use of a general and/or special purpose processor in combination with appropriate software loaded or stored in a machine readable memory and executed by the processor. Further, unless specified otherwise, any designation associated with a particular engine is for ease of reference and is not intended to be limiting to a particular implementation. In addition, any functionality attributed to an engine can be performed equally by multiple engines, combined and/or integrated with another engine of the same or different type, or distributed across one or more engines in various configurations.

Furthermore, it should be appreciated that the following description uses numerous different examples of different elements of the illustrative embodiments to further illustrate exemplary implementations of the illustrative embodiments and to facilitate an understanding of the mechanisms of the illustrative embodiments. These examples are intended to be non-limiting and not exhaustive of the various possibilities for implementing the mechanisms of the illustrative embodiments. In view of this description, it will be apparent to those of ordinary skill in the art that there are many other alternative embodiments for these various elements that may be utilized in addition to or in place of the examples provided herein without departing from the spirit and scope of the present invention.

The present invention may be a system, method and/or computer program product. The computer program product may include a computer-readable storage medium (or media) having computer-readable program instructions embodied therewith for causing a processor to perform various aspects of the present invention.

The computer readable storage medium may be a tangible device that can retain and store instructions for use by an instruction execution device. The computer readable storage medium may be, for example, but not limited to, an electronic memory device, a magnetic memory device, an optical memory device, an electromagnetic memory device, a semiconductor memory device, or any suitable combination of the foregoing. A non-exhaustive list of more specific examples of the computer readable storage medium includes the following: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a Static Random Access Memory (SRAM), a portable compact disc read-only memory (CD-ROM), a Digital Versatile Disc (DVD), a memory stick, a floppy disk, a mechanical coding device such as punch cards or raised structures in grooves having instructions recorded thereon, and any suitable combination of the foregoing. A computer readable storage medium as used herein should not be interpreted as a transient signal per se, such as a radio wave or other freely propagating electromagnetic wave, an electromagnetic wave propagating through a waveguide or other transmission medium (e.g., optical pulses through a fiber optic cable), or an electrical signal transmitted through a wire.

The computer-readable program instructions described herein may be downloaded from a computer-readable storage medium to a corresponding computing/processing device, or to an external computer or external storage device via a network (e.g., the internet, a local area network, a wide area network, and/or a wireless network). The network may include copper transmission cables, optical transmission fibers, wireless transmissions, routers, firewalls, switches, gateway computers and/or edge servers. The network adapter card or network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in a computer-readable storage medium within the respective computing/processing device.

The computer-readable program instructions for carrying out operations of the present invention may be assembler instructions, Instruction Set Architecture (ISA) instructions, machine-related instructions, microcode, firmware instructions, state setting data, or source code or object code written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + +, or the like, as well as conventional procedural programming languages, such as the "C" programming language or similar programming languages. The computer-readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider). In some embodiments, an electronic circuit (including, for example, a programmable logic circuit, a Field Programmable Gate Array (FPGA), or a Programmable Logic Array (PLA)) may personalize the electronic circuit by executing computer-readable program instructions with state information of the computer-readable program instructions in order to perform various aspects of the present invention.

Aspects of the present invention are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer-readable program instructions.

These computer-readable program instructions may be provided to a processor of a computer or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer-readable program instructions may also be stored in a computer-readable storage medium that can direct a computer, a programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer-readable storage medium having the instructions stored therein comprise an article of manufacture including instructions which implement aspects of the function/act specified in the flowchart and/or block diagram block or blocks.

The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer, other programmable apparatus or other devices implement the functions/acts specified in the flowchart and/or block diagram block or blocks.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of an instruction, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

Summary of lesion detection and Classification AI pipeline

Fig. 1 is an example block diagram of a lesion detection and classification Artificial Intelligence (AI) pipeline (referred to herein simply as an "AI pipeline") implementing a plurality of specially configured and trained ML/DL computer models to perform anatomical structure recognition and lesion detection in input medical image data in accordance with one illustrative embodiment. For illustrative purposes only, the depicted AI pipeline is described specifically for liver detection and liver lesion detection in medical image data. As described above, the illustrative embodiments are not so limited and may be applied to any anatomical structures of interest and lesions associated with such anatomical structures of interest, which may be represented in image elements of medical image data captured by medical imaging techniques and corresponding computing systems. For example, the mechanisms of the illustrative embodiments may be applied to the detection, contour recognition, classification, etc. of other anatomical structures (such as the lungs, heart, etc.), as well as lesions associated with the lungs, heart, or other anatomical structures of interest.

Further, it should be understood that the following description provides an overview of the AI pipeline from the level illustrated in FIG. 1, and that subsequent portions of this description will enter additional details regarding the various stages of the AI pipeline. In some demonstrative embodiments, each stage of the AI pipeline is implemented as a configured and trained ML/DL computer model, a neural network, such as a deep learning neural network, as represented by symbols 103 in the various stages of AI pipeline 100. These different ML/DL computer models are specially configured and trained to perform the specific AI operations described herein (e.g., body position identification, liver detection, phase classification, liver minimum detection, liver/lesion detection, lesion segmentation, false positive removal, lesion classification, etc.). While these additional portions of the following description will set forth specific embodiments for implementing various stages of the AI pipeline providing the novel techniques, mechanisms and methods for performing AI operations of different stages, it should be appreciated that other equivalent techniques, mechanisms or methods may be used in the context of the AI pipeline as a whole without departing from the spirit and scope of the illustrative embodiments. Such other equivalent techniques, mechanisms, or methods will be apparent to those of ordinary skill in the art in view of this description and are intended to be within the spirit and scope of the present invention.

As shown in fig. 1, according to one illustrative embodiment, an Artificial Intelligence (AI) pipeline 100 receives a volume 105 of an input medical image, in the depicted example, the volume 105 of the input medical image is a volume of an input Computed Tomography (CT) medical image represented as one or more data structures, as input that is then automatically processed by the various stages of the AI pipeline 100 to ultimately generate an output 170 that includes a list of lesions and their classification and contour information, as well as contour information about an anatomical structure of interest (e.g., a liver in the depicted example). The volume 105 of input medical images may be captured by the medical imaging technology 102 using any of a number of generally known or later developed medical imaging technologies and devices that render images of the internal anatomy of a biological entity (i.e., a patient) as one or more medical image data structures. In some demonstrative embodiments, volume 105 of the input medical image includes two-dimensional slices of a portion of an anatomy of a portion of a patient's body (individual medical images), which are then combined to generate a slab (combination of slices along an axis, providing a set of medical images having thicknesses along the axis), and which are combined to generate a three-dimensional representation (i.e., a volume of the anatomy of the portion of the body).

In the first stage logic 110 of the AI pipeline 100, the AI pipeline 100 determines 112 a portion of the patient's body corresponding to the input volume 105 of CT medical imaging data and determines, via the body part of interest determination logic 114, whether the portion of the patient's body represents a portion of the patient's body corresponding to an anatomical structure of interest (e.g., an abdominal scan rather than a cranial scan, a lower body scan, etc.). This evaluation is operated as an initial filter on the AI pipeline 100 only with respect to a volume 105 of input CT medical imaging data (hereinafter referred to as the "input volume" 105) for which the AI pipeline 100 is specifically configured and trained to perform anatomical structure recognition as well as contour and lesion recognition, contouring and classification. Such detection of the body part represented in the input volume 105 may view metadata associated with the input volume 105, which may have a field specifying a region of the patient's body being scanned, as may be specified by the source medical imaging technology computing system 102 when performing a medical imaging scan. Alternatively, the first stage logic 110 of the AI pipeline 100 may implement a specially configured and trained ML/DL computer model for body part detection 112 that performs medical image classification on a particular portion of the patient's body, the medical image classification performing computerized pattern analysis on the medical image data of the input volume 105 and predicting a classification of the medical imaging data on one or more predetermined portions of the patient's body. In some illustrative embodiments, this evaluation may be binary (e.g., being or not an abdominal medical imaging volume), or may be a more complex multi-class evaluation (e.g., specifically identifying probabilities or scores for a plurality of different body part classifications (e.g., abdominal, cranial, lower extremities, etc.)).

If the body-part-of-interest determination logic 114 of the first stage logic 110 of the AI pipeline 100 determines that the input volume 105 does not represent a portion of the patient's body in which an anatomical structure of interest may be found, (e.g., an abdominal portion of the body in which the liver may be found), the processing of the AI pipeline 100 may be interrupted (a rejection situation). If the body part of interest determination logic 114 of the first stage logic 110 of the AI pipeline 100 determines that the input volume 105 does represent the portion of the patient's body in which the anatomical structure of interest may be found, further processing of the input volume 105 by the AI pipeline 100 is performed as described below. It should be appreciated that in some demonstrative embodiments, multiple different instances of AI pipeline 100 may be provided, each configured and trained to process input volumes 105 corresponding to different anatomical structures that may be present in different portions of a patient's body. Thus, the first stage logic 110 may be provided external to the AI pipeline 100 and may operate as routing logic to route the input volumes 105 to corresponding AI pipelines 100, the AI pipeline 100 being specifically configured and trained to process input volumes 105 of a particular classification, e.g., one AI pipeline instance for liver and liver lesion detection/classification, another AI pipeline instance for lung and lung lesion detection/classification, a third AI pipeline instance for heart and heart lesion detection/classification, etc. Thus, the first stage logic 110 may include routing logic that stores a mapping of which AI pipeline instance 100 corresponds to different body parts/anatomical structures of interest, and based on the detection of body parts represented in the input volume 105, the first stage logic 110 may automatically route the input volume 105 to the corresponding AI pipeline instance 100, the AI pipeline instance 100 specifically configured and trained to process the input volume 105 corresponding to the detected body parts.

Assuming that the input volume 105 is detected as a portion of the patient's body indicative of the presence of an anatomical structure of interest (e.g., an abdominal scan is present in the input volume 105 for purposes of liver lesion detection), further processing of the input volume 105 is performed by the AI pipeline 100 in the second stage logic 120. The second level logic 120 includes two primary substages 122 and 124 that may be executed substantially in parallel and/or sequentially with respect to each other, depending on the desired implementation (shown as parallel execution in FIG. 1 as an example). The two main substages 122, 124 include a phase classification substage 122 and an anatomy detection substage 124, (e.g., a liver detection substage 124 in the case where the AI pipeline 100 is configured to perform liver lesion detection).

The phase classification substage 122 determines whether the input volume 105 includes a single imaging phase (e.g., a pre-contrast phase, an arterial contrast phase, a portal contrast phase, a delayed phase, etc.). Again, the phase classification substage 122 may be implemented as logic that evaluates metadata associated with the input volume 105, which may include a field specifying a phase of the medical imaging study to which the medical image corresponds, as may be generated by the medical imaging technology computing system 102 when performing medical imaging. Alternatively, illustrative embodiments may implement a configured and trained ML/DL computer model that is specifically trained to detect patterns of medical images indicative of different phases of a medical imaging study, and may thereby classify the medical images of the input volume 105 as to which phases they correspond. The output of the phase classification substage 122 may be binary, indicating whether the input volume 105 includes one phase or multiple phases, or may be a classification of each phase represented in the input volume 105, which may then be used to determine whether to represent a single phase or multiple phases.

If the phase classification indicates that there is a single phase in the input volume 105, the AI pipeline 100 performs further processing through the downstream stages 130-170 as described below. If multiple phases are detected, the input volume 105 is not further processed by the AI pipeline 100, or, as previously described, may be filtered and/or divided into sub-volumes, each with images of a corresponding single phase, such that the AI pipeline 100 processes only the sub-volumes corresponding to the target phases and/or routes the sub-volumes to the corresponding AI pipeline, which is configured and trained to process the input volumes of images corresponding to their particular phase classifications. It should be appreciated that the input volume may be rejected for several reasons (e.g., no liver is present in the image, no single phase input volume is present in the image, no sufficient liver is present in the image, etc.). Depending on the actual root cause of the rejection, the reason for the rejection may be communicated to the user via a user interface or the like. For example, in response to the rejection, the output of the AI pipeline 100 can indicate the reason for the rejection and can be utilized by a downstream computing system (e.g., a viewer or additional automated processing system) to communicate the reason for the rejection through the output. For example, where no liver is detected in the input volume, the input volume may be silently ignored, e.g., when a liver is contained for the input volume, the rejection is not communicated to the user, but includes a multi-phase input volume, which may be communicated to the user (e.g., a radiologist) by clearly stating in a user interface generated by a viewer downstream computing system that the input volume is not processed by the AI pipeline 100 because the input volume has images of more than one phase, e.g., so as not to be misinterpreted by an input volume that does not contain any findings.

The second main substage 124 is a detection substage for detecting an anatomical structure of interest (a liver in the exemplary embodiment) in a portion of the input volume 105. That is, slices, slabs, etc., of the input volume 105 that specifically depict the anatomical structure of interest (liver) are identified and evaluated to determine whether a predetermined minimum amount of the anatomical structure of interest (liver) as a whole is present in those slices, slabs, or input volumes. As previously described, the detection substage 124 includes an ML/DL computer model 125, the ML/DL computer model 125 being specially trained and configured to perform computerized medical image analysis to identify portions of the input medical image that correspond to anatomical structures of interest (e.g., a human liver).

Thus, in the liver detection substage 124, the AI pipeline 100 uses the trained ML/DL computer model 125 to determine whether the volume of input CT medical images includes an image depicting the liver. The portion of the volume depicting the liver is passed along with the results of the phase classification substage 122 to a determination substage 126 of the AI pipeline 100, the determination substage 126 including single phase determination logic 127 and minimum structure amount determination logic 128, the determination substage 126 determining whether a single phase medical imaging 127 is present and whether at least a predetermined amount of an anatomical structure of interest 128 is present in the portion of the volume depicting the anatomical structure of interest (e.g., the liver). As previously described, it may be determined whether a predetermined amount of anatomical structure of interest is present based on known measurement mechanisms that determine structural measurements from the medical image, e.g., calculating the size of the structure from differences in pixel locations within the image, and comparing these measurements to one or more predetermined thresholds to determine whether a minimum amount of anatomical structure of interest (e.g., liver), e.g., liver determined to depict 1/3 in the portion of the input volume 105 that is depicting the liver, is present in the input volume 105.

In one wordIn an illustrative embodiment, to determine whether a predetermined amount of anatomical structure of interest (liver) is present in the input volume 105, the portion of the anatomical structure present in the input volume 105 may be evaluated using the axial scoring mechanism previously described. As previously described, for the input volume 105, the ML/DL computer model may be configured and trained to estimate slice scores s that correspond to the slice scores of MSSV and MISV slices, respectively_supAnd s_inf. In some demonstrative embodiments, s 'of the segment may be estimated by first dividing input volume 105 into a plurality of segments, and then, for each segment, performing a configured and trained ML/DL computer model on a slice of the segment'_supAnd s'_infTo indirectly find the slice score s_supAnd s_inf. Given s'_supAnd s'_infBy extrapolation to find s_supAnd s_infSince it is known how the segments are positioned relative to the entire volume of the input CT medical image. It should be noted that for each input volume 105, there will be n s_supAnd s_infWhere n is the number of segments per volume. In one illustrative embodiment, the final estimate is obtained by taking an unweighted average of the n estimates, however, in other illustrative embodiments, other functions of the n estimates may be used to generate the final estimate.

Having determined s of the volume of the input CT medical image_supAnd S_infBased on these values, a score of the anatomical structure of interest (e.g., the liver) is calculated. This calculated score may then be compared to a predetermined threshold to determine whether a predetermined minimum amount of anatomical structure of interest is present in the volume of the input CT medical image, e.g., at least 1/3 livers are present in the volume of the input CT medical image.

If the determinations of the determination logic 127 and 128 indicate that there are multiple phases and/or that there is not a predetermined amount of anatomical structure of interest in the portion of the input volume 105 depicting the liver, further processing of the input volume 105 by the AI pipeline 100 for stage 130 and 170 may be interrupted (i.e., rejection of the input volume 105). If the determinations of determination logic 127 and 128 result in a determination that the input volume 105 has a single phase of images and depicts at least a predetermined amount of liver, the portion of the input volume 105 depicting the anatomical structure is forwarded to the next stage 130 of the AI pipeline 100 for processing. While the exemplary illustrative embodiment forwards a sub-portion of the input volume containing the liver for further processing, in other illustrative embodiments, a context around the liver may also be provided, which may be accomplished by adding a predetermined amount of margin above and below the selected liver region. Depending on how much context is needed for subsequent processing operations, the margin may be increased to completely cover the original input volume.

In the next stage 130 of the AI pipeline 100, the AI pipeline 100 performs lesion detection on the portion of the input volume 105 that represents the anatomical structure of interest (e.g., the liver). This liver/lesion detection stage 130 of the AI pipeline 100 uses the integration of the ML/

DL computer model

132 and 136 to detect the liver and lesions in the liver as represented in the input volume 105. Integration of the ML/DL computer model 132-. Furthermore, the integrated ML/DL computer model 132-.

In one illustrative embodiment, the configured and trained ML/DL computer model 132 is performed on the input volume 105 to detect the presence of a liver. The ML/DL computer model 132 may be the same as the ML/DL computer model 125 employed in the previous AI pipeline stage 120, and thus, may utilize the previously obtained results. A plurality (two or more) of other ML/

DL computer models

134 and 136 are configured and trained to perform lesion detection in the portion of the liver depicted in the medical image of the input volume 105. The first ML/DL computer model 134 is configured and trained to operate directly on the input volume 105 and generate lesion predictions. The second ML/DL computer model 136 is configured with two different decoders implementing two different loss functions, one being a loss function that penalizes errors in false negatives (i.e., classification incorrectly indicates the absence of a lesion (normal anatomy)), and the second being a loss function that penalizes errors in false positive results (i.e., classification incorrectly indicates the presence of a lesion (abnormal anatomy)). The first decoder of the ML/DL computer model 136 is trained to recognize patterns representing a relatively large number of different lesions at the expense of having a large number of false positives. The second decoder of the ML/DL computer model 136 is trained to be less sensitive to detection of lesions, but the detected lesions are more likely to be accurately detected. The integrated third loss function of the ML/DL computer model as a whole compares the results of the decoders of the ML/DL computer model 136 with each other and makes them consistent with each other. The lesion prediction results of the first and second ML/

DL computer models

134, 136 are combined to generate a final lesion prediction for integration, while the other ML/DL computer model 132 that generates a prediction of the liver mask provides an output representative of the liver and its contours. An example architecture of these ML/DL computer models 132-136 will be described in more detail below with reference to FIG. 6.

The results of the liver/lesion detection stage 130 of the AI pipeline 100 include one or more contours (outlines) of the liver and a detection map (e.g., a voxel-wise map of liver lesions detected in the input volume 105) that identifies portions of the medical imaging data elements that correspond to the detected lesions 135. The detection map is then input to the lesion segmentation stage 140 of the AI pipeline 100. As will be described in more detail below, the lesion segmentation stage 140 uses a watershed technique and a corresponding ML/DL computer model 142 to partition the detection map to generate image element (e.g., voxel) partitions of the medical image (slice) of the input volume 105. The liver lesion segmentation stage 140 provides other mechanisms (such as an ML/DL computer model 144) that identifies all contours corresponding to lesions present in slices of the input volume 105 based on the partition, and performs an operation that identifies which contours correspond to the same lesion in three dimensions. The lesion segmentation stage 140 further provides mechanisms (such as ML/DL computer models 146) that aggregate the relevant lesion contours to generate a three-dimensional partition of the lesion. Lesion segmentation uses lesion image elements (e.g., voxels) represented in medical images and repair (in-delineation) of non-liver tissue to focus each lesion individually and perform active contour analysis. In this way, individual lesions may be identified and processed without being biased for analysis by other lesions in the medical image or for portions of the image outside the liver.

The result of lesion segmentation 140 is a list of lesions 148 in the input volume 105 with their corresponding appearance or contour. These outputs 148 are provided to the false positive removal stage 150 of the AI pipeline 100. The false positive removal stage 150 uses a configured and trained ML/DL computer model that uses a two-operation point strategy to reduce false positive lesion detection in the lesion list generated by the lesion segmentation stage 140 of the AI pipeline 100. The first operating point is selected to be sensitive to false positives by configuring the ML/DL computer model of the false positive removal stage 150 to remove as many lesions as possible. Determining whether a predetermined number or less of lesions remain in the list after the sensitive false positive removal. If so, a second operating point that is relatively less sensitive to false positives is used to reconsider the other lesions removed in the list. The results of these two methods identify a final filtered list of lesions to be further processed by the lesion classification stage of the AI pipeline.

After the false positives have been removed from the lesion list and its contours generated by the lesion segmentation stage 140, the resulting filtered lesion list 155 is provided as input to a lesion classification stage 160 of the AI pipeline 100, which performs a configured and trained ML/DL computer model on the lesion list and its corresponding contour data, thereby classifying the lesion into one of a plurality of predetermined lesion classifications. For example, each lesion in the final filtered list of lesions and its attributes (e.g., contour data) may be input into a trained ML/DL computer model of the lesion classification stage 160, which then operates on this data to classify the lesion as a particular predetermined type or class of lesion.

After the lesion is classified by the lesion classification stage 160 of the AI pipeline 100, the AI pipeline 100 generates an output 170 that includes a finalized list of lesions and their classifications, as well as any contour attributes of the lesion. In addition, the AI pipeline 100 output 170 may also include liver contour information for the liver obtained from the liver/lesion detection stage 130. The output generated by the AI pipeline 100 may be provided to a further downstream computing system 180 for further processing and generation of a representation of the anatomical structure of interest and any detected lesions present in the anatomical structure. For example, a graphical representation of the input volume may be generated in a medical image viewer or other computer application of the downstream computing system 180, where the anatomical structure and detected lesion are superimposed or otherwise highlighted in the graphical representation using contour information generated by the AI pipeline. In other illustrative embodiments, downstream processing by the downstream computing system 180 may include diagnostic decision support operations, automated medical imaging report generation based on a list, classification, and contours of detected lesions. In other illustrative embodiments, based on the classification of the lesion, different treatment recommendations may be generated for review and consideration by a practitioner. In some demonstrative embodiments, the list of lesions, their classification and contours may be stored in association with the patient identifier in a historical data structure of downstream computing system 180, such that multiple executions of AI pipeline 100 on different input volumes 105 associated with the same patient may be stored and evaluated over time. For example, differences between a list of lesions and/or their associated classifications and contours may be determined to assess the progress of a patient's disease or medical condition, and such information is presented to a medical professional for use in assisting in the treatment of the patient. The other downstream computing systems 180 of the illustrative embodiments and the processing of the specified anatomical structures and lesion detection information generated by the AI pipeline 100 may be implemented without departing from the spirit and scope of the present invention.

FIG. 2 is an example flowchart outlining an example operation of an AI pipeline in accordance with one illustrative embodiment. The operations outlined in fig. 2 may be implemented by various logic stages, including a configured and trained ML/DL computer model, as shown in fig. 1 and described above with reference to specific example embodiments described in separate sections of the specification that follow. It should be appreciated that this operation is specific to an automated human intelligence pipeline implemented in one or more data processing systems having one or more computing devices specifically configured to implement these automated computer tool mechanisms. There is no human intervention in the operations outlined in fig. 1 and 2 except at the medical image volume creation time and using the output from the downstream computing system. The present invention specifically provides an improved automated artificial intelligence computing mechanism to perform the described operations that avoid human interaction and reduce potential errors due to previous artificial processes by providing new and improved processes that are specifically distinct from any previous artificial processes and specifically directed to providing logic and data structures that allow the improved artificial intelligence computing mechanism of the present invention to be implemented in an automated computing tool.

As shown in fig. 2, the operation begins by receiving an input volume of a medical image from a medical imaging technology computing system (e.g., a computing system providing Computed Tomography (CT) medical images) (step 210). The AI pipeline operates on the received input volume to perform body part detection (step 212) such that a determination can be performed as to whether a body part of interest is present in the received input volume (step 214). If no body part of interest is present in the input volume (e.g. abdomen in case of liver lesion detection and classification), the operation terminates. If there is a body-part of interest in the input volume, the phase classification and the minimal anatomical structure assessment are performed either sequentially or in parallel.

That is, as shown in fig. 2, phase classification is performed on the input volume (step 216) to determine whether the input volume includes medical images (slices) of a single phase (e.g., pre-contrast imaging, partial contrast imaging, delayed phase, etc.) or multiple phases for medical imaging. It is then determined whether the phase classification indicates a single phase or multiple phases (step 218). If the input volume includes medical images directed to multiple phases, the operation terminates; otherwise, if the input volume comprises medical images directed to a single phase, the operation continues to step 220.

In step 220, detection of the anatomical structure of interest (e.g., the liver in the depicted example) is performed in order to determine whether a minimum amount of anatomical structure is present in the input volume to enable accurate performance of subsequent stages of the AI pipeline operation. It is determined whether a minimum amount of anatomical structure (e.g., a liver representing at least 1/3 in the input volume) is present (step 222). If the minimum amount is not present, the operation terminates; otherwise, operation continues to step 224.

In step 224, liver/lesion detection is performed to generate a contour and detection map of the lesion. The contours and detection maps are provided to lesion segmentation logic, which performs lesion segmentation based on the contours and detection maps (e.g., liver lesion segmentation in the depicted example) (step 226). Lesion segmentation results in the generation of a list of lesions and their contours, along with detection and contour information for anatomical structures (e.g., liver) (step 228). Based on this list of lesions and their contours, a false positive removal operation is performed on the lesions in the list to remove false positives and generate a list of filtered lesions and their contours (step 230).

The list of filtered lesions and their contours is provided to lesion classification logic, which performs lesion classification to generate a final list of lesions, their contours, and lesion classifications (step 232). The final list is provided to a downstream computing system (step 234) along with liver contour information, which may operate on this information to generate medical imaging views in a medical imaging viewer application, generate treatment recommendations based on classification of detected lesions, evaluate historical progression of lesions of the same patient over time based on comparison of final lesion lists generated by the AI pipeline at different points in time, and so on.

Thus, the illustrative embodiments as outlined above provide automated artificial intelligence mechanisms and ML/DL computer models that operate on the input volume of the medical image and generate a list of lesions, their contours, and classifications while minimizing false positives. The illustrative embodiments provide an automated artificial intelligence computer tool that specifically identifies, in a given set of image voxels of an input volume, which ones of the voxels correspond to a portion of an anatomical structure of interest (e.g., a liver) and which ones of the voxels correspond to a lesion in the anatomical structure of interest (e.g., a liver lesion). The illustrative embodiments provide a significant improvement over previous methods, both manual and automated, in that the illustrative embodiments may be integrated in a fully automated computer tool in the clinician workflow. Indeed, based on the early stage of the AI pipeline design of the illustrative embodiments, which accepts input volumes of only a single phase (e.g., abdominal scans) and rejects input volumes that do not depict anatomical structures of interest (e.g., liver), or do not depict a predetermined amount of anatomical structures of interest (e.g., too small and liver amounts), only meaningful input volumes are processed through the automated AI pipeline, thereby preventing the radiologist from expending valuable manual resources on useless or defective results when reviewing non-anatomical input volumes of interest (e.g., non-liver cases). In addition to preventing flooding of the radiologist with useless information, the automated AI pipeline of the illustrative embodiments also ensures smooth information technology integration by avoiding congestion of the AI pipeline and downstream computing systems (such as a network) that are archived and reviewed using data associated with cases that do not correspond to or cannot provide a sufficient amount of the anatomical structure of interest. Furthermore, as described above, the automated AI pipeline of the illustrative embodiments allows for the precise detection, measurement, and characterization of lesions in a fully automated manner, which is technically possible through the components of the automated AI pipeline structure of one or more of the illustrative embodiments and its corresponding automated ML/DL-based computer model.

ML/DL computer model for detecting the presence of a minimum amount of anatomical structures in an input volume

As previously mentioned, as part of the processing of the input volume 105, it is important to ensure that the input volume 105 represents a single phase of the medical imaging and that at least a minimal amount of the anatomical structure of interest is represented in the input volume 105. To determine that a minimum amount of the anatomical structure of interest is present in the input volume 105, in one illustrative embodiment, the determination logic 128 implements a specially configured and trained ML/DL computer model that estimates a slice score for determining a portion of the anatomical structure (e.g., a liver present in the input volume 105). The following description provides example embodiments of such configured and trained ML/DL computer models based on defined axial scoring techniques.

FIG. 3A is an exemplary diagram illustrating an example input volume (medical image) of an abdomen of a human patient according to one illustrative embodiment. In the depiction of fig. 3A, a two-dimensional representation of a three-dimensional volume is shown. The slice is a horizontal line within the two-dimensional representation shown in fig. 3A, but will be represented as a plane extending into and/or out of the page to represent a flat two-dimensional slice of the human body, where the stacking of these planes results in a three-dimensional image.

As shown in fig. 3A, the illustrative embodiment defines an axial score for slices ranging from 0 to 1. The axial score is defined such that the slice corresponding to the First Slice (FSL) containing the liver has a slice score of 0 and the Last Slice (LSL) containing the liver has a score of 1. In the depicted example, a first slice and a last slice associated with a lowermost slice (MISV) in the volume and an uppermost slice (MSSV) in the volume are defined, with the lowermost and uppermost determined along a given axis of the volume (e.g., the y-axis in the depicted example of fig. 3A). Thus, in the example depicted here, the MSSV is at the highest y-axis value slice and the MISV is at the lowest y-axis value slice. For example, the MISV may be closest to a lower limb of the biological entity (e.g., a foot of a human subject), and the MSSV may be closest to an upper part of the biological entity (e.g., a head of a human subject). The FLS is a slice depicting the anatomical structure of interest (e.g., liver) relatively closest to the MISV. LSL is a slice depicting the anatomy of interest relatively closest to the MSSV. In one illustrative embodiment, a trained ML/DL computer model (e.g., a neural network) may assign an axial score by taking a slice chunk as an input and outputting the height of the center slice in the chunk (axial score). The trained ML/DL computer model is trained with a cost function that minimizes the error (e.g., least squares error) of the actual altitude. This trained ML/DL computer model is then applied to all chunks covering the input volume (possibly with some overlap between chunks).

Liver axial score estimation (LAE) scoring s by a pair of slices_supAnd s_infIs defined, the pair of slice scores s_supAnd s_infSlice scores corresponding to MSSV and MISV slices, respectively. The ML/DL computer model of the determination logic 128 of FIG. 1 is specially configured and trained to determine the slice score s of the input volume 105_supAnd s_infAnd knowing these slice scores, the mechanism of the illustrative embodiment is able to determine the fraction of the liver in the field of view of the input volume 105.

In some demonstrative embodiments, slice score s may be found indirectly by first dividing input volume 105 into a plurality of segments (e.g., a segment including X slices (e.g., 20 slices))_supAnd s_infThen, for each section, executing a configured and trained ML/DL computer model on the slices of that section to estimate the slice scores s 'of the first and last slices in that section'_supAnd s'_infWhere "first" and "last" may be determined according to a direction of progression along an axis of the three-dimensional volume 105 (e.g., proceeding along the y-axis from a first slice to a last slice from a minimum y-axis value slice to a highest y-axis value slice). Given s'_supAnd s'_infBy extrapolation to find s_supAnd s_infBecause it is known how the segments are positioned relative to the entire volume 105. It should be noted that for each volume there will be n s_supAnd s_infWhere n is the number of segments per volume. In one illustrative embodiment, the final estimate is obtained by taking an unweighted average of the n estimates, however, in other illustrative embodiments, other functions of the n estimates may be used to generate the final estimate.

For example, FIG. 3B shows FIG. 3AIs generated, wherein the segments of the slice are along with their corresponding axial scores s'_infAnd s'_supAre shown together. As shown in fig. 3B, in this example, a sector is defined as 20 slices separated by 5 mm. For each 20 slice segment of the volume, a slice score s 'is estimated from the ML/DL computer model'_supAnd s'_infAnd from these s 'along a given range (e.g., a range from 0 to 1, a range from-0.5 to 1.2, or any other desired predetermined range as appropriate for a particular implementation)'_supAnd s'_infExtrapolation of the values to obtain s_supAnd s_inf. In this example, assuming a predetermined range of-0.5 to 1.2, if s is estimated by applying the ML/DL computer model_supAnd extrapolated to about 1.2 and s_infEstimated to be-0.5, indicating that the entire liver is contained in this volume. Similarly, if s is estimated_supIs 1.2 and s_infAt 0.5, then these values indicate that about 50% of the upper axial liver extension is contained in the volume (e.g. with an overlay of (1.2-.5)/(1.2- (-0.5)) -0.41). As another example, in another illustrative embodiment, where the liver starts at-2.0 and ends at 0.8 (i.e., s)_supIs estimated to be 0.8 and s_infEstimated as 2.0), the upper limit of the liver is below 1.2, so the liver is cut in the upper part, and the lower limit is below-0.5, so the bottom of the liver is completely covered. This indicates that approximately 80% of the lower axial liver extension is contained in the volume (i.e. the coverage is (.8-max (-2, -.5))/(1.2- (-.5)) -0.76).

FIG. 3C is an exemplary diagram of the input volume of FIG. 3A, where the volume is axially divided into n fully overlapping sections. In the depicted example, there are 7 sections indicated by arrows. It should be noted that in this example, the last two sections (arrows at the top of the figure) are almost identical. S 'as before'_supAnd s'_infThe values are estimated by the ML/DL computer model for each of these sections and used to extrapolate the s for MSSV and MISV slices_supAnd s_infValue then s_supAnd s_infThe values may be used to determine the amount of anatomical structure of interest present in the input volume 105.

Thus, by first dividing the input volume 105 into sections and then estimating for each section the slice score s 'for the first and last slices in that section'_supAnd s'_infTo indirectly find the s of MSSV and MISV_supAnd s_infThe value is obtained. Given these estimates, s is estimated by extrapolation_supAnd s_infBecause it is known how the segment is positioned relative to the entire input volume 105. There are n s extrapolated from each segment_supAnd s_infWhere n is the number of segments per volume. The final estimate may be obtained, for example, by estimating any suitable combination function of the n estimates (e.g., an unweighted average of the n estimates or any other suitable combination function).

4A-4C show example diagrams of an illustrative embodiment of a ML/DL computer model configured and trained to estimate s 'of a segment of an input volume of a medical image in accordance with an illustrative embodiment'_supAnd s'_infThe value is obtained. The ML/DL computer model of fig. 4A-4C is only one example of an architecture of the ML/DL computer model, and many modifications may be made to the architecture (such as changing the tensor size of the input slice of the input volume, changing the number of nodes in the layers of the ML/DL computer model, changing the number of layers, etc.) without departing from the spirit and scope of the present invention. In view of this description, those of ordinary skill in the art will recognize how to modify the ML/DL computer model of the illustrative embodiments to a desired implementation.

As shown in FIG. 4A, a sequence of 20 slices representing a section 410 or "slab" of the input volume 105 is provided as an input to a Processing Block (PB) 420-430. In the depicted illustrative embodiment, PB 420-430 is a logical block of a hybrid convolutional layer and an LSTM layer (as shown in FIGS. 4B and 4C). Features are extracted from the convolutional layers of the

PB

420, 430 and then fed as input to the LSTM layers of the

PB

420, 430. This is an intelligent/light modeling of the following facts: the slices have a particular order in an anatomical region or anatomical structure of interest (e.g., abdomen/liver) driven by the anatomical structure (e.g., except for liver anatomy)Relative position of liver, kidney, heart, etc., outside of the structure itself). In the depicted example, the tensor size of the first 20 slices 410 is 128x128 in this example. In this example embodiment, the first processing block 420 reduces the size of the tensor by 8 to generate 20 slice sections of slices having the size 16x16x32 (it being understood that the number of slices in a section is implementation specific and may be modified without departing from the spirit and scope of the invention), where 32 is the number of filters. The second processing block 430 converts the input section slice into 20 slice sections with slices of dimension 2x2x64, where 64 is the number of filters. Subsequent neural network 440 configured with a flat layer, a compact layer, and a linear layer is configured and trained to generate s 'for input section 410 of input volume 105'_supAnd s'_infAnd (6) estimating. FIG. 4B shows the composition of Processing Blocks (PBs) with respect to convolutional and LSTM layers in accordance with an illustrative embodiment, and FIG. 4C shows an example configuration of each of these convolutional and LSTM layers for each PB in accordance with an illustrative embodiment.

4A-4C, during training of the ML/DL computer model, in one illustrative embodiment, medical imaging data (e.g., digital imaging and communications in medicine (DICOM) data) is assembled, for example, with S_iThe floating point number 32, in the input volume of a three-dimensional array of size X512X512, has a Hounsfield Unit (HU) value, which is a normalized physical value that depicts the attenuation of X-rays of material present at a given location (e.g., voxel). S_iIs the number of slices in the ith volume, where i ranges from 0 to N-1, where N is the total product number. Each input volume is processed by the body part detector and an approximate region corresponding to the abdomen is extracted as described above (in the case of liver detection). The abdomen is defined as the continuous region between-30 and 23 axial scores, e.g. from a body part detector. Slices outside this continuous region are rejected, and the base truth values can be defined as the positions of the FSL and LSL that are appropriately adjusted. For example, assume that the input volume ranges from a to b, e.g.The fruit is in [ a: b]And [ -30, 23]If there is no overlap, the input volume is rejected. In other words, if b>23 or if a < -30.

The input sections 410 or "slabs" are overlapped to a predetermined slice interval (e.g., 5 mm). The input section 410 is reshaped in the x, y dimensions to 128x128, which results in N shapes M_ix128x128 segments 410. This is referred to as down-sampling of the data in the input volume. Since the ordering of slices within the input volume depends on coarse information (e.g., the size of the organ), the AI pipeline still operates well on the downsampled data, and both the processing and training time of the AI pipeline are improved due to the reduction in the size of the downsampled data.

Reject input section 410 having a pixel size less than a predetermined number (e.g., 20) of slices or less than a predetermined pixel size (e.g., 55mm), resulting in N' M_ix128x128 segments. Values in the bins are clipped and normalized using a linear transformation from which a range (e.g., -1024, 2048) to a range (0, 1) is obtained. At this time, N 'M's are processed as described above_iThe x128x128 segments constitute a training set on which the neural network 440 is trained to generate s 'for the input segments'_supAnd s'_infIs estimated.

With respect to performing inference with the trained neural network 440, the above operations are used to process the input volume 105 through body part detection, slice selection corresponding to the body part of interest, re-slicing, reshaping, rejecting certain slices that do not meet predetermined requirements, and generating cropped and normalized segments is performed again for new segments of the input volume 105. After generating the cropped and normalized segments, the input volume 105 is divided into R-ceil (M-10)/10 subvolumes or segments containing 20 slices to thereby generate partitions of slices with overlapping chunks. For example, if there are N' ═ 31 slice volumes (slice numbers 0-30), then three segments or sub-volumes containing the following overlapping slices: 0-19, 10-29 and 11-30. These segments or subvolumes will typically have an overlap of about at least 50%.

Thus, providing, configuring and training ML/DL computer modelsType such that, given a defined axial score range from 0 to 1, s 'of a segment of the volume based on a predetermined number of slices (medical images) corresponding thereto'_supAnd s'_infEstimation of the value, estimating s of the input volume_supAnd s_infThe value is obtained. From these estimates, it may be determined whether the input volume includes medical slices that collectively constitute at least a predetermined amount of an anatomical structure of interest (e.g., a liver). As discussed previously, this determination may be part of the determination logic 128 of the AI pipeline 100, which determination logic 128 is used to determine whether there is a sufficient representation of anatomical structures in the input volume 105 to allow accurate liver/lesion detection, lesion segmentation, etc. in further downstream stages of the AI pipeline 100.

Fig. 5 is a flowchart outlining an example operation of liver detection and predetermined amount anatomy determination logic of an AI pipeline in accordance with one illustrative embodiment. As shown in fig. 5, the liver detection operation of the AI pipeline begins by receiving an input volume (step 510) and dividing the input volume into a plurality of overlapping sections of a predetermined number of slices for each section (step 520). The slices for each section are input into a trained ML/DL computer model that estimates the axial scores of the first and last slices in each section (step 530). The axial scores of the first and last slices are used to extrapolate the scores of the lowest slice (MISV) in the volume of the input volume and the highest slice (MSSV) in the volume (step 540). This results in multiple estimates of the axial scores of the MISV and MSSV, which are then combined by a function of the respective estimates to generate an estimate (e.g., a weighted average, etc.) of the axial scores of the MISV and MSSV for the input volume (step 550). Based on the estimates of the axial scores of the MISV and MSSV, the axial scores are compared to criteria for determining whether a predetermined amount of anatomical structure of interest (e.g., liver) is present in the input volume (step 560). Thereafter, the operation terminates.

Liver/lesion detection

As previously described, assuming that the input volume 105 is determined to have a single phase as represented and that the input volume 105 has a predetermined amount of the anatomical structure of interest represented in a slice of the input volume 105, liver/lesion detection is performed on the portion of the input volume 105 that includes the anatomical structure of interest. In one illustrative embodiment, the liver/lesion detection logic stage 130 of the AI pipeline 100 employs a configured and trained ML/DL computer model that operates to detect an anatomical structure of interest (e.g., a liver) in a slice of the input volume 105 (again, in some illustrative embodiments, this may be the same ML/DL computer model 125 used in stage 120 for liver detection). The liver/lesion detection logic stage 130 of the AI pipeline 100 also includes integration of a plurality of other configured and trained ML/DL computer models to detect lesions in images of the anatomical structure of interest (liver).

Fig. 6 is an integrated example diagram of an ML/DL computer model for performing lesion detection in an anatomical structure of interest (e.g., a liver) in accordance with one illustrative embodiment. The integration of the ML/DL computer model 600 includes a first ML/DL computer model 610 for detecting an anatomical structure of interest (e.g., a liver) and generating a corresponding mask. The integration of the ML/DL computer model 600 further comprises a second ML/DL computer model 620 configured and trained to process liver masking inputs implemented in two decoders of the second ML/DL computer model 620 and to generate a lesion prediction using two competitive loss functions. One loss function is configured to penalize false positive errors (resulting in low sensitivity but high accuracy) and the other loss function is configured to penalize false negative errors (resulting in high sensitivity but lower accuracy). An additional penalty function (referred to as a consistency penalty 627 in fig. 6) is employed for the second ML/DL computer model 620, and the outputs generated by the two competing decoders are made similar (consistent) with each other. The integration of the ML/DL computer models further includes a third ML/DL computer model 630 configured and trained to directly process the input volume 105 and generate a lesion prediction.

As shown in FIG. 6 and described above, the integration 600 includes a first configured and trained ML/DL computer model 610 specifically configured and trained to recognize anatomical structures of interest in an input medical image. In some illustrative embodiments, the first ML/DL computer model 610 includes a U-Net neural network model 612 that is configured and trained to perform image analysis to detect a liver within a medical image, however, it should be understood that illustrative embodiments are not limited to this particular neural network model and that any ML/DL computer model that can perform segmentation may be utilized without departing from the spirit and scope of the present invention. U-Net is a convolutional neural network developed for biomedical image segmentation at the Department of Computer Science Department of the University of Freiburg, Germany. The U-Net neural network is based on a full convolution network with an architecture that is modified and extended to work with fewer training images and produce more accurate segmentations. U-Net is generally known in the art and thus a more detailed explanation is not provided here.

As shown in fig. 6, in one illustrative embodiment, a first ML/DL computer model 610 may be trained to process a predetermined number of slices at a time, where the number is determined to be suitable for the desired implementation (e.g., 3 slices determined by an empirical process to produce good results). In one illustrative embodiment, the slices of the input volume are, for example, 512x512 pixel medical images, although other implementations may use different slice sizes without departing from the spirit and scope of the illustrative embodiments. The U-Net generates a segmentation of the anatomical structure in the input slice, resulting in one or more segments corresponding to the anatomical structure of interest (e.g., the liver). As part of this segmentation, the first ML/DL computer model 610 generates a segment representing a liver mask 614. This liver mask 614 is provided as an input to at least one of the other ML/DL computer models 620 of the integration 600 in order to focus the processing by the ML/DL computer model 620 only on the part of the input slice of the input volume 105 corresponding to the liver. By preprocessing the input of the ML/DL computer model with the liver mask 614, the processing by the ML/DL computer model may be focused on the portion of the input slice that corresponds to the anatomical structure of interest, rather than on the "noise" in the input image. Other ML/DL computer models (e.g., ML/DL computer model 630) receive the input volume 105 directly without liver masking using the liver mask 614 generated by the first ML/DL computer model 610.

In the depicted illustrative embodiment of the integration 600, the third ML/DL computer model 630 is comprised of an encoder portion 634-. The ML/DL computer model 630 is configured to receive 9 slabs of the input volume 105, which 9 slabs are then separated into groups 631-633 of 3 slices each, where each group 631-633 is input into a corresponding encoder network 634-636. Each encoder 634-. CNN 634-. The architecture of the original densnet network includes many convolutional layers and a skip true connection that downsamples a 3-slice full-resolution input to many eigen-channels with smaller resolutions. From then on, all-connected headers aggregate all features and map them to multiple categories in the final output of the DenseNet. Because the DenseNet network is used as an encoder in the depicted architecture, the header is removed and only the downsampled features are retained. Then, in the cascade NHWC logic 637, all feature channels are cascaded to pass them into the decoder stage 638, which has the effect of upsampling the image until the desired (e.g., 512x512) output probability map resolution is reached.

The encoder 634-. Training of the ML/DL computer model 630 uses two different loss functions. The main loss function is an adaptive loss specifically configured to penalize false positive errors in slices without lesions in the ground truth and also to penalize false negative errors in slices with lesions in the ground truth. The penalty function is a modified version of the Tversery penalty as follows:

for each output slice:

TP ═ sum (predicted target)

FP ═ sum ((1-target) × prediction)

FN ═ sum ((1-prediction) × target)

LOSS＝1-((TP+1)/(TP+1+α*FN+β*FP))

Where "prediction" is the output probability of the ML/DL computer model 630 and "target" is the underlying truth lesion mask. The output probability values range between 0 and 1. For each pixel in the slice, the object has either a 0 or a 1. For slices with no lesions in them, the "α" term is small (e.g., zero) and "β" is large (e.g., 10). For a slice with a lesion in it, "α" is large (e.g., 10) and "β" is small (e.g., 1).

The second loss function 639 is a function connected to the outputs of the encoders 634-636. Because this lost input comes from the middle of the ML/DL computer model 630, it is referred to as "deep supervision" 639. Deep supervision has shown that it enables the encoder neural network 634-636 to learn a better representation of the input data during training. In one illustrative embodiment, the second loss is a simple mean square error that predicts whether a slice has a lesion. Thus, the output characteristics of encoders 634-636 are mapped to 9 values between 0 and 1 using a mapping network, the 9 values representing the probability of having a lesion in each of the 9 slice inputs. The decoder 638 generates an output specifying a probability map of detected lesions in the input image.

The second ML/DL computer model 620 receives 3 slices of pre-processed input from the input volume, which has been pre-processed with the liver mask 614 generated by the first ML/DL computer model 610 to identify portions of the 3 slices corresponding to the liver mask 614. The resulting preprocessed input slice (having a size of 192x192x3 in the depicted exemplary embodiment) is provided to a second ML/DL computer model 620, which second ML/DL computer model 620 comprises a DenseNet-169(D169) encoder 621 connected to two decoders (2D DEC — representing a decoder consisting of 2-dimensional neural network layers). The D169 encoder 621 is a neural network feature extractor, widely used in computer vision applications. It consists of a series of convolutional layers, where features extracted from each layer are connected to any other layer in a feed-forward manner. The features extracted in the encoder 621 are transmitted to two

independent decoders

622, 623, where each

decoder

622, 623 is composed of a two-dimensional convolutional layer and an upsampled layer (referred to as 2D DEC in fig. 6). Each

decoder

622, 623 is trained to detect lesions (e.g., liver lesions) in the input slice. As discussed previously and below, although the two

decoders

622, 623 are trained to perform the same task (i.e., lesion detection), a key difference in their training is that the two

decoders

622, 623 each utilize a different loss function in order to drive the detection training to two competing directions. The final detection map of the second ML/DL model 620 is combined with the final detection map of the third ML/DL model 630 by an averaging operation 640. This process is applied over all input slabs of the input volume 105 to generate a final detection map (e.g., a liver lesion).

As described above, the second ML/DL computer model 620 is trained using two different loss functions that attempt to achieve opposite detection operating point performance. That is, where one of the encoders 622 uses a loss function for training, the loss function penalizes errors in false negative lesion detection, resulting in high sensitivity detection with relatively low accuracy, and the other of the encoders 623 uses a loss function for training that penalizes errors in false positive lesion detection, resulting in low sensitivity detection but with high accuracy. An example of these Loss functions may be focus Tdesk Loss (see Abraham et al, "A Novel Focal Tdesk Loss with Improved Attention U-Net for distance Segmentation", arXiv: 1810.07842[ cs ], month 10 2018), where the parameters are adjusted for high or low penalties for false positives and false negatives according to an illustrative embodiment. A third penalty function (consistency penalty 627) is used to enforce consistency between the prediction detections of each

decoder

622, 623. The loss of consistency logic 627 compares the

outputs

624, 625 of the two

encoders

622, 623 with each other and makes these outputs similar to each other. Such a loss may be, for example, a loss of mean square error between two prediction detections, a loss of structural similarity, or any other loss that enforces consistency/similarity between the compared prediction detections.

At runtime, using these opposing

operating point encoders

622, 623, the second ML/DL computer model 620 generates two

lesion outputs

624, 625, which are input to slice average (SLC AVG) logic 623 that generates an average of the lesion outputs. This average of the lesion output is then resampled to generate an output that is dimensionally proportional to the output of the third ML/DL computer model 630 for comparison (note that this process includes restoring the liver masking operation, and thus calculating the lesion output at the original 512x512x3 resolution).

At runtime, slice averaging (SLC AVG) logic 626 operates on the lesion prediction outputs 624 and 625 of the

encoders

622, 623 to generate a final detection map of the ML/DL model 620. It should be appreciated that although a consistency penalty 627 is applied during training to drive each

decoder

622, 623 to learn consistency detection, this consistency penalty is no longer utilized at runtime, but rather the ML/DL model 620 outputs two detection maps that need to be aggregated by the SLC AVG module 626. The result of SLC AVG logic 626 is resampled to generate an output with dimensions commensurate with the input slab (512x512x 3). All generated detections of the ML/DL model 620 for each slab of the input volume 105 are combined with the generated detections of the ML/DL model 630 via volume averaging (VOL AVG) logic 640. This logic calculates the average of the two detection masks at the voxel level. The result is a final lesion mask 650 corresponding to the lesion detected in the input volume 105.

Thus, after training the ML/

DL computer models

620, 630, when a new slice of the new input volume 105 is presented, the first ML/DL computer model 610 generates a liver mask 614 for pre-processing the input of the second ML/DL computer model 620, and the two ML/

DL computer models

620, 630 process the input slice to generate a lesion prediction averaged over the volume by the volume averaging logic 640. The result is a final lesion output 650 based on the operation of the first ML/DL computer model 610 and a liver mask output 660. These outputs may be provided as the liver/lesion detection logic stage 130 output of the AI pipeline 100, which is provided to the lesion segmentation logic stage 140 of the AI pipeline 100, as previously discussed above, and described in more detail below. Thus, the mechanisms of the illustrative embodiments provide an integrated 600 method for anatomy identification and lesion detection in the input volume 105 of a medical image (slice).

By integrating the architecture as shown in fig. 6, improved performance over using a single ML/DL computer model is achieved. That is, it has been observed that by combining the detection outputs of the integrated multiple ML/DL computer models using the integrated architecture, improved detection specificity is achieved at the same level of sensitivity as a single ML/DL computer model. That is, errors (false positives) are generated at different locations using the ML/

DL models

620, 630, and when the detection outputs of the different locations are averaged, the signal from the false positives is reduced while the signal from the true positive lesions dominates, resulting in improved performance.

Fig. 7 is a flowchart outlining an example operation of liver/lesion detection logic in an AI pipeline in accordance with one illustrative embodiment. As shown in fig. 7, the operation begins by receiving an input volume (step 710) and performing anatomical structure detection (e.g., liver detection) using a first trained ML/DL computer model, such as a U-Net computer model configured and trained to recognize an anatomical structure (e.g., liver) (step 720). The result of the anatomy detection is a segmentation of the input volume to identify a mask of the anatomy (e.g., a liver mask) (step 730). The input volume is also processed via an integrated first trained ML/DL computer model that is specially configured and trained to perform lesion detection (step 740). The first trained ML/DL computer model generates a first set of lesion detection prediction outputs based on its processing of the input volume (step 750).

The integrated second trained ML/DL computer model receives a masking input generated by applying the generated anatomical mask to the input volume and thereby identifies a portion of the input volume that corresponds to the medical image of the anatomical structure of interest (step 760). The second trained ML/DL computer model processes the masking input via two different decoders with two different and competing loss functions (e.g., one loss function penalizes errors in false positive lesion detection and the other penalty function penalizes errors in false negative lesion detection) (step 770). The result is two sets of lesion prediction outputs, which are then combined by combinatorial logic to generate a lesion prediction output for the second ML/DL computer model (step 780). The second lesion prediction output is resampled if necessary and combined with the first lesion prediction output generated by the integrated first ML/DL computer model to generate a final lesion prediction output (step 790). The final lesion prediction output is then output along with the anatomical mask (step 795), and the operation terminates.

Lesion segmentation

As previously described, the lesion prediction output is generated by the operation of the various ML/DL computer models and stages of logic including the body part detection, body part of interest determination, phase classification, anatomy of interest identification, and AI pipeline of anatomy/lesion detection. For example, in the AI pipeline 100 shown in fig. 1, the results of the liver/lesion detection stage 130 of the AI pipeline 100 include one or more contours (outlines) of the liver, as well as a detection map (e.g., a voxel-wise map of liver lesions detected in the input volume 105) that identifies portions of the medical imaging data elements that correspond to the detected lesions 135. The detection map is then input to the lesion segmentation stage 140 of the AI pipeline 100.

As previously described, lesion segmentation logic (e.g., lesion segmentation stage 140 in fig. 1) uses a watershed technique and a corresponding ML/DL computer model to partition a detection map to generate image element partitions of a medical image (slice) of an input volume. The liver lesion segmentation stage also provides other mechanisms (such as one or more other ML/DL computer models) that identify all contours corresponding to lesions present in slices of the input volume based on image element partitioning, and perform operations that identify which contours correspond to the same lesion in three dimensions. The lesion segmentation stage further provides mechanisms (such as one or more additional ML/DL computer models) that aggregate the relevant lesion contours to generate a three-dimensional partition of the lesion.

Lesion segmentation uses lesion image elements and repair of non-liver tissue (in-tracking) represented in a medical image to individually focus on each lesion and perform active contour analysis. In this way, individual lesions may be identified and processed without being biased for analysis by other lesions in the medical image or due to portions of the image outside the liver. The result of lesion segmentation is a list of lesions with their corresponding appearance or contour in the input volume.

Fig. 8 depicts a block diagram of an overview of aspects of a lesion segmentation process performed by lesion segmentation logic, in accordance with one illustrative embodiment. As depicted in fig. 8, lesion segmentation includes a mechanism for sliced segmentation two-dimensional detection (i.e., detection of lesions in a two-dimensional slice) (block 810), linking two-dimensional lesions along a z-axis (block 820), and sliced refined contours (block 830). Each of these blocks will be described in more detail below with respect to subsequent figures. The segmentation process shown in fig. 8 is implemented as a process for identifying all lesions in a given input volume in an analysis and distinguishing lesions that are close to each other in an image (slice) of the input volume. For example, two lesions that appear to be binned in terms of pixels in one or more images may need to be identified as two different regions or different lesions for the purposes of performing other downstream processing of detected lesions (such as during lesion classification), as well as separately identifying lesions in the output of a lesion list for downstream computing system operations (such as providing a medical viewing application, performing a treatment recommendation operation, performing a decision support operation, etc.).

As part of slicing the 2D image in block 810, the mechanism of the illustrative embodiments uses existing watershed techniques to slice the detection maps from previous lesion detection stages of the AI pipeline (e.g., the detection map 135 generated by the liver/lesion detection logic 130 of the AI pipeline 100 in fig. 1). The watershed algorithm requires defining seeds to perform mask partitioning. The watershed algorithm splits the mask into as many regions as there are seeds, so that each region has exactly one seed located approximately at its center, as shown in fig. 10A and 10C. In automatic segmentation, the seed in the mask can be obtained as its local maximum in the distance map (distance to the mask contour). However, this approach is prone to noise and may result in too many seeds, excessively splitting the mask. Therefore, we need to edit the partitions by reorganizing some of the regions. In view of the empirical observation that most lesions become bubble-shaped, the guiding principle of regional reorganization is to make the resulting new region roughly circular. For example, for the mask shown in FIG. 10C, the mechanism would merge the two regions identified by

seeds

1051 and 1061, respectively, resulting in a new mask partition that includes only two substantially circular regions. Thus, for a detected lesion defined in the detection map 135, such as a lesion shown on the left side of fig. 9 (described later), it may be segmented into several bubble-like lesions, as shown on the right side of fig. 9. They will be interpreted as cross sections of the 3D lesion on the slice.

Watershed segmentation is a region-based method that originates from mathematical morphology. In watershed segmentation, the image is considered as a landscape with ridges and valleys. The elevation value of a landscape is typically defined by the gray value of the respective pixel or its gradient magnitude, so two dimensions are considered to be a three-dimensional representation. The watershed transform decomposes the image into "catch basins" (clipping basins). For each local minimum, the water collection basin includes all points at which its steepest descent path terminates at this minimum. Watershed separates the pots from each other. The watershed transform completely decomposes the image and assigns each pixel into a region or watershed.

Watershed segmentation entails selecting at least one label (called a "seed" point) inside each object of the image. The seed point may be selected by the operator. In one embodiment, the seed points are selected by an automated process that takes into account application specific knowledge of the object. Once the object is marked, the object can be grown using a morphological watershed transform, as will be described in further detail below. The lesion typically has a "bubble" shape. The illustrative embodiments provide techniques for merging segmented regions of watershed based on the assumption.

Thereafter, in block 820, the mechanism of the illustrative embodiment aggregates the voxel partitions on each slice along the z-direction to produce a three-dimensional output. Thus, the mechanism must determine whether two sets of image elements (e.g., voxels) in different slices belong to the same lesion (i.e., whether they are aligned in three dimensions). The mechanism calculates a measure between lesions in adjacent slices based on the intersection and union of the lesions and applies a regression model to determine whether two lesions in adjacent slices are part of the same region. One may treat each lesion as a set of voxels, and the mechanism determines the intersection of two lesions as the intersection of two sets of voxels, and determines the union of two lesions as the union of two sets of voxels.

This results in three-dimensional segmentation of the lesion; however, the contours may not fit the actual image well. There may be over-segmented lesions. The illustrative embodiments propose the use of active contours, which is a traditional framework for dealing with segmentation problems. Such algorithms attempt to iteratively edit the contour to fit it better and better to the image data, while at the same time ensuring that it retains some desired property (such as shape smoothness). In block 830, the mechanism of the illustrative embodiment initializes the active contours with the partitions obtained from the first level 810 and the second level 820, and focuses on one lesion at a time; otherwise, running an active contour or a random segmentation method on the approaching lesion may cause them to merge again into one contour, which is counter-productive as it amounts to substantially eliminating the benefits from the previous region level. The mechanism focuses on one lesion and performs "patching" of lesion voxels and non-liver tissue in the vicinity of the focused lesion.

The linking of these three processing levels allows processing that is not biased by other lesions in the image or pixels or lesions outside the liver.

Sliced, partitioned 2D detection

Fig. 9 depicts the results of lesion detection and slice partitioning in accordance with an illustrative embodiment. As seen on the left side of fig. 9, the lesion 910 is detected by the previous AI pipeline process described above and may be defined in the output of the contour and detection map (e.g., 135 in fig. 1) from the lesion detection logic (e.g., 130 in fig. 1). As shown on the right side of fig. 9, the logic of block 810 in fig. 8 attempts to partition the region into three

lesions

911, 912, and 913 in accordance with one illustrative embodiment. The partitioning mechanism of the illustrative embodiments is based on existing watershed techniques that operate to partition the detection map from the previous lesion detection stage of the AI pipeline. The watershed algorithm is mainly used for image processing for segmentation purposes. The principle behind these known watershed algorithms is that a grayscale image can be viewed as a terrain surface, with high intensities representing peaks and hills and low intensities representing valleys. The watershed technique starts with filling each isolated valley (local minimum) with a different color of water (marker). As the water rises, water from different valleys with different colors will start to merge, depending on the nearby peaks (gradients). To avoid this, a barrier is constructed at the site of hydration. The work of filling water and building barriers continues until all peaks are under water, at which point the created barrier gives the segmentation result. Again, watershed techniques are generally known, and thus, a more detailed description is not provided herein. Any known technique for slicing partitioned 2D images may be used without departing from the spirit and scope of the present invention.

In the context of lesion segmentation, empirical observation of most lesions in circular shapes strongly suggests that the partition that produces a set of circular regions may be a good partition. However, as previously mentioned, the quality of watershed-type partitions depends on the quality of the seeds. In fact, an arbitrary set of seeds need not result in a set of circular regions. For example, FIG. 10C shows watershed segmentation caused by 3 seeds containing only one substantially circular region. The other two are not circular. However, their union is again generally circular. This configuration is referred to as over-splitting because the sloped split (sloped split) in the figure divides the otherwise circular region into two smaller, non-circular regions. Therefore, it is desirable to have an algorithm that can correct for excessive splitting. The seed relabeling mechanism accomplishes this by merging several over-split regions to form a coarser partition containing only circular regions. For example, the mechanism decides to merge the two regions identified by

seeds

1051 and 1061 for the partitioning in FIG. 10C, forming a new region that is more circular.

Illustrative embodiments merge regions in partitions into rounder and larger regions that may correspond to physical lesions. The partitions divide the area into smaller areas or, as described herein, the partitions divide the mask into smaller areas. In terms of profiles, the partitioning thus produces a set of smaller profiles from a large profile (see left to right in fig. 9).

The seed is obtained by extracting local maxima from a distance map, which is calculated from the input mask to the partitions. The graph measures for each pixel its euclidean distance from the mask profile. Depending on the topological structure of the input mask, local maxima derived from this distance map can lead to over-segmented partitioning by the watershed algorithm. In this case, watershed is said to be over-fragmented and tends to produce regions that are not circular, which may be desirable in some applications, but is not ideal for lesion segmentation. In FIG. 10C, we show the composite input mask whose distance map has three local maxima. Thus, watershed produces a region containing three regions, only one of which is substantially circular (corresponding to seed 1071). The other two are not. The area with the seeds 1051 is only semicircular. The seed relabeling mechanism then examines all seed pairs and determines that the two regions corresponding to

seeds

1051 and 1061 should merge together, which would form a more perfect bubble. Such an operation results in a new partition containing only two regions, and both are approximately circular in shape.

A local maximum is a point that has the greatest distance from the contour compared to its immediate neighbors. The local maxima are points and their distance to the contour is known. Thus, the mechanism of the illustrative embodiments may draw a circle centered at this point. The radius of the circle is the distance. For two local maxima, the mechanism may thus compute the overlap of their corresponding circles. This is depicted in fig. 10A and 10B.

Seed relabeling determines whether to merge two regions as follows. For two regions whose associated seeds are directly adjacent, merging will occur; otherwise, the mechanism bases its decision on the hypothesis testing procedure. For example, referring to fig. 10A, the depicted example describes a case where the distance map produces two different local maxima, which results in the assumption that each maximum represents the center of a different circular lesion. Note that the distance map also allows the mechanism of the illustrative embodiments to tell how far the maximum is from the contour (boundary). This distance is represented in fig. 10B by the dashed segment connecting the maximum and the point on the contour. Thus, if the assumption is true, one can infer the spatial extent of the two lesions, since the lesions are assumed to have a generally circular or "bubble" shape. This allows the mechanism of the illustrative embodiment to draw two complete circles as shown in FIG. 10B. Thus, the mechanism then measures the overlap of the two circles (e.g., using the classical dice metric) and compares it to a predetermined threshold. If the value of the overlap metric is greater than the threshold, the mechanism infers that the two bubbles overlap too much to be significant and merging will occur. In other words, the mechanism of the illustrative embodiment then infers that these two local maxima correspond to two "centers" of the same lesion. However, in traditional watershed, there is no such seed (i.e., maximum) relabeling mechanism. Thus, mask over-splitting often occurs.

Overlap can be measured in a number of ways. In one example embodiment, the mechanism uses dice coefficients. For two complete circles corresponding to two local maxima as shown in fig. 10B, the mechanism may calculate the dice metric for the two circles. In this way, the mechanism can learn from the training data set what best threshold to apply in practice, such that once the dice measure is greater than the threshold, the two local maxima are actually the center of the same lesion.

Fig. 10C and 10D provide examples of another lesion mask shape, which differs from fig. 10A and 10B in that the two partially merged circles are closer to each other in fig. 10A than in fig. 10C. There are three seeds in the example lesion mask shape of fig. 10C due to the distance map, which may be very sensitive to the mask shape. Following the above reasoning, the lesion splitting algorithm splits the lesion represented in fig. 10C into two separate lesions, but not three separate lesions as would occur in the watershed technique without seed relabeling.

In fig. 10C and 10D,

seeds

1051 and 1061 represent more extreme cases than the seeds depicted in fig. 10A and 10B. Without the seed relabeling technique of the illustrative embodiments, a split (represented by the slanted solid line) would occur to separate them. But with the seed relabeling mechanism of the illustrative embodiments, this undesirable result may be effectively avoided. In contrast, since the seed 1071 is sufficiently far from the

seeds

1051 and 1061, the same hypothetical test procedure described above will help accept the assumption that the seed 1071 corresponds to the center of a different bubble, resulting in vertical splitting as shown in fig. 10C and 10D. Equivalently, this results in a different tag for seed 1071 than the tags assigned to

seeds

1051 and 1061. However, similar to the situation in fig. 10A and 10B, the hypothesis testing procedure of the seed re-labeling technique of the illustrative embodiments will determine that

seeds

1051 and 1061 correspond to the same lesion.

FIG. 11A is a block diagram illustrating a mechanism for lesion segmentation and re-labeling in accordance with one illustrative embodiment. As shown in fig. 11A, the mechanism may be implemented as a computer model including one or more algorithms, machine learning computer models, or the like, executed by one or more processors of one or more computing devices and the one or more processors operating on an input volume of one or more medical image data structures, receiving a two-dimensional lesion mask 1101, and performing a distance transform (block 1102) to generate a distance map 1111. The distance transform (block 1102) is an operation performed on a binary mask that calculates, for each point in the lesion mask, its shortest distance to the mask outline (boundary). The more moves towards the inside of the lesion mask, the farther away from its contour (boundary). Thus, the distance transform identifies the center point of the lesion mask (i.e., those points having greater distance than other points). In one embodiment, the mechanism optionally performs gaussian smoothing on the distance map 1111.

The mechanism then performs local maximum identification (block 1103) to generate a seed 1112. As described above, these local maxima are the points in the distance map 1111 that have the highest distance from the contour or boundary. The mechanism performs a watershed technique (block 1104) based on the seeds 1112 to generate a watershed split lesion mask 1113. As described above, such a split lesion mask 1113 may be over-split, resulting in regions that do not conform to the assumed bubble shape of the lesion. Thus, the mechanism performs seed relabeling (block 1120) based on the distance map 1111, the seed 1112, and the split 2D lesion mask 1113 to generate an updated split lesion mask 1121. Seed relabeling is described in further detail below with reference to fig. 11B. The resulting updated split lesion mask 1121 will have regions that have merged to form regions that more accurately conform to the shape of the bubble assumed for the lesion.

FIG. 11B is a block diagram illustrating a mechanism for seed relabeling in accordance with one illustrative embodiment. As shown in fig. 11B, the mechanism, which may be implemented as a computer model, machine learning computer model, or the like, including one or more algorithms, executed by one or more processors of one or more computing devices, and operating on input volumes of one or more medical image data structures, receives a distance map 1111 and a seed 1112. More specifically, the mechanism considers each pair of seeds (seed a and seed B) in seed 1112. The mechanism determines whether seed a and seed B are direct neighbors (block 1151). If seed A and seed B are direct neighbors, the mechanism assigns seed A and seed B the same label (block 1155). In other words, seed a and seed B are grouped to represent a single region.

As described below, if seed a and seed B are not direct neighbors in block 1151, the mechanism performs spatial range estimation (1152) based on the distance map 1111 and determines the pairwise affinity of seed a and seed B. According to an illustrative embodiment, the spatial extent estimation assumes that the region is in the shape of a "bubble". Thus, the mechanism assumes that each seed represents a circle with the distance to the distance map as the radius of the circle.

The mechanism then calculates an overlap metric for the circle represented by seed a and seed B (block 1153). In one example embodiment, the mechanism uses the following dice metric:

where | a | represents the area of the circle represented by the seed a and | B | represents the area of the circle represented by the seed B. Similarly, | A ≦ B | represents the area of the intersection of A and B. In an alternative embodiment, the mechanism may compute the overlap metric as follows:

where | a | represents the area of a circle represented by the seed a, | B | represents the area of a circle represented by the seed B, | a ≦ B | represents the area of the intersection of a and B, and | a ≦ B | represents the area of the union of a and B.

The mechanism determines whether the overlap metric is greater than a predetermined threshold (block 1154). If the overlap metric is greater than the threshold in block 1154, the mechanism merges the corresponding regions in the split 2D lesion mask 1113 (block 1155).

If the affinity between two seeds is greater than a threshold, they are assigned the same label. Otherwise, at this level, it is not known whether they should belong to the same group. This decision is left to the label propagation stage (block 1512 in FIG. 15), which is the same module used in the z-connection, which will be described below.

In the case we have more than two seeds, the same operation of FIG. 11B is repeated for all seed pairs before the tag is propagated, which results in a seed group. For example, there is a case where it is determined that the seed pair (a, B) and (B, c) belong to the same group, and the seed pair (a, c) fails the test, as shown in fig. 11B. Tag propagation would then have to place a, b, c in the same group (i.e., the regions corresponding to seeds a and c would still merge). However, if seeds a, b, c and d are present and the affinity calculation (performed on a total of six pairs) shows that only (a, b) and (c, d) pass the test, tag propagation will result in two sets containing (a, b) and (c, d), respectively. Thus, if a seed pair fails the test, it means that it is not known whether they should be put into the same group, rather than whether they should belong to a different group.

For example, in FIG. 10C, there are 3 seed pairs (1051-. The tag propagation step then clusters these 3 seeds into 2 groups, the first containing only 1071 and the second having both 1051 and 1061.

Figure 12 is a flowchart outlining an example operation for lesion fission in accordance with one illustrative embodiment. The operations outlined in fig. 12 may be performed by the mechanisms described above with respect to fig. 11A-11B. As shown in fig. 12, the operation starts (step 1200) and the mechanism generates a distance map of the two-dimensional lesion mask (block 1101). As described above, the distance map may be generated by performing a distance transform operation on a two-dimensional lesion mask and optionally performing gaussian smoothing to remove noise. The mechanism then generates groupings of data points (e.g., local maxima for each group) using local maximum identification (step 1202). The mechanism performs lesion segmentation based on the local maxima to generate regions (step 1203). The mechanism then re-labels the seed based on the pairwise similarity using the distance map (step 1204). The mechanism then merges the regions corresponding to seeds having the same label (step 1205). It should be appreciated that, as described above, due to the seed re-labeling performed by the mechanisms of the illustrative embodiments, the split lesion mask output in step 1205 does not have the excessive splitting problems associated with watershed techniques due to the false labels being associated with the data points associated with each lesion shape. Thereafter, the operation ends (step 1206).

Z-direction connection of lesions

The above-described process of lesion splitting and seed relabeling may be performed for each two-dimensional image or slice of the input volume to generate an appropriately labeled lesion mask for each lesion represented in the corresponding two-dimensional image. However, the input volume represents a three-dimensional representation of the internal anatomy of the biological entity, and when considered in three dimensions, lesions that may appear to be associated with the same lesion may actually be associated with different lesions. Thus, to be able to correctly identify individual lesions within a biological entity represented in three dimensions of an input volume, the illustrative embodiments provide a mechanism for connecting two-dimensional lesions along the z-axis (i.e., in three dimensions).

The mechanism for performing the joining of two-dimensional lesions along the z-axis (referred to as z-joining of lesions) includes a logistic regression model performed on the split lesion output generated by the above mechanism to determine three-dimensional z-direction lesion detection. This mechanism connects two lesions in adjacent image slices. When the logistic regression model determines that two lesions represent the same lesion, the two lesions are connected. For example, for any two-dimensional lesion on adjacent image slices (i.e., slices of z-axis coordinates ordered consecutively along the z-axis in a set of three-dimensional tissues of the slices), the mechanism determines whether the two-dimensional lesions belong to the same three-dimensional lesion, as will be described below.

Fig. 13A-13C illustrate a process for z-connection of lesions, according to an illustrative embodiment. Fig. 13A depicts lesion mask input. Fig. 13B depicts a lesion after a sliced lesion disruption, which may employ the improved lesion disruption mechanism of the re-labeling of the illustrative embodiment previously described. As shown in fig. 13A-13B, slice 1310 has

lesions

1311 and 1312, slice 1320 has lesion 1321, and slice 1330 has

lesions

1331 and 1332. Z-direction joining (i.e., logistic regression model) of lesion mechanics is performed on the split lesion mask of each pair of adjacent slices in the input volume to compare each lesion in a given slice to each lesion in the pair of adjacent slices. For example, the z-connection of lesion mechanics compares lesion 1311 (lesion a) in slice 1310 with lesion 1321 (lesion B) in slice 1320. For each comparison, the mechanism treats each lesion as a set of voxels, and determines the intersection between lesion a (the set of voxels in lesion a) and lesion B (the set of voxels in lesion B) for the size of lesion a and for the size of lesion B. The z-direction connectivity of lesion mechanisms determines whether lesion a and lesion B are connected based on two overlap ratios using a logistic regression model as follows:

where | a | represents the area of the circle represented by the seed a, | B | represents the area of the circle represented by the seed B, and | a ∞ B | represents the area of the intersection of the circles represented by the seed a and the seed B. The mechanism trains a logistic regression model using these two ratios as input features to determine the probability of connecting lesion a and lesion B. That is, using a machine learning process, logistic regression models are trained on a volume of training images to generate predictions about the following probabilities, such as previously described above: in each pairwise combination of slices in each training volume, the lesion in one slice is the same or different lesion as represented in the adjacent slice. This prediction is compared to a ground truth indication of whether the lesion is the same or a different lesion in order to generate a loss or error. The operating parameters (e.g., coefficients or weights) of the logistic regression model are then modified to reduce such losses or errors until a predetermined number of training periods have been performed or a predetermined stopping condition is met.

Logistic regression models are widely used to solve binary classification problems. However, in the context of the illustrative embodiments, such a logistic regression model predicts the probability that two sections of a lesion are part of the same lesion. For this purpose, logistic regression uses two overlap ratios r0 and r1 as mentioned before. Specifically, the logical model learns to linearly combine two features as follows:

wherein (C)₀,C₁And b) is an operating parameter learned from the training volume via a machine learning training operation. By a symbol r₀And r₁And respectively represent a minimum overlap ratio and a maximum overlap ratio. The state of the operating parameters after training of the logistic regression model may be represented as

Upon inference (i.e., after training of the logistic regression model), when processing a new input volume of images (slices), a threshold t is set such that the relationship is if and only if

When true (i.e. the prediction probability is above a set threshold), both cross sections are considered to belong to the same lesion.

There are two extremes. First, when the threshold t is set to 0, the z-direction connection mechanism of the illustrative embodiment always determines that the lesions are the same lesion (i.e., the cross-sections are connected). Then both true and false positive rates are 1. Second, when the threshold t is set to 1, the z-connection mechanism will not identify any cross-section of the lesion to be connected. In this case, both the true positive rate and the false positive rate will be 0. Thus, only when the threshold t is in the interval (0, 1) will the logistic regression model determine whether a lesion cross section is associated with the same lesion across adjacent slices. Using the ideal logistic regression model, the true positive rate is equal to 1 (all true connections are identified) and at the same time the false positive rate is 0 (zero false connections are made).

Thus, once the logistic regression model is trained, new slice pairs can be evaluated in this manner: these ratios of these pairs are calculated and input as input features into a trained logistic regression model to generate a prediction for each of these pairs, and then, if the prediction probability is equal to or greater than a predetermined threshold probability, lesions a and B are considered to be associated with the same lesion in three dimensions. Appropriate re-labeling of lesions across the slices may then be performed in order to appropriately associate lesions in a two-dimensional slice with the same lesion representation in other adjacent slices, and thereby identify three-dimensional lesions within the input volume.

There is a rationale (rationale) that supports two ratio input features for training logistic regression models. For example, if lesions a and B are sufficiently different in size, they may not be part of the same lesion. Furthermore, if lesions a and B do not intersect (e.g., lesion 1312 in slice 1310 and lesion 1321 in slice 1320), then feature r is₀、r₁Will have a value of zero. As described above, two feature quantities r are given₀、r₁The logistic regression model performs regression and outputs a probability value between 0 and 1 representing the likelihood that lesion a and lesion B are part of the same lesion.

FIG. 13C depicts cross-sectional connections between slices in accordance with an illustrative embodiment. As shown in fig. 13C, the mechanism determines that lesion 1311 in slice 1310 and lesion 1321 in slice 1321 are part of the same lesion by executing a trained logistic regression model of the illustrative embodiment that predicts lesion commonality based on overlap ratio as discussed above. The mechanism also determines in a similar manner that lesion 1321 in slice 1320 and lesion 1331 in slice 1330 are part of the same lesion. Thus, the mechanism propagates intersecting lesions along the z-axis and performs the z-axis connection of the lesions.

Based on the pair-wise evaluation of the slices in the input volume with respect to identifying z-connections of lesions across the two-dimensional slices, and determining whether a lesion is connected along the z-axis by a trained logistic regression model, relabeling of the lesion may be performed in order to ensure that the same lesion label is applied to each lesion mask present in each slice of the input volume (e.g., all lesion masks across a set of slices in the input volume), where the lesion masks determined by the logistic regression model are associated with the same lesion a, relabeling to specify that they are part of the same lesion a. This may be performed for each lesion cross-section in each slice of the input volume, thereby generating a three-dimensional association of lesion masks for one or more lesions present in the input volume. This information may then be used to represent or otherwise process the lesion in three dimensions (such as in later downstream computing system operations) because all cross-sections associated with the same lesion are correctly labeled in the input volume.

14A and 14B illustrate the results of a trained logistic regression model in accordance with one illustrative embodiment. FIG. 14A illustrates the maximum overlap ratio (r)₀) + minimum overlap ratio (r)₁) Receiver operating characteristic curve (ROC) curve for the metric and the maximum overlap ratio metric. The ROC curve is a graphical plot illustrating the diagnostic capabilities of a binary classifier system when its discrimination threshold is varied. The ROC curve is created by plotting the True Positive Rate (TPR) versus the False Positive Rate (FPR) at various threshold settings. FIG. 14B illustrates accuracy-recall curves for a maximum overlap ratio + minimum overlap ratio metric and a maximum overlap ratio metric. The accuracy-recall curve is a plot of accuracy (y-axis) and recall (x-axis) for different thresholds, much like the ROC curve, where accuracy is the fraction of relevant instances in the retrieved instances, and recall (or sensitivity) is the fraction of the total number of relevant instances actually retrieved. As shown in these figures, the dual-feature logical model outperforms its single-feature counterpart. These two features thus bring valuable information to the prediction task.

The maximum overlapping ratio (r) in fig. 14A is observed₀) + minimum overlap ratio (r)₁) Metric curves, it can be seen that with an appropriate threshold t, the trained logistic regression model can produce a true positive rate approximately equal to 95% at the expense of a false positive rate of approximately 3%. Referring to fig. 14B, the depicted graph evaluates the trained logistic regression model in terms of accuracy and recall and shows that both measures are able to achieve very good results through the selection of an appropriate threshold t.

FIG. 15 is a flowchart outlining an example operation of a mechanism for connecting two-dimensional lesions along a z-axis in accordance with one illustrative embodiment. As shown in fig. 15, the operation starts (step 1500) and the mechanism selects a first image X from the input volume (step 1501) and selects a first lesion a in image X (step 1502). In some illustrative embodiments, the images or slices in the input volume may be processed using the splitting and re-labeling mechanisms previously described, although this is not required. Rather, the mechanisms of the illustrative embodiments involving z-connection of lesions may be performed with virtually any input volume in which a lesion mask has been identified.

The z-connection mechanism of the illustrative embodiment then selects the first lesion B in the neighboring image Y (step 1503). The mechanism then determines an intersection between lesion a and lesion B for lesion a and an intersection between lesion a and lesion B for lesion B (step 1504). The mechanism applies a trained logistic regression model to r for the intersection of lesion a and lesion B₀And r₁Features to generate a prediction or probability that lesion a and lesion B are the same lesion and then compares the probability to a threshold probability, determining whether lesion a and lesion B belong to the same lesion based on the two intersection values (step 1505). Based on the results of this determination, sections of the lesion in the image may be marked or relabeled to indicate whether they are part of the same lesion.

The mechanism determines whether lesion B in image Y is the last lesion in image Y (step 1506). If lesion B is not the last lesion, the mechanism considers the next lesion B in the neighboring image Y (step 1507) and operation returns to step 1504 to determine the intersection between lesion A and the new lesion B.

If at step 1506 lesion B is the last lesion in the adjacent slice or image Y, then the mechanism determines whether lesion a is the last lesion in image X (step 1508). If lesion a is not the last lesion in image X, the mechanism considers the next lesion a in image X (step 1509) and operation returns to step 1502 to consider the first lesion B in the adjacent image Y.

If lesion a is the last lesion in image X at step 1508, the mechanism determines if image X is the last image to consider (step 1510). If image X is not the last image, the mechanism considers the next image X (step 1511) and operation returns to step 1502 to consider the first lesion A in the new image X.

If image X is the last image to consider at step 1510, the mechanism propagates intersecting lesions between images along the z-axis, where propagation means that the labels associated with the same lesion determined by the above process are set to the same value to indicate that they are part of the same lesion (step 1512). This is performed for each individual lesion identified in the input volume, such that the cross-sections in each image associated with the same lesion are appropriately labeled, and thus, a three-dimensional representation of each lesion is generated by the z-wise connections of the cross-sections. Thereafter, the operation ends (step 1513).

Contour refinement

The above procedure yields accurate results in terms of the number and relative location of lesions, as well as in terms of connecting lesions across two-dimensional space (within an image or slice) and three-dimensional space (across an image or slice in the input volume). However, lesion contours (boundaries) are not always well defined and need improvement. The illustrative embodiments provide a mechanism for improving lesion contour accuracy. This additional mechanism may be employed with the above-described mechanisms as part of lesion segmentation, or may be employed in other illustrative embodiments that do not require the above-described specific lesion detection, lesion segmentation and re-labeling, and/or z-connection mechanisms.

Existing contour algorithms work well only when there is a lesion in the middle of the anatomy and no surrounding lesion, but do not work well when there are different conditions that lead to a "leak" problem in which two or more close lesions have initially different contours of the two or more close lesions that merge into one single fully enclosing contour, thus completely eliminating the benefit from earlier two-dimensional lesion mask splitting. In some cases, when a lesion is near an anatomical structure boundary (e.g., a liver boundary), the contour algorithm distinguishes between pixels of the anatomical structure relative to pixels of other anatomical structures (e.g., organs) in the image, rather than distinguishing one lesion from another, because the contour algorithm is best able to distinguish between pixels of these anatomical structures.

The mechanisms of the illustrative embodiments repair regions of no interest in an image or slice. FIG. 16 illustrates an example of contours with two lesions in the same image in accordance with an illustrative embodiment. On the left side of fig. 16, the active contour algorithm is used to determine the

contours

1611 and 1612 of two lesions. Active contour algorithms are a class of algorithms that iteratively evolve contours to better fit image content.

According to this illustrative embodiment, the mechanism repairs non-liver tissue within outline 1612 and near outline 1611 but not within outline 1611, where repairing means that the outline 1612 and the pixels within outline 1612 and the pixel values of healthy tissue (non-diseased tissue) near outline 1611 are set to specified values so that they all have the same value. For example, the value may be an average tissue value in a region identified as not associated with a lesion, i.e. healthy tissue of the anatomical structure (e.g. liver).

The repair may be performed relative to the selected lesion outline 1611 such that the repair is applied to healthy tissue and other lesions (e.g., lesion 1612) in the image. In this way, contours and pixels associated with the selected lesion (e.g., 1611) are considered separately from other portions of the image when re-evaluating the contours 1611. The contour 1611 may then be reevaluated, and it may be determined whether the reevaluation of the contour 1611 results in an improved definition of the contour 1611. That is, an initial determination of contrast and variance between pixels associated with the selected lesion contour 1611 and pixels near the selected lesion contour 1611 may be generated. After calculating this contrast and variance prior to repair, repair may be performed for the selected lesion 1611 such that pixels associated with other lesion contours (e.g., 1612) and regions of the image representing the anatomy of healthy tissue are repaired using the average pixel intensity values of healthy tissue.

The variance of a set of values is determined as follows. Consider a set of voxels, which comprises, for example, n voxels. First, an arithmetic mean is calculated by summing the intensity values of them and then dividing the resulting sum by n. This is denoted as a. Second, these voxel values are squared separately and then the arithmetic mean is calculated. The result is denoted as B. The variance is then defined as B-a x a, i.e. the difference between B and the square a.

Thus, the variance of the set of n values { x1, …, xn } is defined as follows:

the variance between voxels inside and outside a given contour is calculated. Voxels inside the contour are those voxels surrounded by the contour, and voxels outside the contour refer to those voxels that are outside the contour but remain within a predetermined distance from the contour.

The mechanism recalculates the contour 1611 of the selected lesion after repair using the active contour algorithm as previously described, and recalculates the contrast and/or variance of the new contour 1611 to determine if these values have improved (higher contrast values or lower variance values inside and/or outside the lesion). If the contrast and variance have improved, the newly calculated contour 1611 is retained as the contour of the corresponding lesion. The process may then be performed for lesion 1612, where lesion 1612 is considered a selected lesion by subsequently repairing pixels associated with lesion 1611 and healthy tissue in the vicinity of contour 1612. In this way, each lesion is evaluated separately to generate a profile of the lesion, thereby preventing leakage of the lesions from each other.

The mechanism for calculating the contour of a lesion after repair may be based on a Chan-Vese segmentation algorithm, which is designed to segment objects without well-defined boundaries. The algorithm is based on iterating an evolving level set defined by the sum of the difference intensities corresponding to the averages from outside the segmented region, the sum of the differences from the averages inside the segmented region, and the weighted values of the terms depending on the length of the boundary of the segmented region to minimize the energy. Initialization is done using the partitioned detection map (solving the energy local minimum problem).

Once the mechanism has segmentation, the mechanism initializes the contours using previous estimates and determines whether the new contours are better (e.g., improves the contrast and variance of the contours). If the original contour is better, the original contour is maintained. If the new contour is better (e.g., improving the contrast and variance of the contour), the mechanism uses the new contour. In some illustrative embodiments, the mechanism determines which profile is better based on the homogeneous region and the calculated variance. If the variance decreases both inside and outside of the contour, the mechanism uses the new contour; otherwise, the mechanism uses the old profile. In another illustrative embodiment, the mechanism determines whether the contrast (average inside the contour versus the average near the contour) is improved. Other techniques utilizing different measurements may be used to select between the old and new profiles without departing from the spirit and scope of the illustrative embodiments.

FIG. 17 is a flowchart outlining an example operation of a mechanism for sliced contour refinement in accordance with an illustrative embodiment. As shown in fig. 17, for a given contour in an image segmented to show a lesion, such as in the liver, the operation starts (step 1700), and the mechanism determines a first contrast and variance of the initial contour (step 1701). The mechanism repairs lesion pixels (or voxels) near the lesion (step 1702). The mechanism then determines the contours around the lesion (step 1703). The mechanism then determines a second contrast and variance of the new contour (step 1704). The mechanism determines whether the second contrast and variance represent an improvement over the first contrast and variance (step 1705). If the second contrast and variance indicate an improvement, the mechanism represents the lesion using the updated contour (step 1706). Thereafter, the operation ends (step 1708).

If the second contrast and variance do not represent an improvement in step 1705, then the mechanism reverts to the original contour (step 1707). Thereafter, the operation ends (step 1708). This process may be repeated for each lesion identified in the input slice and/or input volume in order to recalculate the contours and improve the contours associated with each lesion present in the image/input volume.

False positive removal

After performing lesion segmentation to generate a list of lesions and their contours, the AI pipeline 100 performs a false positive stage of process 150 to remove the falsely indicated lesion from the lesion list. This false positive stage 150 may take many forms to reduce the number of misrecognized lesions in the lesion list, such as the contours and maps 135 of fig. 1 output by the liver/lesion detection logic 130, which contours and maps 135 are then merged by segmentation and re-labeling performed in the lesion segmentation logic 140. The following description will set forth a novel false positive removal mechanism that may be used to perform such false positive removal, but does not require such specific false positive removal. Furthermore, the false positive removal mechanism described below may be used separately from the other mechanisms described above, and may be applied to any list of objects identified in an image, with illustrative embodiments specifically utilizing such false positive removal of lesions in medical images. That is, the false positive removal mechanism described in this section may be implemented separately and differently from the other mechanisms described above.

For purposes of illustration, it will be assumed that the false positive removal mechanism is implemented as part of the AI pipeline 100 and as part of the false positive removal logic 150 of the AI pipeline 100. Thus, in the false positive stage 150, the false positive removal mechanism described in this section operates on the lesion list generated by the liver/lesion detection logic and segmentation and re-labeling of the lesion, taking into account the three-dimensional nature of the input volume with z-direction connectivity and contour refinement of the lesion as described above. This list 148 in fig. 1 is input to a false positive removal logic stage 150 that processes the list 148 in the manner described below and outputs a filtered or modified lesion list (where the erroneously identified lesion in the modified lesion list is minimized) to a lesion classification stage 160. The lesion classification stage then classifies the various lesions indicated in the modified lesion list.

That is, capturing all lesions in the previous stage of the AI pipeline 100 may result in an increased sensitivity setting that causes the AI pipeline 100 to misrecognize pixels that do not actually represent the lesion as part of the lesion. Thus, there may be false positives that should be removed. The false positive stage 150 includes logic that operates on a list of lesions and their contours to remove false positives. It will be appreciated that such false positive removal must also balance the following risks: upon review (the set of input volume levels is opposite to the lesion level), removal of false positives (if not properly done) may result in the lesion not being detected. This can be problematic because physicians and patients may not be aware of the pathology that needs treatment. It should be understood that the examination may theoretically contain several image volumes of the same patient. However, because in some illustrative embodiments where there is an AI pipeline that implements single phase detection, only images of one volume are processed, it is assumed that the processing is performed with respect to a single volume. For clarity, the "patient level" is used hereinafter instead of the "check level" as this is of interest to the illustrative embodiment (whether or not the patient has a lesion). It should be appreciated that in other illustrative embodiments, the operations described herein may be extended to an examination level, where multiple image volumes of the same patient may be evaluated.

For these illustrative embodiments, given the output of the prior stages (slices, masks, lesions, lesion and anatomical contours, etc.) of the AI pipeline 100 as the input 148 to the false positive removal stage 150, the false positive removal stage 150 operates at a high specificity operating point at the patient level (input volume level) to allow only a few patient levels of false positives (normal patient/volume with at least one lesion detected). This point can be retrieved from an analysis of a patient subject operating characteristic (ROC) (patient level sensitivity versus patient level specificity) analysis. For which a highly specific operating point (referred to herein as the patient level operating point OP) is used_PATIENT) Those volumes that produce at least some lesions, at the level of the lesion (referred to herein as the lesion level operation point OP)_LESION) A more sensitive operating point is used. Lesion levels can be identified from analysis of the ROC curve (lesion sensitivity versus lesion specificity) at the level of the lesionOperating point OP_LESIONIn order to maximize the number of lesions remaining.

These two operating points, (i.e. OP)_PATIENTAnd OP_LESION) May be implemented in one or more trained ML/DL computer models. One or more trained ML/DL computer models are trained to classify the input volume and/or its lesion list (the result of the segmentation logic) as whether the identified lesion is a true lesion or a false lesion (i.e., true positive or false positive). The one or more trained ML/DL computer models may be implemented as a binary classifier, where the output indicates for each lesion whether it is a true positive or a false positive. An output set including a binary classification for all lesions in the input lesion list may be used to filter the lesion list to remove false positives. In one illustrative embodiment, one or more trained ML/DL computer models first implement a patient level operating point to determine whether the results of the classification indicate that any lesions in the list of lesions are true positives while filtering out false positives. If any true positives are left in the first filtered list of lesions after patient-level (input volume-level) filtering, then the lesion-level operation points are used to filter out remaining false positives (if any). Thus, a filtered list of lesions is generated that minimizes false positives.

The implementation of the operating point may be with respect to a single trained ML/DL computer model or multiple trained ML/DL computer models. For example, using a single trained ML/DL computer model, the operating point may be a setting of operating parameters of the dynamically switchable ML/DL computer model. For example, input to the ML/DL computer model may be processed using the patient level operation points to generate results indicating whether the list of lesions includes true positives after classifying each lesion, and if so, the operation points of the ML/DL computer model may be switched to the lesion level operation points and the input processed again, with false positives by the ML/DL computer model each time removed from the final list of lesions output by the false positive removal stage. Alternatively, in some illustrative embodiments, two separate ML/DL computer models, one for the patient level operating point and one for the lesion level operating point, may be trained such that the result of the first ML/DL computer model indicating at least one true positive causes processing of the input through the second ML/DL computer model and the false positives identified by both models are removed from the final lesion list output by the false positive removal stage of the AI pipeline.

Training of the ML/DL computer model(s) may involve a machine learning training operation in which the ML/DL computer model processes a training input comprising an image volume and a corresponding lesion list, wherein the lesion list comprises a lesion mask or contour, to generate a classification for each lesion in the image as to whether it is true positive or false positive. The training input is further associated with basic truth information indicative of whether the image includes a lesion, which may then be used to evaluate the output generated by the ML/DL computer model to determine a loss or error, and then modify the operating parameters of the ML/DL computer model to reduce the determined loss/error. In this way, the ML/DL computer model learns input features that represent true/false positive lesion detection. The machine learning may be performed with respect to each of the operating points (i.e., OP)_PATIENTAnd OP_LESION) This allows the operating parameters of the ML/DL computer model to be learned taking into account patient level sensitivity/specificity and/or lesion level sensitivity/specificity.

In classifying a lesion as to whether the lesion is a true positive or a false positive, an input volume (representing a patient at the "patient level") is considered positive if it contains at least one lesion. If the input volume does not contain a lesion, the input volume is considered negative. In this regard, true positives are defined as positive input volumes (i.e., input volumes with at least one finding classified as a lesion that is actually a lesion). A true negative is defined as a negative input volume (i.e., an input volume of findings that have no lesions and have not been classified as lesions). A false positive is defined as a negative input volume with no lesion, however, this input indicates a lesion in the finding (i.e., when no lesion is present, the AI pipeline lists the lesion). False negatives are defined as positive input volumes with lesions, but the AI pipeline does not indicate lesions in the findings. The trained ML/DL computer model classifies the lesions in the input as to whether they are true or false positives. False positives are filtered out of the output generated by the false positive removal. The detection of false positives is performed at the patient level and at the lesion level (i.e. at two different operating points) with different sensitivity/specificity levels.

Two different operating points for patient level and lesion level may be determined based on ROC curve analysis. The ROC curve may be calculated using ML/DL computer model verification data consisting of several input volumes (e.g., several input volumes corresponding to different patient exams), which may contain some lesions (between 0 and K lesions per exam). The inputs to the trained ML/DL computer model, or "classifier," are findings previously detected in the inputs, whether actual lesions or false positives (e.g., the output of the lesion detection and segmentation stage of the AI pipeline). A first operating point (i.e., a patient level operating point OP)_PATIENT) Is defined as retaining at least X% of lesions identified as true positives, which means that almost all true positives are retained while removing some false positives. The value of X may be set based on analysis of the ROC curve and may be any suitable value for a particular implementation. In one illustrative embodiment, the value of X is set to 98% such that nearly all true positives are retained, while some false positives are removed.

Defining a second operating point (i.e., a lesion level operating point OP)_LESION) So that the lesion sensitivity is higher than for the first operating point (i.e., the patient horizontal operating point OP)_PATIENT) Lesion sensitivity is obtained and specificity is made higher than Y%, where Y depends on the actual performance of the trained ML/DL computer model. In one illustrative embodiment, Y is set to 30%. An example of ROC curves for patient level and lesion level operating point determination is shown in fig. 18A. As shown in FIG. 18A, the disease was selected along the ROC curve of the lesion levelThe horizontal operating point is changed, so that the lesion sensitivity is higher than that of the horizontal operating point of the patient.

Fig. 18B is an example flowchart of operations for performing false positive removal based on patient and lesion level operating points, according to one illustrative embodiment. As shown in fig. 18B, the results of the segmentation stage logic of the AI pipeline are input 1810 to a first trained ML/DL computer model 1820 that implements a first operating point. Input 1810 includes an input volume (or image Volume (VOI)) and a lesion list including lesion mask or contour data specifying pixels or voxels corresponding to each lesion identified in the image data of the image volume and labels associated with these pixels specifying their lesions corresponding to the three-dimensional space of the input volume (i.e., the output of the segmentation, z-connection, and contour refinement previously described). The input may be represented as a set S. The first trained ML/DL computer model 1820 implements a patient-level operating point in its training to classify features extracted from the input with X% (e.g., 98%) of true positives retained in the resulting filtered lesion list generated by the classification of the trained ML/DL computer model 1820, and some false positives are removed in the resulting list. The resulting list includes a subset S containing true positive lesions classified by the first ML/DL computer model 1820⁺And a subset S-containing false positive lesions classified by the first ML/DL computer model 1820.

The false positive removal logic further includes true positive evaluation logic 1830 that determines whether a subset of the true positives output by the first ML/DL computer model 1820 are null. That is, the true positive evaluation logic 1830 determines whether no elements from S are classified as true lesions by the first ML/DL computer model 1820. If the subset of true positives is empty, the true positive evaluation logic 1830 causes the subset of true positives S to be⁺Is output as a filtered lesion list 1835 (i.e., no lesions will be identified in the output of the lesion classification stage sent to the AI pipeline). If the true positive evaluation logic 1830 determines the subset S of true positives⁺Not null, a second ML/DL computer model 1840 is executed on the input S, where the second ML/DL computer model 1840 implements the second operating point (i.e., the second operating point) in its trainingLesion level operation point OP_LESION). It should be understood that while two ML/

DL computer models

1820 and 1840 are shown for ease of explanation, as described above, these two operating points may be implemented in different sets of training operating parameters for configuring the same ML/DL computer model, such that the second ML/DL computer model may be a process of input S with the same ML/DL computer model as 1820 but with different operating parameters corresponding to the second operating point.

The second ML/DL computer model 1840 processes the inputs having the training operating parameters corresponding to the second operating point to again generate a lesion classification as to whether they are true positives or false positives. The result is a subset S 'containing predicted lesions (true positives)'⁺And a subset S 'containing predicted false positives'^-. The filtered lesion list 1845 is then output as subset S'⁺Thereby effectively eliminating S 'in the subset'^-False positives as specified in (1).

The example embodiment shown in fig. 18A and 18B is described in terms of patient level and lesion level operating points. It should be appreciated that the mechanism for false positive removal can be implemented with a variety of different levels of operating points. For example, a similar operation may be performed on the image volume level and voxel level operation points in a "voxel-wise" false positive removal operation. FIG. 18C is an example flow diagram of operations for performing voxel-wise false positive removal based on input volume levels and voxel level operation points in accordance with one illustrative embodiment. The operations in fig. 18C are similar to those of fig. 18B, but the operations are performed with respect to voxels in the input set S. With voxel-wise false positive removal, the first operation point may again be a patient-level or input volume-level operation point, while the second operation point may be at a voxel-level operation point OP_VOXEL. In this case, true positives and false positives are evaluated at the voxel level, such that a voxel is true positive if it is indicated that it is associated with a lesion and it is actually associated with a lesion, but is considered false positive if it is indicated as being associated with a lesion but it is not actually associated with a lesion. The appropriate operating point can be regenerated based on the corresponding ROC curveThe arrangement is such that a similar balance between sensitivity and specificity is achieved as described above.

It should also be appreciated that while the above illustrative embodiment of the false positive removal mechanism assumes a single input volume from a patient examination, the illustrative embodiments may be applied to any grouping of one or more images (slices). For example, false positive removal may be applied to a single slice, a set of slices smaller than the input volume, or even multiple input volumes from the same examination.

FIG. 19 is a flowchart outlining an exemplary operation of the false positive removal logic of an AI pipeline in accordance with one illustrative embodiment. As shown in fig. 19, the operation begins (step 1900) with receiving input S from a previous stage of the AI pipeline, where the input may include, for example, an input image volume and a corresponding list of lesions including masks, contours, etc. (step 1910). The input is processed by a first trained ML/DL computer model that is trained to implement a first operating point (e.g., a relatively more highly specific and less sensitive patient-level operating point) to generate a first set of classifications for lesions that include a true positive subset and a false positive subset (step 1920). It is determined whether the subset of true positives is empty (step 1930). If the true positive subset is empty, then the operation outputs the true positive subset as a filtered list of lesions (step 1940) and the operation terminates. If the true positive subset is not empty, the input S is processed through a second ML/DL computer model trained to achieve a second operating point that is relatively more sensitive and less specific than the first operating point (step 1950). As described above, in some demonstrative embodiments, the first and second ML/DL computer models may be the same model, but configured with different operating parameters corresponding to different training to achieve different operating points. The result of processing the second ML/DL computer model is a second set of classifications of lesions that include a second subset of true positives and a second subset of false positives. The second subset of true positives is then output as a filtered list of lesions (step 1960) and the operation terminates.

Example computer System Environment

The illustrative embodiments may be utilized in many different types of data processing environments. In order to provide context for describing the specified elements and functionality of the illustrative embodiments, FIGS. 20 and 21 are provided below as example environments in which aspects of the illustrative embodiments may be implemented. It should be appreciated that fig. 20 and 21 are only examples and are not intended to assert or imply any limitation with regard to the environments in which aspects or embodiments of the present invention may be implemented. Many modifications to the depicted environments may be made without departing from the spirit and scope of the present invention.

Fig. 20 depicts an illustrative diagram of one illustrative embodiment of a cognitive system 2000 implementing a request processing pipeline 2008, which in some embodiments may be a Question and Answer (QA) pipeline, a treatment advice pipeline, a medical imaging enhancement pipeline, or any other Artificial Intelligence (AI) or cognitive computation based pipeline that processes requests using a complex artificial intelligence mechanism that approximates a human through a process on generated results, but through a different computer-specified process. For the purposes of this description, it will be assumed that the request processing pipeline 2008 is implemented as a QA pipeline that operates on structured and/or unstructured requests in the form of input questions. One example of a problem-handling operation that may be used in conjunction with the principles described herein is described in U.S. patent application publication No.2011/0125734, which is incorporated herein by reference in its entirety.

The cognitive system 2000 is implemented on one or more computing devices 2004A-D (including one or more processors and one or more memories, and potentially any other computing device elements known in the art, including buses, storage devices, communication interfaces, etc.) connected to a computer network 2002. For illustration purposes only, fig. 20 depicts the cognitive system 2000 implemented only on the computing device 2004A, but as described above, the cognitive system 2000 may be distributed across multiple computing devices (such as multiple computing devices 2004A-D). Network 2002 includes a plurality of computing devices 2004A-D operable as server computing devices, and 2010-2012 operable as client computing devices in communication with each other and other devices or components via one or more wired and/or wireless data communication links, wherein each communication link includes one or more of a wire, a router, a switch, a transmitter, a receiver, etc. In some illustrative embodiments, the cognitive system 2000 and the network 2002 implement their question processing and answer generation (QA) functions via the respective computing devices 2010-2012 of one or more cognitive system users. In other embodiments, the cognitive system 2000 and the network 2002 may provide other types of cognitive operations, including but not limited to request processing and cognitive response generation, which may take many different forms depending on the desired implementation (e.g., cognitive information retrieval, training/instruction of the user, cognitive assessment of data, etc.). Other embodiments of the cognitive system 2000 may be used with components, systems, subsystems, and/or devices other than those depicted herein.

The cognitive system 2000 is configured to implement a request processing pipeline 2008 that receives inputs from various sources. The request may be in the form of a natural language question, a natural language request for information, a natural language request for performance of a cognitive operation, or the like. For example, the cognitive system 2000 receives input from the network 2002, a corpus or multiple corpuses of electronic documents 2006, cognitive system users, and/or other data and other possible input sources. In one embodiment, some or all of the inputs to cognitive system 2000 are routed through network 2002. Each computing device 2004A-D on the network 2002 includes an access point for content creators and cognitive system users. Some of the computing devices 2004A-D include a device for storing a corpus or database of multiple corpuses of data 2006 (which are shown as separate entities in fig. 20 for illustrative purposes only). The corpus or portions of the corpus of data 2006 can also be provided on one or more other network-attached storage devices, in one or more databases, or in other computing devices not explicitly shown in fig. 20. In various embodiments, network 2002 includes local network connections and remote connections, such that awareness system 2000 may operate in an environment of any size, including local and global (e.g., the internet).

In one embodiment, the content creator creates content in a corpus of data 2006 or documents of multiple corpuses to be used as part of a corpus of data for the cognitive system 2000. A document includes any file, text, article, or data source used in the cognitive system 2000. Cognitive system users access the cognitive system 2000 via a network connection or an internet connection to the network 2002 and input questions/requests to the cognitive system 2000 that are answered/processed based on content in the corpus or corpora of data 2006. In one embodiment, natural language is used to form the question/request. The cognitive system 2000 parses and interprets the questions/requests via the pipeline 2008 and provides responses to cognitive system users (e.g., cognitive system user 2010) containing one or more answers to the posed questions, responses to the requests, results of processing the requests, and the like. In some embodiments, the cognitive system 2000 provides responses to the user in an ordered list of candidate answers/responses, while in other illustrative embodiments, the cognitive system 2000 provides a single final answer/response or a combination of a final answer/response and an ordered list of other candidate answers/responses.

The cognitive system 2000 implements a pipeline 2008 that includes multiple stages for processing input questions/requests based on information obtained from a corpus or multiple corpuses of data 2006. The pipeline 2008 generates answers/responses to the input questions or requests based on the processing of the input questions/requests and the corpus or corpuses of the data 2006.

In some illustrative embodiments, the cognitive system 2000 may be IBM Watson, available from International Business Machines corporation of Armonk, N.Y.^TMA cognitive system augmented with the mechanisms of the illustrative embodiments described below. As outlined previously, IBM Watson^TMThe pipeline of the cognitive system receives incoming questions or requests, and then IBM Watson^TMThe pipeline of the cognitive system parses the input question or request to extract the key features of the question/request, which in turn are used to formulate a corpus or multiple languages to apply to the data 2006And (5) inquiring the material library. Based on applying a query to the corpus or corpora of data 2006, a set of hypotheses or candidate answers/responses to the input question/request is generated by viewing, across the corpus or corpora of data 2006, portions of the corpus or corpora of data 2006 (hereinafter referred to simply as corpus 2006) that have some potential to contain valuable responses to the input question/response (hereinafter assumed to be an input question). IBM Watson^TMThe pipeline 2008 of the cognitive system then performs a deep analysis of the language of the input question and the language used in each of the portions of the corpus 2006 found during the application of the query using various inference algorithms.

The scores obtained from the different inference algorithms are then weighted against a statistical model that summarizes the IBM Watson^TMThe pipeline 2008 of the cognitive system 2000 (in this example) has a confidence level with respect to evidence that the potential candidate answer was inferred from the question. This process is repeated for each candidate answer to generate a ranked list of candidate answers, which may then be presented to a user submitting an input question (e.g., the user of client computing device 2010), or from which a final answer is selected and presented to the user. About IBM Watson^TMMore information for the pipeline 2008 of the cognitive system 2000 may be obtained, for example, from the IBM corporation's website, IBM Redbook, etc. For example, with respect to IBM Watson^TMInformation on the pipelining of cognitive systems can be found in "Watson and Healthcare" (Watson and Healthcare) "in the IBM developer work (IBM developerWorks) 2011 by Yuan et al and" errors of cognitive systems "in IBM Redbooks of 2012 by Rob High (Rob High): IBM Watson internal observations and How it Works (The Era of Cognitive Systems: An Inside Look at IBM Watson and Home it Works).

As described above, while input to the cognitive system 2000 from a client device may be presented in the form of a natural language question, the illustrative embodiments are not limited thereto. Rather, the input question may actually be formatted or structured to use structured and/or unstructured input analysis (including, but not limited to, IBM Watson^TMNatural language parsing and analysis mechanisms of the equal cognitive system) to parse and analyze any suitable type of request to determine a basis for performing the cognitive analysis and to provide results of the cognitive analysis. For example, a physician, patient, etc. may issue a request to the cognitive system 2000 via their client computing device 2010 for a particular medical imaging-based operation (e.g., "identify liver lesions present in patient ABC" or "provide treatment recommendations for the patient" or "identify changes in liver lesions of patient ABC", etc.). In accordance with illustrative embodiments, such requests may be directed specifically to cognitive computer operations that employ the lesion detection and classification mechanisms of illustrative embodiments to provide a list of lesions, outlines of lesions, classifications of lesions, and outlines of anatomical structures of interest, upon which the cognitive system 2000 operates to provide cognitive computing output. For example, the request processing pipeline 2008 may process a request such as "identify liver lesions present in patient ABC" to parse the request and thereby identify the anatomical structure of interest as "liver", the particular input volume is the medical imaging volume of patient "ABC", and "lesions" in the anatomical structure are to be identified. Based on this parsing, a particular medical imaging volume corresponding to patient "ABC" may be retrieved from corpus 2006 and input to a lesion detection and classification AI pipeline 2020, which operates on this input volume as previously described for identifying a list of liver lesions, which is output to cognitive computing system 2000 for further evaluation by request processing pipeline 2008 for generating a medical imaging viewer application output, and so forth.

As shown in fig. 20, one or more of these computing devices (e.g., server 2004) may be specially configured to implement a lesion detection and classification AI pipeline 2020 (e.g., like AI pipeline 100 in fig. 1). Configuration of the computing device may include providing specialized hardware, firmware, etc. to facilitate execution of the operations and generation of output described herein with respect to the illustrative embodiments. The configuration of the computing device may also or alternatively include providing a software application stored in one or more storage devices and loaded into memory of the computing device (such as the server 2004) for causing one or more hardware processors of the computing device to execute the software application that configures the processors to perform operations and generate the output described herein with respect to the illustrative embodiments. Moreover, any combination of specialized hardware, firmware, software applications executing on hardware, or the like, may be used without departing from the spirit and scope of the illustrative embodiments.

It should be understood that once a computing device is configured in one of these ways, the computing device becomes a special purpose computing device that is specifically configured to implement the mechanisms of the illustrative embodiments and is not a general purpose computing device. Moreover, as described herein, implementation of the mechanisms of the illustrative embodiments improves the functionality of the computing device and provides useful and specific results that facilitate automatic lesion detection in anatomical structures of interest and classification of such lesions, which reduces errors and improves efficiency relative to manual processes.

As described above, the mechanisms of the illustrative embodiments utilize a specially configured computing device or data processing system to perform operations for performing anatomical structure recognition, lesion detection, and classification. These computing devices or data processing systems may include various hardware elements that are specially configured, either by hardware configuration, software configuration, or a combination of hardware and software configuration, to implement one or more of the systems/subsystems described herein. FIG. 21 is a block diagram of but one example data processing system in which aspects of the illustrative embodiments may be implemented. Data processing system 2100 is an example of a computer, such as server 2004 in fig. 20, in which computer usable code or instructions implementing the processes and aspects of the illustrative embodiments of the present invention may be located and/or executed to achieve the operations, outputs, and external effects of the illustrative embodiments described herein.

In the depicted example, data processing system 2100 employs a hub architecture including a north bridge and memory controller hub (NB/MCH)2102 and a south bridge and input/output (I/O) controller hub (SB/ICH) 2104. Processing unit 2106, main memory 2108, and graphics processor 2110 are connected to NB/MCH 2102. Graphics processor 2110 may be connected to NB/MCH2102 through an Accelerated Graphics Port (AGP).

In the depicted example, Local Area Network (LAN) adapter 2112 connects to SB/ICH 2104. Audio adapter 2116, keyboard and mouse adapter 2120, modem 2122, Read Only Memory (ROM)2124, Hard Disk Drive (HDD)2126, CD-ROM drive 2130, Universal Serial Bus (USB) ports and other communication ports 2132, and PCI/PCIe devices 2134 connect to SB/ICH2104 through bus 2138 and bus 2140. PCI/PCIe devices may include, for example, Ethernet adapters, add-in cards, and PC cards for notebook computers. PCI uses a card bus controller, while PCIe does not. ROM2124 may be, for example, a flash basic input/output system (BIOS).

HDD2126 and CD-ROM drive 2130 connect to SB/ICH2104 via bus 2140. HDD2126 and CD-ROM drive 2130 may use, for example, an Integrated Drive Electronics (IDE) or Serial Advanced Technology Attachment (SATA) interface. A super I/O (SIO) device 2136 may be connected to SB/ICH 2104.

An operating system runs on processing unit 2106. The operating system coordinates and provides control of various components within data processing system 2100 in FIG. 21. As a client, the operating system may be such as

Is commercially available operating system. Object-oriented programming system (such as Java)^TMProgramming system) may run in conjunction with the operating system and provide information from Java executing on data processing system 200^TMA call to the operating system by a program or application.

As a server, data processing system 2100 may, for example, execute high-level interactive execution

Operating system or

IBM eServer for operating System^TM

Computer system and Power-based computer system^TMA computer system of a processor, etc. Data processing system 2100 may be a Symmetric Multiprocessor (SMP) system including a plurality of processors in processing unit 2106. Alternatively, a single processor system may be employed.

Instructions for the operating system, the object-oriented programming system, and applications or programs are located on storage devices, such as HDD2126, and may be loaded into main memory 2108 for execution by processing unit 2106. The processes for illustrative embodiments of the present invention may be performed by processing unit 2106 using computer usable program code, which may be located in a memory such as, for example, main memory 2108, ROM2124, or in one or more

peripheral devices

2126 and 2130, for example.

A bus system, such as bus 2138 or bus 2140 as shown in FIG. 21, may be comprised of one or more buses. Of course, the bus system may be implemented using any type of communication fabric or architecture that provides for a transfer of data between different components or devices attached to the fabric or architecture. A communication unit (such as modem 2122 or network adapter 2112 of fig. 21) may include one or more devices used to transmit and receive data. A memory may be, for example, main memory 2108, ROM2124, or a cache such as found in NB/MCH2102 in FIG. 21.

As described above, in some demonstrative embodiments, the mechanisms of the illustrative embodiments may be implemented as application-specific hardware, firmware, or the like, as application software stored in a storage device, such as HDD2126, and loaded into memory, such as main memory 2108, for execution by one or more hardware processors, such as processing unit 2106, or the like. As such, the computing device shown in fig. 21 becomes specially configured to implement the mechanisms of the illustrative embodiments and to perform operations and generate the outputs described herein with respect to the lesion detection and classification artificial intelligence pipeline.

Those of ordinary skill in the art will appreciate that the hardware in FIGS. 20 and 21 may vary depending on the implementation. Other internal hardware or peripheral devices, such as flash memory, equivalent non-volatile memory, or optical disk drives and the like, may be used in addition to or in place of the hardware depicted in figures 20 and 21. Also, the processes of the illustrative embodiments may be applied to a multiprocessor data processing system, other than the SMP system mentioned previously, without departing from the spirit and scope of the present invention.

Moreover, the data processing system 2100 may take the form of any of a number of different data processing systems including client computing devices, server computing devices, a tablet computer, laptop computer, telephone or other communication device, a Personal Digital Assistant (PDA), or the like. In some illustrative examples, data processing system 2100 may be a portable computing device configured with flash memory to provide non-volatile memory for storing operating system files and/or user-generated data, for example. Essentially, data processing system 2100 may be any known or later developed data processing system without architectural limitation.

As mentioned above, it should be appreciated that the illustrative embodiments may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment containing both hardware and software elements. In an example embodiment, the mechanisms of the illustrative embodiments are implemented in software or program code, which includes but is not limited to firmware, resident software, microcode, etc.

A data processing system suitable for storing and/or executing program code will include at least one processor coupled directly or indirectly to memory elements through a communication bus such as, for example, a system bus. The memory elements can include local memory employed during actual execution of the program code, bulk storage, and cache memories which provide temporary storage of at least some program code in order to reduce the number of times code must be retrieved from bulk storage during execution. The memory can be of various types including, but not limited to, ROM, PROM, EPROM, EEPROM, DRAM, SRAM, flash, solid state memory, and the like.

Input/output or I/O devices (including but not limited to keyboards, displays, pointing devices, etc.) can be coupled to the system either directly or through intervening wired or wireless I/O interfaces and/or controllers, etc. The I/O devices may take many different forms other than traditional keyboards, displays, pointing devices, etc., such as, for example, communication devices coupled by wired or wireless connections, including but not limited to smart phones, tablet computers, touch screen devices, voice recognition devices, etc. Any known or later developed I/O devices are intended to be within the scope of the illustrative embodiments.

Network adapters may also be coupled to the system to enable the data processing system to become coupled to other data processing systems or remote printers or storage devices through intervening private or public networks. Modems, cable modem and Ethernet cards are just a few of the currently available types of network adapters for wired communications. Network adapters based on wireless communications may also be utilized including, but not limited to, 802.11a/b/g/n wireless communications adapters, bluetooth wireless adapters, and the like. Any known or later developed network adapter is intended to be within the spirit and scope of the present invention.

The description of the present invention has been presented for purposes of illustration and description, and is not intended to be exhaustive or limited to the invention in the form disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The embodiment was chosen and described in order to best explain the principles of the invention, the practical application, and to enable others of ordinary skill in the art to understand the invention for various embodiments with various modifications as are suited to the particular use contemplated. The terminology used herein was chosen to best explain the principles of the embodiments, the practical application, or technical improvements found in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.

Claims

1. In a data processing system comprising at least one processor and at least one memory including instructions executed by the at least one processor to implement a trained machine learning computer model for seed relabeling for seed-based sliced lesion segmentation, a method comprising:

receiving a lesion mask for a three-dimensional medical image volume, wherein the lesion mask corresponds to lesions detected in the medical image volume, and wherein each detected lesion has a lesion profile;

generating a distance map for a given two-dimensional slice in the medical image volume based on the lesion mask, wherein the distance map comprises distances to a lesion contour for each voxel of the given two-dimensional slice;

performing local maximum identification to select a set of local maxima from the distance map such that each local maximum has a value greater than its immediate neighbor point;

performing a seed re-labeling based on the distance map and the set of local maxima to generate a set of seeds, wherein each seed in the set of seeds represents a center of a different component of a lesion contour; and is

Performing image segmentation on the lesion mask based on the set of seeds to form a split lesion mask.

2. The method of claim 1, wherein generating a distance map comprises performing gaussian smoothing on the distance map.

3. The method of claim 1, wherein performing seed relabeling comprises grouping first and second local maxima in response to determining that the first and second local maxima are direct neighbors.

4. The method of claim 1, wherein performing seed relabeling comprises:

determining a circle centered at each local maximum, the circle having a radius equal to the corresponding distance of the circle in the distance map;

calculating an overlap measure of a first circle centered at the first local maximum and a second circle centered at the second local maximum; and is

Grouping the first local maximum and the second local maximum if the overlap metric is greater than a predetermined threshold.

5. The method of claim 4, wherein the overlap metric is calculated as follows:

6. The method of claim 4, wherein the overlap metric is calculated as follows:

wherein | S1| represents an area of the first circle, | S2| represents an area of the second circle, | S1| S2| represents an area of intersection of the first circle and the second circle, and | S1| S2| represents an area of a combination of the first circle and the second circle.

7. The method of claim 1, wherein performing image segmentation comprises performing a watershed algorithm on the lesion mask based on a set of local maxima to form an initial split lesion mask defining a first set of lesions.

8. The method of claim 7, wherein performing image segmentation further comprises merging lesions of the first set of lesions to form a revised split lesion mask based on the results of the seed relabeling.

9. A computer program product comprising a computer readable storage medium having a computer readable program stored therein, wherein the computer readable program when executed on a computing device causes the computing device to implement a trained machine learning computer model for seed relabeling for seed-based sliced lesion segmentation, wherein the trained machine learning computer model performs the steps of the method according to any one of claims 1 to 8.

10. An apparatus, comprising:

a processor; and

a memory coupled to the processor, wherein the memory comprises instructions that when executed by the processor cause the processor to implement a trained machine learning computer model for seed relabeling for seed-based sliced lesion segmentation, wherein the trained machine learning computer model performs the steps of the method of any one of claims 1-8.

11. A computer system comprising means for performing the steps of the method according to any one of claims 1 to 8.

12. In a data processing system comprising at least one processor and at least one memory including instructions executable by the at least one processor to implement lesion segmentation logic for refining and repairing a lesion contour with a combined active contour, a method comprising:

receiving an initial segmented medical image having organ tissue, the initial segmented medical image comprising a set of object contours and contours to be refined;

patch object voxels within all contours of the set;

calculating an updated contour around the contour to be refined based on the repaired object voxel to form an updated segmented medical image;

determining whether the updated segmented medical image is improved as compared to the initial segmented medical image; and is

Maintaining the updated segmented medical image in response to the updated segmented medical image being improved.

13. The method of claim 12, further comprising:

in response to the updated segmented medical image not being improved, maintaining the initial segmented medical image.

14. The method of claim 12, wherein determining whether the updated segmented medical image is improved comprises:

determining a first contrast of the initial segmented medical image;

determining a second contrast of the updated segmented two-dimensional medical image; and

determining whether the second contrast is greater than the first contrast.

15. The method of claim 12, wherein determining whether the updated segmented medical image is improved comprises:

determining a first variance between voxels within each contour of the initial segmented medical image and voxels that are outside the contour but remain within a predetermined distance from the contour;

determining a second variance between voxels within each contour of the updated segmented medical image and voxels outside the contour but remaining within a predetermined distance from the contour; and is

Determining whether the second variance is less than the first variance.

16. The method of claim 15, wherein the first variance and the second variance are determinedThe method comprises the following steps: the n voxels 3x are calculated as follows₁,…,x_nVariance of the set of }:

17. the method of claim 12, wherein the set of object contours includes at least one lesion contour.

18. The method of claim 12, wherein the set of object contours includes at least contours of non-organ portions of the segmented medical image.

19. The method of claim 12, wherein the set of object contours includes at least one vessel or artery contour.

20. The method of claim 12, wherein the contour to be refined corresponds to a lesion.

21. The method of claim 12, wherein repairing the object voxels within all contours of the set comprises repairing non-organ tissue voxels near a second lesion.

22. The method of claim 12, wherein the lesion segmentation logic is a computer logic stage in a lesion detection and classification Artificial Intelligence (AI) pipeline including a plurality of trained machine learning computer models, and wherein the lesion segmentation logic receives input from a liver/lesion detection logic stage of the AI pipeline.

23. The method of claim 22, wherein the plurality of trained machine learning computer models of the AI pipeline comprises:

one or more first machine-learned computer models of the AI pipeline that process an input volume of medical images to detect a lesion present in the medical images of the input volume corresponding to an anatomical structure of interest; and

one or more second machine-learned computer models of the AI pipeline that process the detected lesions to perform lesion segmentation and combine lesion contours of different ones of the input medical images associated with the same lesion to generate a list of lesions and corresponding lesion contours.

24. The method of claim 22, wherein the plurality of trained machine learning computer models comprises one or more machine learning computer models that process an updated segmented medical image output by the lesion segmentation logic to remove false positives from the updated segmented medical image.

25. A computer program product comprising a computer readable storage medium having a computer readable program stored therein, wherein the computer readable program when executed on a computing device causes the computing device to implement lesion segmentation logic for refining and repairing a lesion contour with a combined active contour, wherein the lesion segmentation logic performs the steps of the method of any of claims 12-24.

26. An apparatus, comprising:

a processor; and

a memory coupled to the processor, wherein the memory includes instructions that when executed by the processor cause the processor to implement lesion segmentation logic for refining and repairing a lesion contour with a combined active contour, wherein the lesion segmentation logic performs the steps of the method of any of claims 12 to 24.

27. A computer system comprising means for performing the steps of the method according to any one of claims 12 to 24.