CN117522891A - 3D medical image segmentation system and method - Google Patents
3D medical image segmentation system and method Download PDFInfo
- Publication number
- CN117522891A CN117522891A CN202311498656.6A CN202311498656A CN117522891A CN 117522891 A CN117522891 A CN 117522891A CN 202311498656 A CN202311498656 A CN 202311498656A CN 117522891 A CN117522891 A CN 117522891A
- Authority
- CN
- China
- Prior art keywords
- image
- segmentation
- model
- medical image
- training
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000003709 image segmentation Methods 0.000 title claims abstract description 38
- 238000000034 method Methods 0.000 title claims abstract description 38
- 230000011218 segmentation Effects 0.000 claims abstract description 87
- 238000012549 training Methods 0.000 claims abstract description 72
- 230000008569 process Effects 0.000 claims abstract description 13
- 230000002708 enhancing effect Effects 0.000 claims abstract description 7
- 230000006870 function Effects 0.000 claims description 49
- 238000013527 convolutional neural network Methods 0.000 claims description 41
- 230000004913 activation Effects 0.000 claims description 15
- 238000012545 processing Methods 0.000 claims description 12
- 238000011176 pooling Methods 0.000 claims description 10
- 230000001186 cumulative effect Effects 0.000 claims description 8
- 238000007781 pre-processing Methods 0.000 claims description 8
- 238000005315 distribution function Methods 0.000 claims description 7
- 238000013507 mapping Methods 0.000 claims description 6
- 238000010200 validation analysis Methods 0.000 claims description 6
- 238000005070 sampling Methods 0.000 claims description 5
- 238000012795 verification Methods 0.000 claims description 4
- 238000012360 testing method Methods 0.000 claims description 3
- 230000009466 transformation Effects 0.000 claims description 3
- 230000009286 beneficial effect Effects 0.000 abstract description 2
- 238000013135 deep learning Methods 0.000 description 5
- 230000003044 adaptive effect Effects 0.000 description 3
- 238000003745 diagnosis Methods 0.000 description 3
- 238000005457 optimization Methods 0.000 description 3
- 238000002360 preparation method Methods 0.000 description 3
- 210000003484 anatomy Anatomy 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 230000001965 increasing effect Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 210000000056 organ Anatomy 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 1
- 206010028980 Neoplasm Diseases 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000002591 computed tomography Methods 0.000 description 1
- 238000004195 computer-aided diagnosis Methods 0.000 description 1
- 230000010485 coping Effects 0.000 description 1
- 238000013136 deep learning model Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000002059 diagnostic imaging Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000003902 lesion Effects 0.000 description 1
- 230000004807 localization Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 238000012805 post-processing Methods 0.000 description 1
- 238000004445 quantitative analysis Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000000638 solvent extraction Methods 0.000 description 1
- 210000001519 tissue Anatomy 0.000 description 1
- 238000011282 treatment Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/60—Rotation of whole images or parts thereof
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/40—Image enhancement or restoration using histogram techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
- G06V10/765—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects using rules for classification or partitioning the feature space
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- Databases & Information Systems (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Software Systems (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a 3D medical image segmentation system and a method, comprising the following steps: step 1: collecting and preparing a 3D medical image dataset comprising 3D medical images and corresponding segmentation labels, ensuring that the dataset comprises images of non-uniform contrast; step 2: enhancing contrast of the image using histogram equalization; step 3: data enhancement is carried out on the data so as to increase the diversity and the robustness of the training samples; step 4: establishing a 3D CNN model of image segmentation, inputting an original 3D medical image, and outputting a segmentation label of each pixel; step 5: training the CNN model by using the prepared training data, wherein in the training process, the equalized image is used as input to ensure that the contrast ratio is enhanced; step 6: the new 3D medical image is segmented with a trained CNN model. The invention enables the model to be more suitable for the condition of uneven contrast ratio, and is beneficial to improving the stability and accuracy of segmentation.
Description
Technical Field
The invention relates to the technical field of medical image processing, in particular to a 3D medical image segmentation system and method.
Background
3D medical image segmentation refers to the process of precisely dividing and marking different structures or regions in a three-dimensional medical image. This process aims to identify and isolate anatomical structures, organs, tissues or lesions of interest in the image for further analysis, diagnosis and treatment planning. Medical image segmentation has important applications in the field of medical imaging, such as computer-aided diagnosis, surgical planning, tumor localization, and quantitative analysis. Medical image segmentation is a complex task because medical images often feature a high degree of variability, different structures in the medical image have different levels of contrast, some structures may be very bright, while others may be very dim, and uneven contrast makes it difficult for an automatic segmentation algorithm to determine the appropriate segmentation threshold or feature, thus requiring adaptive processing in such a case.
Disclosure of Invention
In order to solve the problems, the invention provides a 3D medical image segmentation system and a method.
In order to achieve the above purpose, the technical scheme adopted by the invention is as follows:
in one aspect, the invention discloses a 3D medical image segmentation method, comprising the steps of:
step 1: collecting and preparing a 3D medical image dataset comprising 3D medical images and corresponding segmentation labels, ensuring that the dataset comprises images of non-uniform contrast;
step 2: enhancing contrast of the image using histogram equalization;
step 3: data enhancement is carried out on the data so as to increase the diversity and the robustness of the training samples;
step 4: establishing a 3D CNN model of image segmentation, inputting an original 3D medical image, and outputting a segmentation label of each pixel;
step 5: training the CNN model by using the prepared training data, wherein in the training process, the equalized image is used as input to ensure that the contrast ratio is enhanced;
step 6: the new 3D medical image is segmented with a trained CNN model.
Further: the step 1 comprises the following steps:
acquiring a 3D medical image and a corresponding segmentation label;
preprocessing the obtained original 3D medical image to ensure the consistency and applicability of the data;
generating a corresponding segmentation label for each image;
dividing the data set into a training set, a verification set and a test set;
images with non-uniform contrast are selected from the dataset as examples of appropriate processing and training for the non-uniform contrast problem.
Further: the step 2 comprises the following steps:
calculating a histogram of the original 3D medical image: h (i) =frequency (i), where H (i) represents the frequency of the pixel value i;
calculating a cumulative distribution function CDF of the histogram:wherein CDF (i) is a cumulative distribution function of pixel values i;
normalizing the CDF to range between 0 and 255: wherein CDF normalized (x) Is normalized CDF, min (CDF) and max (CDF) are minimum and maximum values of CDF, respectively;
Mapping each pixel value in the original image to a histogram equalized value using a normalized CDF: enhanced image (x, y) =cdf normalized (original image (x, y)), wherein original image (x, y) is a pixel value in the original image, and enhanced image (x, y) is a pixel value subjected to histogram equalization.
Further: the step 3 comprises the following steps:
an angle is randomly selected and then the image is rotated at this angle, the rotated angle being expressed by the following formula: rotation angle = random angle x 2 x pi;
the image is randomly scaled to simulate a 3D medical image of different resolutions, the scaling being expressed by the following formula: scaling factor = 1+ random factor, where the random factor is a value less than 1, representing the scale of the scaled down image;
randomly selecting a horizontal flip or a vertical flip image;
the segmented labels corresponding to the images are also subjected to the same transformation operation to maintain consistency of the images and labels.
Further: the step 4 comprises the following steps:
establishing a 3D CNN model based on a U-Net architecture, comprising:
an encoder: including convolution layers and pooling layers, each convolution layer followed by introducing nonlinearity using an activation function, mapping the input raw 3D medical image to a low-resolution feature map;
a decoder: the method comprises a convolution layer and an up-sampling layer, wherein the convolution layer and the up-sampling layer are used for gradually restoring a characteristic image generated by an encoder to the original image resolution, and an activation function is used after each convolution layer;
output layer: the method comprises a convolution layer, wherein the number of output channels of the convolution layer is equal to the number of categories of the segmentation labels, and the convolution layer is used for generating a category probability distribution of each pixel.
Further: the step 5 comprises the following steps:
selecting a random gradient descent algorithm for training a 3D CNN model;
cross entropy loss and Dice loss are used as loss functions:
cross entropy loss is used for multi-class segmentation tasks, and its formula is as follows:
L CE (p,q)=-Σ i p i log(q i )
wherein p is the probability distribution of the real label, q is the prediction probability distribution of the model, and i represents the category index;
the Dice penalty is used for binary segmentation tasks, and its formula is as follows:
wherein p is a binary segmentation mask of the real label, q is a prediction mask of the model, i represents the pixel index, ε is a small constant, and zero denominator is avoided;
initializing parameters of a model before training begins;
in each training batch, forward propagation is used to calculate the model's output and loss function values, and then backward propagation is used to calculate the gradient;
the parameters of the model are updated using a random gradient descent algorithm and gradient information to reduce the value of the loss function, the parameter update rules being as follows:
wherein θ t The model parameters, denoted at time step t, alpha is the learning rate,is the gradient of the loss function with respect to the parameter;
repeating the above steps until the model converges;
after each training period is completed, the performance of the model is assessed using the validation dataset, including using the segmentation performance index, dice coefficient, jaccard index, accuracy to assess the accuracy of the model.
Further: the step 6 comprises the following steps:
preparing a new 3D medical image, ensuring that the image has been equalized to handle non-uniform contrast;
preprocessing the new 3D medical image to ensure that the input is consistent with the model training;
inputting the preprocessed image into a trained 3D CNN model, and generating segmentation prediction for each pixel through a forward propagation process of the model;
generating a final segmentation result by adopting the following strategy according to the output of the model:
for multi-category segmentation, selecting the category with the highest probability as a label of the pixel;
for binary segmentation, a threshold is selected to map the probability to a binary segmentation result.
In another aspect, the invention discloses a 3D medical image segmentation system comprising:
a medical image dataset module: collecting and preparing a 3D medical image dataset comprising 3D medical images and corresponding segmentation labels, ensuring that the dataset comprises images of non-uniform contrast;
histogram equalization module: enhancing contrast of the image using histogram equalization;
and a data enhancement module: the data is data enhanced to increase the diversity and robustness of the training samples.
Convolutional neural network model: establishing a 3D CNN model of image segmentation, inputting an original 3D medical image, and outputting a segmentation label of each pixel;
model training module: training the CNN model by using the prepared training data, wherein in the training process, the equalized image is used as input to ensure that the contrast ratio is enhanced;
an image segmentation module: the new 3D medical image is segmented with a trained CNN model.
Compared with the prior art, the invention has the following technical progress:
more robust to non-uniform contrast: the traditional method is usually very sensitive to uneven contrast and is difficult to determine a proper segmentation threshold or characteristic, however, the method uses histogram equalization to enhance the contrast of the image, so that the model can be more suitable for the situation of uneven contrast, and the stability and the accuracy of segmentation are improved.
Adaptive processing: the invention uses histogram equalization in the data preparation stage, not only improves the contrast of the image, but also enables the model to have self-adaptability, and even if strong contrast non-uniformity exists in the medical image, the model can adapt and provide better segmentation results.
Advantages of deep learning: the invention adopts a 3D Convolutional Neural Network (CNN) model, allows the model to learn more complex features and context information, and compared with the traditional manual feature extraction method, the deep learning can better capture the structure and association in the image, thereby improving the segmentation accuracy.
Data enhancement: data enhancement techniques are used to increase the diversity and robustness of the training samples, which can help the model better adapt to different types of non-uniform contrast and noise, thereby improving the robustness of the segmentation.
Real-time application potential: due to the high efficiency and self-adaptability of the deep learning model, the invention is expected to be used for real-time application, such as real-time image segmentation or navigation, which has potential significance in the fields of medical diagnosis, surgical navigation and the like,
in summary, the invention utilizes deep learning and adaptive processing to make the model more robust and accurate, especially when facing medical images with uneven contrast, is expected to improve the efficiency and accuracy of medical image segmentation, thereby providing a more powerful tool for medical diagnosis and research.
Drawings
The accompanying drawings are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate the invention and together with the embodiments of the invention, serve to explain the invention.
In the drawings:
FIG. 1 is a flow chart of the present invention.
Detailed Description
The following embodiments may be combined with each other, and the same or similar concepts or processes may not be described in detail in some embodiments. Embodiments of the present invention will be described below with reference to the accompanying drawings.
Example 1
As shown in fig. 1, the embodiment discloses a 3D medical image segmentation system and method, including the following steps:
step 1: data preparation
A medical image dataset, including a 3DCT scan or MRI image and corresponding segmentation labels, is collected and prepared, ensuring that the dataset includes images of non-uniform contrast for proper processing and training.
Step 2: histogram equalization
Histogram equalization is used to enhance the contrast of an image, especially in the case of non-uniform contrast.
Step 3: data enhancement
The data is subjected to data enhancement, including rotation, scaling, flipping and the like, so as to increase the diversity and robustness of the training samples.
Step 4: modeling Convolutional Neural Network (CNN)
A 3D CNN model for image segmentation is designed and built that will input the original medical image and output the segmentation labels for each pixel.
Step 5: model training
The CNN model is trained using the prepared training data. The Loss function may typically select a Cross-Entropy Loss (Cross-Entropy Loss) or other Loss function suitable for segmentation tasks, using the equalized image as an input during training to ensure that contrast has been enhanced.
Step 6: segmentation of images
The new medical image is segmented with a trained CNN model and the equalized image will be used as input so that the model can better handle non-uniform contrast.
The invention combines the image enhancement and the deep learning technology to solve the problem of uneven contrast, improves the segmentation accuracy of medical images, and in addition, the data enhancement and the post-processing steps are helpful for improving the robustness and the performance of the model.
Specifically, step 1 includes:
in this step, which is used to illustrate how to prepare the medical image dataset, ensuring that the dataset comprises images of non-uniform contrast, the main objective of this step is to acquire and organize the medical image data for subsequent processing and training, the following are specific steps:
and (3) data acquisition: the acquisition of 3D CT scan or MRI images and corresponding segmentation labels, which are typically acquired from medical facilities, research institutions or medical image databases, ensures that the image data covers anatomical structures and organs of interest, as well as non-uniform contrast.
Data preprocessing: the raw medical image obtained is subjected to necessary preprocessing to ensure consistency and applicability of the data, including noise removal, calibration of image brightness and contrast, image registration, etc., to eliminate differences caused by different scanning devices and scanning parameters.
And (3) generating a segmentation label: corresponding segmentation labels are generated for each image, which labels identify structures or regions of interest in the image, typically requiring medical professionals to manually label or use semi-automatic tools for segmentation, ensuring accurate labels.
Data set partitioning: the data set is divided into a training set for model training, a validation set for adjusting model hyper-parameters and monitoring model performance, and a test set for final performance assessment for use in model development and evaluation.
Equalizing contrast image selection: selecting images with non-uniform contrast from the dataset as examples of appropriate processing and training for the non-uniform contrast problem ensures that the model can be robust in the case of processing non-uniform contrast.
This step is mainly the preparation and arrangement of data to ensure that the subsequent medical image segmentation task can be performed on the prepared dataset.
Specifically, step 2 includes:
histogram equalization is an image processing technique for enhancing image contrast, which can stretch or compress the luminance range by redistributing the pixel values of the image to improve the problem of uneven contrast, the following steps are how to achieve histogram equalization:
calculating a histogram: first, a histogram of an original medical image is calculated: h (i) =frequency (i), where H (i) represents the frequency of the pixel value i and the histogram is a graph representing the frequency of occurrence of different pixel values in the image.
Calculate Cumulative Distribution Function (CDF): a cumulative distribution function of the histogram, also known as CDF, is calculated:where CDF (i) is the cumulative distribution function of pixel values i, and CDF is the cumulative frequency of each pixel value.
Standardized CDF: normalizing the CDF to range from 0 to 255 is accomplished by the following equation:
wherein CDF normalized (x) Is the normalized CDF, and min (CDF) and max (CDF) are the minimum and maximum values of CDF, respectively.
Mapping pixel values: mapping each pixel value in the original image to a histogram equalized value, using a normalized CDF, by the following formula:
EnhancedImange(x,y)=CDF normalized (OriginalImage(x,y))
where origin image (x, y) is a pixel value in the original image, and enhanced image (x, y) is a pixel value after histogram equalization.
After the above processing, a contrast-enhanced image is obtained, which should show a better visual effect in case of uneven contrast. If there are multiple medical images, the above process needs to be repeatedly applied to each image to ensure that all images are contrast enhanced.
Specifically, step 3 includes:
data enhancement is used to increase the diversity and robustness of training data, especially in deep learning tasks. In medical image segmentation, data enhancement can help model generalization better to accommodate different situations, the following general steps of how data enhancement is achieved:
image rotation: the diversity of data can be increased by randomly selecting an angle and then rotating the image at that angle to accommodate images taken at different angles, the rotated angle being expressed by the following formula:
rotation angle = random angle x 2 x pi
Image scaling: the image is randomly scaled to simulate medical images of different resolutions, the scaling being expressed by the following formula:
scaling factor = 1+ randomizing factor
Wherein the random factor is a value less than 1, representing the scale of the scaled-down image.
Image inversion: the random selection of either the horizontal flip or vertical flip image can increase the mirror variation of the data to increase the robustness of the model.
Other enhancement operations: other data enhancement operations, such as adding noise, brightness adjustment, etc., may also be considered, as desired for the task.
Synchronization enhancement of tags: the segmented labels corresponding to the image also need to undergo the same transformation operation to maintain consistency of the image and the labels.
Repeating the operation: the above described enhancement operations are applied to each sample of the training data set to generate more training samples, and the number or number of enhancements may be specified to determine the number of samples generated.
Through the data enhancement operation, the training data set can obtain more variation and diversity, so that the model can be better adapted to various conditions, the generalization capability is improved, and the method is beneficial to coping with medical images with uneven contrast and different angles and resolutions, thereby improving the performance of the segmentation model.
Specifically, step 4 includes:
in this embodiment, convolutional Neural Network (CNN) is used for image segmentation, and the following is a 3D CNN model implementation step based on U-Net architecture:
4.1U-Net architecture: consists of one encoder (downsampling path) and one decoder (upsampling path) allowing the network to learn features of different scales, thus enabling pixel-level segmentation.
4.2 encoder: the encoder section consists of a series of convolution layers, each of which is then typically followed by an activation function (e.g., reLU) to introduce nonlinearities, and a pooling layer to extract features of the image, the encoder maps the input raw medical image to a low resolution feature map.
The encoder section typically contains a plurality of convolution layers, each of which uses a convolution operation to extract features of the image, the convolution operation expressed as the following equation:
where C (x, y) represents the convolved feature image pixel value, I (x, y) is the pixel value of the input image, K (I, j) is the weight of the convolution kernel, and K is the size of the convolution kernel.
Pooling layer: after each convolution layer, a pooling layer is used to reduce the spatial resolution of the feature map while preserving important features, and the pooling operation typically employs maximum pooling or average pooling to reduce the computational effort and risk of overfitting.
The max-pooling operation may be expressed as the following formula:
where P (x, y) represents the pooled feature image pixel values, I (x, y) is the pixel value of the input image, and I and j are the pixel coordinates within the pooled window.
Activation function: after each convolution layer, an activation function, such as a ReLU (modified linear unit), is typically applied to introduce nonlinearities, which can be expressed as the following equation:
A(x)=max(0,x)
where a (x) represents the activated eigenvalue and x is the eigenvalue of the convolutional layer output.
Multi-layer encoder: there are typically multiple convolution layers and pooling layers that make up the encoder, each of which gradually reduces the spatial resolution of the feature map while extracting higher level features, so that the encoder maps the original medical image to a low resolution feature map.
These steps together constitute an encoder section for extracting features of the input medical image, the output of the encoder being passed to a decoder section for gradual reduction and final segmentation of the features, which are very critical steps in the 3D CNN model, which assist the model in understanding the structure and content of the image.
4.3 decoder: the decoder section consists of a series of convolution and up-sampling layers for gradually restoring the encoder generated feature map to the original image resolution, each convolution layer of the decoder also being followed by an activation function, furthermore the decoder connects the encoder feature map with the corresponding decoder feature map to preserve important context information, the following implementation steps of the decoder:
feature map restoration:
the decoder section consists of a series of convolution and upsampling layers for gradually restoring the low resolution feature map to the original image resolution, which can be achieved by deconvolution (transpose convolution) or upsampling operations, each step gradually increasing the resolution of the feature map.
Activation function:
after each convolution layer, an activation function (e.g., reLU) is typically used to introduce nonlinearities, which helps the model learn more complex features.
Connection of context information:
in order to preserve important context information, the decoder connects the encoder-generated feature maps with the corresponding decoder feature maps, which can be achieved by a stitching operation, connecting the two sets of feature maps in the channel dimension.
Design of convolution kernel:
the size and number of convolution kernels may be adjusted based on the complexity of the task and the nature of the data set, and more convolution layers and larger convolution kernels may extract more feature information, but may also increase the complexity of the model.
Number of channels of the feature map:
the number of channels of the feature map of the decoder should match the number of channels of the feature map of the encoder to ensure proper feature map connection, and if the number of channels of the encoder and decoder do not match, a 1x1 convolutional layer can be used to adjust the number of channels.
In general, the decoder gradually restores the low resolution encoder feature map to the original image resolution through a series of convolution and upsampling layers, and retains the context information through activation functions and feature map connections, helping the model to generate accurate segmentation results, taking into account the context information of the entire image.
4.4 output layer: the output layer of the model is typically a convolution layer with a number of output channels equal to the number of classes of the split labels, and for binary split tasks the number of output channels is 1, and the activation function of this layer is typically a Sigmoid function (binary classification) or a Softmax function (multi-class classification) for generating a class probability distribution for each pixel, the following steps to implement the output layer:
convolution layer: the output layer is typically a convolutional layer with a number of output channels equal to the number of classes of split labels, 1 for binary split tasks and equal to the number of classes for multi-class split tasks, these channels will correspond to class predictions for each pixel.
Activation function: the activation function of the output layer is typically a Sigmoid function or a Softmax function, depending on the task type.
For binary segmentation tasks, a Sigmoid function can be used that maps the output of each pixel to a probability value between 0 and 1, the formula of the Sigmoid function is as follows:
for multi-class segmentation tasks, a Softmax function can be used that maps the output of each pixel to a probability distribution, ensuring that the sum of probabilities for all classes is equal to 1, the formula of the Softmax function is as follows:
wherein x is i Is the output of the ith channel, i takes a value from 1 to the number of categories.
Prediction result: the final prediction result of the output layer is a class probability distribution corresponding to each pixel point, a threshold value can be selected to map the probability into a binary segmentation result in a binary segmentation task, and the class with the highest probability is generally selected as a prediction label of the pixel in a multi-class segmentation task.
The design of the output layer and the choice of the activation function depend on the nature of the task, e.g. binary segmentation or multi-class segmentation, the main goal of this layer being to map the features that are partly learned by the encoder and decoder to the segmentation labels at the pixel level to achieve accurate image segmentation.
4.5 loss function: in training the 3D CNN model, the difference between the predictions of the model and the true labels is compared using an appropriate loss function, and for medical image segmentation, the accuracy of the segmentation is measured using Cross-entropy loss (Cross-EntropyLoss) or Dice loss.
Cross entropy loss is used for multi-class segmentation tasks, which is used to measure the difference between the model's predictive probability distribution and the distribution of real labels.
For each pixel, the output of the model is a probability distribution representing the probability that the pixel belongs to the respective class.
The real tag is a one-hot coded tag, where one category is marked 1 and the rest are 0.
The Dice loss is another loss function for medical image segmentation, typically for binary segmentation tasks, that measures the degree of overlap of model predictions with real labels.
This loss function encourages a higher similarity of the model's predictions to the real labels in the overlap region to achieve a more accurate segmentation, when training the 3D CNN model, the appropriate loss function can be chosen based on the nature of the task, cross entropy loss is applicable to multi-class segmentation, and Dice loss is applicable to binary segmentation, which can help the model optimize during training and generate more accurate segmentation results.
Specifically, step 5 includes:
training a 3D CNN model is a key step aimed at enabling the model to predict segmentation labels by optimizing the loss function, the following is a step of how to implement the training strategy:
5.1 selecting an optimization algorithm:
a random gradient descent (SGD) optimization algorithm is used to train the model, and the SGD can choose different variants, such as SGD, adam, RMSprop with momentum, to speed up training and improve model performance by calculating the gradient of the loss function with respect to model parameters and updating the parameters according to the direction of the gradient to minimize the loss function.
5.2 define the loss function:
the loss function includes cross entropy loss and Dice loss, and an appropriate loss function is selected according to the nature of the task, and is defined:
cross entropy loss is used for multi-class segmentation tasks, and its formula is as follows:
where p is the probability distribution of the real labels, q is the predicted probability distribution of the model, and i represents the class index.
The Dice penalty is used for binary segmentation tasks, and its formula is as follows:
where p is the binary segmentation mask of the real label, q is the prediction mask of the model, i represents the pixel index, ε is a small constant, and zero denominator is avoided.
5.3 initializing model parameters:
before training begins, parameters of the model need to be initialized, and a mode of randomly initializing or pre-training the parameters of the model can be adopted.
5.4 counter propagation:
in each training batch, forward propagation is used to calculate the model's output and loss function values, and then backward propagation is used to calculate the gradient, i.e., the gradient of the loss function with respect to the model parameters, which can be calculated by the chain law.
5.5 parameter update:
the parameters of the model are updated using an optimization algorithm (e.g., SGD) and gradient information to reduce the value of the loss function. The parameter update rules are as follows:
wherein θ t The model parameters, denoted at time step t, alpha is the learning rate,is the gradient of the loss function with respect to the parameter.
5.6 data enhancement:
during training, data enhancement techniques may be used to increase the diversity of training samples. This is described in detail in step 3.
5.7 iterative training:
the above steps are repeated until the model converges or a predetermined number of training iterations is reached.
5.8 verification and evaluation:
after each training period is completed, the performance of the model is evaluated using the validation data set. Segmentation performance indicators (e.g., dice coefficients, jaccard indices, accuracy, etc.) may be used to evaluate accuracy of the model as desired, including:
5.8.1 the validation dataset is prepared:
first, a separate validation data set is prepared, comprising the 3D medical image and the corresponding segmentation labels, which should be independent of the training data set to evaluate the generalization ability of the model.
5.8.2 prediction:
the image in the verification dataset is predicted using a 3D CNN model that has been trained, which model generates predictions of segmentation labels for each pixel.
5.8.3 calculating performance indexes:
evaluating the accuracy of the model using the segmentation performance index includes:
the Dice coefficient: the method is used for measuring the overlapping degree of the segmentation result of the model and the real label, and the formula is as follows:
where TP represents a true case (the number of pixels for which the model correctly predicts as a positive class), FP represents a false positive case (the number of pixels for which the model incorrectly predicts as a positive class), and FN represents a false negative case (the number of pixels for which the model incorrectly predicts as a negative class).
Jaccard index: the formula for measuring the proportion between the intersection and union of the model prediction result and the real label is as follows:
accuracy: accuracy is an indicator used to measure the overall performance of the model, representing the ratio of the number of correctly predicted pixels to the total number of pixels:
where TN represents the true negative example (the number of pixels for which the model correctly predicts as negative).
Specifically, step 6 includes:
in this step, a trained 3D CNN model will be used to segment the new medical image, the following implementation steps:
preparing an input image:
first, a new medical image is prepared, ensuring that the image has been equalized to handle uneven contrast.
Image preprocessing:
the new medical image is subjected to the same preprocessing steps as the training data, including normalization, cropping, resizing, etc., to ensure that the input is consistent with the model training.
Model inference:
inputting the preprocessed image into a trained 3D CNN model, generating segmentation prediction for each pixel through a forward propagation process of the model, and outputting probability distribution belonging to different categories for each pixel by the model.
Segmentation results:
based on the output of the model, the following strategy may be employed to generate the final segmentation result:
for multi-category segmentation, the category with the highest probability can be selected as the label of the pixel;
for binary segmentation, a threshold may be selected to map the probability to a binary segmentation result. For example, a threshold of 0.5 may be selected, and pixels having a probability greater than 0.5 are classified as positive and pixels having a probability less than or equal to 0.5 are classified as negative.
Example 2
A 3D medical image segmentation system, comprising:
a medical image dataset module: collecting and preparing a 3D medical image dataset comprising 3D medical images and corresponding segmentation labels, ensuring that the dataset comprises images of non-uniform contrast;
histogram equalization module: enhancing contrast of the image using histogram equalization;
and a data enhancement module: the data is data enhanced to increase the diversity and robustness of the training samples.
Convolutional neural network model: establishing a 3D CNN model of image segmentation, inputting an original 3D medical image, and outputting a segmentation label of each pixel;
model training module: training the CNN model by using the prepared training data, wherein in the training process, the equalized image is used as input to ensure that the contrast ratio is enhanced;
an image segmentation module: the new 3D medical image is segmented with a trained CNN model.
The above-described respective modules are used to realize the functions in embodiment 1.
Finally, it should be noted that: the foregoing description is only a preferred embodiment of the present invention, and the present invention is not limited thereto, but it is to be understood that modifications and equivalents of some of the technical features described in the foregoing embodiments may be made by those skilled in the art, although the present invention has been described in detail with reference to the foregoing embodiments. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the scope of the claims of the present invention.
Claims (8)
1. A method of 3D medical image segmentation, comprising the steps of:
step 1: collecting and preparing a 3D medical image dataset comprising 3D medical images and corresponding segmentation labels, ensuring that the dataset comprises images of non-uniform contrast;
step 2: enhancing contrast of the image using histogram equalization;
step 3: data enhancement is carried out on the data so as to increase the diversity and the robustness of the training samples;
step 4: establishing a 3D CNN model of image segmentation, inputting an original 3D medical image, and outputting a segmentation label of each pixel;
step 5: training the CNN model by using the prepared training data, wherein in the training process, the equalized image is used as input to ensure that the contrast ratio is enhanced;
step 6: the new 3D medical image is segmented with a trained CNN model.
2. A 3D medical image segmentation method according to claim 1, wherein step 1 comprises:
acquiring a 3D medical image and a corresponding segmentation label;
preprocessing the obtained original 3D medical image to ensure the consistency and applicability of the data;
generating a corresponding segmentation label for each image;
dividing the data set into a training set, a verification set and a test set;
images with non-uniform contrast are selected from the dataset as examples of appropriate processing and training for the non-uniform contrast problem.
3. A 3D medical image segmentation method according to claim 2, wherein step 2 comprises:
calculating a histogram of the original 3D medical image: h (i) =frequency (i), where H (i) represents the frequency of the pixel value i;
calculating a cumulative distribution function CDF of the histogram:wherein CDF (i) is a cumulative distribution function of pixel values i;
normalizing the CDF to range between 0 and 255: wherein CDF normalized (x) Is a normalized CDF, min (CDF) and max (CDF) are minimum and maximum values of CDF, respectively;
mapping each pixel value in the original image to a histogram equalized value using a normalized CDF: enhanced image (x, y) =cdf normalized (original image (x, y)), wherein original image (x, y) is a pixel value in the original image, and enhanced image (x, y) is a pixel value subjected to histogram equalization.
4. A 3D medical image segmentation method according to claim 3, wherein said step 3 comprises:
an angle is randomly selected and then the image is rotated at this angle, the rotated angle being expressed by the following formula: rotation angle = random angle x 2 x pi;
the image is randomly scaled to simulate a 3D medical image of different resolutions, the scaling being expressed by the following formula: scaling factor = 1+ random factor, where the random factor is a value less than 1, representing the scale of the scaled down image;
randomly selecting a horizontal flip or a vertical flip image;
the segmented labels corresponding to the images are also subjected to the same transformation operation to maintain consistency of the images and labels.
5. The method of 3D medical image segmentation according to claim 4, wherein step 4 comprises:
establishing a 3D CNN model based on a U-Net architecture, comprising:
an encoder: including convolution layers and pooling layers, each convolution layer followed by introducing nonlinearity using an activation function, mapping the input raw 3D medical image to a low-resolution feature map;
a decoder: the method comprises a convolution layer and an up-sampling layer, wherein the convolution layer and the up-sampling layer are used for gradually restoring a characteristic image generated by an encoder to the original image resolution, and an activation function is used after each convolution layer;
output layer: the method comprises a convolution layer, wherein the number of output channels of the convolution layer is equal to the number of categories of the segmentation labels, and the convolution layer is used for generating a category probability distribution of each pixel.
6. The method of 3D medical image segmentation according to claim 5, wherein said step 5 comprises:
selecting a random gradient descent algorithm for training a 3D CNN model;
cross entropy loss and Dice loss are used as loss functions:
cross entropy loss is used for multi-class segmentation tasks, and its formula is as follows:
L CE (p,q)=-Σ i p i log(q i )
wherein p is the probability distribution of the real label, q is the prediction probability distribution of the model, and i represents the category index;
the Dice penalty is used for binary segmentation tasks, and its formula is as follows:
wherein p is a binary segmentation mask of the real label, q is a prediction mask of the model, i represents the pixel index, ε is a small constant, and zero denominator is avoided;
initializing parameters of a model before training begins;
in each training batch, forward propagation is used to calculate the model's output and loss function values, and then backward propagation is used to calculate the gradient;
the parameters of the model are updated using a random gradient descent algorithm and gradient information to reduce the value of the loss function, the parameter update rules being as follows:
wherein θ t The model parameters, denoted at time step t, alpha is the learning rate,is the gradient of the loss function with respect to the parameter;
repeating the above steps until the model converges;
after each training period is completed, the performance of the model is assessed using the validation dataset, including using the segmentation performance index, dice coefficient, jaccard index, accuracy to assess the accuracy of the model.
7. The method of 3D medical image segmentation according to claim 6, wherein step 6 comprises:
preparing a new 3D medical image, ensuring that the image has been equalized to handle non-uniform contrast;
preprocessing the new 3D medical image to ensure that the input is consistent with the model training;
inputting the preprocessed image into a trained 3D CNN model, and generating segmentation prediction for each pixel through a forward propagation process of the model;
generating a final segmentation result by adopting the following strategy according to the output of the model:
for multi-category segmentation, selecting the category with the highest probability as a label of the pixel;
for binary segmentation, a threshold is selected to map the probability to a binary segmentation result.
8. A 3D medical image segmentation system, comprising:
a medical image dataset module: collecting and preparing a 3D medical image dataset comprising 3D medical images and corresponding segmentation labels, ensuring that the dataset comprises images of non-uniform contrast;
histogram equalization module: enhancing contrast of the image using histogram equalization;
and a data enhancement module: the data is data enhanced to increase the diversity and robustness of the training samples.
Convolutional neural network model: establishing a 3D CNN model of image segmentation, inputting an original 3D medical image, and outputting a segmentation label of each pixel;
model training module: training the CNN model by using the prepared training data, wherein in the training process, the equalized image is used as input to ensure that the contrast ratio is enhanced;
an image segmentation module: the new 3D medical image is segmented with a trained CNN model.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311498656.6A CN117522891A (en) | 2023-11-10 | 2023-11-10 | 3D medical image segmentation system and method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311498656.6A CN117522891A (en) | 2023-11-10 | 2023-11-10 | 3D medical image segmentation system and method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117522891A true CN117522891A (en) | 2024-02-06 |
Family
ID=89763882
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311498656.6A Pending CN117522891A (en) | 2023-11-10 | 2023-11-10 | 3D medical image segmentation system and method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117522891A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117934855A (en) * | 2024-03-22 | 2024-04-26 | 北京壹点灵动科技有限公司 | Medical image segmentation method and device, storage medium and electronic equipment |
-
2023
- 2023-11-10 CN CN202311498656.6A patent/CN117522891A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117934855A (en) * | 2024-03-22 | 2024-04-26 | 北京壹点灵动科技有限公司 | Medical image segmentation method and device, storage medium and electronic equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111784671B (en) | Pathological image focus region detection method based on multi-scale deep learning | |
CN111798462B (en) | Automatic delineation method of nasopharyngeal carcinoma radiotherapy target area based on CT image | |
CN108921851B (en) | Medical CT image segmentation method based on 3D countermeasure network | |
CN112418329B (en) | Cervical OCT image classification method and system based on multi-scale textural feature fusion | |
CN106056595B (en) | Based on the pernicious assistant diagnosis system of depth convolutional neural networks automatic identification Benign Thyroid Nodules | |
CN110930416B (en) | MRI image prostate segmentation method based on U-shaped network | |
CN115018824B (en) | Colonoscope polyp image segmentation method based on CNN and Transformer fusion | |
CN115661144B (en) | Adaptive medical image segmentation method based on deformable U-Net | |
CN112734764A (en) | Unsupervised medical image segmentation method based on countermeasure network | |
CN113706434B (en) | Post-processing method for chest enhancement CT image based on deep learning | |
CN117132774B (en) | Multi-scale polyp segmentation method and system based on PVT | |
CN114663426B (en) | Bone age assessment method based on key bone region positioning | |
CN117522891A (en) | 3D medical image segmentation system and method | |
CN112950631A (en) | Age estimation method based on saliency map constraint and X-ray head skull positioning lateral image | |
CN113781488A (en) | Tongue picture image segmentation method, apparatus and medium | |
CN112633416A (en) | Brain CT image classification method fusing multi-scale superpixels | |
CN113628230A (en) | Ventricular myocardium segmentation model training method, segmentation method and device in cardiac nuclear magnetic resonance image | |
CN115272170A (en) | Prostate MRI (magnetic resonance imaging) image segmentation method and system based on self-adaptive multi-scale transform optimization | |
Abbas | Nodular-deep: classification of pulmonary nodules using deep neural network | |
CN117934824A (en) | Target region segmentation method and system for ultrasonic image and electronic equipment | |
CN117746042A (en) | Liver tumor CT image segmentation method based on APA-UNet | |
CN117635625A (en) | Pancreatic tumor segmentation method based on automatic data enhancement strategy and multi-attention-assisted UNet | |
CN117953208A (en) | Graph-based edge attention gate medical image segmentation method and device | |
CN117523350A (en) | Oral cavity image recognition method and system based on multi-mode characteristics and electronic equipment | |
CN108154107B (en) | Method for determining scene category to which remote sensing image belongs |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |