CN116664594A - Three-dimensional medical image two-stage segmentation method and device based on sharing CNN - Google Patents

Three-dimensional medical image two-stage segmentation method and device based on sharing CNN Download PDF

Info

Publication number
CN116664594A
CN116664594A CN202310559706.0A CN202310559706A CN116664594A CN 116664594 A CN116664594 A CN 116664594A CN 202310559706 A CN202310559706 A CN 202310559706A CN 116664594 A CN116664594 A CN 116664594A
Authority
CN
China
Prior art keywords
segmentation
image
map
initial
feature map
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310559706.0A
Other languages
Chinese (zh)
Inventor
白杰云
丘瑞瑜
陆尧胜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jinan University
Original Assignee
Jinan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jinan University filed Critical Jinan University
Priority to CN202310559706.0A priority Critical patent/CN116664594A/en
Publication of CN116664594A publication Critical patent/CN116664594A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20112Image segmentation details
    • G06T2207/20132Image cropping
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20112Image segmentation details
    • G06T2207/20161Level set

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Multimedia (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Measuring And Recording Apparatus For Diagnosis (AREA)
  • Magnetic Resonance Imaging Apparatus (AREA)
  • Image Analysis (AREA)

Abstract

The application discloses a three-dimensional medical image two-stage segmentation method and a device based on sharing CNN, wherein the method comprises the steps of inputting an initial image into a target model for image segmentation to obtain an initial segmentation map; obtaining target position information of the ROI of the initial segmentation map according to the initial segmentation map, further determining the position and the size of a ROI cutting frame according to the preset side length of the ROI, cutting the initial segmentation map and the initial image to obtain a first feature map and a second feature map respectively, carrying out feature fusion on the first feature map and the second feature map to obtain a fusion feature map, and inputting a target model to obtain a segmentation result; the method solves the problems of large calculation amount and large occupied space caused by using two models in the prior art, solves the problem of poor model universality, improves the segmentation effect of the three-dimensional medical image, and can be widely applied to the technical field of image processing.

Description

Three-dimensional medical image two-stage segmentation method and device based on sharing CNN
Technical Field
The application relates to the technical field of image processing, in particular to a three-dimensional medical image two-stage segmentation method and device based on sharing CNN.
Background
Image segmentation (image segmentation) refers to the process of separating objects of interest to a user of the picture from the picture, while medical image segmentation refers to the process of separating tissues, organs, etc. of interest in medical staff diagnosis from image background and various noise disturbances from medical images such as Magnetic Resonance Imaging (MRI), computed Tomography (CT) imaging, ultrasound (US) imaging, etc. in order to diagnose the structure. Such as: in the diagnosis and treatment of heart patients, in order to more clearly observe the diseased heart structure, it is often necessary to segment the heart nuclear magnetic resonance image (LGE-MRI) of the heart enhanced magnetic resonance imaging to obtain the atrial structure of the patient. As an important part of quantitative analysis of medical images, the segmentation effect of LGE-MRI directly affects the accuracy of subsequent diagnosis. Traditional LGE-MRI segmentation relies on manual operation by experienced healthcare workers, and in order to reduce the workload of the healthcare workers and the operational difficulty of image segmentation, a high-precision automated image segmentation scheme is required to be implemented. However, the LGE-MRI image has low contrast and the presence of physiological noise from the human body and thermal noise caused by the instrumentation makes the separation task difficult to accomplish using conventional image processing algorithms. Thus, the current segmentation schemes are still mainly manual or semi-automatic.
In recent years, the development of artificial intelligent deep learning technology is rapid, and the development of the technology provides a new research thought for automatic intelligent operation in various fields. In the medical field, the development of emerging medical modes is promoted by electronic medical data and Internet medical treatment, a deep learning-based auxiliary diagnosis system can replace manual operation to realize medical automatic analysis work such as higher-quality and efficient image positioning, segmentation and classification, and the automatic segmentation precision of a computer on medical images is superior to that of manual operation. Compared with manual or semi-manual segmentation schemes, the three-dimensional atrial segmentation algorithm design based on deep learning is a better choice for realizing three-dimensional medical image segmentation.
The convolution kernel of the 2D network lacks the z-axis dimension, the continuity of the images of the adjacent layers of the z-axis cannot be guaranteed, the 3D network can sense the context information of the z-axis, and the segmentation result can be close to the true value. In addition, image segmentation is understood as the two-classification of each pixel, while most pixels in an image are not objects to be identified, the critical structures that need to be segmented are only a small part of the image, and the phenomenon of unbalanced numbers of foreground and background pixels often occurs in medical images.
In order to reduce the interference of picture background or image noise, a CNN can be used to detect the position of a key structure in a medical image, a region of interest (ROI) is cut out from an original image according to the detected position information, and finally the ROI is specifically segmented to obtain a target structure. The Double 3D U-Net model structure is the idea of using, two CNN networks are used for realizing detection and segmentation, the network depth is deepened, the calculated amount and the consumption of the video memory are inevitably increased, the network is more 'bloated', the increase of the ROI easily causes the overflow of the video memory, and the model has poor universality.
Disclosure of Invention
In view of this, the embodiment of the application provides a three-dimensional medical image two-stage segmentation method with strong universality based on shared CNN.
In one aspect, an embodiment of the present application provides a two-stage segmentation method for a three-dimensional medical image based on a shared CNN, including:
inputting the initial image into a target model for image segmentation to obtain an initial segmentation map;
obtaining target position information of the ROI of the initial segmentation map according to the initial segmentation map;
determining the position and the size of an ROI cutting frame according to the target position information of the ROI and the preset side length of the ROI;
cutting the initial segmentation map and the initial image according to the position and the size of the ROI cutting frame to obtain a first feature map and a second feature map;
performing feature fusion on the first feature map and the second feature map to obtain a fusion feature map;
and inputting the fusion feature map into the target model for image segmentation to obtain a target segmentation result.
Optionally, in the step of inputting the initial image into a target model for image segmentation to obtain an initial segmentation map, the target model is a 3D U-Net network model.
Optionally, the step of inputting the initial image into the target model to perform image segmentation to obtain an initial segmentation map includes:
extracting features of the initial image through an encoder of the target model to obtain a five-layer feature map;
and carrying out information fusion on the five-layer feature images through a decoder of the target model to obtain an initial segmentation image.
Optionally, the step of obtaining target position information of the ROI of the initial segmentation map according to the initial segmentation map includes:
extracting features of the initial segmentation image through a full connection layer to obtain first position information;
mapping the first position information through an activation function to obtain second position information;
and obtaining target position information based on the second position information, the side length of the ROI and the size of the initial image.
Optionally, the step of feature-fusing the first feature map and the second feature map to obtain a fused feature map includes feature-fusing the first feature map and the second feature map by directly adding them together.
Optionally, in the step of extracting features of the initial image by using an encoder of the target model to obtain a five-layer feature map, the encoder is composed of five downsampling modules, and each downsampling module is composed of two convolution layers and a maximum pooling layer; wherein both convolution layers are 3x3x3 in size and the maximum pooling layer is 2x2x2 in size.
Optionally, in the step of performing information fusion on the five-layer feature map through a decoder of the target model to obtain an initial segmentation map, the decoder is composed of five upsampling modules, and each upsampling module is composed of a deconvolution layer, a jump connection layer and two convolution layers; wherein both convolution layers are 3x3x3 in size.
On the other hand, the embodiment of the application also provides a three-dimensional medical image two-stage segmentation device based on sharing CNN, which comprises the following steps:
the first module is used for inputting an initial image into the target model for image segmentation to obtain an initial segmentation map;
a second module, configured to obtain target position information of an ROI of the initial segmentation map according to the initial segmentation map;
a third module, configured to determine a position and a size of an ROI cutting frame according to the target position information of the ROI and a preset ROI side length;
a fourth module, configured to clip the first segmentation map and the second segmentation map according to the position and the size of the ROI clipping frame, to obtain a first feature map and a second feature map;
a fifth module, configured to perform feature fusion on the first feature map and the second feature map to obtain a fused feature map;
and a sixth module, configured to input the fusion feature map into the target model for image segmentation, so as to obtain a target segmentation result.
On the other hand, the embodiment of the application also provides electronic equipment, which comprises a processor and a memory; the memory is used for storing programs; the processor executes the program to implement the three-dimensional medical image two-stage segmentation method based on the shared CNN as described above.
In another aspect, an embodiment of the present application further provides a computer readable storage medium storing a program that is executed by a processor to implement the aforementioned shared CNN-based two-stage segmentation method for three-dimensional medical images.
Embodiments of the present application also disclose a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The computer instructions may be read from a computer-readable storage medium by a processor of a computer device, and executed by the processor, to cause the computer device to perform the foregoing method.
Embodiments of the present application include at least the following beneficial effects: according to the application, the first characteristic diagram and the second characteristic diagram are subjected to characteristic fusion to obtain the fusion characteristic diagram, and the information learned on the two characteristic diagrams is updated into the 3D U-Net network model, so that the tasks of detection and segmentation can be mutually promoted, and the performance of detection and segmentation is ensured not to be lost; according to the application, an initial image is input into a target model for image segmentation, and finally the fusion feature map is input into the target model for image segmentation, the size adjustable range of the ROI cutting frame is larger through sharing a 3D U-Net network model, the segmentation effect of a three-dimensional medical image is improved, the frame that the two-stage segmentation is realized through the integration of two models in the prior art is broken, the two-stage segmentation is realized through calling the same model, the flexibility and the universality of model segmentation are improved, and the occupied space of the model is reduced compared with the prior art by using two models.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a step diagram of a two-stage segmentation method for a three-dimensional medical image based on a shared CNN according to an embodiment of the present application;
fig. 2 is a flowchart of a two-stage segmentation method for a three-dimensional medical image based on a shared CNN according to an embodiment of the present application;
FIG. 3 is a graph comparing the segmentation results of two-stage segmentation of a three-dimensional medical image based on a shared CNN with the segmentation results of Double 3D U-Net according to an embodiment of the present application;
fig. 4 is a schematic diagram of a two-stage segmentation apparatus for a three-dimensional medical image based on a shared CNN according to an embodiment of the present application.
Detailed Description
The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.
Aiming at the problems existing in the prior art, the embodiment of the application provides a three-dimensional medical image two-stage segmentation method based on sharing CNN, as shown in figure 1, the method comprises the steps of S100-S600:
s100: and inputting the initial image into a target model for image segmentation to obtain an initial segmentation map.
Specifically, an LGE MRI (cardiac magnetic resonance image) dataset is first acquired, a 3D MRI (magnetic resonance image) image including atrial fibrillation in the LGE MRI dataset is preprocessed, and the gray scale distribution of the MRI image is [0,255 ]]The spatial resolution is 0.625x0.625x0.625mm 3 The size of the MRI image is firstly cut into 576×576×80, then the x and y axes are downsampled to 1/3 of the original size, and finally the size of the MRI image input to the 3D U-Net network model is 192×192×80.
Specifically, the target model is a 3D U-Net network model. The 3D U-Net network model is created based on the previous U-Net (2D), is of a U-shaped structure, and consists of an encoder and a decoder, wherein the encoder is used for carrying out feature extraction, the decoder is used for carrying out up-sampling to restore the original resolution of a feature map, and the decoder is mainly used for processing block images in medical images, so that the processing efficiency of the 3D images can be improved.
Specifically, the step of inputting the initial image into the target model to perform image segmentation to obtain an initial segmentation map specifically includes steps S110-S120:
s110: and extracting features of the initial image through an encoder of the target model to obtain a five-layer feature map.
Specifically, the encoder is composed of five downsampling modules, each downsampling module comprises a 3D residual convolution module and a maximum pooling layer; the 3D residual convolution module comprises two 3D convolution layers with the size of 3x3x3, residual connection is added at the feature input position, and the input feature is subjected to feature fusion operation of direct addition of the dimension rising and the feature graphs obtained through two convolutions through one convolution operation of 1x1x 1. The size of the maximum pooling layer is 2x2x2, an initial image is input into a target model, feature extraction is carried out on the initial image through an encoder, the size of the extracted feature image is reduced by half after each downsampling operation of five downsampling modules, for example, if the spatial resolution of the initial image is H×W×Z, H, W, Z is the value of three dimensions of the initial image, wherein H is the height of the three-dimensional image, W is the width of the three-dimensional image, Z is the length of the three-dimensional image, and the three-dimensional image is obtained after passing through the five downsampling modules of the encoderFive-layer characteristic diagram.
S120: and carrying out information fusion on the five-layer feature images through a decoder of the target model to obtain an initial segmentation image.
Specifically, the decoder is composed of five upsampling modules, each upsampling module is composed of a 3D (three-dimensional) deconvolution layer with the size of 2x2x2, a jump connection layer and two 3D (three-dimensional) deconvolution layers with the size of 3x3x3, an upsampling module corresponding to the downsampling module of the encoder is formed, the jump connection layer fuses the position information of the bottom layer with the semantic information of the deep layer through splicing, five upsampling modules restore the five-layer feature map to the original image size, and then the feature map is mapped to a specific value between (0 and 1) through a Sigmoid threshold function. And according to the labeling rule of the training label (background is 0, other areas are 1), the area with the median value of the feature map being larger than 0.5 is considered as other areas (areas needing image processing in the medical image), and the area with the median value being smaller than or equal to 0.5 is considered as the background area, so that an initial segmentation map is obtained.
S200: and obtaining target position information of the ROI of the initial segmentation map according to the initial segmentation map.
Specifically, the ROI is a region of interest in the image, in the embodiment of the present application, the ROI is a region in the three-dimensional medical image that needs to be subjected to image processing, for example, if the initial image is a cardiac magnetic resonance image (LGE-MRI), the ROI is a region including the heart; the initial image is lung CT or kidney CT, and the ROI is lung region and kidney region in the image; the initial image is a uterine ultrasound image, and the ROI is the uterine region in the ultrasound image. In machine vision and image processing, a region to be processed, called a region of interest, i.e., ROI, is outlined from the processed image in the form of a box, circle, ellipse, irregular polygon, or the like. Various operators and functions are commonly used in machine vision software such as Halcon, openCV, matlab to calculate the ROI and process the image in the next step.
Specifically, the step of obtaining the target position information of the ROI of the initial segmentation map according to the initial segmentation map specifically includes steps S201-S203:
s201: and extracting the characteristics of the initial segmentation image through the full connection layer to obtain first position information.
Specifically, the initial segmentation map is input into the full-connection layer to obtain two pieces of first position information including the relative position information of the segmentation target, namely the ROI position coordinates.
S202: and mapping the first position information through an activation function to obtain second position information.
Specifically, mapping the obtained first position information by using a Sigmoid function to obtain second position information location_0 and location_1 with the range of (0, 1), wherein the second position information location_0 and location_1 are used for adapting to two images with different sizes of an initial segmentation map and an initial image.
S203: and obtaining target position information based on the second position information, the side length of the ROI and the size of the initial image.
Specifically, the ROI side length len_of_roi is preset according to the actual image condition, the accurate X and Y coordinates of the image are set based on the second position information and the size of the initial image, only the X, Y axis is cut, the Z axis is not cut, specifically, the ROI side length is subtracted from the maximum value of the image X axis and the maximum value of the image Y axis respectively to ensure that the ROI frame does not cross the boundary of the original image, and then the subtracted values are multiplied by the second position information location_0 and location_1 respectively to obtain the coordinates (roix, roiy) of the point of the lower left corner of the ROI region on the X, Y axis plane. Thus, the position of the ROI frame in the original image [ roix: roix+len_of_roi, roiy: roiy+len_of_roi ], i.e., target position information, is obtained.
S300: and determining the position and the size of an ROI cutting frame according to the target position information of the ROI and the preset side length of the ROI.
Specifically, the size of the ROI crop frame is determined according to the preset side length of the ROI, and the position of the ROI crop frame is determined according to the coordinates of the target position information to be [ roix: roix+len_of_roi, roiy: roiy+len_of_roi ].
S400: and cutting the initial segmentation map and the initial image according to the position and the size of the ROI cutting frame to obtain a first characteristic map and a second characteristic map.
Specifically, a first feature map is obtained by cutting off a region with a position of [ roi_x: roi_x+len_of_roi, roi_y: roi_y+len_of_roi ] in the initial segmentation map; and cutting off the region with the position of [ roix: roix+len_of_roi, roiy: roiy+len_of_roi ] in the initial image to obtain a second feature map, and obtaining feature maps containing the ROI at two different stages.
S500: and carrying out feature fusion on the first feature map and the second feature map to obtain a fusion feature map.
Specifically, the step of performing feature fusion on the first feature map and the second feature map to obtain a fused feature map includes performing feature fusion on the first feature map and the second feature map in a direct addition manner, wherein dimensions of the obtained fused feature map are unchanged, only corresponding values are overlapped, and each dimension contains more features.
S600: and inputting the fusion feature map into the target model for image segmentation to obtain a target segmentation result.
Specifically, the fusion feature map is input into a 3D U-Net network model for image segmentation, a target segmentation result is obtained, the median value of the segmentation result is greater than 0.5 and is determined to be a region which needs to be subjected to image processing in the medical image, and the median value is less than or equal to 0.5 and is determined to be a background region.
Referring to fig. 2, an example of an implementation of the two-stage segmentation method of a three-dimensional medical image based on a shared CNN of the present application is illustrated.
1. First, 154 LGE MRI data sets of 2018 left atrium segmentation challenge race are acquired and comprise 3D MRI images of new atrium fibrillation, and the images are divided into 100 training sets and 54 test sets, wherein the images have two sizes of 640 x 88 and 576 x 88, and MRI gray scale distribution is in [0,255]The spatial resolution is 0.625x0.625x0.625mm 3 . The label corresponding to the MRI image is a binary image, 0 represents background, 255 represents left atrium area; firstly, the sizes of 640 x 88 and 576 x 88 are cut into 576 x 80, then the x and y axes are downsampled to be reduced to 1/3, and an initial image with the size of 192 x 80 is input into a 3D U-Net network model;
2. inputting the initial image into a 3D U-Net network model, extracting the characteristics of the initial image through an encoder of the model, reducing the size of the characteristic image by half through a downsampling module, and finally obtaining a five-layer characteristic image; information fusion is carried out on the five-layer feature images through a decoder of the model, and an initial segmentation image is obtained;
3. then, according to the initial segmentation map, obtaining target position information of the ROI in the initial segmentation map, wherein the ROI is a region of the left atrium in the map in the example, and obtaining position information of the left atrium region in the map;
4. determining the position and the size of a left atrium region cutting frame according to the position information of the left atrium region in the initial feature map and the preset side length of the ROI;
5. cutting the initial segmentation map and the initial image according to the position and the size of the obtained left atrium region cutting frame to obtain a first feature map and a second feature map respectively; then, the first feature map and the second feature map are directly added to perform feature fusion, and a fusion feature map is obtained;
6. and finally, inputting the fusion feature map into a 3D U-Net network model for image segmentation to obtain a target segmentation result, wherein the result with the median value larger than 0.5 is regarded as an atrium, and the result with the median value smaller than or equal to 0.5 is regarded as a background. Referring to fig. 3, for comparing the segmentation result of the two-stage segmentation method of the three-dimensional medical image based on the shared CNN proposed by the present application with the segmentation result of Double 3D U-Net in the prior art, fig. 3 (a) is a left atrium label of a public dataset and is marked by a professional doctor, fig. 3 (b) is the segmentation result of the method proposed by the present application, and fig. 3 (c) is the segmentation result of Double 3D U-Net model, it can be seen that the method proposed by the present application has more advantages in terms of processing surface details, and the result is similar to the actual value, as shown in table 1, the evaluation index parameters of Double 3D U-Net are compared with the evaluation index parameters of the method proposed by the present application, wherein, the shared 3D U-Net is the two-stage segmentation method of the three-dimensional medical image based on the shared CNN proposed by the present application:
TABLE 1
Model DICE Jaccard HD Size
Double 3D U-Net 0.916 0.845 1.300 45.18M
Sharing 3D U-Net 0.918 0.848 1.211 22.66M
The DICE coefficient is a measure index of aggregate similarity, and is generally used for calculating the similarity of two samples, wherein the highest value of the similarity is 1, and the lowest value of the similarity is 0; the Jaccard coefficient is used for comparing the similarity and the difference between the limited sample sets, and the larger the Jaccard coefficient value is, the higher the sample similarity is; the HD coefficient is Haoskov distance, the distance between the two sets is calculated, and the smaller the value is, the higher the similarity between the two sets is represented; size is the Size of the model and is used for measuring the occupied space of the model; the DICE coefficient, the Jaccard coefficient and the HD coefficient are all the similarity between the manual segmentation result of the professional doctor and the DICE coefficient and the Jaccard coefficient.
As can be seen from Table 1, the present application provides a better partitioning effect for sharing the 3D U-Net network model, the result is closer to the true value, and the space used is smaller.
This embodiment is based on PyTorch and runs on an NVIDIA GTX3090 GPU. An Adam optimizer was used in the experiment. The initial learning rate was set to 0.001, the number of iterations was set to 200, the batch size was set to 1, and the roi side length was set to 96. In the training stage, using five times of cross validation and early stopping training strategies to continuously monitor the loss change of the model on the validation set; training is stopped when the loss of the verification set is minimum, so that overfitting is prevented, and generalization of the model is ensured. Furthermore, a model for each iteration number is saved, and an optimal model is selected from among the models based on the DICE coefficients.
In summary, the two-stage segmentation method for the three-dimensional medical image based on the shared CNN provided by the embodiment of the application has the following advantages:
1. according to the application, the first characteristic diagram and the second characteristic diagram are subjected to characteristic fusion to obtain the fusion characteristic diagram, and the information learned on the two characteristic diagrams is updated into the 3D U-Net network model, so that the tasks of detection and segmentation can be mutually promoted, and the performance of detection and segmentation is ensured not to be lost;
2. according to the application, an initial image is input into a target model for image segmentation, and finally the fusion feature map is input into the target model for image segmentation, and the size adjustable range of the ROI cutting frame is larger by sharing a 3D U-Net network model, so that the segmentation effect is improved, the framework that the two-stage segmentation is realized by integrating two models in the prior art is broken, the two-stage segmentation is realized by calling the same model, the flexibility and the universality of model segmentation are improved, and the occupied space of the model is reduced compared with the prior art by using two models.
Referring to fig. 4, an embodiment of the present application further provides a two-stage segmentation apparatus for a three-dimensional medical image based on a shared CNN, including:
a first module 401, configured to input an initial image into a target model for image segmentation, to obtain an initial segmentation map;
a second module 402, configured to obtain target position information of an ROI of the initial segmentation map according to the initial segmentation map;
a third module 403, configured to determine a position and a size of an ROI crop frame according to the target position information of the ROI and a preset ROI side length;
a fourth module 404, configured to clip the first segmentation map and the second segmentation map according to the position and the size of the ROI clipping frame, so as to obtain a first feature map and a second feature map respectively;
a fifth module 405, configured to perform feature fusion on the first feature map and the second feature map to obtain a fused feature map;
and a sixth module 406, configured to input the fusion feature map into the target model for image segmentation, so as to obtain a target segmentation result.
The embodiment of the application also provides electronic equipment, which comprises a processor and a memory; the memory is used for storing programs; the processor executes the program to implement the three-dimensional medical image two-stage segmentation method based on the shared CNN as described above.
The embodiment of the application also provides a computer readable storage medium storing a program which is executed by a processor to implement the three-dimensional medical image two-stage segmentation method based on the shared CNN as described above.
Embodiments of the present application also disclose a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The computer instructions may be read from a computer-readable storage medium by a processor of a computer device, and executed by the processor, to cause the computer device to perform the method shown in fig. 1.
In some alternative embodiments, the functions/acts noted in the block diagrams may occur out of the order noted in the operational illustrations. For example, two blocks shown in succession may in fact be executed substantially concurrently or the blocks may sometimes be executed in the reverse order, depending upon the functionality/acts involved. Furthermore, the embodiments presented and described in the flowcharts of the present application are provided by way of example in order to provide a more thorough understanding of the technology. The disclosed methods are not limited to the operations and logic flows presented herein. Alternative embodiments are contemplated in which the order of various operations is changed, and in which sub-operations described as part of a larger operation are performed independently.
Furthermore, while the application is described in the context of functional modules, it should be appreciated that, unless otherwise indicated, one or more of the described functions and/or features may be integrated in a single physical device and/or software module or one or more functions and/or features may be implemented in separate physical devices or software modules. It will also be appreciated that a detailed discussion of the actual implementation of each module is not necessary to an understanding of the present application. Rather, the actual implementation of the various functional modules in the apparatus disclosed herein will be apparent to those skilled in the art from consideration of their attributes, functions and internal relationships. Accordingly, one of ordinary skill in the art can implement the application as set forth in the claims without undue experimentation. It is also to be understood that the specific concepts disclosed are merely illustrative and are not intended to be limiting upon the scope of the application, which is to be defined in the appended claims and their full scope of equivalents.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
Logic and/or steps represented in the flowcharts or otherwise described herein, e.g., a ordered listing of executable instructions for implementing logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). In addition, the computer readable medium may even be paper or other suitable medium on which the program is printed, as the program may be electronically captured, via, for instance, optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner, if necessary, and then stored in a computer memory.
It is to be understood that portions of the present application may be implemented in hardware, software, firmware, or a combination thereof. In the above-described embodiments, the various steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, may be implemented using any one or combination of the following techniques, as is well known in the art: discrete logic circuits having logic gates for implementing logic functions on data signals, application specific integrated circuits having suitable combinational logic gates, programmable Gate Arrays (PGAs), field Programmable Gate Arrays (FPGAs), and the like.
In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present application. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
While embodiments of the present application have been shown and described, it will be understood by those of ordinary skill in the art that: many changes, modifications, substitutions and variations may be made to the embodiments without departing from the spirit and principles of the application, the scope of which is defined by the claims and their equivalents.
While the preferred embodiment of the present application has been described in detail, the present application is not limited to the embodiments described above, and those skilled in the art can make various equivalent modifications or substitutions without departing from the spirit of the present application, and these equivalent modifications or substitutions are included in the scope of the present application as defined in the appended claims.

Claims (10)

1. The three-dimensional medical image two-stage segmentation method based on the sharing CNN is characterized by comprising the following steps of:
inputting the initial image into a target model for image segmentation to obtain an initial segmentation map;
obtaining target position information of the ROI of the initial segmentation map according to the initial segmentation map;
determining the position and the size of an ROI cutting frame according to the target position information of the ROI and the preset side length of the ROI;
cutting the initial segmentation map and the initial image according to the position and the size of the ROI cutting frame to obtain a first feature map and a second feature map;
performing feature fusion on the first feature map and the second feature map to obtain a fusion feature map;
and inputting the fusion feature map into the target model for image segmentation to obtain a target segmentation result.
2. The method for two-stage segmentation of three-dimensional medical images based on shared CNN according to claim 1, wherein in the step of inputting an initial image into a target model for image segmentation to obtain an initial segmentation map, the target model is a 3DU-Net network model.
3. The method for two-stage segmentation of a three-dimensional medical image based on a shared CNN according to claim 1, wherein the step of inputting an initial image into a target model for image segmentation to obtain an initial segmentation map comprises:
extracting features of the initial image through an encoder of the target model to obtain a five-layer feature map;
and carrying out information fusion on the five-layer feature images through a decoder of the target model to obtain an initial segmentation image.
4. The method for two-stage segmentation of a three-dimensional medical image based on a shared CNN according to claim 1, wherein the step of obtaining target position information of an ROI of the initial segmentation map from the initial segmentation map comprises:
extracting features of the initial segmentation image through a full connection layer to obtain first position information;
mapping the first position information through an activation function to obtain second position information;
and obtaining target position information based on the second position information, the side length of the ROI and the size of the initial image.
5. The method for two-stage segmentation of a three-dimensional medical image based on a shared CNN according to claim 1, wherein the step of feature-fusing the first feature map and the second feature map to obtain a fused feature map includes feature-fusing the first feature map and the second feature map by directly adding them together.
6. The method for two-stage segmentation of a three-dimensional medical image based on a shared CNN according to claim 3, wherein in the step of extracting features of an initial image by an encoder of the object model to obtain a five-layer feature map, the encoder is composed of five downsampling modules, each downsampling module is composed of two convolution layers and a maximum pooling layer; wherein both convolution layers are 3x3x3 in size and the maximum pooling layer is 2x2x2 in size.
7. The method for two-stage segmentation of a three-dimensional medical image based on a shared CNN according to claim 1, wherein in the step of information fusing the five-layer feature map by a decoder of the object model to obtain an initial segmentation map, the decoder is composed of five upsampling modules, each upsampling module is composed of one deconvolution layer, one skip connection layer and two convolution layers; wherein both convolution layers are 3x3x3 in size.
8. A three-dimensional medical image two-stage segmentation apparatus based on a shared CNN, comprising:
the first module is used for inputting an initial image into the target model for image segmentation to obtain an initial segmentation map;
a second module, configured to obtain target position information of an ROI of the initial segmentation map according to the initial segmentation map;
a third module, configured to determine a position and a size of an ROI cutting frame according to the target position information of the ROI and a preset ROI side length;
a fourth module, configured to clip the first segmentation map and the second segmentation map according to the position and the size of the ROI clipping frame, to obtain a first feature map and a second feature map;
a fifth module, configured to perform feature fusion on the first feature map and the second feature map to obtain a fused feature map;
and a sixth module, configured to input the fusion feature map into the target model for image segmentation, so as to obtain a target segmentation result.
9. An electronic device comprising a processor and a memory;
the memory is used for storing programs;
the processor executing the program implements the method of any one of claims 1 to 7.
10. A computer-readable storage medium, characterized in that the storage medium stores a program that is executed by a processor to implement the method of any one of claims 1 to 7.
CN202310559706.0A 2023-05-17 2023-05-17 Three-dimensional medical image two-stage segmentation method and device based on sharing CNN Pending CN116664594A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310559706.0A CN116664594A (en) 2023-05-17 2023-05-17 Three-dimensional medical image two-stage segmentation method and device based on sharing CNN

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310559706.0A CN116664594A (en) 2023-05-17 2023-05-17 Three-dimensional medical image two-stage segmentation method and device based on sharing CNN

Publications (1)

Publication Number Publication Date
CN116664594A true CN116664594A (en) 2023-08-29

Family

ID=87725252

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310559706.0A Pending CN116664594A (en) 2023-05-17 2023-05-17 Three-dimensional medical image two-stage segmentation method and device based on sharing CNN

Country Status (1)

Country Link
CN (1) CN116664594A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118365803A (en) * 2024-06-19 2024-07-19 南方医科大学珠江医院 Biliary tract three-dimensional reconstruction method based on 3D-MRCP image and related device

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118365803A (en) * 2024-06-19 2024-07-19 南方医科大学珠江医院 Biliary tract three-dimensional reconstruction method based on 3D-MRCP image and related device

Similar Documents

Publication Publication Date Title
CN107274402A (en) A kind of Lung neoplasm automatic testing method and system based on chest CT image
CN111340789A (en) Method, device, equipment and storage medium for identifying and quantifying eye fundus retinal blood vessels
CN108133476B (en) Method and system for automatically detecting pulmonary nodules
WO2019000455A1 (en) Method and system for segmenting image
JP2023540910A (en) Connected Machine Learning Model with Collaborative Training for Lesion Detection
JP2004520923A (en) How to segment digital images
WO2022105623A1 (en) Intracranial vascular focus recognition method based on transfer learning
RU2752690C2 (en) Detecting changes in medical imaging
KR20200082660A (en) Pathological diagnosis method and apparatus based on machine learning
US11830193B2 (en) Recognition method of intracranial vascular lesions based on transfer learning
Poh et al. Automatic segmentation of ventricular cerebrospinal fluid from ischemic stroke CT images
Wei et al. Automated lung segmentation and image quality assessment for clinical 3-D/4-D-computed tomography
CN111292307A (en) Digestive system gallstone recognition method and positioning method
CN116664594A (en) Three-dimensional medical image two-stage segmentation method and device based on sharing CNN
US20220301177A1 (en) Updating boundary segmentations
CN114332132A (en) Image segmentation method and device and computer equipment
Tan et al. Automatic prostate segmentation based on fusion between deep network and variational methods
Feng et al. Automatic localization and segmentation of focal cortical dysplasia in FLAIR‐negative patients using a convolutional neural network
CN115937196A (en) Medical image analysis system, analysis method and computer-readable storage medium
CN116309640A (en) Image automatic segmentation method based on multi-level multi-attention MLMA-UNet network
CN115018863A (en) Image segmentation method and device based on deep learning
US20110194741A1 (en) Brain ventricle analysis
Duan et al. Region growing algorithm combined with morphology and skeleton analysis for segmenting airway tree in CT images
CN110533667B (en) Lung tumor CT image 3D segmentation method based on image pyramid fusion
CN112750110A (en) Evaluation system for evaluating lung lesion based on neural network and related products

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination