CN112465754B - 3D medical image segmentation method and device based on layered perception fusion and storage medium - Google Patents

3D medical image segmentation method and device based on layered perception fusion and storage medium Download PDF

Info

Publication number
CN112465754B
CN112465754B CN202011287175.7A CN202011287175A CN112465754B CN 112465754 B CN112465754 B CN 112465754B CN 202011287175 A CN202011287175 A CN 202011287175A CN 112465754 B CN112465754 B CN 112465754B
Authority
CN
China
Prior art keywords
medical image
image
module
network
segmentation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011287175.7A
Other languages
Chinese (zh)
Other versions
CN112465754A (en
Inventor
孟令龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yunrun Da Data Service Co ltd
Original Assignee
Yunrun Da Data Service Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yunrun Da Data Service Co ltd filed Critical Yunrun Da Data Service Co ltd
Priority to CN202011287175.7A priority Critical patent/CN112465754B/en
Publication of CN112465754A publication Critical patent/CN112465754A/en
Application granted granted Critical
Publication of CN112465754B publication Critical patent/CN112465754B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • G06T2207/10012Stereo images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20021Dividing image into blocks, subimages or windows
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30096Tumor; Lesion

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Biophysics (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Medical Informatics (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Radiology & Medical Imaging (AREA)
  • Quality & Reliability (AREA)
  • Measuring And Recording Apparatus For Diagnosis (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a 3D medical image segmentation method, a device and a storage medium based on layered perception fusion, wherein the method comprises the following steps: step S1, acquiring a 3D medical image for preprocessing, and slicing the preprocessed 3D medical image to obtain a plurality of slice images; step S2, performing convolution calculation on each slice image through a convolution neural network semantic segmentation algorithm to obtain a result of each slice image after semantic segmentation; in step S3, the prediction results of the slice images are fused, and the final medical image segmentation result is output.

Description

3D medical image segmentation method and device based on layered perception fusion and storage medium
Technical Field
The invention relates to the technical field of image segmentation, in particular to a 3D medical image segmentation method, a device and a storage medium based on layered perception fusion.
Background
The medical image processing objects are medical images of various imaging mechanisms, and the clinical widely used medical imaging categories mainly include four categories of X-ray imaging (X-CT), Magnetic Resonance Imaging (MRI), Nuclear Medicine Imaging (NMI) and Ultrasonic Imaging (UI). In current medical imaging diagnosis, the pathological changes are mainly discovered by observing a group of two-dimensional slice images, which often needs to be determined by the experience of doctors. The two-dimensional slice image is analyzed and processed by using a computer image processing technology, so that segmentation extraction, three-dimensional reconstruction and three-dimensional display of human organs, soft tissues and pathological variants are realized, and qualitative and even quantitative analysis of pathological change bodies and other interested areas can be assisted by doctors, so that the accuracy and reliability of medical diagnosis are greatly improved; the system can also play an important auxiliary role in medical teaching, operation planning, operation simulation and various medical researches. Medical image segmentation mainly takes images of various cells, tissues and organs as objects to be processed, and the process of segmenting the images into a plurality of regions according to similarity or difference among the regions. Therefore, medical image segmentation plays a very important role in disease diagnosis and prediction.
The 2015 university of hong kong computer science department and biological signal research center and the germany university of fleabag propose a mirror image full convolution neural network Unet, which has small parameters and strong fitting capability and is widely applied to defect detection and medical image segmentation, and then a series of full convolution neural networks for expanding a perception domain appear. Due to the layer jump connection mode and the multi-scale network structure of the Unet, the loss of information of the neural network in the operation process is effectively reduced, and meanwhile, due to the structure, the Unet can obtain excellent performance in medical image segmentation.
Limited to the issues of imaging technology and privacy of medical conditions, medical data is often at a premium. The size of a medical image segmentation model is limited, the model is overfitting due to the fact that the model is too large, and the accuracy is low due to the fact that the model is too small. Reasonable data enhancement and appropriate modeling are key to handling good medical image segmentation. Although a great number of full convolution neural network methods for image segmentation exist at present, the full convolution neural network needs a great amount of data as a drive, a great number of training models cannot avoid flaws due to inexplicability, and meanwhile, a single end-to-end full convolution neural network is difficult to directly meet the high-precision requirement in a medical disease diagnosis scene due to the fact that a medical image scene is complex and changeable and data is limited. At present, in many papers, in order to improve the accuracy of image segmentation, a large number of optimization unit modules are added and the network is deepened to extract more features, and these methods require a large amount of calculation and memory occupation, and cannot provide real-time measurement control in an industrial scene under the condition of limited calculation resources. And the single method for increasing the number of network layers is poor in generalization capability for small data sets, needs a large amount of data to drive, and is not suitable for medical image segmentation scenes.
The current medical image segmentation method is to cut a 3D image into individual 2D images, then use a segmentation model commonly used in natural image/computer vision for each 2D image, and finally splice the segmented result for each 2D image into a 3D form, and the 2D slice solution fully utilizes the information in the whole 2D slice, but ignores the relationship between several adjacent slice images and loses the relevance between the global slice sequences. The 3D image is directly segmented by using the 3D DCNN, because the calculated amount of a 3D CNN network model is large, the display memory is possibly insufficient by directly using the 3D CNN on an original high-resolution 3D block, the 3D original resolution needs to be cut into individual 3D blocks, and then the 3D CNN network is sent to each block for splicing.
Disclosure of Invention
In order to overcome the defects of the prior art, the present invention provides a 3D medical image segmentation method, device and storage medium based on layered perception fusion, so as to divide a 3D medical image into three 2D images in H, W, C directions and a plurality of small 3D images, and to use the fusion of a 2D channel sequence relation model and a 3D model, so as to solve the problem that a single model cannot be fully utilized and prediction is inaccurate, and establish a voting mechanism according to the fusion of multiple models, thereby achieving the purpose of efficient and accurate 3D medical image segmentation.
In order to achieve the above and other objects, the present invention provides a 3D medical image segmentation method based on layered perception fusion, comprising the following steps:
step S1, acquiring a 3D medical image for preprocessing, and slicing the preprocessed 3D medical image to obtain a plurality of slice images;
step S2, performing convolution calculation on each slice image through a convolution neural network semantic segmentation algorithm to obtain a result of each slice image after semantic segmentation;
in step S3, the prediction results of the slice images are fused, and the final medical image segmentation result is output.
Preferably, the step S1 further includes:
s100, acquiring a 3D medical image, marking the 3D medical image, and dividing the 3D medical image into a target part and a background part;
step S101, checking whether the data of the 3D medical image is correct or incorrect after marking is finished;
step S102, segmenting the 3D medical image to obtain three 2D images of the 3D medical image according to H, W, C slices and decomposing the 3D medical image into a plurality of small 3D images;
step S103, data enhancement is performed on each sliced image.
Preferably, the object represents a target region, i.e., an organ pathological tissue image region, and the background represents a non-organ portion.
Preferably, after the 3D medical image is marked, the image becomes a background with a 0 pixel value and a target with a 1 pixel value, and if there are a plurality of pathological tissues, different pixel points are used for distinguishing.
Preferably, in step S2, the convolutional neural network is an RFEUnet network structure including a receptive field enhancement module RFEM.
Preferably, the RFEUnet network structure changes the channel combination of the Unet network into the original 1/2 on the basis of the existing Unet network structure, and enlarges the sensing performance of the network model by adding the receptive field module RFEM to the tail of the encoding structure of the Unet network and using Maxpooling, a mish activation function and a void convolution structure through the receptive field module RFEM.
Preferably, in step S3, the H, W, C direction slice image and the small 3D images enter four RFEUnet network structures to form four segmentation results, and the four output results are averaged to obtain the final image segmentation result.
In order to achieve the above object, the present invention further provides a 3D medical image segmentation apparatus based on layered perception fusion, including:
the preprocessing module is used for acquiring a 3D medical image for preprocessing, and slicing the preprocessed 3D medical image to acquire a plurality of slice images;
the segmentation module is used for performing convolution calculation on each slice image through a convolution neural network semantic segmentation algorithm to obtain a result after each slice image is subjected to semantic segmentation;
and the fusion module is used for fusing the prediction results of all the slice images and outputting the final medical image segmentation result.
Preferably, the segmentation module adopts an RFEUnet network structure including a receptive field enhancement module RFEM as the convolutional neural network, the RFEUnet network structure changes a channel combination of the uet network to 1/2 of an original channel combination on the basis of an existing uet network structure, and the perceptual performance of the network model is expanded by adding the receptive field module RFEM to a tail of the encoding structure of the uet network, and using the maxporoling, the mish activation function, and the void convolutional structure through the receptive field module RFEM.
To achieve the above object, the present invention also provides a computer-readable storage medium for storing program code for executing the above 3D medical image segmentation method.
Compared with the prior art, the 3D medical image segmentation method, the device and the storage medium based on layered perception fusion are used for segmenting the 3D medical image into three 2D images in the H, W, C direction and a plurality of small 3D images, solving the problems that a single model cannot be fully utilized and prediction is inaccurate by utilizing the fusion of a 2D channel sequence relation model and the 3D model, and establishing a voting mechanism according to the fusion of multiple models so as to achieve the purpose of efficiently and accurately segmenting the 3D medical image.
Drawings
FIG. 1 is a flowchart illustrating the steps of a 3D medical image segmentation method based on layered perception fusion according to the present invention;
FIG. 2 is a schematic structural diagram of the RFE augmented reality enhanced Unet model in an embodiment of the present invention;
FIG. 3 is a schematic structural diagram of an embodiment of the invention, namely an RFEM (receptive field enhancement module);
FIG. 4 is a system architecture diagram of a 3D medical image segmentation apparatus based on layered perception fusion according to the present invention;
FIG. 5 is a flow chart of 3D medical image segmentation based on layered perception fusion according to an embodiment of the present invention;
fig. 6 is a comparison diagram of the results of segmenting a 3D medical image by using the original Unet network model, the RFEUnet network model and the multi-model fusion in the embodiment of the present invention.
Detailed Description
Other advantages and capabilities of the present invention will be readily apparent to those skilled in the art from the present disclosure by describing the embodiments of the present invention with specific embodiments thereof in conjunction with the accompanying drawings. The invention is capable of other and different embodiments and its several details are capable of modification in various other respects, all without departing from the spirit and scope of the present invention.
Fig. 1 is a flow chart of steps of a 3D medical image segmentation method based on layered perception fusion according to the present invention. As shown in FIG. 1, the invention relates to a 3D medical image segmentation method based on layered perception fusion, which comprises the following steps:
step S1, acquiring a 3D medical image, preprocessing the 3D medical image, and performing slice processing on the preprocessed image to obtain a plurality of slice images after slice processing.
Specifically, step S1 further includes:
and S100, acquiring a 3D medical image, marking the 3D medical image, and dividing the 3D medical image into a target part and a background part.
In the specific embodiment of the invention, after the 3D medical image is obtained, marking is carried out on the 3D medical image by using marking software, and the 3D medical image is divided into a target part and a background part, wherein the target represents a target area, namely an organ pathological tissue image area, and the background represents a non-organ part.
Step S101, after marking is completed, the data of the 3D medical image is checked for correctness.
After marking is completed, the image data is checked for correctness. If the distribution of possible pixel points of the marking software is not uniform, the image is changed into a background with a 0 pixel value and a target with a 1 pixel value, if a plurality of pathological tissues exist, different pixel points can be used for distinguishing, for example, the pixel point 1 is a tumor tissue, and the pixel point 2 is the periphery of the tumor, and the like.
Step S102, segmenting the 3D medical image to obtain three 2D images of the 3D medical image according to H, W, C slices and decomposing the 3D medical image into a plurality of small 3D images.
That is, in the present invention, there are two slicing methods for slicing a 3D medical image, the first being that the 3D image is a 2D image sliced differently according to H, W, C; the second is to decompose the 3D image into several small 3D images.
In the segmentation of the medical image, if a 3D medical image is directly input into the convolutional neural network and then the 3D image is output, which results in a large amount of calculation and a slow comparison of operation speed, and since the medical image data set is very small and the effect of directly using the 3D image is not good, the present invention considers that the 3D medical image is sliced, that is, 2D image is sliced according to H, W, C into a 2D network to solve the problem of complicated calculation amount, but the network of the 2D image does not consider the problem of correlation between the 2D image and the 2D image, so the present invention intends to consider the results of forming 4 3D segmentations by using the 2D image segmented according to H, W, C three directions and several small 3D images decomposed from the 3D image through 4 convolutional neural networks respectively, the method of processing in this way can not only utilize the hierarchical relation, but also resist the overfitting effect generated by the few data sets.
Step S103, data enhancement is performed on each sliced image.
In the embodiment of the present invention, the data enhancement mainly includes rotation, clipping, and scaling, and plays a role in preventing overfitting and increasing robustness, that is, the processing of rotation, clipping, scaling, and the like is performed on each sliced image after segmentation to prevent overfitting and increase robustness.
And step S2, performing convolution calculation on each slice image through a convolution neural network semantic segmentation algorithm to obtain a result of semantic segmentation of each slice image.
In a specific embodiment of the present invention, the convolutional neural network mainly extracts the high-level features of each slice image through five void convolutional structures, where the five void convolutional structures all include hierarchical structures. Specifically, the invention provides an RFEUnet (perceptual field enhancement Unet) according to the mechanism characteristic that a medical image needs global judgment, a network structure is shown in figure 2, wherein 1/n in the figure represents the multiple of down-sampling of the image, C represents the number of output channels, x 2 represents that two times of convolution operation are carried out, down sampling adopts Maxpooling to carry out down-sampling, Upsampling utilizes bilinear interpolation to carry out up-sampling, and feature fusion adopts a channel splicing mode.
Compared with the original Unet network structure, the RFEUnet network structure provided by the invention is improved mainly by the following three points:
RFEUnet changes the channel combination of Unet to 1/2, so that it has faster reasoning speed, for example, changes the channel number of [64, 128, 256, 512, 1024, 512, 256, 64, 2] to [32, 64, 128, 256, 512, 256, 128, 32, 2], and the channel combination makes the network parameter less and the calculation efficiency higher.
2. And establishing a receptive field module RFEM and adding the receptive field module RFEM to the tail part of the coding structure, and effectively expanding the perception performance of the receptive field enhancement model of the model by utilizing Maxpooling, a mish activation function and a hollow convolution structure. Specifically, the receptive field module RFEM extracts high-level features by using five hole convolution structures rich in hierarchical structures, as shown in fig. 3, C in the diagram represents the number of output channels, a scaling rate represents the number of holes of the hole convolution, 1/16 represents a downsampling 16 times of a desired image, the ratio of the hole convolution is [1, 3, 6, 12, 18], the left top is a maxpololing operation, the hole convolution structure is subjected to a convolution operation and then added with BN (Batch Normalization) and a hash activation function, and the receptive field module RFEM can effectively expand the perceptual performance of the receptive field enhancement model of the model by using the maxpololing, the hash activation function and the hole convolution structure.
3. In the original Unet direct connection structure, a residual error module is not added, and an overfitting phenomenon can occur after a network is deepened, so that a residual error thought and a receptive field enhancement module are added to the Unet bottom layer to form an RFEM (receptive field enhancement module), and the module effectively expands the receptive field of the bottom layer characteristics in the convolution calculation process and can not generate the overfitting phenomenon due to the deepening of the network layer as shown in FIG. 3.
In step S3, the prediction results of the slice images are fused, and the final medical image segmentation result is output.
In the present invention, the cutting is performedThe slice images are H, W, C-direction slice images and a plurality of small 3D images respectively, the slice images enter four convolutional neural networks to form four 3D segmentation results, and the final result is obtained by averaging according to the four output results. That is, the four slice images output four results F through four RFEUnet networks with different weights, respectivelyA(X),FB(X),FC(X),FD(X), for example, the RFEUnet network structure outputs a probability value for each pixel of each input image, averages the probabilities of the four output values, and then indexes the maximum output between categories:
FO(X)=agrmax((FA(X)+FB(X)+FC(X)+FD(X))/4)
in brief, assuming that the neural network outputs are all numbers from 0 to 1, so that it can be understood that one probability value, the output of the present invention is four 3D images so that the average value is taken for each channel, where each pixel has two classes, i.e., lesion and non-lesion, and then the lesion probability and non-lesion probability are compared with each other by the size, i.e., the index maximum. If the focus large-assignment pixel point is 1, and if the non-focus large-assignment pixel point is 0, the effect of the argmax function is achieved, and thus the focus area can be known. If the image is multiplied by 255, the lesion is white and the non-lesion is black, a distinction is made, and the output is a 3D medical image segmentation effect, i.e. the region in the 3D image with the lesion is identified.
Therefore, the 3D medical image is sliced into three 2D images according to H, W, C, and multidirectional information is sent into a 2D network structure, so that the segmentation of the 3D medical image utilizes directional information (a single 2D model cannot have the characteristics), and meanwhile, the 3D prediction splicing of the 3D medical image into a plurality of small blocks utilizes the information of interlayer relation, so that the hierarchical relation of the images is utilized, and the overfitting effect caused by less medical data sets can be resisted.
Fig. 4 is a system structure diagram of a 3D medical image segmentation apparatus based on layered perception fusion according to the present invention. As shown in fig. 4, the 3D medical image segmentation apparatus based on layered perception fusion of the present invention includes:
the preprocessing module 10 is configured to acquire a 3D medical image, preprocess the 3D medical image, perform slicing processing on the 3D medical image, and obtain a plurality of sliced images after slicing processing.
In the present invention, the preprocessing module 10 is specifically configured to:
acquiring a 3D medical image, marking the 3D medical image, and dividing the 3D medical image into a target part and a background part.
In the specific embodiment of the invention, after the 3D medical image is obtained, marking is carried out on the 3D medical image by using marking software, and the 3D medical image is divided into a target part and a background part, wherein the target represents a target area, namely an organ pathological tissue image area, and the background represents a non-organ part.
After the marking is completed, the data of the 3D medical image is checked for correctness.
After marking is completed, the image data is checked for correctness. If the distribution of possible pixel points of the marking software is not uniform, the image is changed into a background with a 0 pixel value and a target with a 1 pixel value, if a plurality of pathological tissues exist, different pixel points can be used for distinguishing, for example, the pixel point 1 is a tumor tissue, and the pixel point 2 is the periphery of the tumor, and the like.
The 3D medical image is segmented, obtaining three 2D images sliced H, W, C from the 3D medical image and decomposing the 3D medical image into several small 3D images.
That is, in the present invention, there are two slicing methods for slicing a 3D medical image, the first being that the 3D image is a 2D image sliced differently according to H, W, C; the second is to decompose the 3D image into several small 3D images.
In the segmentation of the medical image, if a 3D medical image is directly input into the convolutional neural network and then the 3D image is output, which results in a large amount of calculation and a slow comparison of operation speed, and since the medical image data set is very small and the effect of directly using the 3D image is not good, the present invention considers that the 3D medical image is sliced, that is, 2D image is sliced according to H, W, C into a 2D network to solve the problem of complicated calculation amount, but the network of the 2D image does not consider the problem of correlation between the 2D image and the 2D image, so the present invention intends to consider the results of forming 4 3D segmentations by using the 2D image segmented according to H, W, C three directions and several small 3D images decomposed from the 3D image through 4 convolutional neural networks respectively, the method of processing in this way can not only utilize the hierarchical relation, but also resist the overfitting effect generated by the few data sets.
And performing data enhancement on each sliced image after segmentation.
In the embodiment of the present invention, the data enhancement mainly includes rotation, clipping, and scaling, and plays a role in preventing overfitting and increasing robustness, that is, the processing of rotation, clipping, scaling, and the like is performed on each sliced image after segmentation to prevent overfitting and increase robustness.
And the segmentation module 20 is configured to perform convolution calculation on each slice image through a convolution neural network semantic segmentation algorithm, so as to obtain a result of semantic segmentation of each slice image.
In a specific embodiment of the present invention, the convolutional neural network mainly extracts the high-level features of each slice image through five void convolutional structures, where the five void convolutional structures all include hierarchical structures. Specifically, the invention provides an RFEUnet (perceptual field enhancement Unet) according to the mechanism characteristic that a medical image needs global judgment, a network structure is shown in figure 2, wherein 1/n in the figure represents the multiple of down-sampling of the image, C represents the number of output channels, x 2 represents that two times of convolution operation are carried out, down sampling adopts Maxpooling to carry out down-sampling, Upsampling utilizes bilinear interpolation to carry out up-sampling, and feature fusion adopts a channel splicing mode.
Compared with the original Unet network structure, the RFEUnet network structure provided by the invention is improved mainly by the following three points:
RFEUnet changes the channel combination of Unet to 1/2, so that it has faster reasoning speed, for example, changes the channel number of [64, 128, 256, 512, 1024, 512, 256, 64, 2] to [32, 64, 128, 256, 512, 256, 128, 32, 2], and the channel combination makes the network parameter less and the calculation efficiency higher.
2. And establishing a receptive field module RFEM and adding the receptive field module RFEM to the tail part of the coding structure, and effectively expanding the perception performance of the receptive field enhancement model of the model by utilizing Maxpooling, a mish activation function and a hollow convolution structure. Specifically, the receptive field module RFEM extracts high-level features by using five hole convolution structures rich in hierarchical structures, as shown in fig. 3, C in the diagram represents the number of output channels, a scaling rate represents the number of holes of the hole convolution, 1/16 represents a downsampling 16 times of a desired image, the ratio of the hole convolution is [1, 3, 6, 12, 18], the left top is a maxpololing operation, the hole convolution structure is subjected to a convolution operation and then added with BN (Batch Normalization) and a hash activation function, and the receptive field module RFEM can effectively expand the perceptual performance of the receptive field enhancement model of the model by using the maxpololing, the hash activation function and the hole convolution structure.
3. In the original Unet direct connection structure, a residual error module is not added, and an overfitting phenomenon can occur after a network is deepened, so that a residual error thought and a receptive field enhancement module are added to the Unet bottom layer to form an RFEM (receptive field enhancement module), and the module effectively expands the receptive field of the bottom layer characteristics in the convolution calculation process and can not generate the overfitting phenomenon due to the deepening of the network layer as shown in FIG. 3.
And the fusion module 30 is configured to fuse the prediction results of the slice images and output a final medical image segmentation result.
In the present invention, the slice images are H, W, C-direction slice images and a plurality of small 3D images, which enter four convolutional neural networks to form four 3D segmentation results, and the fusion module 30 averages the four output results to obtain the final result. That is, the four slice images are respectively weighted differently by fourRFEUnet network outputs four results FA(X),FB(X),FC(X),FD(X), for example, the RFEUnet network structure outputs a probability value for each pixel of each input image, averages the probabilities of the four output values, and then indexes the maximum output between categories:
FO(X)=agrmax((FA(X)+FB(X)+FC(X)+FD(X))/4)
in brief, assuming that the neural network outputs are all numbers from 0 to 1, so that it can be understood that one probability value, the output of the present invention is four 3D images so that the average value is taken for each channel, where each pixel has two classes, i.e., lesion and non-lesion, and then the lesion probability and non-lesion probability are compared with each other by the size, i.e., the index maximum. If the focus large-assignment pixel point is 1, and if the non-focus large-assignment pixel point is 0, the effect of the argmax function is achieved, and thus the focus area can be known. If the image is multiplied by 255, the lesion is white and the non-lesion is black, a distinction is made, and the output is a 3D medical image segmentation effect, i.e. the region in the 3D image with the lesion is identified.
The present invention also provides a computer-readable storage medium for storing program code for performing the 3D medical image segmentation method provided by the above embodiments.
Examples
Fig. 5 is a flowchart of 3D medical image segmentation based on layered perceptual fusion according to an embodiment of the present invention. In an embodiment of the present invention, a 3D medical image segmentation method based on layered perception fusion includes:
step one, data making, slicing and enhancing
Step 1.1, marking the 3D medical image by using marking software, wherein the marking is divided into a target part and a background part, the target part represents a target area, namely an organ pathological tissue image area, and the background part represents a non-organ part. If a plurality of pathological tissues can be distinguished by different pixel points, for example, the pixel point 1 is tumor tissue, and the pixel point 2 is tumor periphery, etc.
And 1.2, checking whether the data is correct or incorrect after marking is finished. For example, using programming, check if the data distribution of the background and lesion areas is background: 0. focal zone 1: 1. the focal zones 2:2, etc. are simply checked for a pixel value so as not to affect the subsequent encoding, and are not described herein.
Step 1.3, the 3D medical image is sliced. There are two slicing modes, from the 3D model and the 2D model, the first being that the 3D image is a 2D image sliced differently as in H, W, C. The second is to decompose the 3D image into several small 3D images.
And step 1.4, performing data enhancement on each sliced image. Data enhancement is important for small data sets, mainly rotation, cropping and scaling, and plays a role in preventing overfitting and increasing robustness.
And step two, according to the mechanism characteristic that the medical image needs to be judged globally, providing an RFEUnet (responsive field enhancement Unet) network structure, and respectively inputting each slice image into a RFEUnet network result to obtain four semantic segmentation results. The network structure improves the number of channels on the basis of the original structure of Unet (the original channel combination of Unet is changed into the original 1/2), and simultaneously increases the receptive field RFEM of the image (the receptive field module RFEM effectively enlarges the perception performance of the receptive field enhanced network model of the model by using Maxpooling, mish activation function and a cavity convolution structure), so that the network model has a larger perception domain range on large-area image segmentation.
Step three, a voting mechanism: and fusing the output results of the H slice direction prediction network model, the W slice direction prediction network model, the C slice direction prediction network model and the 3D network model to obtain a fusion result of the 3D medical image. The total output of each network model is A, B, C, D four outputs, the model outputs a probability value aiming at each pixel point of each input image, the average probability of the four output values is obtained, and then the maximum output among index types is the final image segmentation result:
FO(X)=agrmax((FA(X)+FB(X)+FC(X)+FD(X))/4)
fig. 6 is a comparison diagram of results of segmenting a 3D medical image by using an original Unet network model, an original RFEUnet network model, and multi-model fusion in the embodiment of the present invention, and table 1 below is a comparison table of prediction capabilities of the original Unet network model, the original RFEUnet network model, and the original multi-model fusion. As can be seen from fig. 6 and table 1, compared with the prior art, the single model prediction capability of the RFEUnet network model is better than that of the prior art Unet, and the multi-model fusion capability is better than that of the prior art because the utilization of the direction and channel related characteristics is considered.
TABLE 1 Unet and RFEUnet model fusion table comparison table
Network Unet RFEMUnet Multi-model fusion
pixel Accuracy 90.2% 98.6% 99.8%
Amount of ginseng 31M 11M 40M
Reasoning speed 0.08s 0.04s 0.16s
The foregoing embodiments are merely illustrative of the principles and utilities of the present invention and are not intended to limit the invention. Modifications and variations can be made to the above-described embodiments by those skilled in the art without departing from the spirit and scope of the present invention. Therefore, the scope of the invention should be determined from the following claims.

Claims (7)

1. A3D medical image segmentation method based on layered perception fusion comprises the following steps:
step S1, acquiring a 3D medical image for preprocessing, and slicing the preprocessed 3D medical image to obtain a plurality of slice images;
step S2, performing convolution calculation on each slice image through a convolution neural network semantic segmentation algorithm to obtain a result of each slice image after semantic segmentation;
the convolutional neural network is an RFEUnet network structure comprising a receptive field enhancement module RFEM;
adding a residual error thought and a receptive field enhancement module into the Unet bottom layer to form RFEM so as to enlarge the receptive field of the bottom layer characteristics in the convolution calculation process;
the RFEUnet network structure changes the channel combination of the Unet network into the original 1/2 on the basis of the existing Unet network structure;
adding the receptor field module RFEM into the tail part of the Unet network coding structure, and expanding the perception performance of a receptor field enhancement model by the receptor field module RFEM by utilizing Maxpooling, a mish activation function and a cavity convolution structure; specifically, the method comprises the following steps:
the receptive field module RFEM extracts high-level features by utilizing five cavity convolution structures rich in hierarchical structures;
the cavity convolution structure is subjected to convolution operation and then added with batch normalization and a hash activation function;
in step S3, the prediction results of the slice images are fused, and the final medical image segmentation result is output.
2. The 3D medical image segmentation method based on layered perception fusion as claimed in claim 1, wherein the step S1 further includes:
s100, acquiring a 3D medical image, marking the 3D medical image, and dividing the 3D medical image into a target part and a background part;
step S101, checking whether the data of the 3D medical image is correct or incorrect after marking is finished;
step S102, segmenting the 3D medical image to obtain three 2D images of the 3D medical image according to H, W, C slices and decomposing the 3D medical image into a plurality of small 3D images;
step S103, data enhancement is performed on each sliced image.
3. The 3D medical image segmentation method based on hierarchical perceptual fusion as set forth in claim 2, wherein: the object represents a target region, i.e., an organ pathological tissue image region, and the background represents a non-organ portion.
4. The 3D medical image segmentation method based on hierarchical perceptual fusion as set forth in claim 3, wherein: after the 3D medical image is marked, the image becomes a background with a 0 pixel value and a target with a 1 pixel value, and if a plurality of pathological tissues exist, different pixel points are used for distinguishing.
5. The 3D medical image segmentation method based on hierarchical perceptual fusion as set forth in claim 4, wherein: in step S3, the H, W, C direction slice image and the small 3D images enter four RFEUnet network structures to form four segmentation results, and the four output results are averaged to obtain a final image segmentation result.
6. A layered perception fusion based 3D medical image segmentation apparatus comprising:
the preprocessing module is used for acquiring a 3D medical image for preprocessing, and slicing the preprocessed 3D medical image to acquire a plurality of slice images;
the segmentation module is used for performing convolution calculation on each slice image through a convolution neural network semantic segmentation algorithm to obtain a result after each slice image is subjected to semantic segmentation;
the fusion module is used for fusing the prediction results of all the slice images and outputting the final medical image segmentation result;
the convolutional neural network is an RFEUnet network structure comprising a receptive field enhancement module RFEM;
adding a residual error thought and a receptive field enhancement module into the Unet bottom layer to form RFEM so as to enlarge the receptive field of the bottom layer characteristics in the convolution calculation process;
the RFEUnet network structure changes the channel combination of the Unet network into the original 1/2 on the basis of the existing Unet network structure;
adding the receptor field module RFEM into the tail part of the Unet network coding structure, and expanding the perception performance of a receptor field enhancement model by the receptor field module RFEM by utilizing Maxpooling, a mish activation function and a cavity convolution structure; specifically, the method comprises the following steps:
the receptive field module RFEM extracts high-level features by utilizing five cavity convolution structures rich in hierarchical structures;
the cavity convolution structure is subjected to convolution operation and then added with batch normalization and a hash activation function.
7. A computer-readable storage medium for storing program code for performing the 3D medical image segmentation method of any one of claims 1-5.
CN202011287175.7A 2020-11-17 2020-11-17 3D medical image segmentation method and device based on layered perception fusion and storage medium Active CN112465754B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011287175.7A CN112465754B (en) 2020-11-17 2020-11-17 3D medical image segmentation method and device based on layered perception fusion and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011287175.7A CN112465754B (en) 2020-11-17 2020-11-17 3D medical image segmentation method and device based on layered perception fusion and storage medium

Publications (2)

Publication Number Publication Date
CN112465754A CN112465754A (en) 2021-03-09
CN112465754B true CN112465754B (en) 2021-09-03

Family

ID=74837066

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011287175.7A Active CN112465754B (en) 2020-11-17 2020-11-17 3D medical image segmentation method and device based on layered perception fusion and storage medium

Country Status (1)

Country Link
CN (1) CN112465754B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113628216A (en) * 2021-08-11 2021-11-09 北京百度网讯科技有限公司 Model training method, image segmentation method, device and related products
CN113963159B (en) * 2021-10-11 2024-08-02 华南理工大学 Three-dimensional medical image segmentation method based on neural network
CN114882315B (en) * 2022-05-23 2023-09-01 北京百度网讯科技有限公司 Sample generation method, model training method, device, equipment and medium
CN115131345B (en) * 2022-08-29 2023-02-03 杭州堃博生物科技有限公司 CT image-based focus detection method and device and computer-readable storage medium

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111563466A (en) * 2020-05-12 2020-08-21 Oppo广东移动通信有限公司 Face detection method and related product

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110689083B (en) * 2019-09-30 2022-04-12 苏州大学 Context pyramid fusion network and image segmentation method
CN110942464A (en) * 2019-11-08 2020-03-31 浙江工业大学 PET image segmentation method fusing 2-dimensional and 3-dimensional models
CN111192245B (en) * 2019-12-26 2023-04-07 河南工业大学 Brain tumor segmentation network and method based on U-Net network
CN111768418A (en) * 2020-06-30 2020-10-13 北京推想科技有限公司 Image segmentation method and device and training method of image segmentation model

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111563466A (en) * 2020-05-12 2020-08-21 Oppo广东移动通信有限公司 Face detection method and related product

Also Published As

Publication number Publication date
CN112465754A (en) 2021-03-09

Similar Documents

Publication Publication Date Title
CN112465754B (en) 3D medical image segmentation method and device based on layered perception fusion and storage medium
CN111311592B (en) Three-dimensional medical image automatic segmentation method based on deep learning
CN110599528B (en) Unsupervised three-dimensional medical image registration method and system based on neural network
CN109523521B (en) Pulmonary nodule classification and lesion positioning method and system based on multi-slice CT image
CN111798462B (en) Automatic delineation method of nasopharyngeal carcinoma radiotherapy target area based on CT image
CN113674253B (en) Automatic segmentation method for rectal cancer CT image based on U-transducer
WO2023071531A1 (en) Liver ct automatic segmentation method based on deep shape learning
CN115578404B (en) Liver tumor image enhancement and segmentation method based on deep learning
CN113393469A (en) Medical image segmentation method and device based on cyclic residual convolutional neural network
CN114092439A (en) Multi-organ instance segmentation method and system
CN110648331B (en) Detection method for medical image segmentation, medical image segmentation method and device
CN112734755A (en) Lung lobe segmentation method based on 3D full convolution neural network and multitask learning
CN114119515A (en) Brain tumor detection method based on attention mechanism and MRI multi-mode fusion
CN114663440A (en) Fundus image focus segmentation method based on deep learning
CN113436173A (en) Abdomen multi-organ segmentation modeling and segmentation method and system based on edge perception
CN115512110A (en) Medical image tumor segmentation method related to cross-modal attention mechanism
CN116433654A (en) Improved U-Net network spine integral segmentation method
CN112488996A (en) Inhomogeneous three-dimensional esophageal cancer energy spectrum CT (computed tomography) weak supervision automatic labeling method and system
CN116883341A (en) Liver tumor CT image automatic segmentation method based on deep learning
CN117710671A (en) Medical image segmentation method based on segmentation large model fine adjustment
CN115861181A (en) Tumor segmentation method and system for CT image
CN117611601B (en) Text-assisted semi-supervised 3D medical image segmentation method
CN117808834A (en) SAM-based cross-modal domain generalization medical image segmentation method
CN118037791A (en) Construction method and application of multi-mode three-dimensional medical image segmentation registration model
CN116664590B (en) Automatic segmentation method and device based on dynamic contrast enhancement magnetic resonance image

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: 3D medical image segmentation method, device and storage medium based on layered perception fusion

Effective date of registration: 20220824

Granted publication date: 20210903

Pledgee: Chepi Road Branch of Guangzhou Bank Co.,Ltd.

Pledgor: Yunrun Da Data Service Co.,Ltd.

Registration number: Y2022980013458

PE01 Entry into force of the registration of the contract for pledge of patent right
PC01 Cancellation of the registration of the contract for pledge of patent right

Date of cancellation: 20230206

Granted publication date: 20210903

Pledgee: Chepi Road Branch of Guangzhou Bank Co.,Ltd.

Pledgor: Yunrun Da Data Service Co.,Ltd.

Registration number: Y2022980013458

PC01 Cancellation of the registration of the contract for pledge of patent right