CN116645380A - Automatic segmentation method for esophageal cancer CT image tumor area based on two-stage progressive information fusion - Google Patents

Automatic segmentation method for esophageal cancer CT image tumor area based on two-stage progressive information fusion Download PDF

Info

Publication number
CN116645380A
CN116645380A CN202310688086.0A CN202310688086A CN116645380A CN 116645380 A CN116645380 A CN 116645380A CN 202310688086 A CN202310688086 A CN 202310688086A CN 116645380 A CN116645380 A CN 116645380A
Authority
CN
China
Prior art keywords
layer
image
esophageal cancer
convolution
decoder
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310688086.0A
Other languages
Chinese (zh)
Inventor
黄勇
徐凯
张飞翔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anhui University
Second Peoples Hospital of Hefei
Original Assignee
Anhui University
Second Peoples Hospital of Hefei
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anhui University, Second Peoples Hospital of Hefei filed Critical Anhui University
Priority to CN202310688086.0A priority Critical patent/CN116645380A/en
Publication of CN116645380A publication Critical patent/CN116645380A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4053Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/70Denoising; Smoothing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/80Geometric correction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H30/00ICT specially adapted for the handling or processing of medical images
    • G16H30/20ICT specially adapted for the handling or processing of medical images for handling medical images, e.g. DICOM, HL7 or PACS
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10072Tomographic images
    • G06T2207/10081Computed x-ray tomography [CT]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Epidemiology (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Primary Health Care (AREA)
  • Public Health (AREA)
  • Radiology & Medical Imaging (AREA)
  • Image Processing (AREA)

Abstract

The invention relates to an automatic segmentation method for tumor areas of an esophageal cancer CT image based on two-stage progressive information fusion, which solves the defect that the automatic segmentation of the esophageal cancer CT image is difficult to realize compared with the prior art. The invention comprises the following steps: acquiring and preprocessing an esophageal cancer CT image; constructing an esophageal cancer CT image segmentation model; training of an esophageal cancer CT image segmentation model; obtaining and preprocessing a CT image of the esophageal cancer to be segmented; and obtaining the esophageal cancer CT image segmentation result. Based on the characteristics of large noise, low resolution and artifacts of the esophageal CT image, the invention proposes the characteristics extracted by using the image super-resolution reconstruction network, and then gradually blends the characteristics into the segmentation network, thereby effectively enhancing the quality of the esophageal CT image, enabling the network to extract more abundant detail characteristics, effectively carrying out segmentation and sketching of an esophageal cancer target region and improving the accuracy and efficiency of segmentation.

Description

Automatic segmentation method for esophageal cancer CT image tumor area based on two-stage progressive information fusion
Technical Field
The invention relates to the technical field of medical image segmentation, in particular to an automatic segmentation method for esophageal cancer CT image tumor regions based on two-stage progressive information fusion.
Background
Esophageal cancer is a dominant invasive malignancy in men, including squamous cell esophageal cancer and adenoid esophageal cancer, which have different pathological characteristics and profiles. Worldwide, squamous cell esophageal cancer remains the most common type. Currently, esophageal resection and triple-field lymphotomy techniques have reached local control limits, requiring further development of the technique. In addition, esophageal cancer is very invasive, and complications such as lymphatic and blood metastasis and the like can occur in early stage. Because early symptoms of the disease are not obvious, most patients have advanced tumor when patients have symptoms such as difficult eating, hoarseness and the like, and the time for surgical excision is missed. For patients with esophageal cancer which can not receive operation treatment, the effect is poor by only using single means such as chemotherapy, targeted therapy and the like, and the survival rate of 5 years is low. In contrast, multi-mode comprehensive treatment methods such as chemotherapy, radiotherapy and endoscopic treatment are becoming the mainstream, and can provide long-term survival opportunities for a part of patients with advanced esophageal cancer, and play an important role in the situation of unsuitable surgical resection. One of the main problems with radiation therapy is the determination of the location of the tumor, which requires a tool that can assist in positioning. Therefore, the computed tomography technique is widely used in the planning of radiotherapy plans.
Accurate radiotherapy requires accurate determination and delineation of the target region of the radiotherapy. Currently, the target delineation work of radiotherapy is mainly done manually by experienced doctors and physicists, and the accuracy depends on the experience level of doctors. However, this is cumbersome and time consuming, and an experienced physician may take two days to complete the annotation of a set of images.
Therefore, how to automatically delineate a target area on a medical image becomes a popular problem in the field of computer vision. Work on radiotherapy target volume delineation, there is currently no mature viable solution due to the complexity of the esophageal organs. In order to improve the working efficiency of doctors and realize accurate treatment of esophageal cancer, research and automatic sketching of tumor target areas of esophageal cancer become a urgent problem to be solved.
The esophageal cancer medical image segmentation based on deep learning is a breakthrough technology, can well realize the tasks of automatic classification, identification and segmentation of organs and tumor target areas, and furthest reduces the internal information of images which are difficult to be found by doctors. In diagnosis of esophageal cancer medical image segmentation, with the aid of AI images, a surgeon can quickly and effectively detect cancers, and diagnosis time is saved. Recent studies have also shown that this technique has satisfactory robustness and potential.
However, automatic delineation of esophageal cancer tumor targets or clinical targets is a challenging task. Segmentation of tumor regions depends on contrast differences between tumor and surrounding tissue in CT images due to the diversity of esophageal cancer lesions morphology, variability in location, and complexity of surrounding tissue, it is difficult for conventional deep learning algorithms to capture all details and features of the tumor.
Therefore, how to automatically segment the esophageal cancer CT image with complex lesion image differences has become a technical problem to be solved.
Disclosure of Invention
The invention aims to solve the defect that automatic segmentation is difficult to be carried out on an esophageal cancer CT image in the prior art, and provides an esophageal cancer CT image tumor area automatic segmentation method based on two-stage progressive information fusion to solve the problems.
In order to achieve the above object, the technical scheme of the present invention is as follows:
an automatic segmentation method for tumor areas of esophageal cancer CT images based on two-stage progressive information fusion comprises the following steps:
11 Acquisition and pretreatment of esophageal cancer CT images: acquiring a CT image in a DICOM format, performing data enhancement processing on CT image data of an esophageal neck region and an esophageal abdominal region in the CT image, and performing slicing processing on all the CT images, namely performing interception operation on the three-dimensional CT image in the DICOM format to obtain a two-dimensional CT image slice in the jpg format and a binarization tag image in the png format to form an esophageal cancer CT image dataset;
12 Constructing an esophageal cancer CT image segmentation model: constructing an esophageal cancer CT image segmentation model based on a two-stage progressive information fusion technology;
13 Training of esophageal cancer CT image segmentation model): inputting the esophageal cancer CT image data set into an esophageal cancer CT image segmentation model for training;
14 Obtaining and preprocessing a CT image of the esophageal cancer to be segmented;
15 Acquisition of esophageal cancer CT image segmentation results: inputting the preprocessed esophageal cancer CT image to be segmented into a trained esophageal cancer CT image segmentation model to obtain a segmented esophageal cancer CT image.
The construction of the esophageal cancer CT image segmentation model comprises the following steps:
21 Setting a esophageal cancer CT image segmentation model comprising a Swin transform network model and a TransResune convolutional neural network model for super-resolution reconstruction, performing progressive information fusion on a characteristic image and an original image output by the Swin transform for super-resolution reconstruction through splicing operation, and then inputting the characteristic image and the original image into the TransResune convolutional neural network model for segmentation to obtain a final segmentation image;
22 A Swin transducer network model is set up, which comprises 6 Swin with residuals
The transducer module RSTB and a residual connection structure, wherein the RSTB consists of 6 Swins
Transformer Layer and a convolution, a residual connection;
23 Setting a TransResUNet convolutional neural network model:
231 Setting a TransResunet convolutional neural network model, wherein the TransResunet convolutional neural network model comprises a downsampling encoder module for feature extraction, a feature pyramid ASPP module for obtaining different scale receptive fields, and an upsampling decoder module for recovering image resolution;
232 Setting up a downsampling encoder module comprising 4 consecutive downsampling structures with residuals;
the downsampling structure with the residual comprises a branch A and a branch B, wherein the branch A is a convolution layer with a convolution kernel of 3 multiplied by 3, a batch normalization layer, a LeakyRelu layer, a Transformer Encoder Block layer, a convolution layer with a convolution kernel of 3 multiplied by 3, and a batch normalization layer stack which is used as the branch A of the residual structure;
the branch B is a convolution layer with a convolution kernel of 1 multiplied by 1, and a batch of normalization layers are stacked to be used as another branch B of a residual structure;
the two branches are added, and finally pass through a LeakyRelu layer;
233 A set feature pyramid ASPP module comprising:
a first operation module: a convolution kernel is a1 x 1 convolution layer;
a second operation module: a convolution layer with a3 x 3 cavity rate of 6;
and a third operation module: a convolution layer with a3 x 3 cavity rate of 12;
fourth operation module: a convolution layer with a3 x 3 void fraction of 18;
a fifth operation module: an adaptive average pooling layer, a convolution layer with a convolution kernel of 1×1, and an up-sampling operation;
the five operation modules are connected in parallel, the obtained 5 feature images are spliced, and the five feature images are subjected to convolution kernel to form a convolution layer of 1 multiplied by 1;
235 Setting up a upsampling decoder module comprising 4 consecutive upsampling structures with residuals and a splice structure with the output branches of the four downsampling structures with residuals of the encoder;
the four residual downsampling structures of the encoder result in four different sized outputs respectively,
the output of the fourth size, input to the feature pyramid ASPP module and spliced to the first input of the decoder,
the output of the third size is spliced to the second input of the decoder,
the output of the second size is spliced to a third input of the decoder,
the output of the first size is spliced to a fourth input of the decoder;
the decoder structure is four consecutive upsampled blocks with residuals,
the upsampled block structure with residual is:
a double up-sampling operation, a corresponding encoder layer splicing operation, a convolution layer with a convolution kernel of 3 x 3, a batch normalization layer, a LeakyRelu layer, a convolution layer with a convolution kernel of 3 x 3 stacked with a batch normalization layer as a branch of the residual structure,
and a convolution layer with a convolution kernel of 1×1 is stacked with a batch of normalization layers as another branch of the residual structure;
the two branches perform an addition operation and finally pass through the LeakyRelu layer in one decoder block.
The training of the esophageal cancer CT image segmentation model comprises the following steps of:
31 Inputting the esophageal cancer CT image data set into a Swin Transformer network model of the esophageal cancer CT image segmentation model, and outputting a feature map from the Swin Transformer network model;
inputting to a Swin transform network model for super-resolution reconstruction, executing a convolution layer with a convolution kernel size of 1×1, performing a residual error connection operation on the convolution layer with the convolution kernel size of 1×1, performing an upsampling operation on the convolution layer with the convolution kernel size of 1×1 through 6 continuous RSTB modules, and obtaining a feature map;
32 Performing progressive information fusion on the feature map output by the Swin Transformer and the original map through splicing operation to obtain a spliced feature map;
33 Inputting the spliced characteristic diagram into a TransResunet convolutional neural network model;
34 Training the spliced feature map in a downsampling encoder module:
341 Inputting the spliced feature map into a first downsampling structure with residual errors, wherein a branch A of the first downsampling structure with residual errors is a convolution layer with a convolution kernel of 3 multiplied by 3, a batch normalization layer, a LeakyRelu layer, a Transformer Encoder Block layer, a convolution layer with a convolution kernel of 3 multiplied by 3 and a batch normalization layer stack; the first branch B with the residual downsampling structure is a convolution layer with a convolution kernel of 1 multiplied by 1 and a batch normalization layer stack;
the two branches perform addition operation, and finally a LeakyRelu layer is executed to obtain a first downsampled output;
342 Feeding the first downsampled output into a second downsampled structure with residual errors, adding the branch A and the branch B of the second downsampled structure with residual errors, and finally executing a LeakyRelu layer to obtain a second downsampled output;
343 Feeding the second downsampled output into a third downsampled structure with residual errors, adding the branch A and the branch B of the third downsampled structure with residual errors, and finally executing a LeakyRelu layer to obtain a third downsampled output;
344 Feeding the third downsampled output into a fourth downsampled structure with residual errors, adding the branch A and the branch B of the fourth downsampled structure with residual errors, and finally executing a LeakyRelu layer to obtain a fourth downsampled output;
35 The four downsampled outputs are input to a decoder module to carry out upsampling to recover the resolution of the image;
36 A fourth downsampled output is input to the feature pyramid ASPP module and spliced to the first input of the decoder;
37 For a first input of the decoder, performing a double up-sampling operation, a corresponding encoder layer splicing operation, a convolution layer with a convolution kernel of 3 x 3, a batch normalization layer, a LeakyRelu layer, a convolution layer with a convolution kernel of 3 x 3 stacked with a batch normalization layer as a branch of the residual structure;
and a convolution layer with a convolution kernel of 1×1 is stacked with a batch of normalization layers as another branch of the residual structure;
the two branches perform addition operation, and finally pass through a LeakyRelu layer in one decoder module to obtain a second input of the decoder;
38 A third downsampled output is spliced to a second input of the decoder,
performing a double up-sampling operation, a corresponding encoder layer splicing operation, a convolution layer with a convolution kernel of 3 x 3, a batch normalization layer, a leak relu layer, a convolution layer with a convolution kernel of 3 x 3 stacked with a batch normalization layer as a branch of the residual structure for a second input to the decoder;
and a convolution layer with a convolution kernel of 1×1 is stacked with a batch of normalization layers as another branch of the residual structure;
the two branches perform addition operation, and finally pass through a LeakyRelu layer in one decoder module to obtain a third input of the decoder;
39 A second downsampled output is spliced to a third input of the decoder,
performing a double up-sampling operation, a corresponding encoder layer splicing operation, a convolution layer with a convolution kernel of 3 x 3, a batch normalization layer, a leak relu layer, a convolution layer with a convolution kernel of 3 x 3 stacked with a batch normalization layer as a branch of the residual structure for a third input to the decoder;
and a convolution layer with a convolution kernel of 1×1 is stacked with a batch of normalization layers as another branch of the residual structure;
the two branches perform addition operation, and finally pass through a LeakyRelu layer in a decoder module to obtain a fourth input of the decoder;
310 A) the first downsampled output is spliced to a fourth input of the decoder,
performing a double up-sampling operation, a corresponding encoder layer splicing operation, a convolution layer with a convolution kernel of 3 x 3, a batch normalization layer, a leak relu layer, a convolution layer with a convolution kernel of 3 x 3 stacked with a batch normalization layer as a branch of the residual structure for a fourth input to the decoder;
and a convolution layer with a convolution kernel of 1×1 is stacked with a batch of normalization layers as another branch of the residual structure;
the two branches are added, and finally, a final output of the TransResunet is obtained through a LeakyRelu layer in a decoder module, a convolution layer which is subjected to double up-sampling and a convolution kernel of 1 multiplied by 1;
311 Forward propagation to obtain segmentation probability;
312 Using the cross entropy loss function and the Dice loss function as the loss function of the esophageal cancer CT image segmentation model, calculating the segmentation probability to obtain segmentation loss, wherein the expression is as follows:
wherein C in the cross entropy loss function CE (p, q) represents the number of categories, p i Is true value, q i Is a predicted value; a and B in the Dice Loss formula respectively represent mask matrixes corresponding to the real label and the model prediction label, the A and B are intersections between A and B, the A and B respectively represent the number of the A and B elements, wherein the coefficient of the molecule is 2 because of repeated calculation of common elements between A and B in denominator;
313 Using the L1 loss function as the loss function of the Swin transducer network model to reconstruct the esophageal cancer CT image in super resolution, the expression is as follows:
wherein N represents the number of samples, y i The real label f (x i ) Is the model predictive value of the ith sample;
314 Determining gradient vectors through back propagation of loss values, and updating the parameters of the esophageal cancer CT image segmentation model;
315 Judging whether the set training round number is reached, if so, completing the training of the esophageal cancer CT image segmentation model, otherwise, continuing the training.
Advantageous effects
Compared with the prior art, the automatic segmentation method for the esophageal cancer CT image tumor area based on the two-stage progressive information fusion is characterized in that the characteristics extracted by using an image super-resolution reconstruction network are provided based on the characteristics of large noise, low resolution and artifacts of the esophageal CT image, and then the characteristics are gradually fused into a segmentation network, so that the quality of the esophageal CT image is effectively enhanced, the network can extract richer detail characteristics, the segmentation and the delineation of an esophageal cancer target area can be effectively performed, and the segmentation accuracy and the segmentation efficiency are improved.
Because the esophageal tumor position is changeable, the tumor anatomical structure is complex, the boundary of the tumor target area is fuzzy, and the individual difference is large, the improved transformerrescenet can extract the remote dependency characteristic, thereby improving the segmentation precision of the tumor target area and the robustness of the model. The invention increases image super-resolution reconstruction branches, transformer Encoder Block modules with long-distance dependent feature extraction and ASPP modules with multi-scale feature fusion, and enhances the feature extraction capability of the network.
Drawings
FIG. 1 is a process sequence diagram of the present invention;
FIG. 2 is a block diagram of an esophageal cancer CT image segmentation model according to the invention;
FIG. 3 is a diagram of a TransResunet convolutional neural network model in accordance with the present invention;
FIG. 4 is a block diagram of a feature pyramid ASPP module according to the present invention;
FIG. 5 is a diagram of the Swin transducer network model according to the present invention;
FIGS. 6a and 7a are CT images of esophageal cancer in the prior art;
fig. 6b and 7b are label images of the split labels of fig. 6a and 7a, respectively;
FIGS. 6c and 7c are, respectively, automatically segmented images produced by the method of the present invention for FIGS. 6a and 7 a;
fig. 6d, 7d are respectively the automatically segmented images generated using the ResUNet network for fig. 6a, 7 a;
fig. 6e and 7e are respectively the automatically segmented images generated using UNet networks for fig. 6a and 7 a.
Detailed Description
For a further understanding and appreciation of the structural features and advantages achieved by the present invention, the following description is provided in connection with the accompanying drawings, which are presently preferred embodiments and are incorporated in the accompanying drawings, in which:
as shown in fig. 1, the automatic segmentation method of the esophageal cancer CT image tumor region based on two-stage progressive information fusion comprises the following steps:
firstly, acquiring and preprocessing an esophageal cancer CT image: and acquiring a CT image in a DICOM format, performing data enhancement processing on CT image data of an esophageal neck region and an esophageal abdominal region in the CT image, and performing slicing processing on all the CT images, namely performing interception operation on the three-dimensional CT image in the DICOM format to obtain a two-dimensional jpg CT image slice and a png format binarization tag image, thereby forming an esophageal cancer CT image dataset.
Secondly, constructing an esophageal cancer CT image segmentation model: and constructing an esophageal cancer CT image segmentation model based on a two-stage progressive information fusion technology.
DICOM format CT images store patient-rich medical image information, but are not suitable for training in deep learning networks. Therefore, the DICOM format data needs to be converted, and the original image and the label information are extracted from the DICOM format data for model training and subsequent data analysis. The deep learning algorithm usually adopts RGB format as data input, so that medical images in DICOM format need to be converted into common RGB format and manufactured into a data set meeting the requirement of deep learning. The conversion process includes two key steps: firstly, metadata of an original DICOM format file needs to be read, each slice of a patient is analyzed independently, pixels are extracted for normalization, and the pixels are set to be between 0 and 1. In order to facilitate storage and subsequent deep learning analysis, the image slice data is mapped to between 0 and 255 and stored as 512×512 image slice data. Furthermore, in the task of tumor segmentation for deep learning, it is necessary to use tumor region labels manually delineated by a physical physician. Reading of this process requires careful reading of the metadata and finding the outline corresponding to the tag name. Its pixel value is set to 1 and the remaining pixel values are set to 0.
Because the CT image data of the neck region and the abdominal region of the esophagus in the data set are less, the data is enhanced to expand the data set, so that the robustness of the network can be enhanced; meanwhile, as the esophageal tumor area is smaller, the shape is irregular, the CT image has large noise and low resolution and has artifacts, the problem can be effectively optimized by constructing the esophageal cancer CT image segmentation model based on the two-stage progressive information fusion technology, and the effect of the model is improved.
(1) As shown in fig. 2, the set esophageal cancer CT image segmentation model includes a Swin transform network model and a transresune convolutional neural network model for super-resolution reconstruction, and the feature map output by the Swin transform of the super-resolution reconstruction and the original map are subjected to progressive information fusion through a stitching operation, and then input into the transresune convolutional neural network model for segmentation to obtain a final segmentation map.
(2) As shown in fig. 5, a Swin Transformer network model is set, which includes 6 Swin Transformer modules RSTB with residuals and one residual connection structure, wherein the RSTB is composed of 6 Swin Transformer Layer and one convolution, one residual connection.
(3) Setting a TransResunet convolutional neural network model:
a1 As shown in fig. 2, the transresune convolutional neural network model is configured to include a downsampling encoder module for feature extraction, a feature pyramid ASPP module for obtaining different scale receptive fields, and an upsampling decoder module for recovering image resolution;
a2 Setting up a downsampling encoder module comprising 4 consecutive downsampling structures with residuals;
the downsampling structure with the residual comprises a branch A and a branch B, wherein the branch A is a convolution layer with a convolution kernel of 3 multiplied by 3, a batch normalization layer, a LeakyRelu layer, a Transformer Encoder Block layer, a convolution layer with a convolution kernel of 3 multiplied by 3, and a batch normalization layer stack which is used as the branch A of the residual structure;
the branch B is a convolution layer with a convolution kernel of 1 multiplied by 1, and a batch of normalization layers are stacked to be used as another branch B of a residual structure;
the two branches are added, and finally pass through a LeakyRelu layer;
a3 As shown in fig. 4, a feature pyramid ASPP module is set, which includes:
a first operation module: a convolution kernel is a1 x 1 convolution layer;
a second operation module: a convolution layer with a3 x 3 cavity rate of 6;
and a third operation module: a convolution layer with a3 x 3 cavity rate of 12;
fourth operation module: a convolution layer with a3 x 3 void fraction of 18;
a fifth operation module: an adaptive average pooling layer, a convolution layer with a convolution kernel of 1×1, and an up-sampling operation;
the five operation modules are connected in parallel, the obtained 5 feature images are spliced, and the five feature images are subjected to convolution kernel to form a convolution layer of 1 multiplied by 1;
a4 Setting up a upsampling decoder module comprising 4 consecutive upsampling structures with residuals and a splice structure with the output branches of the four downsampling structures with residuals of the encoder;
the four residual downsampling structures of the encoder result in four different sized outputs respectively,
the output of the fourth size, input to the feature pyramid ASPP module and spliced to the first input of the decoder,
the output of the third size is spliced to the second input of the decoder,
the output of the second size is spliced to a third input of the decoder,
the output of the first size is spliced to a fourth input of the decoder;
the decoder structure is four consecutive upsampled blocks with residuals,
the upsampled block structure with residual is:
a double up-sampling operation, a corresponding encoder layer splicing operation, a convolution layer with a convolution kernel of 3 x 3, a batch normalization layer, a LeakyRelu layer, a convolution layer with a convolution kernel of 3 x 3 stacked with a batch normalization layer as a branch of the residual structure,
and a convolution layer with a convolution kernel of 1×1 is stacked with a batch of normalization layers as another branch of the residual structure;
the two branches perform an addition operation and finally pass through the LeakyRelu layer in one decoder block.
Thirdly, training an esophageal cancer CT image segmentation model: and inputting the esophageal cancer CT image data set into an esophageal cancer CT image segmentation model for training.
In the training process, the Swin transform network model is used for carrying out super-resolution reconstruction on the original image, the obtained richer features are spliced into the original image, the boundary features of the original image are enhanced, and then the enhanced boundary features are input into the transform convolutional neural network model for image segmentation, so that the two-stage progressive information fusion framework can obtain better segmentation precision.
The training of the esophageal cancer CT image segmentation model comprises the following steps:
(1) Inputting the esophageal cancer CT image data set into a Swin Transformer network model of an esophageal cancer CT image segmentation model, and outputting a feature map from the Swin Transformer network model;
the method comprises the steps of inputting the residual error into a Swin transform network model for super-resolution reconstruction, executing a convolution layer with a convolution kernel size of 1 multiplied by 1, performing a residual error connection operation through 6 continuous RSTB modules, performing a convolution layer with a convolution kernel size of 1 multiplied by 1, performing an upsampling operation on a LeakyRelu layer, and performing a convolution layer with a convolution kernel size of 1 multiplied by 1 to obtain a characteristic diagram.
(2) And carrying out progressive information fusion on the characteristic map output by the Swin Transformer and the original map through splicing operation to obtain a spliced characteristic map.
(3) And inputting the spliced characteristic diagram into a TransResunet convolutional neural network model.
(4) Training the spliced characteristic diagram in a downsampling coder module:
b1 Inputting the spliced feature map into a first downsampling structure with residual errors, wherein a branch A of the first downsampling structure with residual errors is a convolution layer with a convolution kernel of 3 multiplied by 3, a batch normalization layer, a LeakyRelu layer, a Transformer Encoder Block layer, a convolution layer with a convolution kernel of 3 multiplied by 3 and a batch normalization layer stack; the first branch B with the residual downsampling structure is a convolution layer with a convolution kernel of 1 multiplied by 1 and a batch normalization layer stack;
the two branches perform addition operation, and finally a LeakyRelu layer is executed to obtain a first downsampled output;
b2 Feeding the first downsampled output into a second downsampled structure with residual errors, adding the branch A and the branch B of the second downsampled structure with residual errors, and finally executing a LeakyRelu layer to obtain a second downsampled output;
b3 Feeding the second downsampled output into a third downsampled structure with residual errors, adding the branch A and the branch B of the third downsampled structure with residual errors, and finally executing a LeakyRelu layer to obtain a third downsampled output;
b4 Feeding the third downsampled output into a fourth downsampled structure with residual, adding the branch A and the branch B of the fourth downsampled structure with residual, and finally executing a LeakyRelu layer to obtain a fourth downsampled output.
(5) The four downsampled outputs are input to a decoder module for upsampling to recover the resolution of the image.
(6) The fourth downsampled output is input to the feature pyramid ASPP module and spliced to the first input of the decoder.
(7) Performing a double up-sampling operation, a corresponding encoder layer splicing operation, a convolution layer with a convolution kernel of 3 x 3, a batch normalization layer, a leak relu layer, a convolution layer with a convolution kernel of 3 x 3 stacked with a batch normalization layer as a branch of the residual structure for a first input to the decoder;
and a convolution layer with a convolution kernel of 1×1 is stacked with a batch of normalization layers as another branch of the residual structure;
the two branches add and finally pass through the LeakyRelu layer in one decoder block to obtain the second input of the decoder.
(8) The third downsampled output is spliced to the second input of the decoder,
performing a double up-sampling operation, a corresponding encoder layer splicing operation, a convolution layer with a convolution kernel of 3 x 3, a batch normalization layer, a leak relu layer, a convolution layer with a convolution kernel of 3 x 3 stacked with a batch normalization layer as a branch of the residual structure for a second input to the decoder;
and a convolution layer with a convolution kernel of 1×1 is stacked with a batch of normalization layers as another branch of the residual structure;
the two branches add and finally pass through the LeakyRelu layer in one decoder block to get the third input of the decoder.
(9) The second downsampled output is spliced to a third input of the decoder,
performing a double up-sampling operation, a corresponding encoder layer splicing operation, a convolution layer with a convolution kernel of 3 x 3, a batch normalization layer, a leak relu layer, a convolution layer with a convolution kernel of 3 x 3 stacked with a batch normalization layer as a branch of the residual structure for a third input to the decoder;
and a convolution layer with a convolution kernel of 1×1 is stacked with a batch of normalization layers as another branch of the residual structure;
the two branches add and finally pass through the LeakyRelu layer in one decoder block to get the fourth input of the decoder.
(10) The first downsampled output is spliced to a fourth input of the decoder,
performing a double up-sampling operation, a corresponding encoder layer splicing operation, a convolution layer with a convolution kernel of 3 x 3, a batch normalization layer, a leak relu layer, a convolution layer with a convolution kernel of 3 x 3 stacked with a batch normalization layer as a branch of the residual structure for a fourth input to the decoder;
and a convolution layer with a convolution kernel of 1×1 is stacked with a batch of normalization layers as another branch of the residual structure;
the two branches are added, and finally pass through a LeakyRelu layer in a decoder module, and then pass through a convolution layer with double up-sampling and a convolution kernel of 1×1 to obtain the final output of TransResunet.
(11) Forward propagation to obtain segmentation probability;
(12) Using the cross entropy loss function and the Dice loss function as the loss function of the esophageal cancer CT image segmentation model, and calculating segmentation probability to obtain segmentation loss, wherein the expression is as follows:
wherein C in the cross entropy loss function CE (p, q) represents the number of categories, p i Is true value, q i Is a predicted value; a and B in the Dice Loss formula respectively represent mask matrixes corresponding to the real label and the model prediction label, the A and B are intersections between A and B, the A and B respectively represent the number of the A and B elements, wherein the coefficient of the molecule is 2 because of repeated calculation of common elements between A and B in denominator;
(13) The L1 loss function is used as the loss function of the Swin transform network model to reconstruct the super-resolution of the esophageal cancer CT image, and the expression is as follows:
wherein N represents the number of samples, y i The real label f (x i ) Is the model predictive value of the ith sample;
(14) Determining gradient vectors through back propagation of loss values, and updating esophageal cancer CT image segmentation model parameters;
(15) Judging whether the set training round number is reached, if so, finishing the training of the esophageal cancer CT image segmentation model, otherwise, continuing the training.
Fourth, obtaining and preprocessing the CT image of the esophageal cancer to be segmented.
Fifthly, obtaining esophageal cancer CT image segmentation results: inputting the preprocessed esophageal cancer CT image to be segmented into a trained esophageal cancer CT image segmentation model to obtain a segmented esophageal cancer CT image.
As shown in fig. 6a and 7a, the images are CT slice images of two esophageal cancer patients, and fig. 6b and 7b are labels corresponding to the CT slice images of two esophageal cancer patients. As can be seen from fig. 6c and fig. 7c, the method of the present invention, compared with the reset network model shown in fig. 6d and fig. 7d, and the UNet network model shown in fig. 6e and fig. 7e, has more complete automatic segmentation boundary information and good consistency with the label.
DSC represents a Darnst similarity coefficient, the value of which is between [0,1], the larger the value is, the higher the accuracy is, HD represents a Harnst distance, and the smaller the value is, the higher the coincidence of the boundaries is. To conduct fair experiments, all experiments were conducted with the same training initial parameters. As can be seen from table 1, the method of the present invention is improved by 0.19 and 7.88 on the classical UNet, DSC and HD indices, respectively, and by 0.09 and 7.88 on the ResUNet, DSC and HD indices, respectively.
TABLE 1 comparison of the segmentation precision of classical U-Net networks and Resunet on DSC and HD indicators by the method of the invention
The foregoing has shown and described the basic principles, principal features and advantages of the invention. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, and that the above embodiments and descriptions are merely illustrative of the principles of the present invention, and various changes and modifications may be made therein without departing from the spirit and scope of the invention, which is defined by the appended claims. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims (3)

1. An automatic segmentation method for tumor areas of esophageal cancer CT images based on two-stage progressive information fusion is characterized by comprising the following steps:
11 Acquisition and pretreatment of esophageal cancer CT images: acquiring a CT image in a DICOM format, performing data enhancement processing on CT image data of an esophageal neck region and an esophageal abdominal region in the CT image, and performing slicing processing on all the CT images, namely performing interception operation on the three-dimensional CT image in the DICOM format to obtain a two-dimensional CT image slice in the jpg format and a binarization tag image in the png format to form an esophageal cancer CT image dataset;
12 Constructing an esophageal cancer CT image segmentation model: constructing an esophageal cancer CT image segmentation model based on a two-stage progressive information fusion technology;
13 Training of esophageal cancer CT image segmentation model): inputting the esophageal cancer CT image data set into an esophageal cancer CT image segmentation model for training;
14 Obtaining and preprocessing a CT image of the esophageal cancer to be segmented;
15 Acquisition of esophageal cancer CT image segmentation results: inputting the preprocessed esophageal cancer CT image to be segmented into a trained esophageal cancer CT image segmentation model to obtain a segmented esophageal cancer CT image.
2. The automatic segmentation method for the tumor area of the esophageal cancer CT image based on the two-stage progressive information fusion according to claim 1, wherein the construction of the esophageal cancer CT image segmentation model comprises the following steps:
21 Setting a esophageal cancer CT image segmentation model comprising a Swin Transformer network model and a TransResunet convolutional neural network model for super-resolution reconstruction, wherein the Swin is reconstructed by super-resolution
The feature map output by the transducer and the original map are subjected to progressive information fusion through splicing operation, and then input into a TransResunet convolutional neural network model for segmentation to obtain a final segmentation map;
22 A Swin transducer network model is set, which comprises 6 Swin transducer modules RSTB with residual errors and a residual error connection structure, wherein the RSTB consists of 6 Swin Transformer Layer and one convolution and one residual error connection;
23 Setting a TransResUNet convolutional neural network model:
231 Setting a TransResunet convolutional neural network model, wherein the TransResunet convolutional neural network model comprises a downsampling encoder module for feature extraction, a feature pyramid ASPP module for obtaining different scale receptive fields, and an upsampling decoder module for recovering image resolution;
232 Setting up a downsampling encoder module comprising 4 consecutive downsampling structures with residuals;
the downsampling structure with the residual comprises a branch A and a branch B, wherein the branch A is a convolution layer with a convolution kernel of 3 multiplied by 3, a batch normalization layer, a LeakyRelu layer, a Transformer Encoder Block layer, a convolution layer with a convolution kernel of 3 multiplied by 3, and a batch normalization layer stack which is used as the branch A of the residual structure;
the branch B is a convolution layer with a convolution kernel of 1 multiplied by 1, and a batch of normalization layers are stacked to be used as another branch B of a residual structure;
the two branches are added, and finally pass through a LeakyRelu layer;
233 A set feature pyramid ASPP module comprising:
a first operation module: a convolution kernel is a1 x 1 convolution layer;
a second operation module: a convolution layer with a3 x 3 cavity rate of 6;
and a third operation module: a convolution layer with a3 x 3 cavity rate of 12;
fourth operation module: a convolution layer with a3 x 3 void fraction of 18;
a fifth operation module: an adaptive average pooling layer, a convolution layer with a convolution kernel of 1×1, and an up-sampling operation;
the five operation modules are connected in parallel, the obtained 5 feature images are spliced, and the five feature images are subjected to convolution kernel to form a convolution layer of 1 multiplied by 1;
234 Setting up a upsampling decoder module comprising 4 consecutive upsampling structures with residuals and a splice structure with the output branches of the four downsampling structures with residuals of the encoder;
the four residual downsampling structures of the encoder result in four different sized outputs respectively,
the output of the fourth size, input to the feature pyramid ASPP module and spliced to the first input of the decoder,
the output of the third size is spliced to the second input of the decoder,
the output of the second size is spliced to a third input of the decoder,
the output of the first size is spliced to a fourth input of the decoder;
the decoder structure is four consecutive upsampled blocks with residuals,
the upsampled block structure with residual is:
a double up-sampling operation, a corresponding encoder layer splicing operation, a convolution layer with a convolution kernel of 3 x 3, a batch normalization layer, a LeakyRelu layer, a convolution layer with a convolution kernel of 3 x 3 stacked with a batch normalization layer as a branch of the residual structure,
and a convolution layer with a convolution kernel of 1×1 is stacked with a batch of normalization layers as another branch of the residual structure;
the two branches perform an addition operation and finally pass through the LeakyRelu layer in one decoder block.
3. The automatic segmentation method for the tumor area of the esophageal cancer CT image based on the two-stage progressive information fusion according to claim 1, wherein the training of the esophageal cancer CT image segmentation model comprises the following steps:
31 Inputting the esophageal cancer CT image data set into a Swin Transformer network model of the esophageal cancer CT image segmentation model, and outputting a feature map from the Swin Transformer network model;
inputting to a Swin transform network model for super-resolution reconstruction, executing a convolution layer with a convolution kernel size of 1×1, performing a residual error connection operation on the convolution layer with the convolution kernel size of 1×1, performing an upsampling operation on the convolution layer with the convolution kernel size of 1×1 through 6 continuous RSTB modules, and obtaining a feature map;
32 Performing progressive information fusion on the feature map output by the Swin Transformer and the original map through splicing operation to obtain a spliced feature map;
33 Inputting the spliced characteristic diagram into a TransResunet convolutional neural network model;
34 Training the spliced feature map in a downsampling encoder module:
341 Inputting the spliced feature map into a first downsampling structure with residual errors, wherein a branch A of the first downsampling structure with residual errors is a convolution layer with a convolution kernel of 3 multiplied by 3, a batch normalization layer, a LeakyRelu layer, a Transformer Encoder Block layer, a convolution layer with a convolution kernel of 3 multiplied by 3 and a batch normalization layer stack; the first branch B with the residual downsampling structure is a convolution layer with a convolution kernel of 1 multiplied by 1 and a batch normalization layer stack;
the two branches perform addition operation, and finally a LeakyRelu layer is executed to obtain a first downsampled output;
342 Feeding the first downsampled output into a second downsampled structure with residual errors, adding the branch A and the branch B of the second downsampled structure with residual errors, and finally executing a LeakyRelu layer to obtain a second downsampled output;
343 Feeding the second downsampled output into a third downsampled structure with residual errors, adding the branch A and the branch B of the third downsampled structure with residual errors, and finally executing a LeakyRelu layer to obtain a third downsampled output;
344 Feeding the third downsampled output into a fourth downsampled structure with residual errors, adding the branch A and the branch B of the fourth downsampled structure with residual errors, and finally executing a LeakyRelu layer to obtain a fourth downsampled output;
35 The four downsampled outputs are input to a decoder module to carry out upsampling to recover the resolution of the image;
36 A fourth downsampled output is input to the feature pyramid ASPP module and spliced to the first input of the decoder;
37 For a first input of the decoder, performing a double up-sampling operation, a corresponding encoder layer splicing operation, a convolution layer with a convolution kernel of 3 x 3, a batch normalization layer, a LeakyRelu layer, a convolution layer with a convolution kernel of 3 x 3 stacked with a batch normalization layer as a branch of the residual structure;
and a convolution layer with a convolution kernel of 1×1 is stacked with a batch of normalization layers as another branch of the residual structure;
the two branches perform addition operation, and finally pass through a LeakyRelu layer in one decoder module to obtain a second input of the decoder;
38 A third downsampled output is spliced to a second input of the decoder,
performing a double up-sampling operation, a corresponding encoder layer splicing operation, a convolution layer with a convolution kernel of 3 x 3, a batch normalization layer, a leak relu layer, a convolution layer with a convolution kernel of 3 x 3 stacked with a batch normalization layer as a branch of the residual structure for a second input to the decoder;
and a convolution layer with a convolution kernel of 1×1 is stacked with a batch of normalization layers as another branch of the residual structure;
the two branches perform addition operation, and finally pass through a LeakyRelu layer in one decoder module to obtain a third input of the decoder;
39 A second downsampled output is spliced to a third input of the decoder,
performing a double up-sampling operation, a corresponding encoder layer splicing operation, a convolution layer with a convolution kernel of 3 x 3, a batch normalization layer, a leak relu layer, a convolution layer with a convolution kernel of 3 x 3 stacked with a batch normalization layer as a branch of the residual structure for a third input to the decoder;
and a convolution layer with a convolution kernel of 1×1 is stacked with a batch of normalization layers as another branch of the residual structure;
the two branches perform addition operation, and finally pass through a LeakyRelu layer in a decoder module to obtain a fourth input of the decoder;
310 A) the first downsampled output is spliced to a fourth input of the decoder,
performing a double up-sampling operation, a corresponding encoder layer splicing operation, a convolution layer with a convolution kernel of 3 x 3, a batch normalization layer, a leak relu layer, a convolution layer with a convolution kernel of 3 x 3 stacked with a batch normalization layer as a branch of the residual structure for a fourth input to the decoder;
and a convolution layer with a convolution kernel of 1×1 is stacked with a batch of normalization layers as another branch of the residual structure;
the two branches are added, and finally, a final output of the TransResunet is obtained through a LeakyRelu layer in a decoder module, a convolution layer which is subjected to double up-sampling and a convolution kernel of 1 multiplied by 1;
311 Forward propagation to obtain segmentation probability;
312 Using the cross entropy loss function and the Dice loss function as the loss function of the esophageal cancer CT image segmentation model, calculating the segmentation probability to obtain segmentation loss, wherein the expression is as follows:
wherein C in the cross entropy loss function CE (p, q) represents the number of categories, p i Is true value, q i Is a predicted value; a and B in the Dice Loss formula respectively represent mask matrixes corresponding to the real label and the model prediction label, the A and B are intersections between A and B, the A and B respectively represent the number of the A and B elements, wherein the coefficient of the molecule is 2 because of repeated calculation of common elements between A and B in denominator;
313 Using the L1 loss function as the loss function of the Swin transducer network model to reconstruct the esophageal cancer CT image in super resolution, the expression is as follows:
wherein N represents the number of samples, y i The real label f (x i ) Is the model predictive value of the ith sample;
314 Determining gradient vectors through back propagation of loss values, and updating the parameters of the esophageal cancer CT image segmentation model;
315 Judging whether the set training round number is reached, if so, completing the training of the esophageal cancer CT image segmentation model, otherwise, continuing the training.
CN202310688086.0A 2023-06-12 2023-06-12 Automatic segmentation method for esophageal cancer CT image tumor area based on two-stage progressive information fusion Pending CN116645380A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310688086.0A CN116645380A (en) 2023-06-12 2023-06-12 Automatic segmentation method for esophageal cancer CT image tumor area based on two-stage progressive information fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310688086.0A CN116645380A (en) 2023-06-12 2023-06-12 Automatic segmentation method for esophageal cancer CT image tumor area based on two-stage progressive information fusion

Publications (1)

Publication Number Publication Date
CN116645380A true CN116645380A (en) 2023-08-25

Family

ID=87643352

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310688086.0A Pending CN116645380A (en) 2023-06-12 2023-06-12 Automatic segmentation method for esophageal cancer CT image tumor area based on two-stage progressive information fusion

Country Status (1)

Country Link
CN (1) CN116645380A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117788296A (en) * 2024-02-23 2024-03-29 北京理工大学 Infrared remote sensing image super-resolution reconstruction method based on heterogeneous combined depth network
CN117934519A (en) * 2024-03-21 2024-04-26 安徽大学 Self-adaptive segmentation method for esophageal tumor CT image synthesized by unpaired enhancement
CN118196416A (en) * 2024-03-26 2024-06-14 昆明理工大学 Small target colorectal polyp segmentation method integrating multitasking cooperation and progressive resolution strategy

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117788296A (en) * 2024-02-23 2024-03-29 北京理工大学 Infrared remote sensing image super-resolution reconstruction method based on heterogeneous combined depth network
CN117788296B (en) * 2024-02-23 2024-05-07 北京理工大学 Infrared remote sensing image super-resolution reconstruction method based on heterogeneous combined depth network
CN117934519A (en) * 2024-03-21 2024-04-26 安徽大学 Self-adaptive segmentation method for esophageal tumor CT image synthesized by unpaired enhancement
CN117934519B (en) * 2024-03-21 2024-06-07 安徽大学 Self-adaptive segmentation method for esophageal tumor CT image synthesized by unpaired enhancement
CN118196416A (en) * 2024-03-26 2024-06-14 昆明理工大学 Small target colorectal polyp segmentation method integrating multitasking cooperation and progressive resolution strategy

Similar Documents

Publication Publication Date Title
CN113870258B (en) Counterwork learning-based label-free pancreas image automatic segmentation system
CN116309650B (en) Medical image segmentation method and system based on double-branch embedded attention mechanism
WO2023071531A1 (en) Liver ct automatic segmentation method based on deep shape learning
CN111354002A (en) Kidney and kidney tumor segmentation method based on deep neural network
CN116645380A (en) Automatic segmentation method for esophageal cancer CT image tumor area based on two-stage progressive information fusion
CN109614991A (en) A kind of segmentation and classification method of the multiple dimensioned dilatancy cardiac muscle based on Attention
CN107492071A (en) Medical image processing method and equipment
CN109389584A (en) Multiple dimensioned rhinopharyngeal neoplasm dividing method based on CNN
CN112215844A (en) MRI (magnetic resonance imaging) multi-mode image segmentation method and system based on ACU-Net
WO2024104035A1 (en) Long short-term memory self-attention model-based three-dimensional medical image segmentation method and system
CN114494296A (en) Brain glioma segmentation method and system based on fusion of Unet and Transformer
CN116228690A (en) Automatic auxiliary diagnosis method for pancreatic cancer and autoimmune pancreatitis based on PET-CT
CN117455906B (en) Digital pathological pancreatic cancer nerve segmentation method based on multi-scale cross fusion and boundary guidance
JP2024143991A (en) Image segmentation method and system in a multitask learning network
CN115471512A (en) Medical image segmentation method based on self-supervision contrast learning
Li et al. MCRformer: Morphological constraint reticular transformer for 3D medical image segmentation
CN114565601A (en) Improved liver CT image segmentation algorithm based on DeepLabV3+
Fu et al. MSA-Net: Multiscale spatial attention network for medical image segmentation
CN114387282A (en) Accurate automatic segmentation method and system for medical image organs
CN113205496A (en) Abdominal CT image liver tumor lesion segmentation method based on convolutional neural network
Wang et al. Multimodal parallel attention network for medical image segmentation
CN116468741A (en) Pancreatic cancer segmentation method based on 3D physical space domain and spiral decomposition space domain
CN117058163A (en) Depth separable medical image segmentation algorithm based on multi-scale large convolution kernel
Mani Deep learning models for semantic multi-modal medical image segmentation
CN117934519B (en) Self-adaptive segmentation method for esophageal tumor CT image synthesized by unpaired enhancement

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination