CN118229981A - CT image tumor segmentation method, device and medium combining convolutional network and transducer - Google Patents
CT image tumor segmentation method, device and medium combining convolutional network and transducer Download PDFInfo
- Publication number
- CN118229981A CN118229981A CN202410641655.0A CN202410641655A CN118229981A CN 118229981 A CN118229981 A CN 118229981A CN 202410641655 A CN202410641655 A CN 202410641655A CN 118229981 A CN118229981 A CN 118229981A
- Authority
- CN
- China
- Prior art keywords
- network
- image
- convolution
- module
- transducer
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000011218 segmentation Effects 0.000 title claims abstract description 87
- 206010028980 Neoplasm Diseases 0.000 title claims abstract description 53
- 238000000034 method Methods 0.000 title claims abstract description 48
- 238000012549 training Methods 0.000 claims abstract description 29
- 238000011176 pooling Methods 0.000 claims abstract description 20
- 230000004927 fusion Effects 0.000 claims abstract description 19
- 238000007781 pre-processing Methods 0.000 claims abstract description 9
- 238000012360 testing method Methods 0.000 claims abstract description 7
- 238000012795 verification Methods 0.000 claims abstract description 4
- 230000006870 function Effects 0.000 claims description 23
- 238000010606 normalization Methods 0.000 claims description 15
- 238000003860 storage Methods 0.000 claims description 12
- 238000005457 optimization Methods 0.000 claims description 7
- 230000007246 mechanism Effects 0.000 claims description 6
- 230000008569 process Effects 0.000 claims description 5
- 238000003709 image segmentation Methods 0.000 claims description 3
- 230000009467 reduction Effects 0.000 claims description 3
- 230000000007 visual effect Effects 0.000 claims description 3
- 238000004590 computer program Methods 0.000 claims 1
- 206010019695 Hepatic neoplasm Diseases 0.000 abstract description 7
- 208000014018 liver neoplasm Diseases 0.000 abstract description 7
- 238000012545 processing Methods 0.000 abstract description 5
- 238000003759 clinical diagnosis Methods 0.000 abstract description 2
- 238000009472 formulation Methods 0.000 abstract description 2
- 239000000203 mixture Substances 0.000 abstract description 2
- 238000010586 diagram Methods 0.000 description 9
- 238000004891 communication Methods 0.000 description 4
- 210000004185 liver Anatomy 0.000 description 4
- 238000013434 data augmentation Methods 0.000 description 3
- 238000013135 deep learning Methods 0.000 description 3
- 238000013461 design Methods 0.000 description 3
- 238000003745 diagnosis Methods 0.000 description 3
- 238000002372 labelling Methods 0.000 description 3
- 238000010801 machine learning Methods 0.000 description 3
- 230000002159 abnormal effect Effects 0.000 description 2
- 230000004913 activation Effects 0.000 description 2
- 238000004422 calculation algorithm Methods 0.000 description 2
- 238000013527 convolutional neural network Methods 0.000 description 2
- 238000009826 distribution Methods 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 239000000284 extract Substances 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000007170 pathology Effects 0.000 description 2
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000013145 classification model Methods 0.000 description 1
- 238000007621 cluster analysis Methods 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 238000002591 computed tomography Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000003066 decision tree Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- PCHJSUWPFVWCPO-UHFFFAOYSA-N gold Chemical compound [Au] PCHJSUWPFVWCPO-UHFFFAOYSA-N 0.000 description 1
- 238000010191 image analysis Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 230000003902 lesion Effects 0.000 description 1
- 210000005228 liver tissue Anatomy 0.000 description 1
- 230000000877 morphologic effect Effects 0.000 description 1
- 230000004660 morphological change Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 210000000056 organ Anatomy 0.000 description 1
- 231100000915 pathological change Toxicity 0.000 description 1
- 230000036285 pathological change Effects 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 238000005096 rolling process Methods 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
- 238000012706 support-vector machine Methods 0.000 description 1
- 210000001519 tissue Anatomy 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 238000000844 transformation Methods 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
- 239000013598 vector Substances 0.000 description 1
Classifications
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Apparatus For Radiation Diagnosis (AREA)
Abstract
The invention provides a CT image tumor segmentation method, device and medium combining a convolution network and a transducer, and belongs to the technical field of image processing. The method comprises the following steps: collecting an original data set, preprocessing the data set, and dividing the data set into a training set, a verification set and a test set; a CT image tumor segmentation network is established, wherein the CT image tumor segmentation network comprises a basic coding module, an attention feature fusion module, a spatial pyramid pooling module and an attention gate module, wherein the basic coding module consists of a convolution network and a transducer network; training a CT image tumor segmentation network by using training set data, and optimizing model parameters; and deploying a CT image tumor segmentation network after training, collecting CT images to be segmented, preprocessing, and inputting the CT image tumor segmentation network to obtain segmentation results. Compared with the existing scheme, the method realizes more accurate liver tumor segmentation and provides powerful technical support for clinical diagnosis and treatment scheme formulation.
Description
Technical Field
The invention relates to a CT image tumor segmentation method, device and medium combining a convolution network and a transducer, belonging to the technical field of image processing.
Background
The automatic segmentation of liver tumors has important significance for diagnosis, staging and treatment planning of liver tumors. Compared with manual segmentation, the automatic segmentation can remarkably improve the working efficiency, reduce the labor cost and lighten the workload of doctors. However, automatic tumor segmentation of liver CT images faces the following difficulties: the shape, size and location of liver tumors vary greatly. The different patients and the same patient have great differences in different time points, so that the applicability of the segmentation algorithm is challenged; compared with normal liver tissue, the image contrast of the tumor is lower and the boundary is not clear. This presents difficulties for the recognition capabilities of the segmentation algorithm.
Currently, a statistical-based method or a machine learning method and a deep learning method are mostly adopted for liver CT image tumor segmentation. The concrete main steps are as follows:
1. Image segmentation methods based on statistics, such as methods based on probability distribution modeling, methods based on cluster analysis, and the like. The method can capture the difference of tumor and normal tissue on image statistical characteristics (such as gray scale, texture and the like) for classification and segmentation.
2. The classification segmentation method based on traditional machine learning firstly extracts feature vectors representing image statistical characteristics, and commonly used features based on shape, texture and frequency spectrum information. Then dividing the image into small blocks to be used as samples, training a class-II classification model such as a support vector machine, a decision tree and the like by using the sample data, and finally predicting the small blocks of the test image by using a trained classifier to combine the small blocks into a final segmentation result.
3. The deep learning segmentation method based on the convolutional neural network directly learns the hierarchical semantic feature expression on the image data through operations such as convolution, pooling and the like, and does not need to manually design and extract features. Representative split networks such as UNet of U-type structure encode-decode FCNs of structure.
The existing segmentation method has the following technical defects:
1. traditional segmentation methods based on simple statistical analysis mainly rely on statistical information such as image intensity distribution for modeling, and cannot effectively characterize semantic concepts in complex cases. The method is sensitive to image quality and contrast variation, has poor adaptability and robustness to the conditions of organ texture difference, focus heterogeneity, noise and the like existing in a test sample, and cannot effectively divide tumors with abnormal forms.
2. Machine learning based segmentation methods rely on low-level image features of artificial design that are difficult to capture the high-level semantic concepts of images and have very limited ability to express complex lesions. Therefore, the segmentation result obtained by the method has a rough structure, fewer details and low structural accuracy, and cannot meet the requirements of clinical accurate diagnosis and treatment.
3. In the existing segmentation method based on deep learning, most of network structures are convolutional neural networks. While such convolution methods can learn the hierarchical feature representation end-to-end, the convolution operation of their cores is more focused on local features. Therefore, the existing network has weak global morphological expression of the focus and adaptability to abnormal samples, and the edge positioning accuracy is not high, which limits the segmentation accuracy.
Disclosure of Invention
The invention aims to provide a CT image tumor segmentation method, device and medium combining a convolution network and a transducer, which can effectively alleviate the problems of unclear segmentation of different forms and boundaries.
The invention aims to achieve the aim, and the aim is achieved by the following technical scheme:
collecting an original data set, preprocessing the data set, and dividing the data set into a training set, a verification set and a test set;
A CT image tumor segmentation network is established, wherein the CT image tumor segmentation network comprises a basic coding module, an attention feature fusion module, a spatial pyramid pooling module and an attention gate module, wherein the basic coding module consists of a convolution network and a transducer network; the CT image sequentially passes through a basic coding module, an attention feature fusion module and a space pyramid pooling module, and then is rolled and upsampled to obtain a feature map The convolution network output feature map is sent to an attention gate module to obtain a feature map/>Feature map/>And/>Generating a segmentation prediction result through a convolution layer in cascade connection;
training a CT image tumor segmentation network by using training set data, and using a combined loss function combining binary cross entropy loss and Dice loss as an optimization target optimization model parameter;
and deploying a CT image tumor segmentation network after training, collecting CT images to be segmented, preprocessing, and inputting the CT image tumor segmentation network to obtain segmentation results.
Preferably, the convolutional network encoder is composed of 5 standard convolutional blocks, each block comprises convolution, BN and an activation layer, the input CT image is encoded, and local bottom visual features are extracted; the transducer encoder consists of 3 standard transducer encoder layers, models an input CT image through a self-attention mechanism, and captures long-range dependency and global context information.
Preferably, the attention feature fusion module processes the feature map as follows:
Feature map of outputs of convolutional and transform networks And/>Splicing in the channel dimension to obtain a feature map/>;
Map the characteristic mapAlternately inputting the attention weighting modules; the attention weighting module comprises two paths, wherein a feature map/>, in one pathSequentially passing through convolution, batch normalization, relu function blocks, convolution and batch normalization blocks, and feature map/>, in another pathSequentially executing pooling operation, convolution, batch normalization and Relu function blocks, splicing output feature graphs of two paths, and acquiring weights by using a Sigmoid function; and performing point multiplication weighting operation on the output of the convolution network in the basic coding module by using the weight, and performing point multiplication weighting operation on the output of the transform network by using the output obtained by the point multiplication weighting.
Preferably, the spatial pyramid pooling module includes 5 parallel convolution branch paths: the first branch path uses a standard 1 multiplied by 1 convolution check input feature map to carry out convolution, and low-level fine-grained local features are obtained; the second branch path is convolved with a hole with an expansion rate of 6; the third branch path uses hole convolution with expansion rate of 6; the fourth branch path uses hole convolution with expansion rate of 6; the fifth path sequentially comprises a global average pooling layer, 1×1 convolution and bilinear interpolation; the output feature graphs of the five paths are spliced in the channel dimension, and multi-scale convolution features are fused; and carrying out channel dimension reduction on the spliced feature map through 1X 1 convolution, and outputting a final multi-scale fusion feature representation.
Preferably, the attention gate module takes a low-level characteristic diagram of a shallow layer of the convolutional network as a characteristic input, and takes the characteristic diagramAs gating signal input, the signals are spliced after being respectively subjected to convolution and batch normalization; the spliced feature images are subjected to convolution, batch normalization and Relu function blocks, and then a Sigmoid function is used for generating an attention coefficient feature image; the spatial regions of the shallow low-level features are adaptively weighted by the attention coefficient feature map.
Preferably, the joint loss functionThe following are provided:
,
,
,
Wherein, Representation/>Weight coefficient of loss,/>Representation/>Weight coefficient of loss,/>For a true mark to be true,Output probability for model,/>And/>Pixels of the tumor region and the model prediction region are truly labeled.
Preferably, the method comprises the steps of,The value is 0.6,/>The value is 0.4.
The invention has the advantages that: the invention uses the fusion of the convolution network and the transform encoder as the basic encoding module, can simultaneously acquire local detail characteristics and global context characteristics, and greatly enhances the characteristic expression capability. The attention feature fusion module can adaptively and weighted fuse the rolled and Transformer features from different sources through an attention mechanism, selectively highlight important features, and capture semantic-rich feature representations. The spatial pyramid pooling module and the attention gate module can respectively fuse the context semantic features and the detail features with different scales, so that the model has strong adaptability, and tumors with various sizes and forms can be segmented efficiently. The creative loss function is designed to balance the supervision signals of the local pixel level and the global level, and the model is guided to pay attention to details and consistency at the same time, so that a fine and accurate segmentation result is generated.
The invention can effectively fuse local and global characteristic information end to end, fully express multi-scale semantic information, improve adaptability to tumor morphological change, fineness of edge positioning and segmentation robustness to complex cases, realize more accurate liver tumor segmentation than the existing scheme, and provide powerful technical support for clinical diagnosis and treatment scheme formulation.
Drawings
The accompanying drawings are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate the invention and together with the embodiments of the invention, serve to explain the invention.
FIG. 1 is a schematic flow chart of the method of the invention.
Fig. 2 is a schematic diagram of a network structure according to the present invention.
Fig. 3 is a schematic diagram of the attention feature fusion module structure of the present invention.
FIG. 4 is a schematic diagram of a spatial pyramid pooling module structure according to the present invention.
Fig. 5 is a schematic view of the structure of the attention door module of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Example 1
As shown in FIG. 1, a CT image tumor segmentation method combining a convolution network and a transducer comprises the steps of original data and processing, network structure, training by using a joint loss function and model application. The specific implementation mode is as follows:
S1: the method comprises the steps of collecting an original data set, preprocessing the data set, and dividing the data set into a training set, a verification set and a test set.
S2: a CT image tumor segmentation network is established, wherein the CT image tumor segmentation network comprises a basic coding module, an attention feature fusion module, a spatial pyramid pooling module and an attention gate module, wherein the basic coding module consists of a convolution network and a transducer network; the CT image sequentially passes through a basic coding module, an attention feature fusion module and a space pyramid pooling module, and then is rolled and upsampled to obtain a feature mapThe convolution network output feature map is sent to an attention gate module to obtain a feature map/>Feature map/>And/>The concatenation generates a segmentation prediction result through a convolution layer.
S3: training a CT image tumor segmentation network by using training set data, and using a joint loss function combining binary cross entropy loss and Dice loss as an optimization target optimization model parameter.
S4: and deploying a CT image tumor segmentation network after training, collecting CT images to be segmented, preprocessing, and inputting the CT image tumor segmentation network to obtain segmentation results.
As a refinement of step S1, the data used in the present invention is from LiTS (TheLiverTumorSegmentationBenchmark) datasets, which collected data from 7 different medical centers, including 131 training sets and 70 test sets. Each patient had a gold standard labeling of liver tumor segmentation. These CT scans reflect a number of variations from patient to patient, from scanning device to scanning device, and from pathology type to pathology type, all of which are 512 x 512 in image size.
In order to enhance the generalization capability and robustness of the model, the method performs data augmentation on all training images before training. Specifically, a series of image processing transformations are performed on the paired original liver CT images and the corresponding labeled segmented images in the same step, including multiple data augmentation modes such as horizontal and vertical flipping, rotation, scaling, translation, gaussian noise addition, brightness adjustment, contrast adjustment, and the like. Therefore, different scanning conditions and pathological changes in a real scene can be simulated, so that the model is contacted with more abundant and various data during training, and better generalization performance is obtained.
After data augmentation, a sufficiently large number of training samples rich in changes are obtained. The invention takes the amplified paired CT images and the segmentation annotation images as input and training targets, and inputs the paired CT images and the segmentation annotation images into the proposed novel network architecture for end-to-end training.
As a refinement of step S2, as shown in fig. 2 to 5, a specific module structure is schematically shown.
1. Basic coding module
The convolution network and the converter network form a basic coding module, the convolution network coder consists of 5 standard convolution blocks, each block comprises convolution, BN and an activation layer, and the input CT image is coded through the basic operations to extract local bottom visual characteristics. The transducer encoder is then composed of 3 standard transducer encoder layers that model the incoming CT image by a self-attention mechanism, capturing long-range dependencies and global context information, forming a higher level feature representation. Feature maps from different levels of the convolutional encoder and the transform encoder will be fed in parallel into the attention feature fusion module.
Specifically, the treatment process is as follows:
S2-1: the attention feature fusion module first fuses features from convolutional network branches and transform branches And/>Splice in the channel dimension. The stitched features are then alternately input into a series of attention weighting modules to explicitly model the interdependencies between the convolved feature channels.
S2-2: the structure of the attention weighting module is shown in fig. 3, the spliced characteristic diagram is divided into two paths, and the characteristic diagram in one pathSequentially passing through convolution, batch normalization, relu function blocks, convolution and batch normalization blocks, and feature map/>, in another pathAnd sequentially executing pooling operation, convolution, batch normalization and Relu function blocks, splicing the output feature graphs of the two paths, and then acquiring weights by using a Sigmoid function.
S2-3: the output of the convolution network in the basic coding module is weighted by the weight, and then the obtained output is used for further carrying out dot multiplication on the output of the transform network, so that the weighting operation is realized.
The attention mechanism module can adaptively allocate attention weights to different feature channels and selectively highlight or inhibit different features, thereby capturing multi-scale context semantics and enhancing the feature expression of a tumor region.
2. Space pyramid pooling module
To accommodate a wide variety of tumor sizes and morphologies, the spatial pyramid pooling module receives the output of the attention feature fusion module, which includes 5 parallel convolution branch paths:
the first branch path convolves the input feature map using a standard 1 x1 convolution kernel to obtain low-level fine-grained local features.
The second branch path is convolved with a hole with an expansion rate of 6; the third branch path uses hole convolution with expansion rate of 6; the fourth branch path uses hole convolution with expansion rate of 6; the receptive field range is gradually enlarged by controlling the expansion rate, and the medium-scale semantic features are extracted.
The fifth path sequentially comprises a global average pooling layer, 1×1 convolution and bilinear interpolation, and is restored to the original resolution to capture global profile features.
The output feature graphs of the five paths are spliced in the channel dimension, and multi-scale convolution features are fused; and carrying out channel dimension reduction on the spliced feature map through 1X 1 convolution, and outputting a final multi-scale fusion feature representation. Through the module, the invention can integrate local details, intermediate semantics and global context information under different scales at the same time, so that the feature expression has strong adaptability and robustness, and the influence of size and form changes on segmentation is overcome, thereby improving the accuracy and generalization capability of liver tumor segmentation.
3. Attention door module
The attention gate module takes a low-level characteristic diagram of a convolutional network shallow layer as characteristic input, and takes the characteristic diagramAs gating signal input, the signals are spliced after being respectively subjected to convolution and batch normalization; the spliced feature images are subjected to convolution, batch normalization and Relu function blocks, and then a Sigmoid function is used for generating an attention coefficient feature image; the spatial regions of the shallow low-level features are adaptively weighted by the attention coefficient feature map. The attention coefficient feature map adaptively weights the spatial region of the shallow low-level features, retains and strengthens important detail information related to a gating signal (high-level semantics), and suppresses irrelevant background regions.
The output feature map of the attention gate module is cascaded with the feature maps of other branches, and the split prediction is generated comprehensively through a convolution layer and an up-sampling layer to be combined into a finished network. The problem that edge details of a segmentation result are unclear due to poor image quality is solved through the mechanism, and particularly, after a plurality of rolling and downsampling operations in a network, some detail semantic information in an original input image can be lost, so that the fineness of a final segmentation contour is affected. The attention gate module successfully fuses shallow detail and deep semantic features, helps the segmentation network to better reserve the detail features, and accordingly generates a finer and accurate segmentation contour result.
As a refinement of step S3, the joint loss functionThe following are provided:
,
,
,
Wherein, Representation/>Weight coefficient of loss,/>Representation/>Weight coefficient of loss,/>For a true mark to be true,Output probability for model,/>And/>Pixels of the tumor region and the model prediction region are truly labeled.
The method is a standard binary cross entropy loss, and is used for comparing prediction and true value pixel by pixel and describing local detail differences; the Dice similarity coefficients between the prediction mask and the truth mask are calculated, providing a global evaluation.
The two are combined, complementary loss gradient information can be provided, moderate change is introduced, and meanwhile, the smooth stability of a loss curved surface is ensured due to the existence of BCE loss. The present invention is directed to the characteristics of tumors in liver CT images,The value is 0.6,/>The value is 0.4. Through the innovative joint loss function design, the segmentation network can synthesize the supervision signals of the local pixel level and the global semantic level, and is beneficial to generating finer and more accurate segmentation results.
As refinement of step S4, in the network training process, reasonable parameter settings such as batch size, optimizer and learning rate are adopted, in this embodiment, batch size is 4, optimizer is Adam, learning rate is 0.001, andThe joint loss function is used as an optimization target, so that the model can pay attention to local detail and global similarity at the same time. Training will continue until the segmentation performance on the validation set is no longer improved. Finally, the model parameters with optimal performance are saved, and the preparation is made for the subsequent clinical application.
In practical application, the trained model is deployed and integrated to a server side of the medical image analysis system. After uploading new CT image data of the patient, the system automatically performs necessary preprocessing on the image, adjusts the image to be consistent with training data, and then inputs the image into a segmentation model for forward reasoning. The model outputs a high-quality segmentation probability map, generates a final binary segmentation contour based on the probability map, and outputs a visualized segmentation result after fusion with the original image data.
The segmentation result can be widely applied to links such as clinical tumor detection, three-dimensional reconstruction, operation navigation, curative effect evaluation, volume calculation and the like, and the diagnosis and treatment working efficiency is greatly improved. Moreover, the system has the potential of continuously optimizing the segmentation model: continuously collecting new labeling samples in the actual application process, adding the new labeling samples into a training set to perform incremental learning or fine adjustment regularly, and further improving the segmentation performance by adopting strategies such as multi-model integration and the like; the attention module weight can be adjusted according to feedback, and the semi-supervision/weak supervision learning paradigm and other technologies are introduced, so that the model has stronger adaptability and robustness, and high segmentation precision is maintained for a long time.
Example 2
The embodiment of the disclosure also provides a CT image tumor segmentation device combining a convolution network and a transducer, which comprises a processor and a memory. Optionally, the apparatus may further comprise a communication interface (CommunicationInterface) and a bus. The processor, the communication interface and the memory can complete communication with each other through the bus. The communication interface may be used for information transfer. The processor may invoke logic instructions in the memory to perform the CT image tumor segmentation method of the above-described embodiments that combines a convolutional network and a transducer.
Further, the logic instructions in the memory described above may be implemented in the form of software functional units and stored in a computer-readable storage medium when sold or used as a stand-alone product.
The memory is used as a computer readable storage medium for storing a software program, a computer executable program, and program instructions/modules corresponding to the methods in the embodiments of the present disclosure. The processor executes the program instructions/modules stored in the memory to perform the functional application and data processing, i.e., to implement the CT image tumor segmentation method in combination with the convolutional network and the transducer in the above embodiments.
The memory may include a program storage area and a data storage area, wherein the program storage area may store an operating system, at least one application program required for a function; the storage data area may store data created according to the use of the terminal device, etc. Further, the memory may include a high-speed random access memory, and may also include a nonvolatile memory.
Embodiments of the present disclosure provide a computer readable storage medium storing computer executable instructions configured to perform the above-described CT image tumor segmentation method combining a convolutional network and a transducer.
The computer readable storage medium may be a transitory computer readable storage medium or a non-transitory computer readable storage medium.
Embodiments of the present disclosure may be embodied in a software product stored on a storage medium, including one or more instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of a method according to embodiments of the present disclosure. And the aforementioned storage medium may be a non-transitory storage medium including: a plurality of media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-only memory (ROM), a random access memory (RAM, randomAccessMemory), a magnetic disk, or an optical disk, or a transitory storage medium.
Finally, it should be noted that: the foregoing description is only a preferred embodiment of the present invention, and the present invention is not limited thereto, but it is to be understood that modifications and equivalents of some of the technical features described in the foregoing embodiments may be made by those skilled in the art, although the present invention has been described in detail with reference to the foregoing embodiments. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.
Claims (9)
1. A CT image tumor segmentation method combining a convolutional network and a transducer, comprising:
collecting an original data set, preprocessing the data set, and dividing the data set into a training set, a verification set and a test set;
A CT image tumor segmentation network is established, wherein the CT image tumor segmentation network comprises a basic coding module, an attention feature fusion module, a spatial pyramid pooling module and an attention gate module, wherein the basic coding module consists of a convolution network and a transducer network; the CT image sequentially passes through a basic coding module, an attention feature fusion module and a space pyramid pooling module, and then is rolled and upsampled to obtain a feature map The convolution network output feature map is sent to an attention gate module to obtain a feature map/>Feature map/>And/>Generating a segmentation prediction result through a convolution layer in cascade connection;
training a CT image tumor segmentation network by using training set data, and using a combined loss function combining binary cross entropy loss and Dice loss as an optimization target optimization model parameter;
and deploying a CT image tumor segmentation network after training, collecting CT images to be segmented, preprocessing, and inputting the CT image tumor segmentation network to obtain segmentation results.
2. The method for segmenting a tumor of a CT image combining a convolutional network and a fransformer according to claim 1, wherein the convolutional network encoder is composed of 5 standard convolutional blocks, each block comprises a convolutional layer, a BN layer and an active layer, the input CT image is encoded, and local underlying visual features are extracted; the transducer encoder consists of 3 standard transducer encoder layers, models an input CT image through a self-attention mechanism, and captures long-range dependency and global context information.
3. The CT image tumor segmentation method combining convolutional network and transducer according to claim 1, wherein the attention feature fusion module processes the feature map as follows:
Feature map of outputs of convolutional and transform networks And/>Splicing in the channel dimension to obtain a feature map/>;
Map the characteristic mapAlternately inputting the attention weighting modules; the attention weighting module comprises two paths, wherein a feature map/>, in one pathSequentially passing through convolution, batch normalization, relu function blocks, convolution and batch normalization blocks, and feature map/>, in another pathSequentially executing pooling operation, convolution, batch normalization and Relu function blocks, splicing output feature graphs of two paths, and acquiring weights by using a Sigmoid function; and performing point multiplication weighting operation on the output of the convolution network in the basic coding module by using the weight, and performing point multiplication weighting operation on the output of the transform network by using the output obtained by the point multiplication weighting.
4. The CT image tumor segmentation method combining a convolutional network and a transducer of claim 1, wherein the spatial pyramid pooling module comprises 5 parallel convolutional branch paths: the first branch path uses a standard 1 multiplied by 1 convolution check input feature map to carry out convolution, and low-level fine-grained local features are obtained; the second branch path is convolved with a hole with an expansion rate of 6; the third branch path uses hole convolution with expansion rate of 6; the fourth branch path uses hole convolution with expansion rate of 6; the fifth path sequentially comprises a global average pooling layer, 1×1 convolution and bilinear interpolation; the output feature graphs of the five paths are spliced in the channel dimension, and multi-scale convolution features are fused; and carrying out channel dimension reduction on the spliced feature map through 1X 1 convolution, and outputting a final multi-scale fusion feature representation.
5. The CT image tumor segmentation method combining convolutional network and transducer as recited in claim 1, wherein the attention gate module takes a low-level feature map of a shallow layer of the convolutional network as a feature input and takes the feature map as a feature inputAs gating signal input, the signals are spliced after being respectively subjected to convolution and batch normalization; the spliced feature images are subjected to convolution, batch normalization and Relu function blocks, and then a Sigmoid function is used for generating an attention coefficient feature image; the spatial regions of the shallow low-level features are adaptively weighted by the attention coefficient feature map.
6. The CT image tumor segmentation method combining convolutional network and transducer of claim 1, wherein the joint loss functionThe following are provided:
,
,
,
Wherein, Representation/>Weight coefficient of loss,/>Representation/>Weight coefficient of loss,/>Is true mark,/>Output probability for model,/>And/>Pixels of the tumor region and the model prediction region are truly labeled.
7. The method for CT image segmentation as recited in claim 6, wherein the CT image segmentation method comprises the steps of,The value is 0.6,/>The value is 0.4.
8. A CT image tumor segmentation apparatus incorporating a convolutional network and a transducer comprising a processor and a memory storing program instructions, wherein the processor is configured to perform the CT image tumor segmentation method incorporating a convolutional network and a transducer as claimed in any one of claims 1-7 when executing the program instructions.
9. A computer readable storage medium having stored thereon a computer program for execution by a processor of a CT image tumor segmentation method combining a convolutional network and a transducer according to any one of claims 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410641655.0A CN118229981A (en) | 2024-05-23 | 2024-05-23 | CT image tumor segmentation method, device and medium combining convolutional network and transducer |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410641655.0A CN118229981A (en) | 2024-05-23 | 2024-05-23 | CT image tumor segmentation method, device and medium combining convolutional network and transducer |
Publications (1)
Publication Number | Publication Date |
---|---|
CN118229981A true CN118229981A (en) | 2024-06-21 |
Family
ID=91508806
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410641655.0A Pending CN118229981A (en) | 2024-05-23 | 2024-05-23 | CT image tumor segmentation method, device and medium combining convolutional network and transducer |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN118229981A (en) |
Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2021104056A1 (en) * | 2019-11-27 | 2021-06-03 | 中国科学院深圳先进技术研究院 | Automatic tumor segmentation system and method, and electronic device |
CN113469119A (en) * | 2021-07-20 | 2021-10-01 | 合肥工业大学 | Cervical cell image classification method based on visual converter and graph convolution network |
CN113837193A (en) * | 2021-09-23 | 2021-12-24 | 中南大学 | Zinc flotation froth image segmentation algorithm based on improved U-Net network |
WO2022141723A1 (en) * | 2020-12-29 | 2022-07-07 | 江苏大学 | Image classification and segmentation apparatus and method based on feature guided network, and device and medium |
CN115641340A (en) * | 2022-09-07 | 2023-01-24 | 闽江学院 | Retina blood vessel image segmentation method based on multi-scale attention gating network |
CN116309278A (en) * | 2022-12-16 | 2023-06-23 | 安徽大学 | Medical image segmentation model and method based on multi-scale context awareness |
CN116309615A (en) * | 2023-01-09 | 2023-06-23 | 西南科技大学 | Multi-mode MRI brain tumor image segmentation method |
CN116739985A (en) * | 2023-05-10 | 2023-09-12 | 浙江医院 | Pulmonary CT image segmentation method based on transducer and convolutional neural network |
KR20230147492A (en) * | 2022-04-14 | 2023-10-23 | 한국교통대학교산학협력단 | Method and apparatus for segmenting brain tumor regions in brain magnetic resonance image based on deep learning |
WO2024000161A1 (en) * | 2022-06-28 | 2024-01-04 | 中国科学院深圳先进技术研究院 | Ct pancreatic tumor automatic segmentation method and system, terminal and storage medium |
CN117523204A (en) * | 2023-11-28 | 2024-02-06 | 辽宁科技大学 | Liver tumor image segmentation method and device oriented to medical scene and readable storage medium |
-
2024
- 2024-05-23 CN CN202410641655.0A patent/CN118229981A/en active Pending
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2021104056A1 (en) * | 2019-11-27 | 2021-06-03 | 中国科学院深圳先进技术研究院 | Automatic tumor segmentation system and method, and electronic device |
WO2022141723A1 (en) * | 2020-12-29 | 2022-07-07 | 江苏大学 | Image classification and segmentation apparatus and method based on feature guided network, and device and medium |
CN113469119A (en) * | 2021-07-20 | 2021-10-01 | 合肥工业大学 | Cervical cell image classification method based on visual converter and graph convolution network |
CN113837193A (en) * | 2021-09-23 | 2021-12-24 | 中南大学 | Zinc flotation froth image segmentation algorithm based on improved U-Net network |
KR20230147492A (en) * | 2022-04-14 | 2023-10-23 | 한국교통대학교산학협력단 | Method and apparatus for segmenting brain tumor regions in brain magnetic resonance image based on deep learning |
WO2024000161A1 (en) * | 2022-06-28 | 2024-01-04 | 中国科学院深圳先进技术研究院 | Ct pancreatic tumor automatic segmentation method and system, terminal and storage medium |
CN115641340A (en) * | 2022-09-07 | 2023-01-24 | 闽江学院 | Retina blood vessel image segmentation method based on multi-scale attention gating network |
CN116309278A (en) * | 2022-12-16 | 2023-06-23 | 安徽大学 | Medical image segmentation model and method based on multi-scale context awareness |
CN116309615A (en) * | 2023-01-09 | 2023-06-23 | 西南科技大学 | Multi-mode MRI brain tumor image segmentation method |
CN116739985A (en) * | 2023-05-10 | 2023-09-12 | 浙江医院 | Pulmonary CT image segmentation method based on transducer and convolutional neural network |
CN117523204A (en) * | 2023-11-28 | 2024-02-06 | 辽宁科技大学 | Liver tumor image segmentation method and device oriented to medical scene and readable storage medium |
Non-Patent Citations (1)
Title |
---|
郝晓宇;熊俊峰;薛旭东;石军;文可;韩文廷;李骁扬;赵俊;傅小龙;: "融合双注意力机制3D U-Net的肺肿瘤分割", 中国图象图形学报, no. 10, 16 October 2020 (2020-10-16) * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113077471B (en) | Medical image segmentation method based on U-shaped network | |
CN111627019B (en) | Liver tumor segmentation method and system based on convolutional neural network | |
CN109598727B (en) | CT image lung parenchyma three-dimensional semantic segmentation method based on deep neural network | |
CN111798462B (en) | Automatic delineation method of nasopharyngeal carcinoma radiotherapy target area based on CT image | |
CN111784671B (en) | Pathological image focus region detection method based on multi-scale deep learning | |
CN111612754B (en) | MRI tumor optimization segmentation method and system based on multi-modal image fusion | |
CN111738363B (en) | Alzheimer disease classification method based on improved 3D CNN network | |
CN110889853A (en) | Tumor segmentation method based on residual error-attention deep neural network | |
CN110889852A (en) | Liver segmentation method based on residual error-attention deep neural network | |
CN113034505B (en) | Glandular cell image segmentation method and glandular cell image segmentation device based on edge perception network | |
CN112258488A (en) | Medical image focus segmentation method | |
Popescu et al. | Retinal blood vessel segmentation using pix2pix gan | |
CN111260667A (en) | Neurofibroma segmentation method combined with space guidance | |
CN115375711A (en) | Image segmentation method of global context attention network based on multi-scale fusion | |
CN115311194A (en) | Automatic CT liver image segmentation method based on transformer and SE block | |
CN112750137A (en) | Liver tumor segmentation method and system based on deep learning | |
CN116309806A (en) | CSAI-Grid RCNN-based thyroid ultrasound image region of interest positioning method | |
CN112465754A (en) | 3D medical image segmentation method and device based on layered perception fusion and storage medium | |
CN115457057A (en) | Multi-scale feature fusion gland segmentation method adopting deep supervision strategy | |
CN115578406A (en) | CBCT jaw bone region segmentation method and system based on context fusion mechanism | |
CN112489048B (en) | Automatic optic nerve segmentation method based on depth network | |
CN117649385A (en) | Lung CT image segmentation method based on global and local attention mechanisms | |
CN113256657A (en) | Efficient medical image segmentation method and system, terminal and medium | |
CN112488996A (en) | Inhomogeneous three-dimensional esophageal cancer energy spectrum CT (computed tomography) weak supervision automatic labeling method and system | |
CN116542924A (en) | Prostate focus area detection method, device and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination |