CN116416261A - CT image super-resolution segmentation method assisted by super-resolution reconstruction - Google Patents
CT image super-resolution segmentation method assisted by super-resolution reconstruction Download PDFInfo
- Publication number
- CN116416261A CN116416261A CN202310682299.2A CN202310682299A CN116416261A CN 116416261 A CN116416261 A CN 116416261A CN 202310682299 A CN202310682299 A CN 202310682299A CN 116416261 A CN116416261 A CN 116416261A
- Authority
- CN
- China
- Prior art keywords
- super
- resolution
- reconstruction
- segmentation
- feature
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 230000011218 segmentation Effects 0.000 title claims abstract description 112
- 238000000034 method Methods 0.000 title claims abstract description 82
- 230000004927 fusion Effects 0.000 claims abstract description 32
- 238000005070 sampling Methods 0.000 claims abstract description 11
- 230000009977 dual effect Effects 0.000 claims abstract description 3
- 230000008569 process Effects 0.000 claims description 28
- 230000006870 function Effects 0.000 claims description 18
- 239000004973 liquid crystal related substance Substances 0.000 claims description 6
- 238000005192 partition Methods 0.000 claims description 6
- 230000007246 mechanism Effects 0.000 claims description 5
- 238000012545 processing Methods 0.000 claims description 5
- 230000004913 activation Effects 0.000 claims description 3
- 238000013528 artificial neural network Methods 0.000 claims description 3
- 238000010606 normalization Methods 0.000 claims description 3
- 238000011176 pooling Methods 0.000 claims description 3
- 230000008859 change Effects 0.000 claims description 2
- 230000010339 dilation Effects 0.000 claims 1
- 210000000056 organ Anatomy 0.000 description 6
- 239000013589 supplement Substances 0.000 description 6
- 239000000284 extract Substances 0.000 description 4
- 238000012549 training Methods 0.000 description 4
- 210000000709 aorta Anatomy 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 2
- 230000000052 comparative effect Effects 0.000 description 2
- 230000007547 defect Effects 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 210000003238 esophagus Anatomy 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 210000002216 heart Anatomy 0.000 description 2
- 210000003437 trachea Anatomy 0.000 description 2
- 238000009825 accumulation Methods 0.000 description 1
- 210000003484 anatomy Anatomy 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000003759 clinical diagnosis Methods 0.000 description 1
- 238000002591 computed tomography Methods 0.000 description 1
- 201000010099 disease Diseases 0.000 description 1
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000003902 lesion Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0499—Feedforward networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T11/00—2D [Two Dimensional] image generation
- G06T11/003—Reconstruction from projections, e.g. tomography
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformation in the plane of the image
- G06T3/40—Scaling the whole image or part thereof
- G06T3/4007—Interpolation-based scaling, e.g. bilinear interpolation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformation in the plane of the image
- G06T3/40—Scaling the whole image or part thereof
- G06T3/4046—Scaling the whole image or part thereof using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformation in the plane of the image
- G06T3/40—Scaling the whole image or part thereof
- G06T3/4053—Super resolution, i.e. output image resolution higher than sensor resolution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10072—Tomographic images
- G06T2207/10081—Computed x-ray tomography [CT]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Biophysics (AREA)
- General Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
The invention discloses a CT image super-resolution segmentation method assisted by super-resolution reconstruction, which comprises the following steps: 4 times down-sampling the original CT image by using a bicubic algorithm; -converting said low resolution image I lr Inputting the super-resolution reconstruction and super-resolution segmentation into an encoder through two independent decoder branches; extracting multi-scale features by using a multi-scale fusion module MSFB, and transmitting intermediate features of the encoder to a decoder; fusing intermediate features of the encoder and decoder using a dual channel attention module DCAB; the model described above is optimized by a loss function.
Description
Technical Field
The invention belongs to the technical field of medical image processing, relates to the technical field of CT image reconstruction and segmentation, and particularly relates to a CT image super-resolution segmentation method assisted by super-resolution reconstruction.
Background
In clinical diagnosis, it is important to perform super-resolution (SR) segmentation on medical scan images. Existing segmentation techniques aim at segmenting regions of interest, such as vital organs or infected regions, from medical images, thereby obtaining important information about the size, shape and location of the region. However, for some regions where the anatomy is complex, the segmentation mask of the original resolution may not accurately express the segmented region, and thus it is necessary to predict a segmentation mask of high resolution from a CT image of low resolution using the SR segmentation method. However, the low resolution image contains limited detailed information that is insufficient to support the prediction of a precise high resolution segmentation mask. Therefore, we consider predicting a corresponding high resolution CT from a low resolution CT using SR reconstruction techniques, with low-level features restored during reconstruction, such as texture and edges, to aid in predicting the high resolution segmentation mask.
Existing methods for assisting SR segmentation using SR reconstruction are mainly divided into two categories: the first is to take the SR reconstruction as a preprocessing step of the image, which ignores the correlation and complementarity between the SR reconstruction and the SR segmentation task. In fact, not only can the detail information of the SR reconstruction process assist the SR segmentation process to generate a more accurate segmentation mask, but also abstract semantic information provided by the SR segmentation process can guide the SR reconstruction process to generate texture details more in line with real distribution; the second is to combine the SR reconstruction model and the SR segmentation model in a serial manner, which allows the SR reconstruction and SR segmentation processes to interact and adjust to each other, but the interaction between the two processes is still insufficient. Furthermore, the parallel approach can lead to an accumulation of errors. Thus, there is still a lack of a method that can effectively combine SR reconstruction and SR segmentation, thereby enabling the two processes to interact.
Disclosure of Invention
The invention aims to: the invention aims to provide a CT image super-resolution segmentation method assisted by super-resolution reconstruction. The method utilizes complementarity between the SR reconstruction and the SR segmentation task, specifically, the detailed characteristics such as texture, edge and the like restored in the SR reconstruction process can help the SR segmentation process to predict a more accurate segmentation mask, and abstract semantic features extracted in the SR segmentation process can also guide the SR reconstruction process to generate texture details more conforming to real distribution. In addition, considering that the size of the region of interest in the CT image varies greatly, the method uses multi-scale large-kernel convolution to extract multi-scale features, thereby further improving the performance of reconstruction and segmentation.
The technical scheme is as follows: in order to achieve the above object, the present invention provides a method for CT image super-resolution segmentation with super-resolution reconstruction assistance, comprising the steps of:
s1: downsampling the original CT image using a bicubic interpolation algorithm to downsample the original CT image 4 times to a low resolution imageUse of the original CT image +.>And split tag->Respectively used as a super-resolution reconstruction tag and a super-resolution segmentation tag;
s2: the low resolution image in step S1Inputting the super-resolution reconstruction and super-resolution segmentation into an encoder through two independent decoder branches;
s3: the multi-scale features are extracted using a multi-scale fusion Module (MSFB) and the intermediate features generated during the encoding in step S2 are passed to a decoder.
S4: the intermediate features of the encoder and decoder in step S2 are fused using a two-channel attention module (DCAB).
S5: the model described above is optimized by a loss function.
The scheme fully utilizes complementarity between the SR reconstruction and the SR segmentation task, specifically, the detailed characteristics such as texture, edge and the like restored in the SR reconstruction process can help the SR segmentation process to predict a more accurate segmentation mask, and abstract semantic characteristics extracted in the SR segmentation process can also guide the SR reconstruction process to generate texture details more conforming to real distribution.
Further, a common encoder is utilized in the step S2Features are extracted, and then features conforming to the respective tasks are extracted through independent decoder branches. Specifically, the intermediate features of the decoder of the reconstruction branch contain more detail features of low semantic information such as edges, textures and the like, and the intermediate features of the decoder of the segmentation branch contain more abstract high-level features. The method comprises the following steps: for a given inputIt is first encoded by three serial convolution modules comprising 2 layers +.>convolution-Relu activation function-BN normalization layer, downsampling layer is max-pooling downsampling, +.>Processing by a first convolution module to obtain a first coding feature +.>The feature generates a second coding feature via the downsampling layer and a second convolution module>And so on, respectively obtaining the third coding feature +.>And bottleneck characteristics->Will get->And the two decoders are identical in structure and comprise three serial up-sampling layers and a convolution module, and the up-sampling layers use a bilinear interpolation method. Taking the SR split branch decoder as an example, +.>Input to decoder and upsampledThe layer and convolution module gets a third segmentation feature +.>By doing so, the second segmentation feature +.>First segmentation feature->Third reconstruction feature in SR reconstruction branch decoderSecond reconstruction feature->First reconstruction feature->. The common encoder utilizes the correlation and complementarity between reconstruction and segmentation tasks to perform preliminary fusion of reconstruction and segmentation features. The independent decoder branches take the difference between different tasks into consideration, so that the mutual side effect between the tasks is avoided.
Further, in the step S3, the features in each layer of the encoder are fused by using the MSFB module, the multi-scale features are extracted, and the result is sent to the decoder, where each branch includes three parallel MSFB modules, specifically, the first MFSB module of the SR splitting branch, which will,/>,/>Interpolation to the same size, and splicing to obtain splicing characteristicsWith three parallel large-kernel convolutions, the convolution kernels are +.>Extracting a first multi-scale segmentation residual +.>Feed-forward neural network FFN pair +.>Further adjusting to obtain a first split residual +.>,/>In a spliced manner with->Binding to obtain new->The subsequent modules input to the split decoder and so on, the remaining MSFB modules are according to +.>Extracting the second partition residual->Third partition residual->First reconstruction residual->Second reconstructed residual->Third reconstruction residual->And spliced with corresponding decoder intermediate features. These residual features contain multi-scale information to enable the model to better cope with different organs or diseases in the medical imageThe problem of large range area size variations and complements the information that the encoder loses when downsampling.
Further, the MSFB module in step S3 uses a large-kernel convolution, which we decompose into three smaller convolutions in series in this method. For one ofWe decompose it into three parts: one or more ofIs depthwise convolution, DWconv, a +.>DWDconv and a point-by-point convolution pointwise convolution, PWconv, in the present method, in order to achieve a 9 x 9 large-kernel convolution +.>The first scale feature is obtained by passing through 3X 3 DWconv, 5X 5 DWDconv and PWconv in sequence>The method comprises the steps of carrying out a first treatment on the surface of the To achieve a large kernel convolution of 27 x 27, < >>The second scale feature is obtained by passing 5X 5 DWconv, 7X 7 DWDconv and PWconv in sequence>The method comprises the steps of carrying out a first treatment on the surface of the For the convolution of 3X 3, it is not decomposed +.>Third scale feature +.>Will be、/>、/>And splicing and fusing, and sending the result to a subsequent module of the MSFB. The large-kernel convolution enables the model to have a larger receptive field, so that the performance is improved, and the large-kernel convolution has the defect of large calculation cost and is unfavorable for the deployment of algorithms. By decomposing the large-kernel convolution into three serial convolutions, our method effectively reduces the large overhead that large-kernel convolutions bring.
Further, the step S4 uses the DCAB module to fuse the features of the reconstruction and segmentation branches by using a cross-attention mechanism,、/>、/>feature adjustment is carried out through a convolution module respectively to obtain segmentation fusion features +.>Reconstruction of fusion characteristics->Input fusion feature->To supplement the segmentation fusion feature with detailed information, avoiding its negative influence on the reconstructed feature, will +.>And->Adding to obtain new segmentation fusion feature->,And->Is mapped separately into separate query features>Split key feature->Segmentation candidate feature->And rebuild query feature->Reconstruction of key value features->Reconstruction candidate feature->The above features are fused by using a cross-attention operation, and the fusion result is subjected to a local enhancement feed-forward network LEFF to obtain a new segmentation fusion feature +.>And reconstructing fusion characteristics->The process can be expressed as:
wherein, the liquid crystal display device comprises a liquid crystal display device,representing feature dimensions, LEFF represents parameters of a locally enhanced feed forward network, < + > in order to guarantee feature stability>And->Respectively and->And->Adding to obtain new->And->. The module fully exploits complementarity between the reconstructed and segmented features: the detail features such as edges, textures and the like generated in the reconstructed features can be used as supplements of low-resolution input, so that the segmentation branches can be helped to accurately predict a high-resolution segmentation mask; abstract semantic features generated in the segmentation process can also guide the reconstruction branches to generate more real detail features.
Further, the pair in the step S5And->Respectively performing pixelshutdown up-sampling, and calculating a loss function with each label, wherein the loss function of the SR segmentation task comprises cross entropy loss and dice loss, the loss function of the SR reconstruction task comprises L1 loss, and in order to balance the loss functions of the two tasks, a dynamic adjustment mechanism is used, and the specific expression of the loss function is as follows:
wherein (1)>And->Respectively represent pair->And->Final result of pixelshutdown up-sampling, +.>And->Representing a real super-resolution reconstruction tag and a super-resolution segmentation tag, respectively, < >>Representing the calculation->Loss (S)>Representing the calculated cross entropy loss,/->Representing the calculated race loss,/->And->Representing the calculated reconstruction loss and segmentation loss, respectively, in order to balance the loss functions of the two tasks, use +.>And->Calculating the scaling factor of dynamic change and weighting the scaling factor to obtain the final loss function +.>。
The beneficial effects are that: compared with the prior art, the invention has the following advantages:
1. the invention provides a CT image super-resolution segmentation method assisted by super-resolution reconstruction by utilizing complementarity between super-resolution reconstruction and super-resolution segmentation;
2. the invention uses the characteristics of each layer of the parallel large-core convolution fusion encoder, and extracts multi-scale characteristics from the characteristics so as to better process the situation that the sizes of organs in the medical image are greatly different;
3. the invention provides a DCAB module which utilizes cross-attention operation to effectively fuse SR reconstruction characteristics and SR segmentation characteristics, so that the performances of two tasks can be improved;
4. the present invention uses dynamic weights to balance the reconstruction and segmentation tasks, dynamically adjusting the loss function.
Drawings
FIG. 1 is a schematic flow chart of a CT image super-resolution segmentation model assisted by super-resolution reconstruction;
FIG. 2 is a general frame structure diagram of a CT image super-resolution segmentation model assisted by super-resolution reconstruction provided by the invention;
FIG. 3 is a schematic diagram of a dual channel attention module (DCAB) topology according to the present invention;
FIG. 4 is a graph showing the comparison of the results of super-resolution segmentation;
fig. 5 is a comparative graph of the results of super-resolution reconstruction.
Detailed Description
The present invention is further illustrated in the accompanying drawings and detailed description which are to be understood as being merely illustrative of the invention and not limiting of its scope, and various modifications of the invention, which are equivalent to those skilled in the art upon reading the invention, will fall within the scope of the invention as defined in the appended claims.
Examples: the super-resolution segmentation task and the super-resolution reconstruction task have great relevance and complementarity. The super-resolution reconstruction task can gradually restore detail characteristics in the reconstruction process, supplement limited detail information in the input low-resolution CT image, and help the super-resolution segmentation task to more accurately predict the segmentation mask; the super-resolution segmentation can extract abstracted semantic information and guide the reconstruction process to generate texture details which are more in line with real distribution. The method uses two parallel branches to process reconstruction and segmentation tasks simultaneously, and designs a special fusion module to effectively fuse the middle characteristics of different branches, so that the reconstruction and segmentation tasks are mutually promoted.
The invention comprises the following steps:
s1: downsampling the original CT image using a bicubic interpolation algorithm to downsample the original CT image 4 times to a low resolution imageUse of the original CT image +.>And split tag->Respectively used as a super-resolution reconstruction tag and a super-resolution segmentation tag;
the data set used contains the CT image and its corresponding segmentation label. To meet the requirement of the method, we downsampled the CT image 4 times by the bicubic interpolation algorithm in the traditional image processing algorithm, and take it as the low resolution input of the model.
S2: the low resolution image in step S1The encoder is input and super-resolution reconstruction and super-resolution segmentation are performed by two independent decoder branches, respectively.
As shown in fig. 2, the method uses a common encoder to extract features, performs preliminary fusion on SR reconstruction features and SR segmentation features, and then extracts features fitting respective tasks through independent decoder branches. Specifically, the intermediate features of the decoder of the reconstruction branch comprise more detail features of low semantic information such as edges, textures and the like, and the intermediate features of the decoder of the segmentation branch comprise more abstractAdvanced features. For a given inputIt is first encoded by three serial convolution modules comprising 2 layers +.>convolution-Relu activation function-BN normalization layer, downsampling layer is max-pooling downsampling, +.>Processing by a first convolution module to obtain a first coding feature +.>The feature generates a second coding feature via the downsampling layer and a second convolution module>And so on, respectively obtaining the third coding feature +.>And bottleneck characteristics->Will get->And the two decoders are identical in structure and comprise three serial up-sampling layers and a convolution module, and the up-sampling layers use a bilinear interpolation method. Taking the SR split branch decoder as an example, +.>After input to the decoder, the third segmentation feature is obtained by an upsampling layer and a convolution module>By doing so, the second segmentation feature +.>First segmentation feature->And third reconstruction feature +.>Second reconstruction feature->First reconstruction feature->. The common encoder utilizes the correlation and complementarity between reconstruction and segmentation tasks to perform preliminary fusion of reconstruction and segmentation features. The independent decoder branches take the difference between different tasks into consideration, so that the mutual side effect between the tasks is avoided.
S3: the multi-scale features are extracted using a multi-scale fusion Module (MSFB) and the intermediate features generated during the encoding in step S2 are passed to a decoder.
As shown in fig. 2, the method utilizes the MSFB module to fuse the features in each layer of the encoder, extract the multi-scale features, and send the result to the decoder. The first MFSB module of the SR partition branch, which module is to,/>,/>Interpolation to the same size, and splicing to obtain splicing characteristics->Three parallel large-kernel convolutions are utilized, the convolution kernels are respectivelyExtracting a first multi-scale segmentation residual +.>Feed-forward neural network FFN pair +.>Further adjusting to obtain a first split residual +.>,/>In a spliced manner with->Binding to obtain new->The subsequent modules input to the split decoder and so on, the remaining MSFB modules are according to +.>Extracting a second segmentation residualThird partition residual->First reconstruction residual->Second reconstructed residual->Third reconstruction residualAnd spliced with corresponding decoder intermediate features. These residual features contain multi-scale information, which enables the model to better cope with the problem of large changes in the size of different organs or lesion areas in the medical image, and supplements the information lost by the encoder during downsampling.
The MSFB module in the step S3Using a large kernel convolution, we decompose the large kernel convolution into three smaller convolutions in series in this method. For one ofWe decompose it into three parts: one->Is depthwise convolution, DWconv, a +.>DWDconv and a point-by-point convolution pointwise convolution, PWconv, in the present method, in order to achieve a 9 x 9 large-kernel convolution +.>The first scale feature is obtained by passing through 3X 3 DWconv, 5X 5 DWDconv and PWconv in sequence>The method comprises the steps of carrying out a first treatment on the surface of the To achieve a large kernel convolution of 27 x 27, < >>The second scale feature is obtained by passing 5X 5 DWconv, 7X 7 DWDconv and PWconv in sequence>The method comprises the steps of carrying out a first treatment on the surface of the For a 3×3 convolution, it is not decomposedThird scale feature +.>Will->、/>、/>And splicing and fusing, and sending the result to a subsequent module of the MSFB. The large-kernel convolution enables the model to have a larger receptive field, so that the performance is improved, and the large-kernel convolution has the defect of large calculation cost and is unfavorable for the deployment of algorithms. By decomposing the large-kernel convolution into three serial convolutions, our method effectively reduces the large overhead that large-kernel convolutions bring.
S4: the intermediate features of the encoder and decoder in step S2 are fused using a two-channel attention module (DCAB).
As shown in fig. 3, the DCAB module fuses features of the reconstructed and split branches using a cross-attention mechanism.
、/>、/>Feature adjustment is carried out through a convolution module respectively to obtain segmentation fusion features +.>Reconstruction of fusion characteristics->Input fusion feature->To supplement the segmentation fusion feature with detailed information, avoiding its negative influence on the reconstructed feature, will +.>And->Adding to obtain new segmentation fusion feature->,And->Is mapped separately into separate query features>Split key feature->Segmentation candidate feature->And rebuild query feature->Reconstruction of key value features->Reconstruction candidate feature->The above features are fused by using a cross-attention operation, and the fusion result is subjected to a local enhancement feed-forward network LEFF to obtain a new segmentation fusion feature +.>And reconstructing fusion characteristics->The process can be expressed as:
wherein, the liquid crystal display device comprises a liquid crystal display device,representing feature dimensions, LEFF represents parameters of a locally enhanced feed forward network, < + > in order to guarantee feature stability>And->Respectively and->And->Adding to obtain new->And->. The module fully exploits complementarity between the reconstructed and segmented features: the detail features such as edges, textures and the like generated in the reconstructed features can be used as supplements of low-resolution input, so that the segmentation branches can be helped to accurately predict a high-resolution segmentation mask; abstract semantic features generated in the segmentation process can also guide the reconstruction branches to generate more real detail features.
S5: the model described above is optimized by a loss function.
To demonstrate the effectiveness of the present invention, the present invention also provides the following comparative experiments:
in particular, the present invention uses SegTHOR to disclose a dataset that is a chest multi-organ segmentation dataset comprising segmentation tags for 4 organs: heart, aorta, trachea and esophagus. The dataset contained 40 CT scans of the patient, we randomly selected 28 as the training set, 4 as the validation set, and 8 as the test set. Before formal training, we intercept the Hu values between [ -128, 384] and pre-process them as described in step S1. The training process adopts an Adamw optimizer, the initial learning rate is 0.001, and the total training period is 150.
In order to verify the effectiveness of the method in reconstruction and segmentation, we compare with the segmentation and reconstruction algorithm with the best current effect. On the segmentation task, the experimental results of the method are compared with CPFNet, kiU-Net, unet++, UCTransNet, and the evaluation indexes are universal dice and hd95, and the comparison results are shown in Table 1. It can be seen that the method has a more obvious improvement in segmentation performance compared with other methods; in the reconstruction task, the experimental result of the method is compared with RDN, EDSR, NSLA, the evaluation indexes are psnr and ssim, and the comparison result is shown in table 2.
Table 1 the results of the method compared with other algorithms on the segmentation task, the bolded data represent the best performing data. (Eso oesophageal Hea: heart Tra: trachea Aor: aorta)
Table 2 the method compares the results of the reconstruction task with other algorithms and the bolded data represents the best performing data.
To intuitively demonstrate the effectiveness of the present method, we compare the results of the present method with other methods in visual effect. FIG. 3 shows the segmentation results of the methods, and it can be seen that the method can segment the organs more accurately than other methods; FIG. 4 is a reconstruction of the various methods, which can accurately restore the unclear boundary between the esophagus and the aorta, as shown, thanks to the semantic information provided by the segmentation process.
Claims (6)
1. The CT image super-resolution segmentation method assisted by super-resolution reconstruction is characterized by comprising the following steps of:
s1: downsampling the original CT image using a bicubic interpolation algorithm to downsample the original CT image 4 times to a low resolution imageUse of the original CT image +.>And split tag->Respectively used as a super-resolution reconstruction tag and a super-resolution segmentation tag;
s2: the low resolution image in step S1Inputting the super-resolution reconstruction and super-resolution segmentation into an encoder through two independent decoder branches;
s3: extracting multi-scale features by using a multi-scale fusion module MSFB, and transmitting intermediate features generated in the encoding process in the step S2 to a decoder;
s4: fusing intermediate features of the encoder and decoder in step S2 using a dual channel attention module DCAB;
s5: the model described above is optimized by a loss function.
2. The super-resolution reconstruction-assisted CT image super-resolution segmentation method according to claim 1, wherein: in the step S2, features are extracted by using a common encoder, the SR reconstruction features and the SR segmentation features are primarily fused, and then features matching with respective tasks are respectively extracted by independent decoder branches, which is specifically as follows: for a given inputIt is first encoded by three serial convolution modules comprising 2 layers +.>convolution-Relu activation function-BN normalization layer, downsampling layer is max-pooling downsampling, +.>Processing by a first convolution module to obtain a first coding feature +.>The feature generates a second coding feature via the downsampling layer and a second convolution module>And so on, respectively obtaining the third coding feature +.>And bottleneck characteristics->Will get->And the two decoders are identical in structure and comprise three serial up-sampling layers and a convolution module, and the up-sampling layers use a bilinear interpolation method.
3. The super-resolution reconstruction-assisted CT image super-resolution segmentation method according to claim 1, wherein: in the step S3, features in each layer of the encoder are fused by using the MSFB module, multi-scale features are extracted, and the result is sent to the decoder, wherein each branch comprises three parallel MSFB modules, specifically, the first MFSB module of the SR splitting branch is as follows, and the module will,/>,/>Interpolation to the same size, and splicing to obtain splicing characteristics->Using three parallel large core volumesThe convolution kernels are +.>Extracting a first multi-scale segmentation residual +.>Feed-forward neural network FFN pair +.>Further adjusting to obtain a first split residual +.>,/>In a spliced manner withBinding to obtain new->The subsequent modules input to the split decoder and so on, the remaining MSFB modules are according to +.>Extracting the second partition residual->Third partition residual->First reconstruction residual->Second reconstructed residual->Third reconstruction residual->And spliced with corresponding decoder intermediate features.
4. The super-resolution reconstruction-assisted CT image super-resolution segmentation method according to claim 1, wherein: the MSFB module in step S3 uses large-kernel convolution for oneIs decomposed into three parts: one->Is depthwise convolution, DWconv, a +.>Is a depth-expanded convolution depthwise dilation convolution, DWDconv and a point-wise convolution pointwise convolution, PWconv,/->The first scale feature is obtained by passing through 3X 3 DWconv, 5X 5 DWDconv and PWconv in sequence>The method comprises the steps of carrying out a first treatment on the surface of the To achieve a large kernel convolution of 27 x 27, < >>The second scale feature is obtained by passing 5X 5 DWconv, 7X 7 DWDconv and PWconv in sequence>The method comprises the steps of carrying out a first treatment on the surface of the For a convolution of 3×3, < >>Third scale feature +.>Will->、/>、/>And splicing and fusing, and sending the result to a subsequent module of the MSFB.
5. The super-resolution reconstruction-assisted CT image super-resolution segmentation method according to claim 1, wherein: the DCAB module is used in said step S4 to fuse the features of the reconstructed and split branches using a cross-attention mechanism,、/>、/>feature adjustment is carried out through a convolution module respectively to obtain segmentation fusion features +.>Reconstruction of fusion characteristics->Input fusion feature->Will->And->Adding to obtain new segmentation fusion feature->,/>And->Is mapped separately into separate query features>Split key feature->Segmentation candidate feature->And rebuild query feature->Reconstruction of key value features->Reconstruction candidate feature->The above features are fused by using a cross-attention operation, and the fusion result is subjected to a local enhancement feed-forward network LEFF to obtain a new segmentation fusion feature +.>And reconstructing fusion characteristics->The process can be expressed as:
6. The super-resolution reconstruction-assisted CT image super-resolution segmentation method according to claim 1, wherein: the pair in the step S5And->Respectively performing pixelshutdown up-sampling, and calculating a loss function with each label, wherein the loss function of the SR segmentation task comprises cross entropy loss and dice loss, the loss function of the SR reconstruction task comprises L1 loss, and in order to balance the loss functions of the two tasks, a dynamic adjustment mechanism is used, and the specific expression of the loss function is as follows:
wherein (1)>And->Respectively represent pair->And->Final result of pixelshutdown up-sampling, +.>And->Representing a real super-resolution reconstruction tag and a super-resolution segmentation tag, respectively, < >>Representing the calculation->Loss (S)>Representing the calculated cross entropy loss,/->Representing the calculated race loss,/->And->Representing the calculated reconstruction loss and segmentation loss, respectively, in order to balance the loss functions of the two tasks, use +.>And->Calculating the scaling factor of dynamic change and weighting the scaling factor to obtain the final loss function +.> 。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310682299.2A CN116416261B (en) | 2023-06-09 | 2023-06-09 | CT image super-resolution segmentation method assisted by super-resolution reconstruction |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310682299.2A CN116416261B (en) | 2023-06-09 | 2023-06-09 | CT image super-resolution segmentation method assisted by super-resolution reconstruction |
Publications (2)
Publication Number | Publication Date |
---|---|
CN116416261A true CN116416261A (en) | 2023-07-11 |
CN116416261B CN116416261B (en) | 2023-09-12 |
Family
ID=87049598
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310682299.2A Active CN116416261B (en) | 2023-06-09 | 2023-06-09 | CT image super-resolution segmentation method assisted by super-resolution reconstruction |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116416261B (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113657388A (en) * | 2021-07-09 | 2021-11-16 | 北京科技大学 | Image semantic segmentation method fusing image super-resolution reconstruction |
CN114841859A (en) * | 2022-04-28 | 2022-08-02 | 南京信息工程大学 | Single-image super-resolution reconstruction method based on lightweight neural network and Transformer |
CN115953494A (en) * | 2023-03-09 | 2023-04-11 | 南京航空航天大学 | Multi-task high-quality CT image reconstruction method based on low dose and super-resolution |
WO2023098289A1 (en) * | 2021-12-01 | 2023-06-08 | 浙江大学 | Automatic unlabeled pancreas image segmentation system based on adversarial learning |
-
2023
- 2023-06-09 CN CN202310682299.2A patent/CN116416261B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113657388A (en) * | 2021-07-09 | 2021-11-16 | 北京科技大学 | Image semantic segmentation method fusing image super-resolution reconstruction |
WO2023098289A1 (en) * | 2021-12-01 | 2023-06-08 | 浙江大学 | Automatic unlabeled pancreas image segmentation system based on adversarial learning |
CN114841859A (en) * | 2022-04-28 | 2022-08-02 | 南京信息工程大学 | Single-image super-resolution reconstruction method based on lightweight neural network and Transformer |
CN115953494A (en) * | 2023-03-09 | 2023-04-11 | 南京航空航天大学 | Multi-task high-quality CT image reconstruction method based on low dose and super-resolution |
Non-Patent Citations (4)
Title |
---|
JIA ZHAOHONG 等: "Two-Branch network for brain tumor segmentation using attention mechanism and super-resolution reconstruction", 《COMPUTERS IN BIOLOGY AND MEDICINE》, vol. 157, pages 1 - 11 * |
YANG LIUTAO 等: "Low-Dose CT Denoising via Sinogram Inner-Structure Transformer", 《IEEE TRANSACTIONS ON MEDICAL IMAGING》, vol. 42, no. 4, pages 910 - 921, XP011938010, DOI: 10.1109/TMI.2022.3219856 * |
ZHANG QIAN 等: "Collaborative Network for Super-Resolution and Semantic Segmentation of Remote Sensing Images", 《IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING》, vol. 60, pages 1 - 12 * |
刘伟: "基于深度学习的三维头部MRI超分辨率重建", 《中国优秀硕士学位论文全文数据库 医药卫生科技辑》, no. 2 * |
Also Published As
Publication number | Publication date |
---|---|
CN116416261B (en) | 2023-09-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111145170B (en) | Medical image segmentation method based on deep learning | |
CN115482241A (en) | Cross-modal double-branch complementary fusion image segmentation method and device | |
CN110738660B (en) | Vertebra CT image segmentation method and device based on improved U-net | |
CN116433914A (en) | Two-dimensional medical image segmentation method and system | |
CN112785593A (en) | Brain image segmentation method based on deep learning | |
CN112700460A (en) | Image segmentation method and system | |
CN111260670B (en) | Tubular structure segmentation graph fracture repairing method and system of three-dimensional image based on deep learning network | |
CN112150470A (en) | Image segmentation method, image segmentation device, image segmentation medium, and electronic device | |
CN114219755A (en) | Intelligent pulmonary tuberculosis detection method and system based on images and clinical data | |
CN115526829A (en) | Honeycomb lung focus segmentation method and network based on ViT and context feature fusion | |
CN117058307A (en) | Method, system, equipment and storage medium for generating heart three-dimensional nuclear magnetic resonance image | |
CN110599495B (en) | Image segmentation method based on semantic information mining | |
KR102419270B1 (en) | Apparatus and method for segmenting medical image using mlp based architecture | |
CN117078930A (en) | Medical image segmentation method based on boundary sensing and attention mechanism | |
Wang et al. | Automatic consecutive context perceived transformer GAN for serial sectioning image blind inpainting | |
CN116416261B (en) | CT image super-resolution segmentation method assisted by super-resolution reconstruction | |
CN117292704A (en) | Voice-driven gesture action generation method and device based on diffusion model | |
CN117152173A (en) | Coronary artery segmentation method and system based on DUNetR model | |
CN115100731B (en) | Quality evaluation model training method and device, electronic equipment and storage medium | |
CN115984560A (en) | Image segmentation method based on CNN and Transformer | |
CN116309278A (en) | Medical image segmentation model and method based on multi-scale context awareness | |
Hou et al. | Lung nodule segmentation algorithm with SMR-UNet | |
Li et al. | Image analysis and diagnosis of skin diseases-a review | |
CN116385720A (en) | Breast cancer focus ultrasonic image segmentation algorithm | |
Chen et al. | TMTrans: texture mixed transformers for medical image segmentation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |