CN116416261A - CT image super-resolution segmentation method assisted by super-resolution reconstruction - Google Patents

CT image super-resolution segmentation method assisted by super-resolution reconstruction Download PDF

Info

Publication number
CN116416261A
CN116416261A CN202310682299.2A CN202310682299A CN116416261A CN 116416261 A CN116416261 A CN 116416261A CN 202310682299 A CN202310682299 A CN 202310682299A CN 116416261 A CN116416261 A CN 116416261A
Authority
CN
China
Prior art keywords
super
resolution
reconstruction
segmentation
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310682299.2A
Other languages
Chinese (zh)
Other versions
CN116416261B (en
Inventor
葛荣骏
徐颖
张道强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Aeronautics and Astronautics
Original Assignee
Nanjing University of Aeronautics and Astronautics
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Aeronautics and Astronautics filed Critical Nanjing University of Aeronautics and Astronautics
Priority to CN202310682299.2A priority Critical patent/CN116416261B/en
Publication of CN116416261A publication Critical patent/CN116416261A/en
Application granted granted Critical
Publication of CN116416261B publication Critical patent/CN116416261B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0499Feedforward networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • G06T11/003Reconstruction from projections, e.g. tomography
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformation in the plane of the image
    • G06T3/40Scaling the whole image or part thereof
    • G06T3/4007Interpolation-based scaling, e.g. bilinear interpolation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformation in the plane of the image
    • G06T3/40Scaling the whole image or part thereof
    • G06T3/4046Scaling the whole image or part thereof using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformation in the plane of the image
    • G06T3/40Scaling the whole image or part thereof
    • G06T3/4053Super resolution, i.e. output image resolution higher than sensor resolution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10072Tomographic images
    • G06T2207/10081Computed x-ray tomography [CT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses a CT image super-resolution segmentation method assisted by super-resolution reconstruction, which comprises the following steps: 4 times down-sampling the original CT image by using a bicubic algorithm; -converting said low resolution image I lr Inputting the super-resolution reconstruction and super-resolution segmentation into an encoder through two independent decoder branches; extracting multi-scale features by using a multi-scale fusion module MSFB, and transmitting intermediate features of the encoder to a decoder; fusing intermediate features of the encoder and decoder using a dual channel attention module DCAB; the model described above is optimized by a loss function.

Description

CT image super-resolution segmentation method assisted by super-resolution reconstruction
Technical Field
The invention belongs to the technical field of medical image processing, relates to the technical field of CT image reconstruction and segmentation, and particularly relates to a CT image super-resolution segmentation method assisted by super-resolution reconstruction.
Background
In clinical diagnosis, it is important to perform super-resolution (SR) segmentation on medical scan images. Existing segmentation techniques aim at segmenting regions of interest, such as vital organs or infected regions, from medical images, thereby obtaining important information about the size, shape and location of the region. However, for some regions where the anatomy is complex, the segmentation mask of the original resolution may not accurately express the segmented region, and thus it is necessary to predict a segmentation mask of high resolution from a CT image of low resolution using the SR segmentation method. However, the low resolution image contains limited detailed information that is insufficient to support the prediction of a precise high resolution segmentation mask. Therefore, we consider predicting a corresponding high resolution CT from a low resolution CT using SR reconstruction techniques, with low-level features restored during reconstruction, such as texture and edges, to aid in predicting the high resolution segmentation mask.
Existing methods for assisting SR segmentation using SR reconstruction are mainly divided into two categories: the first is to take the SR reconstruction as a preprocessing step of the image, which ignores the correlation and complementarity between the SR reconstruction and the SR segmentation task. In fact, not only can the detail information of the SR reconstruction process assist the SR segmentation process to generate a more accurate segmentation mask, but also abstract semantic information provided by the SR segmentation process can guide the SR reconstruction process to generate texture details more in line with real distribution; the second is to combine the SR reconstruction model and the SR segmentation model in a serial manner, which allows the SR reconstruction and SR segmentation processes to interact and adjust to each other, but the interaction between the two processes is still insufficient. Furthermore, the parallel approach can lead to an accumulation of errors. Thus, there is still a lack of a method that can effectively combine SR reconstruction and SR segmentation, thereby enabling the two processes to interact.
Disclosure of Invention
The invention aims to: the invention aims to provide a CT image super-resolution segmentation method assisted by super-resolution reconstruction. The method utilizes complementarity between the SR reconstruction and the SR segmentation task, specifically, the detailed characteristics such as texture, edge and the like restored in the SR reconstruction process can help the SR segmentation process to predict a more accurate segmentation mask, and abstract semantic features extracted in the SR segmentation process can also guide the SR reconstruction process to generate texture details more conforming to real distribution. In addition, considering that the size of the region of interest in the CT image varies greatly, the method uses multi-scale large-kernel convolution to extract multi-scale features, thereby further improving the performance of reconstruction and segmentation.
The technical scheme is as follows: in order to achieve the above object, the present invention provides a method for CT image super-resolution segmentation with super-resolution reconstruction assistance, comprising the steps of:
s1: downsampling the original CT image using a bicubic interpolation algorithm to downsample the original CT image 4 times to a low resolution image
Figure SMS_1
Use of the original CT image +.>
Figure SMS_2
And split tag->
Figure SMS_3
Respectively used as a super-resolution reconstruction tag and a super-resolution segmentation tag;
s2: the low resolution image in step S1
Figure SMS_4
Inputting the super-resolution reconstruction and super-resolution segmentation into an encoder through two independent decoder branches;
s3: the multi-scale features are extracted using a multi-scale fusion Module (MSFB) and the intermediate features generated during the encoding in step S2 are passed to a decoder.
S4: the intermediate features of the encoder and decoder in step S2 are fused using a two-channel attention module (DCAB).
S5: the model described above is optimized by a loss function.
The scheme fully utilizes complementarity between the SR reconstruction and the SR segmentation task, specifically, the detailed characteristics such as texture, edge and the like restored in the SR reconstruction process can help the SR segmentation process to predict a more accurate segmentation mask, and abstract semantic characteristics extracted in the SR segmentation process can also guide the SR reconstruction process to generate texture details more conforming to real distribution.
Further, a common encoder is utilized in the step S2Features are extracted, and then features conforming to the respective tasks are extracted through independent decoder branches. Specifically, the intermediate features of the decoder of the reconstruction branch contain more detail features of low semantic information such as edges, textures and the like, and the intermediate features of the decoder of the segmentation branch contain more abstract high-level features. The method comprises the following steps: for a given input
Figure SMS_6
It is first encoded by three serial convolution modules comprising 2 layers +.>
Figure SMS_9
convolution-Relu activation function-BN normalization layer, downsampling layer is max-pooling downsampling, +.>
Figure SMS_13
Processing by a first convolution module to obtain a first coding feature +.>
Figure SMS_8
The feature generates a second coding feature via the downsampling layer and a second convolution module>
Figure SMS_10
And so on, respectively obtaining the third coding feature +.>
Figure SMS_14
And bottleneck characteristics->
Figure SMS_17
Will get->
Figure SMS_5
And the two decoders are identical in structure and comprise three serial up-sampling layers and a convolution module, and the up-sampling layers use a bilinear interpolation method. Taking the SR split branch decoder as an example, +.>
Figure SMS_12
Input to decoder and upsampledThe layer and convolution module gets a third segmentation feature +.>
Figure SMS_16
By doing so, the second segmentation feature +.>
Figure SMS_19
First segmentation feature->
Figure SMS_7
Third reconstruction feature in SR reconstruction branch decoder
Figure SMS_11
Second reconstruction feature->
Figure SMS_15
First reconstruction feature->
Figure SMS_18
. The common encoder utilizes the correlation and complementarity between reconstruction and segmentation tasks to perform preliminary fusion of reconstruction and segmentation features. The independent decoder branches take the difference between different tasks into consideration, so that the mutual side effect between the tasks is avoided.
Further, in the step S3, the features in each layer of the encoder are fused by using the MSFB module, the multi-scale features are extracted, and the result is sent to the decoder, where each branch includes three parallel MSFB modules, specifically, the first MFSB module of the SR splitting branch, which will
Figure SMS_32
,/>
Figure SMS_21
,/>
Figure SMS_27
Interpolation to the same size, and splicing to obtain splicing characteristics
Figure SMS_25
With three parallel large-kernel convolutions, the convolution kernels are +.>
Figure SMS_30
Extracting a first multi-scale segmentation residual +.>
Figure SMS_34
Feed-forward neural network FFN pair +.>
Figure SMS_36
Further adjusting to obtain a first split residual +.>
Figure SMS_23
,/>
Figure SMS_26
In a spliced manner with->
Figure SMS_20
Binding to obtain new->
Figure SMS_29
The subsequent modules input to the split decoder and so on, the remaining MSFB modules are according to +.>
Figure SMS_22
Extracting the second partition residual->
Figure SMS_31
Third partition residual->
Figure SMS_33
First reconstruction residual->
Figure SMS_35
Second reconstructed residual->
Figure SMS_24
Third reconstruction residual->
Figure SMS_28
And spliced with corresponding decoder intermediate features. These residual features contain multi-scale information to enable the model to better cope with different organs or diseases in the medical imageThe problem of large range area size variations and complements the information that the encoder loses when downsampling.
Further, the MSFB module in step S3 uses a large-kernel convolution, which we decompose into three smaller convolutions in series in this method. For one of
Figure SMS_39
We decompose it into three parts: one or more of
Figure SMS_43
Is depthwise convolution, DWconv, a +.>
Figure SMS_46
DWDconv and a point-by-point convolution pointwise convolution, PWconv, in the present method, in order to achieve a 9 x 9 large-kernel convolution +.>
Figure SMS_38
The first scale feature is obtained by passing through 3X 3 DWconv, 5X 5 DWDconv and PWconv in sequence>
Figure SMS_42
The method comprises the steps of carrying out a first treatment on the surface of the To achieve a large kernel convolution of 27 x 27, < >>
Figure SMS_45
The second scale feature is obtained by passing 5X 5 DWconv, 7X 7 DWDconv and PWconv in sequence>
Figure SMS_48
The method comprises the steps of carrying out a first treatment on the surface of the For the convolution of 3X 3, it is not decomposed +.>
Figure SMS_37
Third scale feature +.>
Figure SMS_41
Will be
Figure SMS_44
、/>
Figure SMS_47
、/>
Figure SMS_40
And splicing and fusing, and sending the result to a subsequent module of the MSFB. The large-kernel convolution enables the model to have a larger receptive field, so that the performance is improved, and the large-kernel convolution has the defect of large calculation cost and is unfavorable for the deployment of algorithms. By decomposing the large-kernel convolution into three serial convolutions, our method effectively reduces the large overhead that large-kernel convolutions bring.
Further, the step S4 uses the DCAB module to fuse the features of the reconstruction and segmentation branches by using a cross-attention mechanism,
Figure SMS_62
、/>
Figure SMS_54
、/>
Figure SMS_58
feature adjustment is carried out through a convolution module respectively to obtain segmentation fusion features +.>
Figure SMS_61
Reconstruction of fusion characteristics->
Figure SMS_65
Input fusion feature->
Figure SMS_64
To supplement the segmentation fusion feature with detailed information, avoiding its negative influence on the reconstructed feature, will +.>
Figure SMS_67
And->
Figure SMS_52
Adding to obtain new segmentation fusion feature->
Figure SMS_59
Figure SMS_49
And->
Figure SMS_55
Is mapped separately into separate query features>
Figure SMS_50
Split key feature->
Figure SMS_56
Segmentation candidate feature->
Figure SMS_53
And rebuild query feature->
Figure SMS_57
Reconstruction of key value features->
Figure SMS_51
Reconstruction candidate feature->
Figure SMS_60
The above features are fused by using a cross-attention operation, and the fusion result is subjected to a local enhancement feed-forward network LEFF to obtain a new segmentation fusion feature +.>
Figure SMS_63
And reconstructing fusion characteristics->
Figure SMS_66
The process can be expressed as:
Figure SMS_68
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure SMS_69
representing feature dimensions, LEFF represents parameters of a locally enhanced feed forward network, < + > in order to guarantee feature stability>
Figure SMS_70
And->
Figure SMS_71
Respectively and->
Figure SMS_72
And->
Figure SMS_73
Adding to obtain new->
Figure SMS_74
And->
Figure SMS_75
. The module fully exploits complementarity between the reconstructed and segmented features: the detail features such as edges, textures and the like generated in the reconstructed features can be used as supplements of low-resolution input, so that the segmentation branches can be helped to accurately predict a high-resolution segmentation mask; abstract semantic features generated in the segmentation process can also guide the reconstruction branches to generate more real detail features.
Further, the pair in the step S5
Figure SMS_76
And->
Figure SMS_77
Respectively performing pixelshutdown up-sampling, and calculating a loss function with each label, wherein the loss function of the SR segmentation task comprises cross entropy loss and dice loss, the loss function of the SR reconstruction task comprises L1 loss, and in order to balance the loss functions of the two tasks, a dynamic adjustment mechanism is used, and the specific expression of the loss function is as follows:
Figure SMS_91
wherein (1)>
Figure SMS_79
And->
Figure SMS_86
Respectively represent pair->
Figure SMS_89
And->
Figure SMS_92
Final result of pixelshutdown up-sampling, +.>
Figure SMS_90
And->
Figure SMS_93
Representing a real super-resolution reconstruction tag and a super-resolution segmentation tag, respectively, < >>
Figure SMS_81
Representing the calculation->
Figure SMS_85
Loss (S)>
Figure SMS_78
Representing the calculated cross entropy loss,/->
Figure SMS_84
Representing the calculated race loss,/->
Figure SMS_80
And->
Figure SMS_87
Representing the calculated reconstruction loss and segmentation loss, respectively, in order to balance the loss functions of the two tasks, use +.>
Figure SMS_83
And->
Figure SMS_88
Calculating the scaling factor of dynamic change and weighting the scaling factor to obtain the final loss function +.>
Figure SMS_82
The beneficial effects are that: compared with the prior art, the invention has the following advantages:
1. the invention provides a CT image super-resolution segmentation method assisted by super-resolution reconstruction by utilizing complementarity between super-resolution reconstruction and super-resolution segmentation;
2. the invention uses the characteristics of each layer of the parallel large-core convolution fusion encoder, and extracts multi-scale characteristics from the characteristics so as to better process the situation that the sizes of organs in the medical image are greatly different;
3. the invention provides a DCAB module which utilizes cross-attention operation to effectively fuse SR reconstruction characteristics and SR segmentation characteristics, so that the performances of two tasks can be improved;
4. the present invention uses dynamic weights to balance the reconstruction and segmentation tasks, dynamically adjusting the loss function.
Drawings
FIG. 1 is a schematic flow chart of a CT image super-resolution segmentation model assisted by super-resolution reconstruction;
FIG. 2 is a general frame structure diagram of a CT image super-resolution segmentation model assisted by super-resolution reconstruction provided by the invention;
FIG. 3 is a schematic diagram of a dual channel attention module (DCAB) topology according to the present invention;
FIG. 4 is a graph showing the comparison of the results of super-resolution segmentation;
fig. 5 is a comparative graph of the results of super-resolution reconstruction.
Detailed Description
The present invention is further illustrated in the accompanying drawings and detailed description which are to be understood as being merely illustrative of the invention and not limiting of its scope, and various modifications of the invention, which are equivalent to those skilled in the art upon reading the invention, will fall within the scope of the invention as defined in the appended claims.
Examples: the super-resolution segmentation task and the super-resolution reconstruction task have great relevance and complementarity. The super-resolution reconstruction task can gradually restore detail characteristics in the reconstruction process, supplement limited detail information in the input low-resolution CT image, and help the super-resolution segmentation task to more accurately predict the segmentation mask; the super-resolution segmentation can extract abstracted semantic information and guide the reconstruction process to generate texture details which are more in line with real distribution. The method uses two parallel branches to process reconstruction and segmentation tasks simultaneously, and designs a special fusion module to effectively fuse the middle characteristics of different branches, so that the reconstruction and segmentation tasks are mutually promoted.
The invention comprises the following steps:
s1: downsampling the original CT image using a bicubic interpolation algorithm to downsample the original CT image 4 times to a low resolution image
Figure SMS_94
Use of the original CT image +.>
Figure SMS_95
And split tag->
Figure SMS_96
Respectively used as a super-resolution reconstruction tag and a super-resolution segmentation tag;
the data set used contains the CT image and its corresponding segmentation label. To meet the requirement of the method, we downsampled the CT image 4 times by the bicubic interpolation algorithm in the traditional image processing algorithm, and take it as the low resolution input of the model.
S2: the low resolution image in step S1
Figure SMS_97
The encoder is input and super-resolution reconstruction and super-resolution segmentation are performed by two independent decoder branches, respectively.
As shown in fig. 2, the method uses a common encoder to extract features, performs preliminary fusion on SR reconstruction features and SR segmentation features, and then extracts features fitting respective tasks through independent decoder branches. Specifically, the intermediate features of the decoder of the reconstruction branch comprise more detail features of low semantic information such as edges, textures and the like, and the intermediate features of the decoder of the segmentation branch comprise more abstractAdvanced features. For a given input
Figure SMS_100
It is first encoded by three serial convolution modules comprising 2 layers +.>
Figure SMS_105
convolution-Relu activation function-BN normalization layer, downsampling layer is max-pooling downsampling, +.>
Figure SMS_109
Processing by a first convolution module to obtain a first coding feature +.>
Figure SMS_101
The feature generates a second coding feature via the downsampling layer and a second convolution module>
Figure SMS_103
And so on, respectively obtaining the third coding feature +.>
Figure SMS_107
And bottleneck characteristics->
Figure SMS_111
Will get->
Figure SMS_98
And the two decoders are identical in structure and comprise three serial up-sampling layers and a convolution module, and the up-sampling layers use a bilinear interpolation method. Taking the SR split branch decoder as an example, +.>
Figure SMS_102
After input to the decoder, the third segmentation feature is obtained by an upsampling layer and a convolution module>
Figure SMS_106
By doing so, the second segmentation feature +.>
Figure SMS_110
First segmentation feature->
Figure SMS_99
And third reconstruction feature +.>
Figure SMS_104
Second reconstruction feature->
Figure SMS_108
First reconstruction feature->
Figure SMS_112
. The common encoder utilizes the correlation and complementarity between reconstruction and segmentation tasks to perform preliminary fusion of reconstruction and segmentation features. The independent decoder branches take the difference between different tasks into consideration, so that the mutual side effect between the tasks is avoided.
S3: the multi-scale features are extracted using a multi-scale fusion Module (MSFB) and the intermediate features generated during the encoding in step S2 are passed to a decoder.
As shown in fig. 2, the method utilizes the MSFB module to fuse the features in each layer of the encoder, extract the multi-scale features, and send the result to the decoder. The first MFSB module of the SR partition branch, which module is to
Figure SMS_125
,/>
Figure SMS_115
,/>
Figure SMS_119
Interpolation to the same size, and splicing to obtain splicing characteristics->
Figure SMS_126
Three parallel large-kernel convolutions are utilized, the convolution kernels are respectively
Figure SMS_128
Extracting a first multi-scale segmentation residual +.>
Figure SMS_127
Feed-forward neural network FFN pair +.>
Figure SMS_129
Further adjusting to obtain a first split residual +.>
Figure SMS_118
,/>
Figure SMS_123
In a spliced manner with->
Figure SMS_114
Binding to obtain new->
Figure SMS_121
The subsequent modules input to the split decoder and so on, the remaining MSFB modules are according to +.>
Figure SMS_116
Extracting a second segmentation residual
Figure SMS_120
Third partition residual->
Figure SMS_117
First reconstruction residual->
Figure SMS_122
Second reconstructed residual->
Figure SMS_113
Third reconstruction residual
Figure SMS_124
And spliced with corresponding decoder intermediate features. These residual features contain multi-scale information, which enables the model to better cope with the problem of large changes in the size of different organs or lesion areas in the medical image, and supplements the information lost by the encoder during downsampling.
The MSFB module in the step S3Using a large kernel convolution, we decompose the large kernel convolution into three smaller convolutions in series in this method. For one of
Figure SMS_131
We decompose it into three parts: one->
Figure SMS_136
Is depthwise convolution, DWconv, a +.>
Figure SMS_139
DWDconv and a point-by-point convolution pointwise convolution, PWconv, in the present method, in order to achieve a 9 x 9 large-kernel convolution +.>
Figure SMS_133
The first scale feature is obtained by passing through 3X 3 DWconv, 5X 5 DWDconv and PWconv in sequence>
Figure SMS_135
The method comprises the steps of carrying out a first treatment on the surface of the To achieve a large kernel convolution of 27 x 27, < >>
Figure SMS_138
The second scale feature is obtained by passing 5X 5 DWconv, 7X 7 DWDconv and PWconv in sequence>
Figure SMS_141
The method comprises the steps of carrying out a first treatment on the surface of the For a 3×3 convolution, it is not decomposed
Figure SMS_130
Third scale feature +.>
Figure SMS_134
Will->
Figure SMS_137
、/>
Figure SMS_140
、/>
Figure SMS_132
And splicing and fusing, and sending the result to a subsequent module of the MSFB. The large-kernel convolution enables the model to have a larger receptive field, so that the performance is improved, and the large-kernel convolution has the defect of large calculation cost and is unfavorable for the deployment of algorithms. By decomposing the large-kernel convolution into three serial convolutions, our method effectively reduces the large overhead that large-kernel convolutions bring.
S4: the intermediate features of the encoder and decoder in step S2 are fused using a two-channel attention module (DCAB).
As shown in fig. 3, the DCAB module fuses features of the reconstructed and split branches using a cross-attention mechanism.
Figure SMS_156
、/>
Figure SMS_145
、/>
Figure SMS_150
Feature adjustment is carried out through a convolution module respectively to obtain segmentation fusion features +.>
Figure SMS_155
Reconstruction of fusion characteristics->
Figure SMS_159
Input fusion feature->
Figure SMS_157
To supplement the segmentation fusion feature with detailed information, avoiding its negative influence on the reconstructed feature, will +.>
Figure SMS_160
And->
Figure SMS_142
Adding to obtain new segmentation fusion feature->
Figure SMS_148
Figure SMS_146
And->
Figure SMS_149
Is mapped separately into separate query features>
Figure SMS_144
Split key feature->
Figure SMS_153
Segmentation candidate feature->
Figure SMS_154
And rebuild query feature->
Figure SMS_158
Reconstruction of key value features->
Figure SMS_147
Reconstruction candidate feature->
Figure SMS_152
The above features are fused by using a cross-attention operation, and the fusion result is subjected to a local enhancement feed-forward network LEFF to obtain a new segmentation fusion feature +.>
Figure SMS_143
And reconstructing fusion characteristics->
Figure SMS_151
The process can be expressed as:
Figure SMS_161
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure SMS_162
representing feature dimensions, LEFF represents parameters of a locally enhanced feed forward network, < + > in order to guarantee feature stability>
Figure SMS_163
And->
Figure SMS_164
Respectively and->
Figure SMS_165
And->
Figure SMS_166
Adding to obtain new->
Figure SMS_167
And->
Figure SMS_168
. The module fully exploits complementarity between the reconstructed and segmented features: the detail features such as edges, textures and the like generated in the reconstructed features can be used as supplements of low-resolution input, so that the segmentation branches can be helped to accurately predict a high-resolution segmentation mask; abstract semantic features generated in the segmentation process can also guide the reconstruction branches to generate more real detail features.
S5: the model described above is optimized by a loss function.
To demonstrate the effectiveness of the present invention, the present invention also provides the following comparative experiments:
in particular, the present invention uses SegTHOR to disclose a dataset that is a chest multi-organ segmentation dataset comprising segmentation tags for 4 organs: heart, aorta, trachea and esophagus. The dataset contained 40 CT scans of the patient, we randomly selected 28 as the training set, 4 as the validation set, and 8 as the test set. Before formal training, we intercept the Hu values between [ -128, 384] and pre-process them as described in step S1. The training process adopts an Adamw optimizer, the initial learning rate is 0.001, and the total training period is 150.
In order to verify the effectiveness of the method in reconstruction and segmentation, we compare with the segmentation and reconstruction algorithm with the best current effect. On the segmentation task, the experimental results of the method are compared with CPFNet, kiU-Net, unet++, UCTransNet, and the evaluation indexes are universal dice and hd95, and the comparison results are shown in Table 1. It can be seen that the method has a more obvious improvement in segmentation performance compared with other methods; in the reconstruction task, the experimental result of the method is compared with RDN, EDSR, NSLA, the evaluation indexes are psnr and ssim, and the comparison result is shown in table 2.
Figure SMS_169
Table 1 the results of the method compared with other algorithms on the segmentation task, the bolded data represent the best performing data. (Eso oesophageal Hea: heart Tra: trachea Aor: aorta)
Figure SMS_170
Table 2 the method compares the results of the reconstruction task with other algorithms and the bolded data represents the best performing data.
To intuitively demonstrate the effectiveness of the present method, we compare the results of the present method with other methods in visual effect. FIG. 3 shows the segmentation results of the methods, and it can be seen that the method can segment the organs more accurately than other methods; FIG. 4 is a reconstruction of the various methods, which can accurately restore the unclear boundary between the esophagus and the aorta, as shown, thanks to the semantic information provided by the segmentation process.

Claims (6)

1. The CT image super-resolution segmentation method assisted by super-resolution reconstruction is characterized by comprising the following steps of:
s1: downsampling the original CT image using a bicubic interpolation algorithm to downsample the original CT image 4 times to a low resolution image
Figure QLYQS_1
Use of the original CT image +.>
Figure QLYQS_2
And split tag->
Figure QLYQS_3
Respectively used as a super-resolution reconstruction tag and a super-resolution segmentation tag;
s2: the low resolution image in step S1
Figure QLYQS_4
Inputting the super-resolution reconstruction and super-resolution segmentation into an encoder through two independent decoder branches;
s3: extracting multi-scale features by using a multi-scale fusion module MSFB, and transmitting intermediate features generated in the encoding process in the step S2 to a decoder;
s4: fusing intermediate features of the encoder and decoder in step S2 using a dual channel attention module DCAB;
s5: the model described above is optimized by a loss function.
2. The super-resolution reconstruction-assisted CT image super-resolution segmentation method according to claim 1, wherein: in the step S2, features are extracted by using a common encoder, the SR reconstruction features and the SR segmentation features are primarily fused, and then features matching with respective tasks are respectively extracted by independent decoder branches, which is specifically as follows: for a given input
Figure QLYQS_7
It is first encoded by three serial convolution modules comprising 2 layers +.>
Figure QLYQS_8
convolution-Relu activation function-BN normalization layer, downsampling layer is max-pooling downsampling, +.>
Figure QLYQS_10
Processing by a first convolution module to obtain a first coding feature +.>
Figure QLYQS_6
The feature generates a second coding feature via the downsampling layer and a second convolution module>
Figure QLYQS_9
And so on, respectively obtaining the third coding feature +.>
Figure QLYQS_11
And bottleneck characteristics->
Figure QLYQS_12
Will get->
Figure QLYQS_5
And the two decoders are identical in structure and comprise three serial up-sampling layers and a convolution module, and the up-sampling layers use a bilinear interpolation method.
3. The super-resolution reconstruction-assisted CT image super-resolution segmentation method according to claim 1, wherein: in the step S3, features in each layer of the encoder are fused by using the MSFB module, multi-scale features are extracted, and the result is sent to the decoder, wherein each branch comprises three parallel MSFB modules, specifically, the first MFSB module of the SR splitting branch is as follows, and the module will
Figure QLYQS_26
,/>
Figure QLYQS_15
,/>
Figure QLYQS_21
Interpolation to the same size, and splicing to obtain splicing characteristics->
Figure QLYQS_17
Using three parallel large core volumesThe convolution kernels are +.>
Figure QLYQS_20
Extracting a first multi-scale segmentation residual +.>
Figure QLYQS_18
Feed-forward neural network FFN pair +.>
Figure QLYQS_23
Further adjusting to obtain a first split residual +.>
Figure QLYQS_27
,/>
Figure QLYQS_29
In a spliced manner with
Figure QLYQS_13
Binding to obtain new->
Figure QLYQS_24
The subsequent modules input to the split decoder and so on, the remaining MSFB modules are according to +.>
Figure QLYQS_16
Extracting the second partition residual->
Figure QLYQS_22
Third partition residual->
Figure QLYQS_25
First reconstruction residual->
Figure QLYQS_28
Second reconstructed residual->
Figure QLYQS_14
Third reconstruction residual->
Figure QLYQS_19
And spliced with corresponding decoder intermediate features.
4. The super-resolution reconstruction-assisted CT image super-resolution segmentation method according to claim 1, wherein: the MSFB module in step S3 uses large-kernel convolution for one
Figure QLYQS_32
Is decomposed into three parts: one->
Figure QLYQS_36
Is depthwise convolution, DWconv, a +.>
Figure QLYQS_39
Is a depth-expanded convolution depthwise dilation convolution, DWDconv and a point-wise convolution pointwise convolution, PWconv,/->
Figure QLYQS_33
The first scale feature is obtained by passing through 3X 3 DWconv, 5X 5 DWDconv and PWconv in sequence>
Figure QLYQS_34
The method comprises the steps of carrying out a first treatment on the surface of the To achieve a large kernel convolution of 27 x 27, < >>
Figure QLYQS_37
The second scale feature is obtained by passing 5X 5 DWconv, 7X 7 DWDconv and PWconv in sequence>
Figure QLYQS_40
The method comprises the steps of carrying out a first treatment on the surface of the For a convolution of 3×3, < >>
Figure QLYQS_30
Third scale feature +.>
Figure QLYQS_35
Will->
Figure QLYQS_38
、/>
Figure QLYQS_41
、/>
Figure QLYQS_31
And splicing and fusing, and sending the result to a subsequent module of the MSFB.
5. The super-resolution reconstruction-assisted CT image super-resolution segmentation method according to claim 1, wherein: the DCAB module is used in said step S4 to fuse the features of the reconstructed and split branches using a cross-attention mechanism,
Figure QLYQS_54
、/>
Figure QLYQS_44
、/>
Figure QLYQS_48
feature adjustment is carried out through a convolution module respectively to obtain segmentation fusion features +.>
Figure QLYQS_46
Reconstruction of fusion characteristics->
Figure QLYQS_50
Input fusion feature->
Figure QLYQS_56
Will->
Figure QLYQS_59
And->
Figure QLYQS_47
Adding to obtain new segmentation fusion feature->
Figure QLYQS_51
,/>
Figure QLYQS_42
And->
Figure QLYQS_52
Is mapped separately into separate query features>
Figure QLYQS_45
Split key feature->
Figure QLYQS_49
Segmentation candidate feature->
Figure QLYQS_55
And rebuild query feature->
Figure QLYQS_58
Reconstruction of key value features->
Figure QLYQS_43
Reconstruction candidate feature->
Figure QLYQS_53
The above features are fused by using a cross-attention operation, and the fusion result is subjected to a local enhancement feed-forward network LEFF to obtain a new segmentation fusion feature +.>
Figure QLYQS_57
And reconstructing fusion characteristics->
Figure QLYQS_60
The process can be expressed as:
Figure QLYQS_61
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure QLYQS_62
representing feature dimensions, LEFF represents parameters of the locally enhanced feed forward network, in order to guarantee the stability of the feature,
Figure QLYQS_63
and->
Figure QLYQS_64
Respectively and->
Figure QLYQS_65
And->
Figure QLYQS_66
Adding to obtain new->
Figure QLYQS_67
And->
Figure QLYQS_68
6. The super-resolution reconstruction-assisted CT image super-resolution segmentation method according to claim 1, wherein: the pair in the step S5
Figure QLYQS_69
And->
Figure QLYQS_70
Respectively performing pixelshutdown up-sampling, and calculating a loss function with each label, wherein the loss function of the SR segmentation task comprises cross entropy loss and dice loss, the loss function of the SR reconstruction task comprises L1 loss, and in order to balance the loss functions of the two tasks, a dynamic adjustment mechanism is used, and the specific expression of the loss function is as follows:
Figure QLYQS_76
wherein (1)>
Figure QLYQS_73
And->
Figure QLYQS_77
Respectively represent pair->
Figure QLYQS_74
And->
Figure QLYQS_79
Final result of pixelshutdown up-sampling, +.>
Figure QLYQS_75
And->
Figure QLYQS_81
Representing a real super-resolution reconstruction tag and a super-resolution segmentation tag, respectively, < >>
Figure QLYQS_82
Representing the calculation->
Figure QLYQS_85
Loss (S)>
Figure QLYQS_72
Representing the calculated cross entropy loss,/->
Figure QLYQS_80
Representing the calculated race loss,/->
Figure QLYQS_78
And->
Figure QLYQS_84
Representing the calculated reconstruction loss and segmentation loss, respectively, in order to balance the loss functions of the two tasks, use +.>
Figure QLYQS_83
And->
Figure QLYQS_86
Calculating the scaling factor of dynamic change and weighting the scaling factor to obtain the final loss function +.>
Figure QLYQS_71
CN202310682299.2A 2023-06-09 2023-06-09 CT image super-resolution segmentation method assisted by super-resolution reconstruction Active CN116416261B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310682299.2A CN116416261B (en) 2023-06-09 2023-06-09 CT image super-resolution segmentation method assisted by super-resolution reconstruction

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310682299.2A CN116416261B (en) 2023-06-09 2023-06-09 CT image super-resolution segmentation method assisted by super-resolution reconstruction

Publications (2)

Publication Number Publication Date
CN116416261A true CN116416261A (en) 2023-07-11
CN116416261B CN116416261B (en) 2023-09-12

Family

ID=87049598

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310682299.2A Active CN116416261B (en) 2023-06-09 2023-06-09 CT image super-resolution segmentation method assisted by super-resolution reconstruction

Country Status (1)

Country Link
CN (1) CN116416261B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113657388A (en) * 2021-07-09 2021-11-16 北京科技大学 Image semantic segmentation method fusing image super-resolution reconstruction
CN114841859A (en) * 2022-04-28 2022-08-02 南京信息工程大学 Single-image super-resolution reconstruction method based on lightweight neural network and Transformer
CN115953494A (en) * 2023-03-09 2023-04-11 南京航空航天大学 Multi-task high-quality CT image reconstruction method based on low dose and super-resolution
WO2023098289A1 (en) * 2021-12-01 2023-06-08 浙江大学 Automatic unlabeled pancreas image segmentation system based on adversarial learning

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113657388A (en) * 2021-07-09 2021-11-16 北京科技大学 Image semantic segmentation method fusing image super-resolution reconstruction
WO2023098289A1 (en) * 2021-12-01 2023-06-08 浙江大学 Automatic unlabeled pancreas image segmentation system based on adversarial learning
CN114841859A (en) * 2022-04-28 2022-08-02 南京信息工程大学 Single-image super-resolution reconstruction method based on lightweight neural network and Transformer
CN115953494A (en) * 2023-03-09 2023-04-11 南京航空航天大学 Multi-task high-quality CT image reconstruction method based on low dose and super-resolution

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
JIA ZHAOHONG 等: "Two-Branch network for brain tumor segmentation using attention mechanism and super-resolution reconstruction", 《COMPUTERS IN BIOLOGY AND MEDICINE》, vol. 157, pages 1 - 11 *
YANG LIUTAO 等: "Low-Dose CT Denoising via Sinogram Inner-Structure Transformer", 《IEEE TRANSACTIONS ON MEDICAL IMAGING》, vol. 42, no. 4, pages 910 - 921, XP011938010, DOI: 10.1109/TMI.2022.3219856 *
ZHANG QIAN 等: "Collaborative Network for Super-Resolution and Semantic Segmentation of Remote Sensing Images", 《IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING》, vol. 60, pages 1 - 12 *
刘伟: "基于深度学习的三维头部MRI超分辨率重建", 《中国优秀硕士学位论文全文数据库 医药卫生科技辑》, no. 2 *

Also Published As

Publication number Publication date
CN116416261B (en) 2023-09-12

Similar Documents

Publication Publication Date Title
CN111145170B (en) Medical image segmentation method based on deep learning
CN115482241A (en) Cross-modal double-branch complementary fusion image segmentation method and device
CN110738660B (en) Vertebra CT image segmentation method and device based on improved U-net
CN116433914A (en) Two-dimensional medical image segmentation method and system
CN112785593A (en) Brain image segmentation method based on deep learning
CN112700460A (en) Image segmentation method and system
CN111260670B (en) Tubular structure segmentation graph fracture repairing method and system of three-dimensional image based on deep learning network
CN112150470A (en) Image segmentation method, image segmentation device, image segmentation medium, and electronic device
CN114219755A (en) Intelligent pulmonary tuberculosis detection method and system based on images and clinical data
CN115526829A (en) Honeycomb lung focus segmentation method and network based on ViT and context feature fusion
CN117058307A (en) Method, system, equipment and storage medium for generating heart three-dimensional nuclear magnetic resonance image
CN110599495B (en) Image segmentation method based on semantic information mining
KR102419270B1 (en) Apparatus and method for segmenting medical image using mlp based architecture
CN117078930A (en) Medical image segmentation method based on boundary sensing and attention mechanism
Wang et al. Automatic consecutive context perceived transformer GAN for serial sectioning image blind inpainting
CN116416261B (en) CT image super-resolution segmentation method assisted by super-resolution reconstruction
CN117292704A (en) Voice-driven gesture action generation method and device based on diffusion model
CN117152173A (en) Coronary artery segmentation method and system based on DUNetR model
CN115100731B (en) Quality evaluation model training method and device, electronic equipment and storage medium
CN115984560A (en) Image segmentation method based on CNN and Transformer
CN116309278A (en) Medical image segmentation model and method based on multi-scale context awareness
Hou et al. Lung nodule segmentation algorithm with SMR-UNet
Li et al. Image analysis and diagnosis of skin diseases-a review
CN116385720A (en) Breast cancer focus ultrasonic image segmentation algorithm
Chen et al. TMTrans: texture mixed transformers for medical image segmentation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant