CN116416261B

CN116416261B - CT image super-resolution segmentation method assisted by super-resolution reconstruction

Info

Publication number: CN116416261B
Application number: CN202310682299.2A
Authority: CN
Inventors: 葛荣骏; 徐颖; 张道强
Original assignee: Nanjing University of Aeronautics and Astronautics
Current assignee: Nanjing University of Aeronautics and Astronautics
Priority date: 2023-06-09
Filing date: 2023-06-09
Publication date: 2023-09-12
Anticipated expiration: 2043-06-09
Also published as: CN116416261A

Abstract

The application discloses a CT image super-resolution segmentation method assisted by super-resolution reconstruction, which comprises the following steps: 4 times down-sampling the original CT image by using a bicubic algorithm; -converting said low resolution image I _lr Inputting the super-resolution reconstruction and super-resolution segmentation into an encoder through two independent decoder branches; extracting multi-scale features by using a multi-scale fusion module MSFB, and transmitting intermediate features of the encoder to a decoder; fusing intermediate features of the encoder and decoder using a dual channel attention module DCAB; the model described above is optimized by a loss function.

Description

CT image super-resolution segmentation method assisted by super-resolution reconstruction

Technical Field

The application belongs to the technical field of medical image processing, relates to the technical field of CT image reconstruction and segmentation, and particularly relates to a CT image super-resolution segmentation method assisted by super-resolution reconstruction.

Background

In clinical diagnosis, it is important to perform super-resolution (SR) segmentation on medical scan images. Existing segmentation techniques aim at segmenting regions of interest, such as vital organs or infected regions, from medical images, thereby obtaining important information about the size, shape and location of the region. However, for some regions where the anatomy is complex, the segmentation mask of the original resolution may not accurately express the segmented region, and thus it is necessary to predict a segmentation mask of high resolution from a CT image of low resolution using the SR segmentation method. However, the low resolution image contains limited detailed information that is insufficient to support the prediction of a precise high resolution segmentation mask. Therefore, we consider predicting a corresponding high resolution CT from a low resolution CT using SR reconstruction techniques, with low-level features restored during reconstruction, such as texture and edges, to aid in predicting the high resolution segmentation mask.

Existing methods for assisting SR segmentation using SR reconstruction are mainly divided into two categories: the first is to take the SR reconstruction as a preprocessing step of the image, which ignores the correlation and complementarity between the SR reconstruction and the SR segmentation task. In fact, not only can the detail information of the SR reconstruction process assist the SR segmentation process to generate a more accurate segmentation mask, but also abstract semantic information provided by the SR segmentation process can guide the SR reconstruction process to generate texture details more in line with real distribution; the second is to combine the SR reconstruction model and the SR segmentation model in a serial manner, which allows the SR reconstruction and SR segmentation processes to interact and adjust to each other, but the interaction between the two processes is still insufficient. Furthermore, the parallel approach can lead to an accumulation of errors. Thus, there is still a lack of a method that can effectively combine SR reconstruction and SR segmentation, thereby enabling the two processes to interact.

Disclosure of Invention

The application aims to: the application aims to provide a CT image super-resolution segmentation method assisted by super-resolution reconstruction. The method utilizes complementarity between the SR reconstruction and the SR segmentation task, specifically, the detailed characteristics such as texture, edge and the like restored in the SR reconstruction process can help the SR segmentation process to predict a more accurate segmentation mask, and abstract semantic features extracted in the SR segmentation process can also guide the SR reconstruction process to generate texture details more conforming to real distribution. In addition, considering that the size of the region of interest in the CT image varies greatly, the method uses multi-scale large-kernel convolution to extract multi-scale features, thereby further improving the performance of reconstruction and segmentation.

The technical scheme is as follows: in order to achieve the above object, the present application provides a method for CT image super-resolution segmentation with super-resolution reconstruction assistance, comprising the steps of:

s1: downsampling the original CT image using a bicubic interpolation algorithm to downsample the original CT image 4 times to a low resolution imageUse of the original CT image +.>And split tag->Respectively used as a super-resolution reconstruction tag and a super-resolution segmentation tag;

s2: the low resolution image in step S1Inputting the super-resolution reconstruction and super-resolution segmentation into an encoder through two independent decoder branches;

s3: the multi-scale features are extracted using a multi-scale fusion Module (MSFB) and the intermediate features generated during the encoding in step S2 are passed to a decoder.

S4: the intermediate features of the encoder and decoder in step S2 are fused using a two-channel attention module (DCAB).

S5: the model described above is optimized by a loss function.

The scheme fully utilizes complementarity between the SR reconstruction and the SR segmentation task, specifically, the detailed characteristics such as texture, edge and the like restored in the SR reconstruction process can help the SR segmentation process to predict a more accurate segmentation mask, and abstract semantic characteristics extracted in the SR segmentation process can also guide the SR reconstruction process to generate texture details more conforming to real distribution.

Further, in the step S2, features are extracted by using a common encoder, and then features matching respective tasks are extracted by using independent decoder branches. Specifically, the intermediate features of the decoder of the reconstruction branch contain more detail features of low semantic information such as edges, textures and the like, and the intermediate features of the decoder of the segmentation branch contain more abstract high-level features. The method comprises the following steps: for a given inputIt is first encoded by three serial convolution modules comprising 2 layers +.>convolution-Relu activation function-BN normalization layer, downsampling layer is max-pooling downsampling, +.>Processing by a first convolution module to obtain a first coding feature +.>The feature generates a second coding feature via the downsampling layer and a second convolution module>And so on, respectively obtaining the third coding feature +.>And bottleneck characteristics->Will get->And the two decoders are identical in structure and comprise three serial up-sampling layers and a convolution module, and the up-sampling layers use a bilinear interpolation method. Taking the SR split branch decoder as an example, +.>After input to the decoder, the third segmentation feature is obtained by an upsampling layer and a convolution module>By doing so, the second segmentation feature +.>First segmentation feature->And third reconstruction feature +.>Second reconstruction feature->First reconstruction feature->. The common encoder utilizes the correlation and complementarity between reconstruction and segmentation tasks to perform preliminary fusion of reconstruction and segmentation features. The independent decoder branches take the difference between different tasks into consideration, so that the mutual side effect between the tasks is avoided.

Further, in the step S3, the features in each layer of the encoder are fused by using the MSFB module, the multi-scale features are extracted, and the result is sent to the decoder, where each branch includes three parallel MSFB modules, specifically, the first MFSB module of the SR splitting branch, which will，/>，/>Interpolation to the same size, and splicing to obtain splicing characteristicsWith three parallel large-kernel convolutions, the convolution kernels are +.>Extracting a first multi-scale segmentation residual +.>Feed-forward neural network FFN pair +.>Further adjusting to obtain a first segmentation residual，/>In a spliced manner with->Binding to obtain new->The subsequent modules input to the split decoder and so on, the remaining MSFB modules are according to +.>Extracting the second partition residual->Third partition residual->First reconstruction residual->Second reconstructed residual->Third reconstruction residual->And spliced with corresponding decoder intermediate features. These residual features contain multi-scale information, which enables the model to better cope with the problem of large changes in the size of different organs or lesion areas in the medical image, and supplements the information lost by the encoder during downsampling.

Further, the MSFB module in step S3 uses a large-kernel convolution, which we decompose into three smaller convolutions in series in this method. For one ofWe decompose it into three parts: one or more ofIs depthwise convolution, DWconv, a +.>Is a deep-expansion roll of (2)Product depthwise dilation convolution, DWDconv and a point-wise convolution pointwise convolution, PWconv, in the present method, in order to achieve a large kernel convolution of 9X 9 +.>The first scale feature is obtained by passing through 3X 3 DWconv, 5X 5 DWDconv and PWconv in sequence>The method comprises the steps of carrying out a first treatment on the surface of the To achieve a large kernel convolution of 27 x 27, < >>The second scale feature is obtained by passing 5X 5 DWconv, 7X 7 DWDconv and PWconv in sequence>The method comprises the steps of carrying out a first treatment on the surface of the For the convolution of 3X 3, it is not decomposed +.>Third scale feature +.>Will be、/>、/>And splicing and fusing, and sending the result to a subsequent module of the MSFB. The large-kernel convolution enables the model to have a larger receptive field, so that the performance is improved, and the large-kernel convolution has the defect of large calculation cost and is unfavorable for the deployment of algorithms. By decomposing the large-kernel convolution into three serial convolutions, our method effectively reduces the large overhead that large-kernel convolutions bring.

Further, the step S4 uses the DCAB module to fuse the features of the reconstruction and segmentation branches by using a cross-attention mechanism,、/>、/>feature adjustment is carried out through a convolution module respectively to obtain segmentation fusion features +.>Reconstruction of fusion characteristics->Input fusion feature->To supplement the segmentation fusion feature with detailed information, avoiding its negative influence on the reconstructed feature, will +.>And->Adding to obtain new segmentation fusion feature->，/>And->Is mapped separately into separate query features>Split key feature->Segmentation candidate feature->And rebuild query feature->Reconstruction of key value features->Reconstruction candidate feature->The above features are fused by using a cross-attention operation, and the fusion result is subjected to a local enhancement feed-forward network LEFF to obtain a new segmentation fusion feature +.>And reconstructing fusion characteristics->The process can be expressed as:

，

wherein, the liquid crystal display device comprises a liquid crystal display device,representing feature dimensions, LEFF represents parameters of a locally enhanced feed forward network, < + > in order to guarantee feature stability>And->Respectively and->And->Adding to obtain new->And. The module fully exploits complementarity between the reconstructed and segmented features: reconstructing edges and textures generated in featuresThe detail features can be used as the supplement of low-resolution input, so that the segmentation branches can be helped to more accurately predict a high-resolution segmentation mask; abstract semantic features generated in the segmentation process can also guide the reconstruction branches to generate more real detail features.

Further, the pair in the step S5And->Respectively performing pixelshutdown up-sampling, and calculating a loss function with each label, wherein the loss function of the SR segmentation task comprises cross entropy loss and dice loss, the loss function of the SR reconstruction task comprises L1 loss, and in order to balance the loss functions of the two tasks, a dynamic adjustment mechanism is used, and the specific expression of the loss function is as follows:

wherein (1)>And->Respectively represent pair->And->Final result of pixelshutdown up-sampling, +.>And->Representing a real super-resolution reconstruction tag and a super-resolution segmentation tag, respectively, < >>Representing the calculation->Loss (S)>Representing the calculated cross entropy loss,/->Representing the calculated race loss,/->And->Representing the calculated reconstruction loss and segmentation loss, respectively, in order to balance the loss functions of the two tasks, use +.>And->Calculating the scaling factor of dynamic change and weighting the scaling factor to obtain the final loss function +.>。

The beneficial effects are that: compared with the prior art, the application has the following advantages:

1. the application provides a CT image super-resolution segmentation method assisted by super-resolution reconstruction by utilizing complementarity between super-resolution reconstruction and super-resolution segmentation;

2. the application uses the characteristics of each layer of the parallel large-core convolution fusion encoder, and extracts multi-scale characteristics from the characteristics so as to better process the situation that the sizes of organs in the medical image are greatly different;

3. the application provides a DCAB module which utilizes cross-attention operation to effectively fuse SR reconstruction characteristics and SR segmentation characteristics, so that the performances of two tasks can be improved;

4. the present application uses dynamic weights to balance the reconstruction and segmentation tasks, dynamically adjusting the loss function.

Drawings

FIG. 1 is a schematic flow chart of a CT image super-resolution segmentation model assisted by super-resolution reconstruction;

FIG. 2 is a general frame structure diagram of a CT image super-resolution segmentation model assisted by super-resolution reconstruction provided by the application;

FIG. 3 is a schematic diagram of a dual channel attention module (DCAB) topology according to the present application;

FIG. 4 is a graph showing the comparison of the results of super-resolution segmentation;

fig. 5 is a comparative graph of the results of super-resolution reconstruction.

Detailed Description

The present application is further illustrated in the accompanying drawings and detailed description which are to be understood as being merely illustrative of the application and not limiting of its scope, and various modifications of the application, which are equivalent to those skilled in the art upon reading the application, will fall within the scope of the application as defined in the appended claims.

Examples: the super-resolution segmentation task and the super-resolution reconstruction task have great relevance and complementarity. The super-resolution reconstruction task can gradually restore detail characteristics in the reconstruction process, supplement limited detail information in the input low-resolution CT image, and help the super-resolution segmentation task to more accurately predict the segmentation mask; the super-resolution segmentation can extract abstracted semantic information and guide the reconstruction process to generate texture details which are more in line with real distribution. The method uses two parallel branches to process reconstruction and segmentation tasks simultaneously, and designs a special fusion module to effectively fuse the middle characteristics of different branches, so that the reconstruction and segmentation tasks are mutually promoted.

The application comprises the following steps:

s1: downsampling the original CT image using a bicubic interpolation algorithm to downsample the original CT image 4 times to a low resolution imageUsing the originalCT image +.>And split tag->Respectively used as a super-resolution reconstruction tag and a super-resolution segmentation tag;

the data set used contains the CT image and its corresponding segmentation label. To meet the requirement of the method, we downsampled the CT image 4 times by the bicubic interpolation algorithm in the traditional image processing algorithm, and take it as the low resolution input of the model.

S2: the low resolution image in step S1The encoder is input and super-resolution reconstruction and super-resolution segmentation are performed by two independent decoder branches, respectively.

As shown in fig. 2, the method uses a common encoder to extract features, performs preliminary fusion on SR reconstruction features and SR segmentation features, and then extracts features fitting respective tasks through independent decoder branches. Specifically, the intermediate features of the decoder of the reconstruction branch contain more detail features of low semantic information such as edges, textures and the like, and the intermediate features of the decoder of the segmentation branch contain more abstract high-level features. For a given inputIt is first encoded by three serial convolution modules comprising 2 layers +.>convolution-Relu activation function-BN normalization layer, downsampling layer is max-pooling downsampling, +.>Processing by a first convolution module to obtain a first coding feature +.>The feature generates a second coding feature via the downsampling layer and a second convolution module>And so on, respectively obtaining the third coding feature +.>And bottleneck characteristics->Will get->And the two decoders are identical in structure and comprise three serial up-sampling layers and a convolution module, and the up-sampling layers use a bilinear interpolation method. Taking the SR split branch decoder as an example, +.>After input to the decoder, the third segmentation feature is obtained by an upsampling layer and a convolution module>By doing so, the second segmentation feature +.>First segmentation feature->And third reconstruction feature +.>Second reconstruction feature->First reconstruction feature. The common encoder exploits correlation and complementarity between reconstruction and segmentation tasksThe reconstruction and segmentation features are initially fused. The independent decoder branches take the difference between different tasks into consideration, so that the mutual side effect between the tasks is avoided.

As shown in fig. 2, the method utilizes the MSFB module to fuse the features in each layer of the encoder, extract the multi-scale features, and send the result to the decoder. The first MFSB module of the SR partition branch, which module is to，/>，/>Interpolation to the same size, and splicing to obtain splicing characteristics->Three parallel large-kernel convolutions are utilized, the convolution kernels are respectivelyExtracting a first multi-scale segmentation residual +.>Feed-forward neural network FFN pairFurther adjusting to obtain a first split residual +.>，/>In a spliced manner with->Binding to obtain new->The subsequent modules input to the split decoder and so on, the remaining MSFB modules are according to +.>Extracting the second partition residual->Third partition residual->First reconstruction residual->Second reconstructed residualThird reconstruction residual->And spliced with corresponding decoder intermediate features. These residual features contain multi-scale information, which enables the model to better cope with the problem of large changes in the size of different organs or lesion areas in the medical image, and supplements the information lost by the encoder during downsampling.

The MSFB module in step S3 uses a large-kernel convolution, which we decompose into three serial smaller convolutions in this method. For one ofWe decompose it into three parts: one or more ofIs depthwise convolution, DWconv, a +.>DWDconv and a point-by-point convolution pointwise convolution, PWconv, in the present method, for the purpose of realismNow 9 x 9 big kernel convolution +.>The first scale feature is obtained by passing through 3X 3 DWconv, 5X 5 DWDconv and PWconv in sequence>The method comprises the steps of carrying out a first treatment on the surface of the To achieve a large kernel convolution of 27 x 27, < >>The second scale feature is obtained by passing 5X 5 DWconv, 7X 7 DWDconv and PWconv in sequence>The method comprises the steps of carrying out a first treatment on the surface of the For the convolution of 3X 3, it is not decomposed +.>Third scale feature +.>Will be、/>、/>And splicing and fusing, and sending the result to a subsequent module of the MSFB. The large-kernel convolution enables the model to have a larger receptive field, so that the performance is improved, and the large-kernel convolution has the defect of large calculation cost and is unfavorable for the deployment of algorithms. By decomposing the large-kernel convolution into three serial convolutions, our method effectively reduces the large overhead that large-kernel convolutions bring.

As shown in fig. 3, the DCAB module fuses features of the reconstructed and split branches using a cross-attention mechanism.

、/>、/>Feature adjustment is carried out through a convolution module respectively to obtain segmentation fusion features +.>Reconstruction of fusion characteristics->Input fusion feature->To supplement the segmentation fusion feature with detailed information, avoiding its negative influence on the reconstructed feature, will +.>And->Adding to obtain new segmentation fusion characteristics，/>And->Is mapped separately into separate query features>Segmentation key value featureSegmentation candidate feature->And rebuild query feature->Reconstruction of key value features->Reconstruction candidate feature->The above features are fused by using a cross-attention operation, and the fusion result is subjected to a local enhancement feed-forward network LEFF to obtain a new segmentation fusion feature +.>And reconstructing fusion characteristics->The process can be expressed as:

，

wherein, the liquid crystal display device comprises a liquid crystal display device,representing feature dimensions, LEFF represents parameters of a locally enhanced feed forward network, < + > in order to guarantee feature stability>And->Respectively and->And->Adding to obtain new->And. The module fully utilizes reconstruction andcomplementarity between segmentation features: the detail features such as edges, textures and the like generated in the reconstructed features can be used as supplements of low-resolution input, so that the segmentation branches can be helped to accurately predict a high-resolution segmentation mask; abstract semantic features generated in the segmentation process can also guide the reconstruction branches to generate more real detail features.

S5: the model described above is optimized by a loss function.

To demonstrate the effectiveness of the present application, the present application also provides the following comparative experiments:

in particular, the present application uses SegTHOR to disclose a dataset that is a chest multi-organ segmentation dataset comprising segmentation tags for 4 organs: heart, aorta, trachea and esophagus. The dataset contained 40 CT scans of the patient, we randomly selected 28 as the training set, 4 as the validation set, and 8 as the test set. Before formal training, we intercept the Hu values between [ -128, 384] and pre-process them as described in step S1. The training process adopts an Adamw optimizer, the initial learning rate is 0.001, and the total training period is 150.

In order to verify the effectiveness of the method in reconstruction and segmentation, we compare with the segmentation and reconstruction algorithm with the best current effect. On the segmentation task, the experimental results of the method are compared with CPFNet, kiU-Net, unet++, UCTransNet, and the evaluation indexes are universal dice and hd95, and the comparison results are shown in Table 1. It can be seen that the method has a more obvious improvement in segmentation performance compared with other methods; in the reconstruction task, the experimental result of the method is compared with RDN, EDSR, NSLA, the evaluation indexes are psnr and ssim, and the comparison result is shown in table 2.

Table 1 the results of the method compared with other algorithms on the segmentation task, the bolded data represent the best performing data. (Eso oesophageal Hea: heart Tra: trachea Aor: aorta)

Table 2 the method compares the results of the reconstruction task with other algorithms and the bolded data represents the best performing data.

To intuitively demonstrate the effectiveness of the present method, we compare the results of the present method with other methods in visual effect. FIG. 3 shows the segmentation results of the methods, and it can be seen that the method can segment the organs more accurately than other methods; FIG. 4 is a reconstruction of the various methods, which can accurately restore the unclear boundary between the esophagus and the aorta, as shown, thanks to the semantic information provided by the segmentation process.

Claims

1. The CT image super-resolution segmentation method assisted by super-resolution reconstruction is characterized by comprising the following steps of:

s1: downsampling the original CT image using a bicubic interpolation algorithm to downsample the original CT image 4 times to a low resolution image I _lr Using the original CT image and the segmentation labels as super-resolution segmentation labels I respectively _hr And a true super-resolution reconstruction tag I _mask ；

S2: the low resolution image I in step S1 _lr Inputting the super-resolution reconstruction and super-resolution segmentation into an encoder through two independent decoder branches; in step S2, features are extracted by using a common encoder, the SR reconstruction features and the SR segmentation features are primarily fused, and then features matching with respective tasks are respectively extracted by independent decoder branches, which is specifically as follows: for a given input I _lr Firstly, the method comprises the steps of encoding the data through three serial convolution modules and a downsampling layer, wherein the convolution modules comprise 2 layers of 3 multiplied by 3 convolution-Relu activation function-BN standardization layers, the downsampling layer is the maximum pooling downsampling layer, and I _lr Processing by a first convolution module to obtain a first coding feature F _encl The feature is passed through a downsampling layer and a second convolution module to generate a second encoded feature F _enc2 And so on, respectively obtaining third coding features F _enc3 And bottleneck feature F _bot F to be obtained _bot The method comprises the steps of sending the data to an SR reconstruction branch decoder and an SR segmentation branch decoder, wherein the two decoders have the same structure and comprise three serial up-sampling layers and a convolution module, and the up-sampling layers use a bilinear interpolation method;

s3: extracting multi-scale features by using a multi-scale fusion module MSFB, and transmitting intermediate features generated in the encoding process in the step S2 to a decoder;

s4: fusing intermediate features of the encoder and decoder in step S2 using a dual channel attention module DCAB;

s5: optimizing through a loss function; wherein in step S5, F is _seg3 And F _recon3 Respectively performing pixelshutdown up-sampling, and calculating a loss function with each label, wherein the loss function of the SR segmentation task comprises cross entropy loss and dice loss, the loss function of the SR reconstruction task comprises L1 loss, and in order to balance the loss functions of the two tasks, a dynamic adjustment mechanism is used, and the specific expression of the loss function is as follows:

L _seg ＝L _ce （I _seg ，I _mask )+L _dice (I _seg ，I _mask )，

L _recon ＝L ₁ (I _sr ，I _hr )，

wherein I is _seg And I _recon Respectively represent F pairs _seg3 And F _recon3 Final result of pixelshutdown up-sampling, I _mask And I _hr Representing a real super-resolution reconstruction tag and a super-resolution segmentation tag, L ₁ Representing calculation L ₁ Loss, L _ce Representing calculated cross entropy loss, L _dice Representing calculated dice loss, L _recon And L _seg Representing the calculated reconstruction loss and segmentation loss, respectively, in order to balance eitherLoss function of business using L _seg And L _recon Calculating the scaling factor of dynamic change, and weighting the scaling factor and the scaling factor to obtain the final loss function L _tota1 。

2. The super-resolution reconstruction-assisted CT image super-resolution segmentation method according to claim 1, wherein: in the step S3, features in each layer of the encoder are fused by using the MSFB module, multi-scale features are extracted, and the result is sent to the decoder, wherein each branch comprises three parallel MSFB modules, specifically, the first MFSB module of the SR splitting branch, which divides F _enc1 ，F _enc2 ，F _enc3 Interpolation to the same size, and splicing to obtain splicing characteristic F _cat The three parallel large-kernel convolutions are utilized, convolution kernels are respectively 3 multiplied by 3, 9 multiplied by 9 and 21 multiplied by 21, and a first multi-scale segmentation residual R is extracted _msseg1 FFN pair R of feedforward neural network _msseg1 Further adjusting to obtain a first segmentation residual R _seg1 ，R _seg1 In a spliced manner with F _seg1 Combining to obtain a new F _seg1 Input to the subsequent module of the split decoder, and so on, the remaining MSFB modules are according to F _cat Extracting a second segmentation residual R _seg2 Third segmentation residual R _seg3 First reconstructed residual R _recon1 Second reconstructed residual R _reeon2 Third reconstruction residual R _recon3 And spliced with corresponding decoder intermediate features.

3. The super-resolution reconstruction-assisted CT image super-resolution segmentation method according to claim 2, wherein: the MSFB module in step S3 uses a large-kernel convolution, which is decomposed into three parts for a k×k convolution: one (2 d-1) × (2 d-1) deep convolution depthwise convolution, DWconv, oneIs a depth-expanded convolution depthwise dilation convolution, DWDconv and a point-wise convolution pointwise convolution，PWconv，F _cat The first scale feature F is obtained by sequentially passing through 3X 3 DWconv, 5X 5 DWDconv and PWconv _scale1 The method comprises the steps of carrying out a first treatment on the surface of the To achieve a large 27 x 27 kernel convolution, F _cat The second scale feature F is obtained by passing 5X 5 DWconv, 7X 7 DWDconv and PWconv in sequence _scale2 The method comprises the steps of carrying out a first treatment on the surface of the For a convolution of 3×3, F _cat Obtaining third scale feature F directly by 3X 3 convolution _scale3 F is to F _scale1 、F _scale2 、F _scale3 And splicing and fusing, and sending the result to a subsequent module of the MSFB.

4. The super-resolution reconstruction-assisted CT image super-resolution segmentation method according to claim 1, wherein: the step S4 uses the DCAB module to fuse the features of the reconstruction and the segmentation branches by using a cross-attention mechanism, F _seg3 、F _recon3 、I _lr Feature adjustment is carried out through a convolution module respectively to obtain segmentation fusion features F _{fuse_seg} Reconstructing fusion feature F _{fuse_recon} Input fusion feature F _{fuse_in} F is to F _{fuse_seg} And F is equal to _{fuse_in} Adding to obtain new segmentation fusion feature F _{fuse_seg} ，F _{fuse_seg} And F _{fuse_recon} Are mapped separately into separate query features Q _seg Segmentation key value feature K _seg Segmentation candidate feature V _seg And rebuilding query feature Q _recon Rebuilding key value characteristic K _recon Reconstruction of candidate feature V _recon The above features are fused by using a cross-attention operation, and the fusion result is subjected to a local enhancement feed-forward network LEFF to obtain a new segmentation fusion feature F _{fuse_seg} And reconstructing fusion feature F _{fuse_recon} The process can be expressed as:

where d represents the feature dimension, LEFF represents the parameters of the locally enhanced feed forward network, F to ensure feature stability _{fuse_seg} And F _{fuse_recon} Respectively with F _seg3 And F _recon3 Adding to obtain a new F _seg3 And F _recon3 。