CN116416261B - CT image super-resolution segmentation method assisted by super-resolution reconstruction - Google Patents

CT image super-resolution segmentation method assisted by super-resolution reconstruction Download PDF

Info

Publication number
CN116416261B
CN116416261B CN202310682299.2A CN202310682299A CN116416261B CN 116416261 B CN116416261 B CN 116416261B CN 202310682299 A CN202310682299 A CN 202310682299A CN 116416261 B CN116416261 B CN 116416261B
Authority
CN
China
Prior art keywords
segmentation
super
resolution
reconstruction
convolution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310682299.2A
Other languages
Chinese (zh)
Other versions
CN116416261A (en
Inventor
葛荣骏
徐颖
张道强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Aeronautics and Astronautics
Original Assignee
Nanjing University of Aeronautics and Astronautics
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Aeronautics and Astronautics filed Critical Nanjing University of Aeronautics and Astronautics
Priority to CN202310682299.2A priority Critical patent/CN116416261B/en
Publication of CN116416261A publication Critical patent/CN116416261A/en
Application granted granted Critical
Publication of CN116416261B publication Critical patent/CN116416261B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0499Feedforward networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • G06T11/003Reconstruction from projections, e.g. tomography
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformation in the plane of the image
    • G06T3/40Scaling the whole image or part thereof
    • G06T3/4007Interpolation-based scaling, e.g. bilinear interpolation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformation in the plane of the image
    • G06T3/40Scaling the whole image or part thereof
    • G06T3/4046Scaling the whole image or part thereof using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformation in the plane of the image
    • G06T3/40Scaling the whole image or part thereof
    • G06T3/4053Super resolution, i.e. output image resolution higher than sensor resolution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10072Tomographic images
    • G06T2207/10081Computed x-ray tomography [CT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The application discloses a CT image super-resolution segmentation method assisted by super-resolution reconstruction, which comprises the following steps: 4 times down-sampling the original CT image by using a bicubic algorithm; -converting said low resolution image I lr Inputting the super-resolution reconstruction and super-resolution segmentation into an encoder through two independent decoder branches; extracting multi-scale features by using a multi-scale fusion module MSFB, and transmitting intermediate features of the encoder to a decoder; fusing intermediate features of the encoder and decoder using a dual channel attention module DCAB; the model described above is optimized by a loss function.

Description

CT image super-resolution segmentation method assisted by super-resolution reconstruction
Technical Field
The application belongs to the technical field of medical image processing, relates to the technical field of CT image reconstruction and segmentation, and particularly relates to a CT image super-resolution segmentation method assisted by super-resolution reconstruction.
Background
In clinical diagnosis, it is important to perform super-resolution (SR) segmentation on medical scan images. Existing segmentation techniques aim at segmenting regions of interest, such as vital organs or infected regions, from medical images, thereby obtaining important information about the size, shape and location of the region. However, for some regions where the anatomy is complex, the segmentation mask of the original resolution may not accurately express the segmented region, and thus it is necessary to predict a segmentation mask of high resolution from a CT image of low resolution using the SR segmentation method. However, the low resolution image contains limited detailed information that is insufficient to support the prediction of a precise high resolution segmentation mask. Therefore, we consider predicting a corresponding high resolution CT from a low resolution CT using SR reconstruction techniques, with low-level features restored during reconstruction, such as texture and edges, to aid in predicting the high resolution segmentation mask.
Existing methods for assisting SR segmentation using SR reconstruction are mainly divided into two categories: the first is to take the SR reconstruction as a preprocessing step of the image, which ignores the correlation and complementarity between the SR reconstruction and the SR segmentation task. In fact, not only can the detail information of the SR reconstruction process assist the SR segmentation process to generate a more accurate segmentation mask, but also abstract semantic information provided by the SR segmentation process can guide the SR reconstruction process to generate texture details more in line with real distribution; the second is to combine the SR reconstruction model and the SR segmentation model in a serial manner, which allows the SR reconstruction and SR segmentation processes to interact and adjust to each other, but the interaction between the two processes is still insufficient. Furthermore, the parallel approach can lead to an accumulation of errors. Thus, there is still a lack of a method that can effectively combine SR reconstruction and SR segmentation, thereby enabling the two processes to interact.
Disclosure of Invention
The application aims to: the application aims to provide a CT image super-resolution segmentation method assisted by super-resolution reconstruction. The method utilizes complementarity between the SR reconstruction and the SR segmentation task, specifically, the detailed characteristics such as texture, edge and the like restored in the SR reconstruction process can help the SR segmentation process to predict a more accurate segmentation mask, and abstract semantic features extracted in the SR segmentation process can also guide the SR reconstruction process to generate texture details more conforming to real distribution. In addition, considering that the size of the region of interest in the CT image varies greatly, the method uses multi-scale large-kernel convolution to extract multi-scale features, thereby further improving the performance of reconstruction and segmentation.
The technical scheme is as follows: in order to achieve the above object, the present application provides a method for CT image super-resolution segmentation with super-resolution reconstruction assistance, comprising the steps of:
s1: downsampling the original CT image using a bicubic interpolation algorithm to downsample the original CT image 4 times to a low resolution imageUse of the original CT image +.>And split tag->Respectively used as a super-resolution reconstruction tag and a super-resolution segmentation tag;
s2: the low resolution image in step S1Inputting the super-resolution reconstruction and super-resolution segmentation into an encoder through two independent decoder branches;
s3: the multi-scale features are extracted using a multi-scale fusion Module (MSFB) and the intermediate features generated during the encoding in step S2 are passed to a decoder.
S4: the intermediate features of the encoder and decoder in step S2 are fused using a two-channel attention module (DCAB).
S5: the model described above is optimized by a loss function.
The scheme fully utilizes complementarity between the SR reconstruction and the SR segmentation task, specifically, the detailed characteristics such as texture, edge and the like restored in the SR reconstruction process can help the SR segmentation process to predict a more accurate segmentation mask, and abstract semantic characteristics extracted in the SR segmentation process can also guide the SR reconstruction process to generate texture details more conforming to real distribution.
Further, in the step S2, features are extracted by using a common encoder, and then features matching respective tasks are extracted by using independent decoder branches. Specifically, the intermediate features of the decoder of the reconstruction branch contain more detail features of low semantic information such as edges, textures and the like, and the intermediate features of the decoder of the segmentation branch contain more abstract high-level features. The method comprises the following steps: for a given inputIt is first encoded by three serial convolution modules comprising 2 layers +.>convolution-Relu activation function-BN normalization layer, downsampling layer is max-pooling downsampling, +.>Processing by a first convolution module to obtain a first coding feature +.>The feature generates a second coding feature via the downsampling layer and a second convolution module>And so on, respectively obtaining the third coding feature +.>And bottleneck characteristics->Will get->And the two decoders are identical in structure and comprise three serial up-sampling layers and a convolution module, and the up-sampling layers use a bilinear interpolation method. Taking the SR split branch decoder as an example, +.>After input to the decoder, the third segmentation feature is obtained by an upsampling layer and a convolution module>By doing so, the second segmentation feature +.>First segmentation feature->And third reconstruction feature +.>Second reconstruction feature->First reconstruction feature->. The common encoder utilizes the correlation and complementarity between reconstruction and segmentation tasks to perform preliminary fusion of reconstruction and segmentation features. The independent decoder branches take the difference between different tasks into consideration, so that the mutual side effect between the tasks is avoided.
Further, in the step S3, the features in each layer of the encoder are fused by using the MSFB module, the multi-scale features are extracted, and the result is sent to the decoder, where each branch includes three parallel MSFB modules, specifically, the first MFSB module of the SR splitting branch, which will,/>,/>Interpolation to the same size, and splicing to obtain splicing characteristicsWith three parallel large-kernel convolutions, the convolution kernels are +.>Extracting a first multi-scale segmentation residual +.>Feed-forward neural network FFN pair +.>Further adjusting to obtain a first segmentation residual,/>In a spliced manner with->Binding to obtain new->The subsequent modules input to the split decoder and so on, the remaining MSFB modules are according to +.>Extracting the second partition residual->Third partition residual->First reconstruction residual->Second reconstructed residual->Third reconstruction residual->And spliced with corresponding decoder intermediate features. These residual features contain multi-scale information, which enables the model to better cope with the problem of large changes in the size of different organs or lesion areas in the medical image, and supplements the information lost by the encoder during downsampling.
Further, the MSFB module in step S3 uses a large-kernel convolution, which we decompose into three smaller convolutions in series in this method. For one ofWe decompose it into three parts: one or more ofIs depthwise convolution, DWconv, a +.>Is a deep-expansion roll of (2)Product depthwise dilation convolution, DWDconv and a point-wise convolution pointwise convolution, PWconv, in the present method, in order to achieve a large kernel convolution of 9X 9 +.>The first scale feature is obtained by passing through 3X 3 DWconv, 5X 5 DWDconv and PWconv in sequence>The method comprises the steps of carrying out a first treatment on the surface of the To achieve a large kernel convolution of 27 x 27, < >>The second scale feature is obtained by passing 5X 5 DWconv, 7X 7 DWDconv and PWconv in sequence>The method comprises the steps of carrying out a first treatment on the surface of the For the convolution of 3X 3, it is not decomposed +.>Third scale feature +.>Will be、/>、/>And splicing and fusing, and sending the result to a subsequent module of the MSFB. The large-kernel convolution enables the model to have a larger receptive field, so that the performance is improved, and the large-kernel convolution has the defect of large calculation cost and is unfavorable for the deployment of algorithms. By decomposing the large-kernel convolution into three serial convolutions, our method effectively reduces the large overhead that large-kernel convolutions bring.
Further, the step S4 uses the DCAB module to fuse the features of the reconstruction and segmentation branches by using a cross-attention mechanism,、/>、/>feature adjustment is carried out through a convolution module respectively to obtain segmentation fusion features +.>Reconstruction of fusion characteristics->Input fusion feature->To supplement the segmentation fusion feature with detailed information, avoiding its negative influence on the reconstructed feature, will +.>And->Adding to obtain new segmentation fusion feature->,/>And->Is mapped separately into separate query features>Split key feature->Segmentation candidate feature->And rebuild query feature->Reconstruction of key value features->Reconstruction candidate feature->The above features are fused by using a cross-attention operation, and the fusion result is subjected to a local enhancement feed-forward network LEFF to obtain a new segmentation fusion feature +.>And reconstructing fusion characteristics->The process can be expressed as:
wherein, the liquid crystal display device comprises a liquid crystal display device,representing feature dimensions, LEFF represents parameters of a locally enhanced feed forward network, < + > in order to guarantee feature stability>And->Respectively and->And->Adding to obtain new->And. The module fully exploits complementarity between the reconstructed and segmented features: reconstructing edges and textures generated in featuresThe detail features can be used as the supplement of low-resolution input, so that the segmentation branches can be helped to more accurately predict a high-resolution segmentation mask; abstract semantic features generated in the segmentation process can also guide the reconstruction branches to generate more real detail features.
Further, the pair in the step S5And->Respectively performing pixelshutdown up-sampling, and calculating a loss function with each label, wherein the loss function of the SR segmentation task comprises cross entropy loss and dice loss, the loss function of the SR reconstruction task comprises L1 loss, and in order to balance the loss functions of the two tasks, a dynamic adjustment mechanism is used, and the specific expression of the loss function is as follows:
wherein (1)>And->Respectively represent pair->And->Final result of pixelshutdown up-sampling, +.>And->Representing a real super-resolution reconstruction tag and a super-resolution segmentation tag, respectively, < >>Representing the calculation->Loss (S)>Representing the calculated cross entropy loss,/->Representing the calculated race loss,/->And->Representing the calculated reconstruction loss and segmentation loss, respectively, in order to balance the loss functions of the two tasks, use +.>And->Calculating the scaling factor of dynamic change and weighting the scaling factor to obtain the final loss function +.>
The beneficial effects are that: compared with the prior art, the application has the following advantages:
1. the application provides a CT image super-resolution segmentation method assisted by super-resolution reconstruction by utilizing complementarity between super-resolution reconstruction and super-resolution segmentation;
2. the application uses the characteristics of each layer of the parallel large-core convolution fusion encoder, and extracts multi-scale characteristics from the characteristics so as to better process the situation that the sizes of organs in the medical image are greatly different;
3. the application provides a DCAB module which utilizes cross-attention operation to effectively fuse SR reconstruction characteristics and SR segmentation characteristics, so that the performances of two tasks can be improved;
4. the present application uses dynamic weights to balance the reconstruction and segmentation tasks, dynamically adjusting the loss function.
Drawings
FIG. 1 is a schematic flow chart of a CT image super-resolution segmentation model assisted by super-resolution reconstruction;
FIG. 2 is a general frame structure diagram of a CT image super-resolution segmentation model assisted by super-resolution reconstruction provided by the application;
FIG. 3 is a schematic diagram of a dual channel attention module (DCAB) topology according to the present application;
FIG. 4 is a graph showing the comparison of the results of super-resolution segmentation;
fig. 5 is a comparative graph of the results of super-resolution reconstruction.
Detailed Description
The present application is further illustrated in the accompanying drawings and detailed description which are to be understood as being merely illustrative of the application and not limiting of its scope, and various modifications of the application, which are equivalent to those skilled in the art upon reading the application, will fall within the scope of the application as defined in the appended claims.
Examples: the super-resolution segmentation task and the super-resolution reconstruction task have great relevance and complementarity. The super-resolution reconstruction task can gradually restore detail characteristics in the reconstruction process, supplement limited detail information in the input low-resolution CT image, and help the super-resolution segmentation task to more accurately predict the segmentation mask; the super-resolution segmentation can extract abstracted semantic information and guide the reconstruction process to generate texture details which are more in line with real distribution. The method uses two parallel branches to process reconstruction and segmentation tasks simultaneously, and designs a special fusion module to effectively fuse the middle characteristics of different branches, so that the reconstruction and segmentation tasks are mutually promoted.
The application comprises the following steps:
s1: downsampling the original CT image using a bicubic interpolation algorithm to downsample the original CT image 4 times to a low resolution imageUsing the originalCT image +.>And split tag->Respectively used as a super-resolution reconstruction tag and a super-resolution segmentation tag;
the data set used contains the CT image and its corresponding segmentation label. To meet the requirement of the method, we downsampled the CT image 4 times by the bicubic interpolation algorithm in the traditional image processing algorithm, and take it as the low resolution input of the model.
S2: the low resolution image in step S1The encoder is input and super-resolution reconstruction and super-resolution segmentation are performed by two independent decoder branches, respectively.
As shown in fig. 2, the method uses a common encoder to extract features, performs preliminary fusion on SR reconstruction features and SR segmentation features, and then extracts features fitting respective tasks through independent decoder branches. Specifically, the intermediate features of the decoder of the reconstruction branch contain more detail features of low semantic information such as edges, textures and the like, and the intermediate features of the decoder of the segmentation branch contain more abstract high-level features. For a given inputIt is first encoded by three serial convolution modules comprising 2 layers +.>convolution-Relu activation function-BN normalization layer, downsampling layer is max-pooling downsampling, +.>Processing by a first convolution module to obtain a first coding feature +.>The feature generates a second coding feature via the downsampling layer and a second convolution module>And so on, respectively obtaining the third coding feature +.>And bottleneck characteristics->Will get->And the two decoders are identical in structure and comprise three serial up-sampling layers and a convolution module, and the up-sampling layers use a bilinear interpolation method. Taking the SR split branch decoder as an example, +.>After input to the decoder, the third segmentation feature is obtained by an upsampling layer and a convolution module>By doing so, the second segmentation feature +.>First segmentation feature->And third reconstruction feature +.>Second reconstruction feature->First reconstruction feature. The common encoder exploits correlation and complementarity between reconstruction and segmentation tasksThe reconstruction and segmentation features are initially fused. The independent decoder branches take the difference between different tasks into consideration, so that the mutual side effect between the tasks is avoided.
S3: the multi-scale features are extracted using a multi-scale fusion Module (MSFB) and the intermediate features generated during the encoding in step S2 are passed to a decoder.
As shown in fig. 2, the method utilizes the MSFB module to fuse the features in each layer of the encoder, extract the multi-scale features, and send the result to the decoder. The first MFSB module of the SR partition branch, which module is to,/>,/>Interpolation to the same size, and splicing to obtain splicing characteristics->Three parallel large-kernel convolutions are utilized, the convolution kernels are respectivelyExtracting a first multi-scale segmentation residual +.>Feed-forward neural network FFN pairFurther adjusting to obtain a first split residual +.>,/>In a spliced manner with->Binding to obtain new->The subsequent modules input to the split decoder and so on, the remaining MSFB modules are according to +.>Extracting the second partition residual->Third partition residual->First reconstruction residual->Second reconstructed residualThird reconstruction residual->And spliced with corresponding decoder intermediate features. These residual features contain multi-scale information, which enables the model to better cope with the problem of large changes in the size of different organs or lesion areas in the medical image, and supplements the information lost by the encoder during downsampling.
The MSFB module in step S3 uses a large-kernel convolution, which we decompose into three serial smaller convolutions in this method. For one ofWe decompose it into three parts: one or more ofIs depthwise convolution, DWconv, a +.>DWDconv and a point-by-point convolution pointwise convolution, PWconv, in the present method, for the purpose of realismNow 9 x 9 big kernel convolution +.>The first scale feature is obtained by passing through 3X 3 DWconv, 5X 5 DWDconv and PWconv in sequence>The method comprises the steps of carrying out a first treatment on the surface of the To achieve a large kernel convolution of 27 x 27, < >>The second scale feature is obtained by passing 5X 5 DWconv, 7X 7 DWDconv and PWconv in sequence>The method comprises the steps of carrying out a first treatment on the surface of the For the convolution of 3X 3, it is not decomposed +.>Third scale feature +.>Will be、/>、/>And splicing and fusing, and sending the result to a subsequent module of the MSFB. The large-kernel convolution enables the model to have a larger receptive field, so that the performance is improved, and the large-kernel convolution has the defect of large calculation cost and is unfavorable for the deployment of algorithms. By decomposing the large-kernel convolution into three serial convolutions, our method effectively reduces the large overhead that large-kernel convolutions bring.
S4: the intermediate features of the encoder and decoder in step S2 are fused using a two-channel attention module (DCAB).
As shown in fig. 3, the DCAB module fuses features of the reconstructed and split branches using a cross-attention mechanism.
、/>、/>Feature adjustment is carried out through a convolution module respectively to obtain segmentation fusion features +.>Reconstruction of fusion characteristics->Input fusion feature->To supplement the segmentation fusion feature with detailed information, avoiding its negative influence on the reconstructed feature, will +.>And->Adding to obtain new segmentation fusion characteristics,/>And->Is mapped separately into separate query features>Segmentation key value featureSegmentation candidate feature->And rebuild query feature->Reconstruction of key value features->Reconstruction candidate feature->The above features are fused by using a cross-attention operation, and the fusion result is subjected to a local enhancement feed-forward network LEFF to obtain a new segmentation fusion feature +.>And reconstructing fusion characteristics->The process can be expressed as:
wherein, the liquid crystal display device comprises a liquid crystal display device,representing feature dimensions, LEFF represents parameters of a locally enhanced feed forward network, < + > in order to guarantee feature stability>And->Respectively and->And->Adding to obtain new->And. The module fully utilizes reconstruction andcomplementarity between segmentation features: the detail features such as edges, textures and the like generated in the reconstructed features can be used as supplements of low-resolution input, so that the segmentation branches can be helped to accurately predict a high-resolution segmentation mask; abstract semantic features generated in the segmentation process can also guide the reconstruction branches to generate more real detail features.
S5: the model described above is optimized by a loss function.
To demonstrate the effectiveness of the present application, the present application also provides the following comparative experiments:
in particular, the present application uses SegTHOR to disclose a dataset that is a chest multi-organ segmentation dataset comprising segmentation tags for 4 organs: heart, aorta, trachea and esophagus. The dataset contained 40 CT scans of the patient, we randomly selected 28 as the training set, 4 as the validation set, and 8 as the test set. Before formal training, we intercept the Hu values between [ -128, 384] and pre-process them as described in step S1. The training process adopts an Adamw optimizer, the initial learning rate is 0.001, and the total training period is 150.
In order to verify the effectiveness of the method in reconstruction and segmentation, we compare with the segmentation and reconstruction algorithm with the best current effect. On the segmentation task, the experimental results of the method are compared with CPFNet, kiU-Net, unet++, UCTransNet, and the evaluation indexes are universal dice and hd95, and the comparison results are shown in Table 1. It can be seen that the method has a more obvious improvement in segmentation performance compared with other methods; in the reconstruction task, the experimental result of the method is compared with RDN, EDSR, NSLA, the evaluation indexes are psnr and ssim, and the comparison result is shown in table 2.
Table 1 the results of the method compared with other algorithms on the segmentation task, the bolded data represent the best performing data. (Eso oesophageal Hea: heart Tra: trachea Aor: aorta)
Table 2 the method compares the results of the reconstruction task with other algorithms and the bolded data represents the best performing data.
To intuitively demonstrate the effectiveness of the present method, we compare the results of the present method with other methods in visual effect. FIG. 3 shows the segmentation results of the methods, and it can be seen that the method can segment the organs more accurately than other methods; FIG. 4 is a reconstruction of the various methods, which can accurately restore the unclear boundary between the esophagus and the aorta, as shown, thanks to the semantic information provided by the segmentation process.

Claims (4)

1. The CT image super-resolution segmentation method assisted by super-resolution reconstruction is characterized by comprising the following steps of:
s1: downsampling the original CT image using a bicubic interpolation algorithm to downsample the original CT image 4 times to a low resolution image I lr Using the original CT image and the segmentation labels as super-resolution segmentation labels I respectively hr And a true super-resolution reconstruction tag I mask
S2: the low resolution image I in step S1 lr Inputting the super-resolution reconstruction and super-resolution segmentation into an encoder through two independent decoder branches; in step S2, features are extracted by using a common encoder, the SR reconstruction features and the SR segmentation features are primarily fused, and then features matching with respective tasks are respectively extracted by independent decoder branches, which is specifically as follows: for a given input I lr Firstly, the method comprises the steps of encoding the data through three serial convolution modules and a downsampling layer, wherein the convolution modules comprise 2 layers of 3 multiplied by 3 convolution-Relu activation function-BN standardization layers, the downsampling layer is the maximum pooling downsampling layer, and I lr Processing by a first convolution module to obtain a first coding feature F encl The feature is passed through a downsampling layer and a second convolution module to generate a second encoded feature F enc2 And so on, respectively obtaining third coding features F enc3 And bottleneck feature F bot F to be obtained bot The method comprises the steps of sending the data to an SR reconstruction branch decoder and an SR segmentation branch decoder, wherein the two decoders have the same structure and comprise three serial up-sampling layers and a convolution module, and the up-sampling layers use a bilinear interpolation method;
s3: extracting multi-scale features by using a multi-scale fusion module MSFB, and transmitting intermediate features generated in the encoding process in the step S2 to a decoder;
s4: fusing intermediate features of the encoder and decoder in step S2 using a dual channel attention module DCAB;
s5: optimizing through a loss function; wherein in step S5, F is seg3 And F recon3 Respectively performing pixelshutdown up-sampling, and calculating a loss function with each label, wherein the loss function of the SR segmentation task comprises cross entropy loss and dice loss, the loss function of the SR reconstruction task comprises L1 loss, and in order to balance the loss functions of the two tasks, a dynamic adjustment mechanism is used, and the specific expression of the loss function is as follows:
L seg =L ce (I seg ,I mask )+L dice (I seg ,I mask ),
L recon =L 1 (I sr ,I hr ),
wherein I is seg And I recon Respectively represent F pairs seg3 And F recon3 Final result of pixelshutdown up-sampling, I mask And I hr Representing a real super-resolution reconstruction tag and a super-resolution segmentation tag, L 1 Representing calculation L 1 Loss, L ce Representing calculated cross entropy loss, L dice Representing calculated dice loss, L recon And L seg Representing the calculated reconstruction loss and segmentation loss, respectively, in order to balance eitherLoss function of business using L seg And L recon Calculating the scaling factor of dynamic change, and weighting the scaling factor and the scaling factor to obtain the final loss function L tota1
2. The super-resolution reconstruction-assisted CT image super-resolution segmentation method according to claim 1, wherein: in the step S3, features in each layer of the encoder are fused by using the MSFB module, multi-scale features are extracted, and the result is sent to the decoder, wherein each branch comprises three parallel MSFB modules, specifically, the first MFSB module of the SR splitting branch, which divides F enc1 ,F enc2 ,F enc3 Interpolation to the same size, and splicing to obtain splicing characteristic F cat The three parallel large-kernel convolutions are utilized, convolution kernels are respectively 3 multiplied by 3, 9 multiplied by 9 and 21 multiplied by 21, and a first multi-scale segmentation residual R is extracted msseg1 FFN pair R of feedforward neural network msseg1 Further adjusting to obtain a first segmentation residual R seg1 ,R seg1 In a spliced manner with F seg1 Combining to obtain a new F seg1 Input to the subsequent module of the split decoder, and so on, the remaining MSFB modules are according to F cat Extracting a second segmentation residual R seg2 Third segmentation residual R seg3 First reconstructed residual R recon1 Second reconstructed residual R reeon2 Third reconstruction residual R recon3 And spliced with corresponding decoder intermediate features.
3. The super-resolution reconstruction-assisted CT image super-resolution segmentation method according to claim 2, wherein: the MSFB module in step S3 uses a large-kernel convolution, which is decomposed into three parts for a k×k convolution: one (2 d-1) × (2 d-1) deep convolution depthwise convolution, DWconv, oneIs a depth-expanded convolution depthwise dilation convolution, DWDconv and a point-wise convolution pointwise convolution,PWconv,F cat The first scale feature F is obtained by sequentially passing through 3X 3 DWconv, 5X 5 DWDconv and PWconv scale1 The method comprises the steps of carrying out a first treatment on the surface of the To achieve a large 27 x 27 kernel convolution, F cat The second scale feature F is obtained by passing 5X 5 DWconv, 7X 7 DWDconv and PWconv in sequence scale2 The method comprises the steps of carrying out a first treatment on the surface of the For a convolution of 3×3, F cat Obtaining third scale feature F directly by 3X 3 convolution scale3 F is to F scale1 、F scale2 、F scale3 And splicing and fusing, and sending the result to a subsequent module of the MSFB.
4. The super-resolution reconstruction-assisted CT image super-resolution segmentation method according to claim 1, wherein: the step S4 uses the DCAB module to fuse the features of the reconstruction and the segmentation branches by using a cross-attention mechanism, F seg3 、F recon3 、I lr Feature adjustment is carried out through a convolution module respectively to obtain segmentation fusion features F fuse_seg Reconstructing fusion feature F fuse_recon Input fusion feature F fuse_in F is to F fuse_seg And F is equal to fuse_in Adding to obtain new segmentation fusion feature F fuse_seg ,F fuse_seg And F fuse_recon Are mapped separately into separate query features Q seg Segmentation key value feature K seg Segmentation candidate feature V seg And rebuilding query feature Q recon Rebuilding key value characteristic K recon Reconstruction of candidate feature V recon The above features are fused by using a cross-attention operation, and the fusion result is subjected to a local enhancement feed-forward network LEFF to obtain a new segmentation fusion feature F fuse_seg And reconstructing fusion feature F fuse_recon The process can be expressed as:
where d represents the feature dimension, LEFF represents the parameters of the locally enhanced feed forward network, F to ensure feature stability fuse_seg And F fuse_recon Respectively with F seg3 And F recon3 Adding to obtain a new F seg3 And F recon3
CN202310682299.2A 2023-06-09 2023-06-09 CT image super-resolution segmentation method assisted by super-resolution reconstruction Active CN116416261B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310682299.2A CN116416261B (en) 2023-06-09 2023-06-09 CT image super-resolution segmentation method assisted by super-resolution reconstruction

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310682299.2A CN116416261B (en) 2023-06-09 2023-06-09 CT image super-resolution segmentation method assisted by super-resolution reconstruction

Publications (2)

Publication Number Publication Date
CN116416261A CN116416261A (en) 2023-07-11
CN116416261B true CN116416261B (en) 2023-09-12

Family

ID=87049598

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310682299.2A Active CN116416261B (en) 2023-06-09 2023-06-09 CT image super-resolution segmentation method assisted by super-resolution reconstruction

Country Status (1)

Country Link
CN (1) CN116416261B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113657388A (en) * 2021-07-09 2021-11-16 北京科技大学 Image semantic segmentation method fusing image super-resolution reconstruction
CN114841859A (en) * 2022-04-28 2022-08-02 南京信息工程大学 Single-image super-resolution reconstruction method based on lightweight neural network and Transformer
CN115953494A (en) * 2023-03-09 2023-04-11 南京航空航天大学 Multi-task high-quality CT image reconstruction method based on low dose and super-resolution
WO2023098289A1 (en) * 2021-12-01 2023-06-08 浙江大学 Automatic unlabeled pancreas image segmentation system based on adversarial learning

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113657388A (en) * 2021-07-09 2021-11-16 北京科技大学 Image semantic segmentation method fusing image super-resolution reconstruction
WO2023098289A1 (en) * 2021-12-01 2023-06-08 浙江大学 Automatic unlabeled pancreas image segmentation system based on adversarial learning
CN114841859A (en) * 2022-04-28 2022-08-02 南京信息工程大学 Single-image super-resolution reconstruction method based on lightweight neural network and Transformer
CN115953494A (en) * 2023-03-09 2023-04-11 南京航空航天大学 Multi-task high-quality CT image reconstruction method based on low dose and super-resolution

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Low-Dose CT Denoising via Sinogram Inner-Structure Transformer;Yang Liutao 等;《IEEE Transactions on Medical Imaging》;第42卷(第4期);第910-921页 *

Also Published As

Publication number Publication date
CN116416261A (en) 2023-07-11

Similar Documents

Publication Publication Date Title
CN111563902A (en) Lung lobe segmentation method and system based on three-dimensional convolutional neural network
Chi et al. Computed tomography (CT) image quality enhancement via a uniform framework integrating noise estimation and super-resolution networks
CN116433697B (en) Abdominal multi-organ CT image segmentation method based on eye movement instrument
CN116433914A (en) Two-dimensional medical image segmentation method and system
CN112700460A (en) Image segmentation method and system
CN112150470A (en) Image segmentation method, image segmentation device, image segmentation medium, and electronic device
CN117058307A (en) Method, system, equipment and storage medium for generating heart three-dimensional nuclear magnetic resonance image
CN116128898A (en) Skin lesion image segmentation method based on transducer double-branch model
CN116823850A (en) Cardiac MRI segmentation method and system based on U-Net and transducer fusion improvement
CN111260670A (en) Tubular structure segmentation graph fracture repairing method and system of three-dimensional image based on deep learning network
CN114419277A (en) Image-based step-by-step generation type human body reconstruction method and device
CN116416261B (en) CT image super-resolution segmentation method assisted by super-resolution reconstruction
Zhou et al. High-resolution hierarchical adversarial learning for OCT speckle noise reduction
CN117292704A (en) Voice-driven gesture action generation method and device based on diffusion model
CN114708353B (en) Image reconstruction method and device, electronic equipment and storage medium
CN117333428A (en) CT image segmentation method, device, equipment and storage medium
CN116258730A (en) Semi-supervised medical image segmentation method based on consistency loss function
CN115984560A (en) Image segmentation method based on CNN and Transformer
Joshi et al. Efficient diffeomorphic image registration using multi-scale dual-phased learning
Han et al. Edge-directed single image super-resolution via cross-resolution sharpening function learning
Wei et al. SRP&PASMLP‐Net: Lightweight skin lesion segmentation network based on structural re‐parameterization and parallel axial shift multilayer perceptron
Yan et al. MRSNet: Joint consistent optic disc and cup segmentation based on large kernel residual convolutional attention and self-attention
CN109840888A (en) A kind of image super-resolution rebuilding method based on joint constraint
WO2022255025A1 (en) Region extraction model creation assistance device, method for operating region extraction model creation assistance device, and program for operating region extraction model creation assistance device
CN114299053A (en) Parallel multi-resolution coding and decoding network model and medical image segmentation method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant