CN116416156A - Swin transducer-based medical image denoising method - Google Patents

Swin transducer-based medical image denoising method Download PDF

Info

Publication number
CN116416156A
CN116416156A CN202310246661.1A CN202310246661A CN116416156A CN 116416156 A CN116416156 A CN 116416156A CN 202310246661 A CN202310246661 A CN 202310246661A CN 116416156 A CN116416156 A CN 116416156A
Authority
CN
China
Prior art keywords
swin
image
network
denoising
medical image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310246661.1A
Other languages
Chinese (zh)
Inventor
苏进
李学俊
王华彬
张弓
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Canada Institute Of Health Engineering Hefei Co ltd
Original Assignee
China Canada Institute Of Health Engineering Hefei Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Canada Institute Of Health Engineering Hefei Co ltd filed Critical China Canada Institute Of Health Engineering Hefei Co ltd
Priority to CN202310246661.1A priority Critical patent/CN116416156A/en
Publication of CN116416156A publication Critical patent/CN116416156A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/70Denoising; Smoothing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30168Image quality inspection

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses a Swin transform-based medical image denoising method, and belongs to the technical field of image processing. The invention comprises the following steps: step one, obtaining a noise medical image and a clean medical image as a training set and a testing set; step two, adding a Swin transducer to the neural network to design an RSTB block; training a network by adopting an Adam algorithm, and constructing a medical image denoising network; inputting the noise medical image into a network to obtain a denoising result; and fifthly, evaluating the network by using the image quality evaluation index. According to the invention, the Swin transform module is used as a main body of the network structure, so that the texture details can be refined by using the global information of the picture; the use of downsampling increases the receptive field, allows the convolution of 3*3 to perform feature extraction over a larger image range, and the model can better reconstruct the details and texture of the image.

Description

Swin transducer-based medical image denoising method
Technical Field
The invention relates to the technical field of image processing, in particular to a Swin transducer-based medical image denoising method.
Background
Image restoration is a long-standing problem of low-level vision, whose purpose is to restore high-quality, noiseless images from low-quality images, such as reduced, noisy, and compressed images. Advanced image restoration methods are based on convolutional neural networks, but cannot solve the problem of long-distance dependence of information due to the limitation of local modeling.
Most CNN-based approaches focus on complex architectural designs such as residual learning and dense connections. Although there is a significant improvement in performance over traditional model-based approaches, two fundamental problems are often faced, both of which stem from the underlying convolutional layers. First, the interaction between the image and the convolution kernel is content independent, and using the same convolution kernel to recover different image regions may not be the best choice. Second, convolution is ineffective for long-term dependency modeling under the principle of local processing.
The transfomer devised a self-attention mechanism to capture global interactions between contexts and to show good performance on several visual problems. However, visual transducers for image restoration typically divide an input image into fixed-size patches (e.g., 48 x 48) and process each patch independently. This strategy inevitably brings about two disadvantages. First, the boundary pixels cannot perform image restoration using neighboring pixels outside the patch. Second, the restored image may introduce boundary artifacts around each patch. Although this problem can be alleviated by patch overlap, it can impose additional computational burden and computational effort.
After searching, chinese patent number CN114140353A, the patent name is: channel attention-based Swin-transducer image denoising method and system; according to the application, a noise image is input into a denoising network model after training and optimization, a shallow feature extraction network in the denoising network model firstly extracts shallow feature information such as noise and channels of the noise image, then the extracted shallow feature information is input into a deep feature extraction network in the denoising network model to obtain deep feature information, and then the shallow feature information and the deep feature information are input into a reconstruction network of the denoising network model to perform feature fusion, so that a pure image can be obtained; but this application differs from the present patent in the idea of using Swin transducer for medical image denoising.
Disclosure of Invention
1. Technical problem to be solved by the invention
In view of the shortcomings of the prior art, the invention provides a medical image denoising method based on a Swin transducer, and provides a strong baseline image restoration model-USwinTrans based on the Swin transducer, which combines the advantages of CNN and transducer, on one hand, the method has the advantage of CNN processing large-size images due to a local attention mechanism. On the other hand, with the advantage of a transducer, long-range dependencies can be modeled with a shift window scheme.
2. Technical proposal
In order to achieve the above purpose, the technical scheme provided by the invention is as follows:
the invention discloses a Swin transducer-based medical image denoising method, which comprises the following steps of:
step one, obtaining a noise medical image and a clean medical image as a training set and a testing set;
step two, adding a Swin transducer to the neural network to design an RSTB block;
training a network by adopting an Adam algorithm, and constructing a medical image denoising network;
inputting the noise medical image into a network to obtain a denoising result;
and fifthly, evaluating the network by using the image quality evaluation index.
Further, the step one performs data augmentation on the obtained image training set by cutting the image.
Furthermore, a depth feature extraction module is added between the encoder and the decoder of the U-net network, and the module introduces a Swin transform and is combined with convolution operation to extract local and global information respectively.
Still further, the encoder of the U-net network uses two downsampling operations of step size 2, and the decoder uses two upsampling operations of step size 2.
Still further, the depth feature extraction module includes a plurality of RSTB blocks and a convolution block, each RSTB block includes a plurality of Swin transform layers and a convolution block, and residual connection is performed after the series connection.
Still further, the depth feature extraction module includes 5 RSTB blocks and 1 convolution block, and each RSTB block is a residual module formed by connecting 6 Swin transform layers and 1 convolution block in series.
Furthermore, the Swin Transformer layer consists of two residual blocks, wherein the first residual block is normalized by a LayerNorm layer and is connected with a multi-head self-attention module MSA; the second residual block is normalized by the LayerNorm layer, followed by a multi-layer perceptron MLP.
Still further, step three trains the network using Adam optimizer, L2 loss function is calculated as:
Figure BDA0004126198830000021
wherein X represents a clean image,
Figure BDA0004126198830000022
representing a noisy image, F (·) representing the network.
3. Advantageous effects
Compared with the prior art, the technical scheme provided by the invention has the following remarkable effects:
(1) According to the medical image denoising method based on the Swin transform, a medical image denoising model based on U-net is improved, a strong baseline image restoration model USwinTrans based on the Swin transform is constructed, the Swin transform is added into the model, the transform and a convolution module Convolutional Module are combined to obtain a depth feature extraction part, a Self-attention mechanism Self-Attention Mechanism is introduced into the model, and finally a convolution layer Convolutinal Layer is used as a decoder to output a denoising result, so that details and textures of an image can be reconstructed better, and the performance is superior to that of other denoising methods;
(2) According to the medical image denoising method based on the Swin Transformer, on a U-net basic structure, the Swin Transformer and a convolution module Convolutional Module are introduced to improve the depth feature extraction part of a network, so that the model not only improves the capability of capturing local information of an image, but also promotes the understanding of the model on information between image patches; meanwhile, the Swin transducer is used for extracting more global information, and a good effect is achieved in the process of denoising medical images. The method is not only effective for denoising medical images, but also can generate good visual effect for denoising natural images.
Drawings
FIG. 1 is a medical image denoising flowchart of the present invention;
FIG. 2 is a schematic diagram of a medical image denoising model structure;
fig. 3 is a block structure diagram of Residual Swin Transformer Block (RSTB);
FIG. 4 is a schematic diagram of Swin Transformer Layer (STL);
fig. 5 is a schematic diagram of a processing result of using a denoising method for a noisy gray-scale image with gaussian noise σ=0.001 and multiplicative noise σ=0.005;
fig. 6 is a schematic diagram of a processing result of using a denoising method for a noisy gray scale nuclear magnetic image with gaussian noise σ=0.001 and multiplicative noise σ=0.005.
Detailed Description
For a further understanding of the present invention, the present invention will be described in detail with reference to the drawings and examples.
Example 1
Referring to fig. 1, the embodiment mainly includes original medical image data augmentation, constructing a medical image denoising model, a noisy medical image training network and a training result test; the method specifically comprises the following steps:
step one, acquiring a medical image data set, and distributing the data set into an image training set and an image testing set according to requirements; performing data augmentation on the obtained image training set by cutting the image;
and step two, adding a depth feature extraction module between an encoder and a decoder of the U-net network, introducing a Swin Transformer, and respectively extracting local and global information by combining convolution operation. The strong baseline image restoration model based on the Swin transducer constructed in the embodiment is called USwinTrans, and the specific structure is shown in FIG. 2.
The image encoder uses two downsampling operations with a step size of 2, the decoder uses two upsampling operations with a step size of 2, and the depth feature extraction module consists of five residual Swin transform blocks (RSTB blocks) and one convolution block, each RSTB block comprises six Swin transform layers and one convolution block, and residual connection is performed after the steps are connected in series. The multi-level information of the image is obtained through up and down sampling, and the depth feature extraction module focuses on the recovery of the high-frequency information of the image.
Residual Swin Transformer Block (RSTB) is a residual block consisting of Swin Transformer Layer (STL) and a convolution block, and the structure is shown in fig. 3. This design has two benefits, first, although the transducer can be considered as a specific example of spatially varying convolution, the convolution layer with spatially invariant filters can enhance the translational invariance of the USwinTrans. Second, the residual connection provides self-based connection from different blocks to the reconstruction module, allowing different levels of feature aggregation.
Swin Transformer Layer (STL) is based on the standard multi-headed self-attention mechanism of original Transformer Layer. The main differences are local attention and changes in window mechanism. The STL structure is shown in fig. 4, which consists of two residual blocks, the first normalized by LayerNorm, followed by a multi-head self-attention Module (MSA); the second residual block is normalized by the LayerNorm layer, followed by a multi-layer perceptron (MLP).
Given an input of size H W C, the Swin transducer first remodels the input to a size (HW/M by dividing the input into M partial windows that do not overlap 2 )×M 2 Features of XC, where HW/M 2 Is the total number of windows. Then, the Swin transducer calculates the standard self-attention (i.e., local attention) for each window separately. For the local window feature X, the query, key, value matrix (Q, K, V) in the attention mechanism is calculated as follows:
Q=XP Q ,K=XP K ,V=XP V
wherein P is Q 、P K And P V Is a projection matrix shared by different windows. The attention matrix is calculated in the local window by a self-attention mechanism:
Figure BDA0004126198830000041
where B is a learnable relative position code and d is the dimension of K. The present embodiment performs the attention function h times in parallel and connects the results of the multi-head self-attention Module (MSA). Next, a further feature transformation is performed using a multi-layer perceptron (MLP) having two fully connected layers with GELU nonlinearity between the layers. LayerNorm (LN) layers were added before MSA and MLP, both modules using residual connections.
And thirdly, training the network by using an Adam optimizer and an L2 loss function. Wherein the L2 loss function is calculated as:
Figure BDA0004126198830000042
wherein X represents a clean image,
Figure BDA0004126198830000043
representing noise patternsLike, F (·) represents a network.
Inputting the noise medical image into a network to obtain a denoising result.
And fifthly, testing the USwinTrans by using the image test set in the step one, and evaluating the model by using an image evaluation index.
In the embodiment, the Swin transform module is used as a main body of the network structure, and can refine texture details by using global information of the picture; the use of downsampling operations increases the receptive field, enabling the convolution of 3*3 for feature extraction over a larger image range.
The present embodiment performs experiments on natural images (gray scale, color) and medical images, respectively. The present example was divided into three groups to perform experiments to verify the effectiveness of the algorithm. This example was compared in experiments with several other methods (PM, LEPM, DEPS, FDOGC), respectively.
Referring to tables 1-3, for each set of experiments, the present example selects a plurality of pictures to be tested in different methods, calculates the corresponding index (PSNR, SSIM, RMSE) from the obtained result image and the corresponding clean noise-free image, and averages the result image and the corresponding index. From quantitative analysis, the indexes of the experimental results of the invention are maximum or minimum values, which shows that the method of the invention is superior to other methods in maintaining the similarity of image structures and improving the signal to noise ratio of images.
Table 1 comparison of results under various indices for different methods of gray natural images
Method PSNR SSIM RMSE
PM 30.1910 0.8247 0.0246
LEPM 27.7090 0.7866 0.0336
DEPS 30.2095 0.8248 0.0245
FDOGC 27.7117 0.7865 0.0464
USwinTrans 31.1550 0.8472 0.0216
Table 2 comparison of results under various indices for different methods of color natural images
Method PSNR SSIM RMSE
PM 31.9112 0.9230 0.0263
LEPM 30.6955 0.8970 0.0311
DEPS 31.9161 0.9230 0.0263
FDOGC 30.6891 0.8970 0.0312
USwinTrans 33.0638 0.9345 0.0228
Table 3 comparison of results under various indices for different methods of greyscale medical images
Method PSNR SSIM RMSE
PM 27.8698 0.6919 0.0234
LEPM 27.5688 0.6765 0.0242
DEPS 27.7713 0.6919 0.0236
FDOGC 27.4868 0.7254 0.0244
USwinTrans 27.9732 0.7320 0.0231
Fig. 5 and 6 are comparison results of the method of the present invention with the remaining several denoising methods, wherein (a) represents an input image, (B), (C), (D), (E) represent processing results using the PM, LEPM, DEPS, FDOGC denoising method, respectively, (F) represents processing results using the USwinTrans denoising method, and (G) represents a clean image. From qualitative analysis, the method can reconstruct details and textures of the image better, and the denoising effect is obviously better than that of other methods, and is stronger than that of other methods in the aspect of maintaining the detail textures, so that the method has the effectiveness of denoising the medical image and the natural image.
The invention and its embodiments have been described above by way of illustration and not limitation, and the invention is illustrated in the accompanying drawings and described in the drawings in which the actual structure is not limited thereto. Therefore, if one of ordinary skill in the art is informed by this disclosure, the structural mode and the embodiments similar to the technical scheme are not creatively designed without departing from the gist of the present invention.

Claims (8)

1. The medical image denoising method based on the Swin transducer is characterized by comprising the following steps of:
step one, obtaining a noise medical image and a clean medical image as a training set and a testing set;
step two, adding a Swin transducer to the neural network to design an RSTB block;
training a network by adopting an Adam algorithm, and constructing a medical image denoising network;
inputting the noise medical image into a network to obtain a denoising result;
and fifthly, evaluating the network by using the image quality evaluation index.
2. The method for denoising medical images based on Swin transducer according to claim 1, wherein the method comprises the steps of: and step one, performing data augmentation on the obtained image training set by cutting the image.
3. A method for denoising medical images based on Swin transducer according to claim 1 or 2, wherein: and step two, adding a depth feature extraction module between an encoder and a decoder of the U-net network, introducing a Swin transducer, and respectively extracting local and global information by combining convolution operation.
4. A method of denoising a medical image based on a Swin transducer according to claim 3, wherein: the encoder of the U-net network uses two downsampling operations of step size 2 and the decoder uses two upsampling operations of step size 2.
5. The method for denoising a medical image based on a Swin transducer according to claim 4, wherein: the depth feature extraction module comprises a plurality of RSTB blocks and a convolution block, each RSTB block comprises a plurality of Swin transform layers and a convolution block, and residual connection is carried out after the RSTB blocks are connected in series.
6. The method for denoising medical images based on Swin transducer according to claim 5, wherein: the depth feature extraction module comprises 5 RSTB blocks and 1 convolution block, and each RSTB block is a residual module formed by connecting 6 Swin transform layers and 1 convolution block in series.
7. The method for denoising medical images based on Swin transducer according to claim 6, wherein: the Swin converter layer consists of two residual blocks, wherein the first residual block is normalized by a LayerNorm layer and is connected with a multi-head self-attention module MSA; the second residual block is normalized by the LayerNorm layer, followed by a multi-layer perceptron MLP.
8. The method for denoising medical images based on Swin transducer according to claim 7, wherein: step three, training a network by using an Adam optimizer and an L2 loss function, wherein the L2 loss function is calculated as follows:
Figure FDA0004126198800000011
wherein, X is as followsA clean image is shown and is shown in a clear view,
Figure FDA0004126198800000012
representing a noisy image, F (·) representing the network.
CN202310246661.1A 2023-03-10 2023-03-10 Swin transducer-based medical image denoising method Pending CN116416156A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310246661.1A CN116416156A (en) 2023-03-10 2023-03-10 Swin transducer-based medical image denoising method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310246661.1A CN116416156A (en) 2023-03-10 2023-03-10 Swin transducer-based medical image denoising method

Publications (1)

Publication Number Publication Date
CN116416156A true CN116416156A (en) 2023-07-11

Family

ID=87052398

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310246661.1A Pending CN116416156A (en) 2023-03-10 2023-03-10 Swin transducer-based medical image denoising method

Country Status (1)

Country Link
CN (1) CN116416156A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117115149A (en) * 2023-10-20 2023-11-24 北京邮电大学 Image quality evaluation method, device, equipment and storage medium
CN117250657A (en) * 2023-11-17 2023-12-19 东北石油大学三亚海洋油气研究院 Seismic data reconstruction denoising integrated method

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117115149A (en) * 2023-10-20 2023-11-24 北京邮电大学 Image quality evaluation method, device, equipment and storage medium
CN117115149B (en) * 2023-10-20 2024-02-06 北京邮电大学 Image quality evaluation method, device, equipment and storage medium
CN117250657A (en) * 2023-11-17 2023-12-19 东北石油大学三亚海洋油气研究院 Seismic data reconstruction denoising integrated method
CN117250657B (en) * 2023-11-17 2024-03-08 东北石油大学三亚海洋油气研究院 Seismic data reconstruction denoising integrated method

Similar Documents

Publication Publication Date Title
CN114140353B (en) Swin-Transformer image denoising method and system based on channel attention
CN111062872B (en) Image super-resolution reconstruction method and system based on edge detection
CN110992275A (en) Refined single image rain removing method based on generation countermeasure network
CN116416156A (en) Swin transducer-based medical image denoising method
CN111127374B (en) Pan-sharing method based on multi-scale dense network
CN110443768B (en) Single-frame image super-resolution reconstruction method based on multiple consistency constraints
CN103093433B (en) Natural image denoising method based on regionalism and dictionary learning
CN109214989B (en) Single image super resolution ratio reconstruction method based on Orientation Features prediction priori
CN106952228A (en) The super resolution ratio reconstruction method of single image based on the non local self-similarity of image
CN111127354B (en) Single-image rain removing method based on multi-scale dictionary learning
CN112270654A (en) Image denoising method based on multi-channel GAN
CN111598804B (en) Deep learning-based image multi-level denoising method
CN112037304B (en) Two-stage edge enhancement QSM reconstruction method based on SWI phase image
CN114266957A (en) Hyperspectral image super-resolution restoration method based on multi-degradation mode data augmentation
CN114266939A (en) Brain extraction method based on ResTLU-Net model
CN115631107A (en) Edge-guided single image noise removal
Gangeh et al. Document enhancement system using auto-encoders
CN113240581A (en) Real world image super-resolution method for unknown fuzzy kernel
CN117408924A (en) Low-light image enhancement method based on multiple semantic feature fusion network
Liu et al. Multi-level wavelet network based on CNN-Transformer hybrid attention for single image deraining
CN115861108A (en) Image restoration method based on wavelet self-attention generation countermeasure network
CN115861749A (en) Remote sensing image fusion method based on window cross attention
CN111325765B (en) Image edge detection method based on redundant wavelet transform
CN114219738A (en) Single-image multi-scale super-resolution reconstruction network structure and method
CN112907456B (en) Deep neural network image denoising method based on global smooth constraint prior model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination