CN113506222A - Multi-mode image super-resolution method based on convolutional neural network - Google Patents

Multi-mode image super-resolution method based on convolutional neural network Download PDF

Info

Publication number
CN113506222A
CN113506222A CN202110870612.6A CN202110870612A CN113506222A CN 113506222 A CN113506222 A CN 113506222A CN 202110870612 A CN202110870612 A CN 202110870612A CN 113506222 A CN113506222 A CN 113506222A
Authority
CN
China
Prior art keywords
resolution
image
super
network
low
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110870612.6A
Other languages
Chinese (zh)
Other versions
CN113506222B (en
Inventor
刘羽
朱文瑜
成娟
李畅
宋仁成
陈勋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hefei University of Technology
Original Assignee
Hefei University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hefei University of Technology filed Critical Hefei University of Technology
Priority to CN202110870612.6A priority Critical patent/CN113506222B/en
Publication of CN113506222A publication Critical patent/CN113506222A/en
Application granted granted Critical
Publication of CN113506222B publication Critical patent/CN113506222B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4007Scaling of whole images or parts thereof, e.g. expanding or contracting based on interpolation, e.g. bilinear interpolation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4046Scaling of whole images or parts thereof, e.g. expanding or contracting using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4053Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Processing (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a multi-modal image super-resolution method based on a convolutional neural network, which comprises the following steps: firstly, preparing data; secondly, constructing a super-resolution network for cascading a plurality of dense residual attention modules, wherein the super-resolution network comprises the following components: a shallow feature extraction, feature depth processing and image reconstruction part; thirdly, performing super resolution on the input low-resolution image, comprising: super-resolution network training and super-resolution image testing. The invention can fully utilize complementary and redundant information in medical images of different modes to reconstruct a high-resolution image with better quality, provides an image with better quality for human eye observation, and simultaneously provides support for computer vision tasks such as segmentation, classification and the like of the image.

Description

Multi-mode image super-resolution method based on convolutional neural network
Technical Field
The invention relates to the technical field of image super-resolution, in particular to a multi-mode image super-resolution method based on a convolutional neural network.
Background
The image super-resolution refers to a process of reconstructing a high-resolution image from a given low-resolution image or images, and the high-resolution image can be obtained only through an algorithm in the whole process. At present, the more advanced super-resolution method is mostly based on the super-resolution of the image of a single mode, so that although a high-resolution image with good effect can be obtained, complementary and redundant information among multi-mode images is definitely ignored, and the information sometimes has importance on the image reconstruction result in the process of reconstructing the high-resolution image. Nowadays, information technology is facing explosive growth, forms of data resources are increasingly diversified, and multi-modal data becomes a main form of data used by people. Generally speaking, more information, and more feature expression capability, tends to reconstruct more excellent high resolution images. Therefore, the research relates to a multi-mode learning method of multiple input modes, and more prior information given to the super-resolution network has huge application prospect and wide research value.
In the field of natural images, multimodality image data are increasingly developed, such as visible light images and infrared images. The visible light has higher resolution, contrast and good visual effect, but the infrared image can be less influenced by environmental factors, so that the general applicability of the infrared image is stronger. In many computer vision tasks such as pedestrian re-recognition, face recognition, etc., combining images of different modalities can achieve a more excellent effect. However, in the current super-resolution field, only a few methods are combined with images of multiple different modalities to perform super-resolution, so that the performance of many network structures cannot be further improved.
Medical imaging includes a plurality of image modalities, and the use of multi-modality Magnetic Resonance Imaging (MRI) technology is common in medical image data. Among the more common MRI modalities, T1-weighted imaging (T1) and T2-weighted imaging (T2) are included. Generally, only one-sided medical information can be obtained from a single-modality MRI image, and in order to obtain more complete and accurate information, the mutual complementation of different modalities of MRI plays a crucial role.
Meanwhile, medical image super-resolution is always a big hot spot in the field of image super-resolution. Only conventional image super resolution methods are used, for example: nearest neighbor algorithm, bilinear algorithm, bicubic interpolation, etc. Although these methods are fast to run and easy to implement, their results often show edge blurring and loss of high frequency details, which is a fatal problem in the field of medical images. The super-resolution method based on deep learning can extract depth features through long-time training so as to reconstruct a high-resolution image with quality higher than that of the traditional method. However, obtaining high-resolution images still has the problems of generating artifacts, losing details and the like, and thus it is difficult to obtain reliable high-resolution medical images.
Compared with natural images, the super-resolution problem of medical images has stronger complexity and strictness, and comprises the following characteristics: 1) medical images generally require high accuracy, the super-resolution result must faithfully reflect the actual situation, and once deviation occurs, the subsequent processing steps (such as segmentation, classification and other high-level tasks) are seriously wrong; 2) the anatomical tissue structure and the shape of a human body are complex, and individual differences exist, so that difficulty is brought to super resolution of an image; 3) the acquisition of medical image information is extremely easily affected by various factors, such as external noise, field offset effect, local body effect, unconscious movement of an acquired object, unavoidable tissue activity and the like, and inevitably generates problems of motion artifacts, nonuniformity and the like, thereby bringing great difficulty to the application of the image super-resolution method based on deep learning to medical images. Therefore, it is necessary to conduct a deeper research on the super resolution method with respect to the above-mentioned features of medical images, and consider combining information in medical images of different modalities to improve the performance of the super resolution method.
Disclosure of Invention
The invention provides a multi-mode image super-resolution method based on a convolutional neural network for overcoming the problems of the prior image super-resolution technology, so as to provide better image characteristic expression by fully utilizing complementary and redundant information of images among different modes and reconstruct a high-resolution image with higher quality, thereby providing better quality images for human eye observation and simultaneously providing support for computer vision tasks such as image segmentation, classification and the like.
The invention adopts the following technical scheme for solving the problems:
the invention discloses a multi-modal image super-resolution method based on a convolutional neural network, which is characterized by comprising the following steps of:
step 1, data preparation:
acquiring any group of reference image sets with K × L resolution and S modalities
Figure BDA0003188971810000021
Wherein the content of the first and second substances,
Figure BDA0003188971810000022
a reference image representing an S-th modality, S1, 2.., S; obtaining a corresponding set of low resolution image gathers having a resolution of η K by η L
Figure BDA0003188971810000023
Wherein the content of the first and second substances,
Figure BDA0003188971810000024
representing a low resolution image of the s-th modality, η representing a zoom factor, and 0 < η < 1;
step 2, constructing a multi-mode image super-resolution network, comprising the following steps: a shallow layer feature extraction part, a feature refining part and an image reconstruction part;
step 2.1, the shallow feature extraction part comprises a convolution layer with convolution kernel size of NxN and an activation function;
low resolution images of S different modalities
Figure BDA0003188971810000025
After cascading, obtaining a cascaded low-resolution image I with the size of eta Kx eta Lx SinAnd inputting the data into the multi-mode image super-resolution network, and outputting a shallow feature map F with the size of eta K multiplied by eta L multiplied by C after the processing of the shallow feature extraction partinitC is the number of channels set by the network;
step 2.2, the feature refining part consists of G dense residual error attention modules, an NxN convolutional layer and an MxM convolutional layer;
g dense residual attention modules are denoted as DRAB1,DRAB2,…,DRABg,...,DRABGWherein DRABgIs the g-th dense residual attention module;
the g-th dense residual attention module DRABgThe system is formed by cascading a g-th dense residual error unit and a g-th channel attention mechanism unit;
the g-th dense residual error unit is composed of Y NxN convolutional layers and an MxM convolutional layer, and the Y NxN convolutional layers are densely connected;
the g channel attention mechanism unit consists of a g global pooling layer PGThe g-th one-dimensional convolution layer for adaptively adjusting the size of the convolution kernel and the g-th activation function FAForming;
when g is 1, the shallow feature map FinitAs the input characteristic of the g dense residual unit, and inputting the characteristic into the g dense residual unit, and cascading the output characteristic graph of each N × N convolution layer with the input shallow characteristic graph FinitThen, the g-th intermediate feature is obtained through the M × M convolutional layer
Figure BDA0003188971810000031
The g-th intermediate feature
Figure BDA0003188971810000032
Adding the input characteristics of the g dense residual error unit to obtain the output characteristics of the g dense residual error unit
Figure BDA0003188971810000033
Output characteristics of the g-th dense residual unit
Figure BDA0003188971810000034
As the input characteristic of the g channel attention mechanism unit, obtaining a weight vector L by the g channel attention mechanism unitAgAnd then, obtaining the output of the g channel attention mechanism unit by using the formula (1) and using the output as a g dense residual attention module DRABgOutput characteristic of
Figure BDA0003188971810000035
FDRAg=LAg×FDRg (1)
When G is more than or equal to 2 and less than or equal to G, the G-th dense residual error attention module DRABgIs the g-1 th dense residual attention module DRABg-1Output characteristic of
Figure BDA0003188971810000036
So that the output characteristics are obtained by G dense residual attention modules
Figure BDA0003188971810000037
After cascade connection, sequentially passing through the N × N convolution layer and one M × M convolution layer of the feature refining part, outputting an intermediate feature map F 'of the feature refining part with the size of eta K × eta L × C'fineAnd from said intermediate feature map F'fineAnd the shallow feature map FinitAfter jump linking, adding to obtain final characteristic diagram F of low resolution space with the size of eta K multiplied by eta L multiplied by CLR
Step 2.3, the image reconstruction part comprises an up-sampling layer and S image reconstruction branches, wherein the S-th image reconstruction branch comprises: h, NxN convolutional layers with activation functions and one NxN convolutional layer without activation functions;
final feature F of low resolution spaceLRInputting the image into the image reconstruction part, and obtaining a high-resolution spatial feature F through an up-sampling layerHRAnd outputting a residual error map of the s-th mode after passing through the s-th image reconstruction branch
Figure BDA0003188971810000041
Thus, residual error maps of all modes are obtained by the S image reconstruction branches
Figure BDA0003188971810000042
Step 2.4, for low resolution image set
Figure BDA0003188971810000043
Upsampling to obtain an interpolated low resolution image set
Figure BDA0003188971810000044
Wherein the content of the first and second substances,
Figure BDA0003188971810000045
low resolution image representing the s-th modality
Figure BDA0003188971810000046
Carrying out up-sampling to obtain an interpolated low-resolution image;
step 2.5, residual error map of the s-th mode
Figure BDA0003188971810000047
And interpolated low resolution map of the s-th modality
Figure BDA0003188971810000048
Adding the obtained S-th modal super-resolution image
Figure BDA0003188971810000049
So as to obtain S super-resolution images of different modes
Figure BDA00031889718100000410
Step 3, training the multi-mode image super-resolution network:
step 3.1, obtaining R groups of reference image sets and R groups of low-resolution image sets corresponding to the R groups of reference image sets according to the process of the step 1;
step 3.2, defining the current cycle number as t, and initializing t to be 0; defining the maximum number of iterations as
Figure BDA00031889718100000411
Z is the maximum number of preset rounds of super-resolution network training; x is the number of groups extracted each time;
3.3, randomly taking out X groups of low-resolution image sets from the t times of the R groups of low-resolution image sets, inputting the X groups of low-resolution image sets into the multi-mode image super-resolution network for training, and obtaining super-resolution image sets output by the t times of training
Figure BDA00031889718100000412
Figure BDA00031889718100000413
Representing the super-resolution image of the s-th mode in the x-th group of multi-mode super-resolution images output by the t-th training; x is 1,2, …, X;
and correspondingly taking X groups of images from the t-th time of the R group of reference image sets, and constructing a loss function L of the t-th training shown in formula (2)t(θ):
Figure BDA00031889718100000414
In the formula (2), the reaction mixture is,
Figure BDA00031889718100000415
a reference image of the s-th modality representing the x-th set of reference images; optimizing and solving the loss function constructed by the formula (2) by adopting a back propagation algorithm, so as to adjust all parameters in the deep learning network;
step 3.4, after T +1 is assigned to T, judging whether T is greater than T or not, if so, indicating that a super-resolution network model which is finally trained is obtained; otherwise, returning to the step 3.3 for sequential execution;
and 3.5, inputting the low-resolution image to be tested into the trained super-resolution network model so as to obtain a predicted super-resolution image.
Compared with the prior art, the invention has the beneficial effects that:
1. the invention provides a unified network framework, simultaneously realizes the super-resolution tasks of images in different modes, fully utilizes the redundant and complementary information among the images in different modes, simultaneously reconstructs high-resolution images in multiple modes, and improves the super-resolution effect of any image in a single mode. Compared with the prior art that networks need to be trained on different modal image super-resolution tasks respectively, the method can realize simultaneous super-resolution of multi-modal images by only carrying out network training once.
2. Compared with most of the existing deep learning super-resolution networks, the invention designs a lightweight network to realize multi-mode image super-resolution, improves the calculation efficiency, reduces the storage cost and has stronger practicability. In addition, the method carries out up-sampling at the end of the feature extraction part, so that a large amount of convolution operation is carried out in a low-resolution space, and a strategy of reconstructing a residual error map is adopted in an image reconstruction link, so that the network is easy to train, and the calculation efficiency is high.
3. The invention designs a cascade basic structure of a plurality of dense residual attention modules, so that the information flow in the network is more reasonable, the loss of characteristic information is reduced, the network can learn more deep characteristics with complex hierarchical structures, the characteristic information in the original image is prevented from being lost in large quantity, and the quality of super-resolution results is greatly improved. In addition, the method further improves the flow path of information in the network by adopting the global residual learning and the local residual learning, prevents the network from losing shallow features while extracting deeper features, and enables the feature information extracted by the network to be more comprehensive.
Drawings
FIG. 1 is a flow chart of a multi-modal image super-resolution method based on a convolutional neural network of the present invention;
fig. 2 is a schematic diagram of a specific framework of the present invention, taking s-2 as an example;
FIG. 3 is a schematic diagram of a dense residual attention module according to the present invention;
FIG. 4 is a diagram illustrating a dense residual error unit structure according to the present invention;
FIG. 5 is a schematic diagram of a channel attention mechanism unit according to the present invention.
Detailed Description
In this embodiment, taking MRI of two different modalities as an example, a specific network framework is shown in fig. 2, and a multi-modality image super-resolution method based on a convolutional neural network is shown in fig. 1, and includes the following steps:
step 1, data preparation:
acquiring any group of reference image sets with K × L resolution and S modalities
Figure BDA0003188971810000051
Wherein the content of the first and second substances,
Figure BDA0003188971810000052
a reference image representing an S-th modality, S1, 2.., S; obtaining a corresponding set of low resolution image gathers having a resolution of η K by η L
Figure BDA0003188971810000061
Wherein the content of the first and second substances,
Figure BDA0003188971810000062
representing a low resolution image of the s-th modality, η representing a zoom factor, and 0 < η < 1;
in this embodiment, T1 and T2 MR images in MICCAI BraTS _2019, each of which has a size of 240 × 240 × 155, were used as raw data, and 457 3D MR images were included in the data set. Slicing the 3D MR images along the Z axis, selecting one slice as training data for every 5 slices of each 3D image from the 60 th slice, and taking 10 slices of each 3D image to obtain 4570 groups of 2D MR images with 2 modalities;
using the obtained data as a reference image set
Figure BDA0003188971810000063
Respectively carrying out bicubic downsampling with different scaling factors on the reference image to obtain a low-resolution image set under different scale factors
Figure BDA0003188971810000064
The scaling factor adopted in the embodiment is 2, but other scaling factors can also achieve good effects in the network;
step 2, constructing a multi-mode image super-resolution network, comprising the following steps: a shallow layer feature extraction part, a feature refining part and an image reconstruction part;
step 2.1, the shallow feature extraction part comprises a convolution layer with convolution kernel size of nxn and an activation function, in the embodiment, N is taken to be 3, the activation function adopts a relu activation function, and the convolution layer with convolution kernel size of 3 is adopted, so that a good effect can be obtained, and too many parameters cannot be introduced to slow down the network training speed;
low resolution images of S different modalities
Figure BDA0003188971810000065
After cascading, obtaining a cascaded low-resolution image I with the size of eta Kx eta Lx SinIn the present embodiment, 2 images of different modalities are used, each having a size of 120 × 120, i.e., S ═ 2, K ═ L ═ 240, and η ═ 0.5;
to cascade low resolution images IinInputting the image into a multi-mode image super-resolution network, processing the image by a shallow feature extraction part, and outputting a shallow feature map F with the size of eta K multiplied by eta L multiplied by CinitC is the number of channels set by the network, and in this embodiment, C is set to 64;
step 2.2, the feature refining part is composed of G dense residual attention modules, an M × M convolutional layer and an N × N convolutional layer, where in this embodiment, N is 3, M is 1, and G is 3;
g dense residual attention modules are denoted as DRAB1,DRAB2,...,DRABg,...,DRABGWherein DRABgThe structure of the specific dense residual attention module is shown in fig. 3;
g-th dense residual attention module DRABgThe system is formed by cascading a g-th dense residual error unit and a g-th channel attention mechanism unit;
in this embodiment, Y is 6, the growth rate of the dense connection is 32, and the specific structure of the dense residual unit is as shown in fig. 4. Dense connection can enable the feature diagram to be continuously reused, so that a network can obtain a better effect under a shallower condition, and the computing efficiency of the network is improved;
the g channel attention mechanism unit consists of a g global pooling layer PGThe g-th one-dimensional convolution layer for adaptively adjusting the size of the convolution kernel and the g-th activation function FAComposition, in this embodiment, the function F is activatedAWith softmax, the specific structure of the channel attention mechanism unit is shown in fig. 5;
when g is 1, the shallow feature map FinitAs the input characteristic of the g dense residual unit, and concatenating the output characteristic diagram of each NxN convolutional layer with the input shallow characteristic diagram FinitThen, the g-th intermediate feature is obtained through the M × M convolutional layer
Figure BDA0003188971810000071
In this embodiment, each NxN convolutional layer outputs a feature map of 32 channels, and each MxM convolutional layer outputs a feature map of 64 channels, and an intermediate feature
Figure BDA0003188971810000072
Has 64 channels; the g-th intermediate feature
Figure BDA0003188971810000073
Adding the input characteristics of the g dense residual error unit to obtain the output characteristics of the g dense residual error unit
Figure BDA0003188971810000074
By adopting a strategy of local residual error learning, the input feature graph can be directly transmitted to a deep network through jump connection, the deep feature is continuously extracted by the network, and meanwhile, the shallow feature is prevented from being lost, so that the network obtains a more comprehensive feature graph;
output characteristics of the g-th dense residual unit
Figure BDA0003188971810000075
As the input characteristic of the g channel attention mechanism unit, the g channel attention mechanism unit obtains a weight vector LAgAnd then, obtaining the output of the g channel attention mechanism unit by using the formula (1) and using the output as a g dense residual attention module DRABgOutput characteristic of
Figure BDA0003188971810000076
FDRAg=LAg×FDRg (1)
When G is more than or equal to 2 and less than or equal to G, the G-th dense residual error attention module DRABgIs characterized by the g-1 th dense residual attention module DRABg-1Output characteristic of
Figure BDA0003188971810000077
So that the output characteristics are obtained by G dense residual attention modules
Figure BDA0003188971810000078
After cascade connection, the intermediate feature map F 'of the feature refining part with the size of eta K eta L multiplied by C is output after sequentially passing through the N multiplied by N convolution layers and one M multiplied by M convolution layers of the feature refining part'fineAnd from an intermediate feature map F'fineAnd shallow feature map FinitAfter jump linking, adding to obtain final characteristic F of low resolution space with size of eta K multiplied by eta L multiplied by CLR. The output of the G dense residual attention modules is cascaded to obtain a characteristic diagram of G × 64 channels, which is 192 channels in this embodiment. The characteristic diagram is firstly passed through a 1 × 1 convolutional layer,feature map recompression to 64 channels with some feature fusion, followed by further feature refinement through a 3 × 3 convolutional layer to obtain an intermediate feature map F 'of size 120 × 120 × 64'fine. Then intermediate feature map F'fineAnd shallow feature map FinitGlobal residual learning is formed by adding, and the representation capability of the network is further improved;
step 2.3, the image reconstruction part comprises an up-sampling layer and S image reconstruction branches, and the S-th image reconstruction branch comprises: h is an nxn convolutional layer with an activation function and an nxn convolutional layer without an activation function, in this embodiment, the upsampled convolutional layer used is a high-efficiency sub-pixel convolutional layer, and H is 1;
final feature F of low resolution spaceLRIn the input image reconstruction part, a high-resolution spatial feature F is obtained through an upsampling layerHRAnd then the residual error image of the s-th mode is output after passing through the s-th image reconstruction branch
Figure BDA0003188971810000081
Thus, residual error maps of all modes are obtained by the S image reconstruction branches
Figure BDA0003188971810000082
In the embodiment, 2 image reconstruction branches are provided to reconstruct 2 residual images with different modes, and the characteristic up-sampling is performed at the rear part of the network, so that most of convolution operations of the network can be performed in a low-resolution space, and the calculation resources are also saved;
step 2.4, for low resolution image set
Figure BDA0003188971810000083
Upsampling to obtain an interpolated low resolution image set
Figure BDA0003188971810000084
Wherein the content of the first and second substances,
Figure BDA0003188971810000085
low resolution image representing the s-th modality
Figure BDA0003188971810000086
In this embodiment, a bicubic interpolation method is adopted as a method for up-sampling the low-resolution image;
step 2.5, residual error map of the s-th mode
Figure BDA0003188971810000087
And interpolated low resolution map of the s-th modality
Figure BDA0003188971810000088
Adding the obtained S-th modal super-resolution image
Figure BDA0003188971810000089
So as to obtain S super-resolution images of different modes
Figure BDA00031889718100000810
The difficulty of reconstructing the image is reduced by the network in a mode of reconstructing a residual error map, so that the network is easier to train;
step 3, training the multi-mode image super-resolution network:
step 3.1, obtaining R groups of reference image sets and R groups of low-resolution image sets corresponding to the R groups of reference image sets according to the process of the step 1;
step 3.2, defining the current cycle number as t, and initializing t to be 0; defining the maximum number of iterations as
Figure BDA00031889718100000811
Z is the maximum number of rounds of the super-resolution network training, X is the number of groups extracted each time, in this embodiment, X is set to be 32, and Z is set to be 200;
3.3, randomly taking out X groups of low-resolution image sets from the t time of the R groups of low-resolution image sets, inputting the X groups of low-resolution image sets into a multi-mode image super-resolution network for training, and obtaining the super-resolution image set output by the t time of training
Figure BDA00031889718100000812
Figure BDA0003188971810000091
Representing the super-resolution image of the s-th mode in the x-th group of multi-mode super-resolution images output by the t-th training; x is 1,2, …, X;
and correspondingly taking X groups of images from the t-th time of the R group of reference image sets, and constructing a loss function L of the t-th training shown in formula (2)t(θ):
Figure BDA0003188971810000092
In the formula (2), the reaction mixture is,
Figure BDA0003188971810000093
a reference image of the s-th modality representing the x-th set of reference images; the loss function constructed by the formula (2) is optimized and solved by adopting a back propagation algorithm, so that all parameters in the whole network are adjustedt(theta) carrying out optimization solution;
step 3.4, after T +1 is assigned to T, judging whether T is greater than T or not, if so, indicating that a super-resolution network model which is finally trained is obtained; otherwise, returning to the step 3.3 for sequential execution;
and 3.5, inputting the low-resolution image to be tested into the trained super-resolution network model so as to obtain a predicted super-resolution image.

Claims (1)

1. A multi-modal image super-resolution method based on a convolutional neural network is characterized by comprising the following steps:
step 1, data preparation:
acquiring any group of reference image sets with K × L resolution and S modalities
Figure FDA0003188971800000011
Wherein the content of the first and second substances,
Figure FDA0003188971800000012
a reference image representing an S-th modality, S1, 2.., S; obtaining a corresponding set of low resolution image gathers having a resolution of η K by η L
Figure FDA0003188971800000013
Wherein the content of the first and second substances,
Figure FDA0003188971800000014
representing a low resolution image of the s-th modality, η representing a zoom factor, and 0 < η < 1;
step 2, constructing a multi-mode image super-resolution network, comprising the following steps: a shallow layer feature extraction part, a feature refining part and an image reconstruction part;
step 2.1, the shallow feature extraction part comprises a convolution layer with convolution kernel size of NxN and an activation function;
low resolution images of S different modalities
Figure FDA0003188971800000015
After cascading, obtaining a cascaded low-resolution image I with the size of eta Kx eta Lx SinAnd inputting the data into the multi-mode image super-resolution network, and outputting a shallow feature map F with the size of eta K multiplied by eta L multiplied by C after the processing of the shallow feature extraction partinitC is the number of channels set by the network;
step 2.2, the feature refining part consists of G dense residual error attention modules, an NxN convolutional layer and an MxM convolutional layer;
g dense residual attention modules are denoted as DRAB1,DRAB2,...,DRABg,…,DRABGWherein DRABgIs the g-th dense residual attention module;
the g-th dense residual attention module DRABgThe system is formed by cascading a g-th dense residual error unit and a g-th channel attention mechanism unit;
the g-th dense residual error unit is composed of Y NxN convolutional layers and an MxM convolutional layer, and the Y NxN convolutional layers are densely connected;
the g channel attention mechanism unit consists of a g global pooling layer PGThe g-th one-dimensional convolution layer for adaptively adjusting the size of the convolution kernel and the g-th activation function FAForming;
when g is 1, the shallow feature map FinitAs the input characteristic of the g dense residual unit, and inputting the characteristic into the g dense residual unit, and cascading the output characteristic graph of each N × N convolution layer with the input shallow characteristic graph FinitThen, the g-th intermediate feature is obtained through the M × M convolutional layer
Figure FDA0003188971800000021
The g-th intermediate feature
Figure FDA0003188971800000022
Adding the input characteristics of the g dense residual error unit to obtain the output characteristics of the g dense residual error unit
Figure FDA0003188971800000023
Output characteristics of the g-th dense residual unit
Figure FDA0003188971800000024
As the input characteristic of the g channel attention mechanism unit, obtaining a weight vector L by the g channel attention mechanism unitAgAnd then, obtaining the output of the g channel attention mechanism unit by using the formula (1) and using the output as a g dense residual attention module DRABgOutput characteristic of
Figure FDA0003188971800000025
FDRAg=LAg×FDRg (1)
When G is more than or equal to 2 and less than or equal to G, the G-th dense residual error attention module DRABgIs the g-1 th dense residual attentionForce module DRABg-1Output characteristic of
Figure FDA0003188971800000026
So that the output characteristics are obtained by G dense residual attention modules
Figure FDA0003188971800000027
After cascade connection, sequentially passing through the N × N convolution layer and one M × M convolution layer of the feature refining part, outputting an intermediate feature map F 'of the feature refining part with the size of eta K × eta L × C'fineAnd from said intermediate feature map F'fineAnd the shallow feature map FinitAfter jump linking, adding to obtain final characteristic diagram F of low resolution space with the size of eta K multiplied by eta L multiplied by CLR
Step 2.3, the image reconstruction part comprises an up-sampling layer and S image reconstruction branches, wherein the S-th image reconstruction branch comprises: h, NxN convolutional layers with activation functions and one NxN convolutional layer without activation functions;
final feature F of low resolution spaceLRInputting the image into the image reconstruction part, and obtaining a high-resolution spatial feature F through an up-sampling layerHRAnd outputting a residual error map of the s-th mode after passing through the s-th image reconstruction branch
Figure FDA0003188971800000028
Thus, residual error maps of all modes are obtained by the S image reconstruction branches
Figure FDA0003188971800000029
Step 2.4, for low resolution image set
Figure FDA00031889718000000210
Upsampling to obtain an interpolated low resolution image set
Figure FDA00031889718000000211
Wherein the content of the first and second substances,
Figure FDA00031889718000000212
low resolution image representing the s-th modality
Figure FDA00031889718000000213
Carrying out up-sampling to obtain an interpolated low-resolution image;
step 2.5, residual error map of the s-th mode
Figure FDA00031889718000000214
And interpolated low resolution map of the s-th modality
Figure FDA00031889718000000215
Adding the obtained S-th modal super-resolution image
Figure FDA00031889718000000216
So as to obtain S super-resolution images of different modes
Figure FDA00031889718000000217
Step 3, training the multi-mode image super-resolution network:
step 3.1, obtaining R groups of reference image sets and R groups of low-resolution image sets corresponding to the R groups of reference image sets according to the process of the step 1;
step 3.2, defining the current cycle number as t, and initializing t to be 0; defining the maximum number of iterations as
Figure FDA0003188971800000031
Z is the maximum number of preset rounds of super-resolution network training; x is the number of groups extracted each time;
3.3, randomly taking out X groups of low-resolution image sets from the t times of the R groups of low-resolution image sets, inputting the X groups of low-resolution image sets into the multi-mode image super-resolution network for training, and obtaining super-resolution image sets output by the t times of training
Figure FDA0003188971800000032
Figure FDA0003188971800000033
Representing the super-resolution image of the s-th mode in the x-th group of multi-mode super-resolution images output by the t-th training; x is 1,2, …, X;
and correspondingly taking X groups of images from the t-th time of the R group of reference image sets, and constructing a loss function L of the t-th training shown in formula (2)t(θ):
Figure FDA0003188971800000034
In the formula (2), the reaction mixture is,
Figure FDA0003188971800000035
a reference image of the s-th modality representing the x-th set of reference images; optimizing and solving the loss function constructed by the formula (2) by adopting a back propagation algorithm, so as to adjust all parameters in the deep learning network;
step 3.4, after T +1 is assigned to T, judging whether T is greater than T or not, if so, indicating that a super-resolution network model which is finally trained is obtained; otherwise, returning to the step 3.3 for sequential execution;
and 3.5, inputting the low-resolution image to be tested into the trained super-resolution network model so as to obtain a predicted super-resolution image.
CN202110870612.6A 2021-07-30 2021-07-30 Multi-mode image super-resolution method based on convolutional neural network Active CN113506222B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110870612.6A CN113506222B (en) 2021-07-30 2021-07-30 Multi-mode image super-resolution method based on convolutional neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110870612.6A CN113506222B (en) 2021-07-30 2021-07-30 Multi-mode image super-resolution method based on convolutional neural network

Publications (2)

Publication Number Publication Date
CN113506222A true CN113506222A (en) 2021-10-15
CN113506222B CN113506222B (en) 2024-03-01

Family

ID=78014561

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110870612.6A Active CN113506222B (en) 2021-07-30 2021-07-30 Multi-mode image super-resolution method based on convolutional neural network

Country Status (1)

Country Link
CN (1) CN113506222B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114049408A (en) * 2021-11-15 2022-02-15 哈尔滨工业大学(深圳) Depth network model for accelerating multi-modality MR imaging
CN114331849A (en) * 2022-03-15 2022-04-12 之江实验室 Cross-mode nuclear magnetic resonance hyper-resolution network and image super-resolution method
CN114943650A (en) * 2022-04-14 2022-08-26 北京东软医疗设备有限公司 Image deblurring method and device, computer equipment and storage medium
WO2023109719A1 (en) * 2021-12-15 2023-06-22 深圳先进技术研究院 Terahertz single-pixel super-resolution imaging method and system
CN117391938A (en) * 2023-12-13 2024-01-12 长春理工大学 Infrared image super-resolution reconstruction method, system, equipment and terminal

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170351935A1 (en) * 2016-06-01 2017-12-07 Mitsubishi Electric Research Laboratories, Inc Method and System for Generating Multimodal Digital Images
CN109325931A (en) * 2018-08-22 2019-02-12 中北大学 Based on the multi-modality images fusion method for generating confrontation network and super-resolution network
US20190325621A1 (en) * 2016-06-24 2019-10-24 Rensselaer Polytechnic Institute Tomographic image reconstruction via machine learning
CN110415170A (en) * 2019-06-24 2019-11-05 武汉大学 A kind of image super-resolution method based on multiple dimensioned attention convolutional neural networks
US20200034948A1 (en) * 2018-07-27 2020-01-30 Washington University Ml-based methods for pseudo-ct and hr mr image estimation
CN111192200A (en) * 2020-01-02 2020-05-22 南京邮电大学 Image super-resolution reconstruction method based on fusion attention mechanism residual error network
AU2020100200A4 (en) * 2020-02-08 2020-06-11 Huang, Shuying DR Content-guide Residual Network for Image Super-Resolution
CN111445390A (en) * 2020-02-28 2020-07-24 天津大学 Wide residual attention-based three-dimensional medical image super-resolution reconstruction method
CN111899165A (en) * 2020-06-16 2020-11-06 厦门大学 Multi-task image reconstruction convolution network model based on functional module
CN112200725A (en) * 2020-10-26 2021-01-08 深圳大学 Super-resolution reconstruction method and device, storage medium and electronic equipment
CN113096017A (en) * 2021-04-14 2021-07-09 南京林业大学 Image super-resolution reconstruction method based on depth coordinate attention network model

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170351935A1 (en) * 2016-06-01 2017-12-07 Mitsubishi Electric Research Laboratories, Inc Method and System for Generating Multimodal Digital Images
US20190325621A1 (en) * 2016-06-24 2019-10-24 Rensselaer Polytechnic Institute Tomographic image reconstruction via machine learning
US20200034948A1 (en) * 2018-07-27 2020-01-30 Washington University Ml-based methods for pseudo-ct and hr mr image estimation
CN109325931A (en) * 2018-08-22 2019-02-12 中北大学 Based on the multi-modality images fusion method for generating confrontation network and super-resolution network
CN110415170A (en) * 2019-06-24 2019-11-05 武汉大学 A kind of image super-resolution method based on multiple dimensioned attention convolutional neural networks
CN111192200A (en) * 2020-01-02 2020-05-22 南京邮电大学 Image super-resolution reconstruction method based on fusion attention mechanism residual error network
AU2020100200A4 (en) * 2020-02-08 2020-06-11 Huang, Shuying DR Content-guide Residual Network for Image Super-Resolution
CN111445390A (en) * 2020-02-28 2020-07-24 天津大学 Wide residual attention-based three-dimensional medical image super-resolution reconstruction method
CN111899165A (en) * 2020-06-16 2020-11-06 厦门大学 Multi-task image reconstruction convolution network model based on functional module
CN112200725A (en) * 2020-10-26 2021-01-08 深圳大学 Super-resolution reconstruction method and device, storage medium and electronic equipment
CN113096017A (en) * 2021-04-14 2021-07-09 南京林业大学 Image super-resolution reconstruction method based on depth coordinate attention network model

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
雷鹏程;刘丛;唐坚刚;彭敦陆;: "分层特征融合注意力网络图像超分辨率重建", 中国图象图形学报, no. 09 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114049408A (en) * 2021-11-15 2022-02-15 哈尔滨工业大学(深圳) Depth network model for accelerating multi-modality MR imaging
WO2023109719A1 (en) * 2021-12-15 2023-06-22 深圳先进技术研究院 Terahertz single-pixel super-resolution imaging method and system
CN114331849A (en) * 2022-03-15 2022-04-12 之江实验室 Cross-mode nuclear magnetic resonance hyper-resolution network and image super-resolution method
CN114943650A (en) * 2022-04-14 2022-08-26 北京东软医疗设备有限公司 Image deblurring method and device, computer equipment and storage medium
CN117391938A (en) * 2023-12-13 2024-01-12 长春理工大学 Infrared image super-resolution reconstruction method, system, equipment and terminal
CN117391938B (en) * 2023-12-13 2024-02-20 长春理工大学 Infrared image super-resolution reconstruction method, system, equipment and terminal

Also Published As

Publication number Publication date
CN113506222B (en) 2024-03-01

Similar Documents

Publication Publication Date Title
CN113506222B (en) Multi-mode image super-resolution method based on convolutional neural network
Zhao et al. Channel splitting network for single MR image super-resolution
Xiao et al. Satellite video super-resolution via multiscale deformable convolution alignment and temporal grouping projection
CN107610194B (en) Magnetic resonance image super-resolution reconstruction method based on multi-scale fusion CNN
CN108460726B (en) Magnetic resonance image super-resolution reconstruction method based on enhanced recursive residual network
CN109214989B (en) Single image super resolution ratio reconstruction method based on Orientation Features prediction priori
CN110461228A (en) The quality of medical image is improved using more contrasts and deep learning
Wang et al. Wide weighted attention multi-scale network for accurate MR image super-resolution
Gu et al. Deep generative adversarial networks for thin-section infant MR image reconstruction
CN111899165A (en) Multi-task image reconstruction convolution network model based on functional module
CN114565816B (en) Multi-mode medical image fusion method based on global information fusion
Yan et al. SMIR: A Transformer-Based Model for MRI super-resolution reconstruction
CN116563100A (en) Blind super-resolution reconstruction method based on kernel guided network
Chen et al. Self-supervised cycle-consistent learning for scale-arbitrary real-world single image super-resolution
CN113313728B (en) Intracranial artery segmentation method and system
Rashid et al. Single MR image super-resolution using generative adversarial network
Yang et al. MGDUN: An interpretable network for multi-contrast MRI image super-resolution reconstruction
Peng et al. Deep slice interpolation via marginal super-resolution, fusion, and refinement
CN114898110A (en) Medical image segmentation method based on full-resolution representation network
Sun et al. DIR3D: Cascaded Dual-Domain Inter-Scale Mutual Reinforcement 3D Network for highly accelerated 3D MR image reconstruction
Zhou et al. GAN-based super-resolution for confocal superficial eyelid imaging: Real-time, domain generalization, and noise robustness
Li et al. Video frame interpolation based on multi-scale convolutional network and adversarial training
Wei et al. Sofnet: Optical-flow based large-scale slice augmentation of brain mri
CN113744132A (en) MR image depth network super-resolution method based on multiple optimization
CN112529949A (en) Method and system for generating DWI image based on T2 image

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant