CN113506222A - Multi-mode image super-resolution method based on convolutional neural network - Google Patents
Multi-mode image super-resolution method based on convolutional neural network Download PDFInfo
- Publication number
- CN113506222A CN113506222A CN202110870612.6A CN202110870612A CN113506222A CN 113506222 A CN113506222 A CN 113506222A CN 202110870612 A CN202110870612 A CN 202110870612A CN 113506222 A CN113506222 A CN 113506222A
- Authority
- CN
- China
- Prior art keywords
- resolution
- image
- super
- network
- low
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 30
- 238000013527 convolutional neural network Methods 0.000 title claims abstract description 9
- 238000012549 training Methods 0.000 claims abstract description 23
- 238000000605 extraction Methods 0.000 claims abstract description 11
- 238000012545 processing Methods 0.000 claims abstract description 5
- 230000006870 function Effects 0.000 claims description 21
- 230000004913 activation Effects 0.000 claims description 14
- 238000007670 refining Methods 0.000 claims description 12
- 238000010586 diagram Methods 0.000 claims description 11
- 238000005070 sampling Methods 0.000 claims description 10
- 239000000126 substance Substances 0.000 claims description 9
- 238000013135 deep learning Methods 0.000 claims description 5
- 238000011176 pooling Methods 0.000 claims description 3
- 238000002360 preparation method Methods 0.000 claims description 3
- 239000011541 reaction mixture Substances 0.000 claims description 3
- 230000000295 complement effect Effects 0.000 abstract description 4
- 230000011218 segmentation Effects 0.000 abstract description 2
- 230000000694 effects Effects 0.000 description 9
- 238000002595 magnetic resonance imaging Methods 0.000 description 5
- 238000004364 calculation method Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 238000011160 research Methods 0.000 description 3
- 238000003384 imaging method Methods 0.000 description 2
- 230000033001 locomotion Effects 0.000 description 2
- 208000003443 Unconsciousness Diseases 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000002059 diagnostic imaging Methods 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 239000002360 explosive Substances 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 238000003709 image segmentation Methods 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4007—Scaling of whole images or parts thereof, e.g. expanding or contracting based on interpolation, e.g. bilinear interpolation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4046—Scaling of whole images or parts thereof, e.g. expanding or contracting using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4053—Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Image Processing (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a multi-modal image super-resolution method based on a convolutional neural network, which comprises the following steps: firstly, preparing data; secondly, constructing a super-resolution network for cascading a plurality of dense residual attention modules, wherein the super-resolution network comprises the following components: a shallow feature extraction, feature depth processing and image reconstruction part; thirdly, performing super resolution on the input low-resolution image, comprising: super-resolution network training and super-resolution image testing. The invention can fully utilize complementary and redundant information in medical images of different modes to reconstruct a high-resolution image with better quality, provides an image with better quality for human eye observation, and simultaneously provides support for computer vision tasks such as segmentation, classification and the like of the image.
Description
Technical Field
The invention relates to the technical field of image super-resolution, in particular to a multi-mode image super-resolution method based on a convolutional neural network.
Background
The image super-resolution refers to a process of reconstructing a high-resolution image from a given low-resolution image or images, and the high-resolution image can be obtained only through an algorithm in the whole process. At present, the more advanced super-resolution method is mostly based on the super-resolution of the image of a single mode, so that although a high-resolution image with good effect can be obtained, complementary and redundant information among multi-mode images is definitely ignored, and the information sometimes has importance on the image reconstruction result in the process of reconstructing the high-resolution image. Nowadays, information technology is facing explosive growth, forms of data resources are increasingly diversified, and multi-modal data becomes a main form of data used by people. Generally speaking, more information, and more feature expression capability, tends to reconstruct more excellent high resolution images. Therefore, the research relates to a multi-mode learning method of multiple input modes, and more prior information given to the super-resolution network has huge application prospect and wide research value.
In the field of natural images, multimodality image data are increasingly developed, such as visible light images and infrared images. The visible light has higher resolution, contrast and good visual effect, but the infrared image can be less influenced by environmental factors, so that the general applicability of the infrared image is stronger. In many computer vision tasks such as pedestrian re-recognition, face recognition, etc., combining images of different modalities can achieve a more excellent effect. However, in the current super-resolution field, only a few methods are combined with images of multiple different modalities to perform super-resolution, so that the performance of many network structures cannot be further improved.
Medical imaging includes a plurality of image modalities, and the use of multi-modality Magnetic Resonance Imaging (MRI) technology is common in medical image data. Among the more common MRI modalities, T1-weighted imaging (T1) and T2-weighted imaging (T2) are included. Generally, only one-sided medical information can be obtained from a single-modality MRI image, and in order to obtain more complete and accurate information, the mutual complementation of different modalities of MRI plays a crucial role.
Meanwhile, medical image super-resolution is always a big hot spot in the field of image super-resolution. Only conventional image super resolution methods are used, for example: nearest neighbor algorithm, bilinear algorithm, bicubic interpolation, etc. Although these methods are fast to run and easy to implement, their results often show edge blurring and loss of high frequency details, which is a fatal problem in the field of medical images. The super-resolution method based on deep learning can extract depth features through long-time training so as to reconstruct a high-resolution image with quality higher than that of the traditional method. However, obtaining high-resolution images still has the problems of generating artifacts, losing details and the like, and thus it is difficult to obtain reliable high-resolution medical images.
Compared with natural images, the super-resolution problem of medical images has stronger complexity and strictness, and comprises the following characteristics: 1) medical images generally require high accuracy, the super-resolution result must faithfully reflect the actual situation, and once deviation occurs, the subsequent processing steps (such as segmentation, classification and other high-level tasks) are seriously wrong; 2) the anatomical tissue structure and the shape of a human body are complex, and individual differences exist, so that difficulty is brought to super resolution of an image; 3) the acquisition of medical image information is extremely easily affected by various factors, such as external noise, field offset effect, local body effect, unconscious movement of an acquired object, unavoidable tissue activity and the like, and inevitably generates problems of motion artifacts, nonuniformity and the like, thereby bringing great difficulty to the application of the image super-resolution method based on deep learning to medical images. Therefore, it is necessary to conduct a deeper research on the super resolution method with respect to the above-mentioned features of medical images, and consider combining information in medical images of different modalities to improve the performance of the super resolution method.
Disclosure of Invention
The invention provides a multi-mode image super-resolution method based on a convolutional neural network for overcoming the problems of the prior image super-resolution technology, so as to provide better image characteristic expression by fully utilizing complementary and redundant information of images among different modes and reconstruct a high-resolution image with higher quality, thereby providing better quality images for human eye observation and simultaneously providing support for computer vision tasks such as image segmentation, classification and the like.
The invention adopts the following technical scheme for solving the problems:
the invention discloses a multi-modal image super-resolution method based on a convolutional neural network, which is characterized by comprising the following steps of:
acquiring any group of reference image sets with K × L resolution and S modalitiesWherein the content of the first and second substances,a reference image representing an S-th modality, S1, 2.., S; obtaining a corresponding set of low resolution image gathers having a resolution of η K by η LWherein the content of the first and second substances,representing a low resolution image of the s-th modality, η representing a zoom factor, and 0 < η < 1;
step 2, constructing a multi-mode image super-resolution network, comprising the following steps: a shallow layer feature extraction part, a feature refining part and an image reconstruction part;
step 2.1, the shallow feature extraction part comprises a convolution layer with convolution kernel size of NxN and an activation function;
low resolution images of S different modalitiesAfter cascading, obtaining a cascaded low-resolution image I with the size of eta Kx eta Lx SinAnd inputting the data into the multi-mode image super-resolution network, and outputting a shallow feature map F with the size of eta K multiplied by eta L multiplied by C after the processing of the shallow feature extraction partinitC is the number of channels set by the network;
step 2.2, the feature refining part consists of G dense residual error attention modules, an NxN convolutional layer and an MxM convolutional layer;
g dense residual attention modules are denoted as DRAB1,DRAB2,…,DRABg,...,DRABGWherein DRABgIs the g-th dense residual attention module;
the g-th dense residual attention module DRABgThe system is formed by cascading a g-th dense residual error unit and a g-th channel attention mechanism unit;
the g-th dense residual error unit is composed of Y NxN convolutional layers and an MxM convolutional layer, and the Y NxN convolutional layers are densely connected;
the g channel attention mechanism unit consists of a g global pooling layer PGThe g-th one-dimensional convolution layer for adaptively adjusting the size of the convolution kernel and the g-th activation function FAForming;
when g is 1, the shallow feature map FinitAs the input characteristic of the g dense residual unit, and inputting the characteristic into the g dense residual unit, and cascading the output characteristic graph of each N × N convolution layer with the input shallow characteristic graph FinitThen, the g-th intermediate feature is obtained through the M × M convolutional layerThe g-th intermediate featureAdding the input characteristics of the g dense residual error unit to obtain the output characteristics of the g dense residual error unit
Output characteristics of the g-th dense residual unitAs the input characteristic of the g channel attention mechanism unit, obtaining a weight vector L by the g channel attention mechanism unitAgAnd then, obtaining the output of the g channel attention mechanism unit by using the formula (1) and using the output as a g dense residual attention module DRABgOutput characteristic of
FDRAg=LAg×FDRg (1)
When G is more than or equal to 2 and less than or equal to G, the G-th dense residual error attention module DRABgIs the g-1 th dense residual attention module DRABg-1Output characteristic ofSo that the output characteristics are obtained by G dense residual attention modulesAfter cascade connection, sequentially passing through the N × N convolution layer and one M × M convolution layer of the feature refining part, outputting an intermediate feature map F 'of the feature refining part with the size of eta K × eta L × C'fineAnd from said intermediate feature map F'fineAnd the shallow feature map FinitAfter jump linking, adding to obtain final characteristic diagram F of low resolution space with the size of eta K multiplied by eta L multiplied by CLR;
Step 2.3, the image reconstruction part comprises an up-sampling layer and S image reconstruction branches, wherein the S-th image reconstruction branch comprises: h, NxN convolutional layers with activation functions and one NxN convolutional layer without activation functions;
final feature F of low resolution spaceLRInputting the image into the image reconstruction part, and obtaining a high-resolution spatial feature F through an up-sampling layerHRAnd outputting a residual error map of the s-th mode after passing through the s-th image reconstruction branchThus, residual error maps of all modes are obtained by the S image reconstruction branches
Step 2.4, for low resolution image setUpsampling to obtain an interpolated low resolution image setWherein the content of the first and second substances,low resolution image representing the s-th modalityCarrying out up-sampling to obtain an interpolated low-resolution image;
step 2.5, residual error map of the s-th modeAnd interpolated low resolution map of the s-th modalityAdding the obtained S-th modal super-resolution imageSo as to obtain S super-resolution images of different modes
Step 3, training the multi-mode image super-resolution network:
step 3.1, obtaining R groups of reference image sets and R groups of low-resolution image sets corresponding to the R groups of reference image sets according to the process of the step 1;
step 3.2, defining the current cycle number as t, and initializing t to be 0; defining the maximum number of iterations asZ is the maximum number of preset rounds of super-resolution network training; x is the number of groups extracted each time;
3.3, randomly taking out X groups of low-resolution image sets from the t times of the R groups of low-resolution image sets, inputting the X groups of low-resolution image sets into the multi-mode image super-resolution network for training, and obtaining super-resolution image sets output by the t times of training Representing the super-resolution image of the s-th mode in the x-th group of multi-mode super-resolution images output by the t-th training; x is 1,2, …, X;
and correspondingly taking X groups of images from the t-th time of the R group of reference image sets, and constructing a loss function L of the t-th training shown in formula (2)t(θ):
In the formula (2), the reaction mixture is,a reference image of the s-th modality representing the x-th set of reference images; optimizing and solving the loss function constructed by the formula (2) by adopting a back propagation algorithm, so as to adjust all parameters in the deep learning network;
step 3.4, after T +1 is assigned to T, judging whether T is greater than T or not, if so, indicating that a super-resolution network model which is finally trained is obtained; otherwise, returning to the step 3.3 for sequential execution;
and 3.5, inputting the low-resolution image to be tested into the trained super-resolution network model so as to obtain a predicted super-resolution image.
Compared with the prior art, the invention has the beneficial effects that:
1. the invention provides a unified network framework, simultaneously realizes the super-resolution tasks of images in different modes, fully utilizes the redundant and complementary information among the images in different modes, simultaneously reconstructs high-resolution images in multiple modes, and improves the super-resolution effect of any image in a single mode. Compared with the prior art that networks need to be trained on different modal image super-resolution tasks respectively, the method can realize simultaneous super-resolution of multi-modal images by only carrying out network training once.
2. Compared with most of the existing deep learning super-resolution networks, the invention designs a lightweight network to realize multi-mode image super-resolution, improves the calculation efficiency, reduces the storage cost and has stronger practicability. In addition, the method carries out up-sampling at the end of the feature extraction part, so that a large amount of convolution operation is carried out in a low-resolution space, and a strategy of reconstructing a residual error map is adopted in an image reconstruction link, so that the network is easy to train, and the calculation efficiency is high.
3. The invention designs a cascade basic structure of a plurality of dense residual attention modules, so that the information flow in the network is more reasonable, the loss of characteristic information is reduced, the network can learn more deep characteristics with complex hierarchical structures, the characteristic information in the original image is prevented from being lost in large quantity, and the quality of super-resolution results is greatly improved. In addition, the method further improves the flow path of information in the network by adopting the global residual learning and the local residual learning, prevents the network from losing shallow features while extracting deeper features, and enables the feature information extracted by the network to be more comprehensive.
Drawings
FIG. 1 is a flow chart of a multi-modal image super-resolution method based on a convolutional neural network of the present invention;
fig. 2 is a schematic diagram of a specific framework of the present invention, taking s-2 as an example;
FIG. 3 is a schematic diagram of a dense residual attention module according to the present invention;
FIG. 4 is a diagram illustrating a dense residual error unit structure according to the present invention;
FIG. 5 is a schematic diagram of a channel attention mechanism unit according to the present invention.
Detailed Description
In this embodiment, taking MRI of two different modalities as an example, a specific network framework is shown in fig. 2, and a multi-modality image super-resolution method based on a convolutional neural network is shown in fig. 1, and includes the following steps:
acquiring any group of reference image sets with K × L resolution and S modalitiesWherein the content of the first and second substances,a reference image representing an S-th modality, S1, 2.., S; obtaining a corresponding set of low resolution image gathers having a resolution of η K by η LWherein the content of the first and second substances,representing a low resolution image of the s-th modality, η representing a zoom factor, and 0 < η < 1;
in this embodiment, T1 and T2 MR images in MICCAI BraTS _2019, each of which has a size of 240 × 240 × 155, were used as raw data, and 457 3D MR images were included in the data set. Slicing the 3D MR images along the Z axis, selecting one slice as training data for every 5 slices of each 3D image from the 60 th slice, and taking 10 slices of each 3D image to obtain 4570 groups of 2D MR images with 2 modalities;
using the obtained data as a reference image setRespectively carrying out bicubic downsampling with different scaling factors on the reference image to obtain a low-resolution image set under different scale factorsThe scaling factor adopted in the embodiment is 2, but other scaling factors can also achieve good effects in the network;
step 2, constructing a multi-mode image super-resolution network, comprising the following steps: a shallow layer feature extraction part, a feature refining part and an image reconstruction part;
step 2.1, the shallow feature extraction part comprises a convolution layer with convolution kernel size of nxn and an activation function, in the embodiment, N is taken to be 3, the activation function adopts a relu activation function, and the convolution layer with convolution kernel size of 3 is adopted, so that a good effect can be obtained, and too many parameters cannot be introduced to slow down the network training speed;
low resolution images of S different modalitiesAfter cascading, obtaining a cascaded low-resolution image I with the size of eta Kx eta Lx SinIn the present embodiment, 2 images of different modalities are used, each having a size of 120 × 120, i.e., S ═ 2, K ═ L ═ 240, and η ═ 0.5;
to cascade low resolution images IinInputting the image into a multi-mode image super-resolution network, processing the image by a shallow feature extraction part, and outputting a shallow feature map F with the size of eta K multiplied by eta L multiplied by CinitC is the number of channels set by the network, and in this embodiment, C is set to 64;
step 2.2, the feature refining part is composed of G dense residual attention modules, an M × M convolutional layer and an N × N convolutional layer, where in this embodiment, N is 3, M is 1, and G is 3;
g dense residual attention modules are denoted as DRAB1,DRAB2,...,DRABg,...,DRABGWherein DRABgThe structure of the specific dense residual attention module is shown in fig. 3;
g-th dense residual attention module DRABgThe system is formed by cascading a g-th dense residual error unit and a g-th channel attention mechanism unit;
in this embodiment, Y is 6, the growth rate of the dense connection is 32, and the specific structure of the dense residual unit is as shown in fig. 4. Dense connection can enable the feature diagram to be continuously reused, so that a network can obtain a better effect under a shallower condition, and the computing efficiency of the network is improved;
the g channel attention mechanism unit consists of a g global pooling layer PGThe g-th one-dimensional convolution layer for adaptively adjusting the size of the convolution kernel and the g-th activation function FAComposition, in this embodiment, the function F is activatedAWith softmax, the specific structure of the channel attention mechanism unit is shown in fig. 5;
when g is 1, the shallow feature map FinitAs the input characteristic of the g dense residual unit, and concatenating the output characteristic diagram of each NxN convolutional layer with the input shallow characteristic diagram FinitThen, the g-th intermediate feature is obtained through the M × M convolutional layerIn this embodiment, each NxN convolutional layer outputs a feature map of 32 channels, and each MxM convolutional layer outputs a feature map of 64 channels, and an intermediate featureHas 64 channels; the g-th intermediate featureAdding the input characteristics of the g dense residual error unit to obtain the output characteristics of the g dense residual error unitBy adopting a strategy of local residual error learning, the input feature graph can be directly transmitted to a deep network through jump connection, the deep feature is continuously extracted by the network, and meanwhile, the shallow feature is prevented from being lost, so that the network obtains a more comprehensive feature graph;
output characteristics of the g-th dense residual unitAs the input characteristic of the g channel attention mechanism unit, the g channel attention mechanism unit obtains a weight vector LAgAnd then, obtaining the output of the g channel attention mechanism unit by using the formula (1) and using the output as a g dense residual attention module DRABgOutput characteristic of
FDRAg=LAg×FDRg (1)
When G is more than or equal to 2 and less than or equal to G, the G-th dense residual error attention module DRABgIs characterized by the g-1 th dense residual attention module DRABg-1Output characteristic ofSo that the output characteristics are obtained by G dense residual attention modulesAfter cascade connection, the intermediate feature map F 'of the feature refining part with the size of eta K eta L multiplied by C is output after sequentially passing through the N multiplied by N convolution layers and one M multiplied by M convolution layers of the feature refining part'fineAnd from an intermediate feature map F'fineAnd shallow feature map FinitAfter jump linking, adding to obtain final characteristic F of low resolution space with size of eta K multiplied by eta L multiplied by CLR. The output of the G dense residual attention modules is cascaded to obtain a characteristic diagram of G × 64 channels, which is 192 channels in this embodiment. The characteristic diagram is firstly passed through a 1 × 1 convolutional layer,feature map recompression to 64 channels with some feature fusion, followed by further feature refinement through a 3 × 3 convolutional layer to obtain an intermediate feature map F 'of size 120 × 120 × 64'fine. Then intermediate feature map F'fineAnd shallow feature map FinitGlobal residual learning is formed by adding, and the representation capability of the network is further improved;
step 2.3, the image reconstruction part comprises an up-sampling layer and S image reconstruction branches, and the S-th image reconstruction branch comprises: h is an nxn convolutional layer with an activation function and an nxn convolutional layer without an activation function, in this embodiment, the upsampled convolutional layer used is a high-efficiency sub-pixel convolutional layer, and H is 1;
final feature F of low resolution spaceLRIn the input image reconstruction part, a high-resolution spatial feature F is obtained through an upsampling layerHRAnd then the residual error image of the s-th mode is output after passing through the s-th image reconstruction branchThus, residual error maps of all modes are obtained by the S image reconstruction branchesIn the embodiment, 2 image reconstruction branches are provided to reconstruct 2 residual images with different modes, and the characteristic up-sampling is performed at the rear part of the network, so that most of convolution operations of the network can be performed in a low-resolution space, and the calculation resources are also saved;
step 2.4, for low resolution image setUpsampling to obtain an interpolated low resolution image setWherein the content of the first and second substances,low resolution image representing the s-th modalityIn this embodiment, a bicubic interpolation method is adopted as a method for up-sampling the low-resolution image;
step 2.5, residual error map of the s-th modeAnd interpolated low resolution map of the s-th modalityAdding the obtained S-th modal super-resolution imageSo as to obtain S super-resolution images of different modesThe difficulty of reconstructing the image is reduced by the network in a mode of reconstructing a residual error map, so that the network is easier to train;
step 3, training the multi-mode image super-resolution network:
step 3.1, obtaining R groups of reference image sets and R groups of low-resolution image sets corresponding to the R groups of reference image sets according to the process of the step 1;
step 3.2, defining the current cycle number as t, and initializing t to be 0; defining the maximum number of iterations asZ is the maximum number of rounds of the super-resolution network training, X is the number of groups extracted each time, in this embodiment, X is set to be 32, and Z is set to be 200;
3.3, randomly taking out X groups of low-resolution image sets from the t time of the R groups of low-resolution image sets, inputting the X groups of low-resolution image sets into a multi-mode image super-resolution network for training, and obtaining the super-resolution image set output by the t time of training Representing the super-resolution image of the s-th mode in the x-th group of multi-mode super-resolution images output by the t-th training; x is 1,2, …, X;
and correspondingly taking X groups of images from the t-th time of the R group of reference image sets, and constructing a loss function L of the t-th training shown in formula (2)t(θ):
In the formula (2), the reaction mixture is,a reference image of the s-th modality representing the x-th set of reference images; the loss function constructed by the formula (2) is optimized and solved by adopting a back propagation algorithm, so that all parameters in the whole network are adjustedt(theta) carrying out optimization solution;
step 3.4, after T +1 is assigned to T, judging whether T is greater than T or not, if so, indicating that a super-resolution network model which is finally trained is obtained; otherwise, returning to the step 3.3 for sequential execution;
and 3.5, inputting the low-resolution image to be tested into the trained super-resolution network model so as to obtain a predicted super-resolution image.
Claims (1)
1. A multi-modal image super-resolution method based on a convolutional neural network is characterized by comprising the following steps:
step 1, data preparation:
acquiring any group of reference image sets with K × L resolution and S modalitiesWherein the content of the first and second substances,a reference image representing an S-th modality, S1, 2.., S; obtaining a corresponding set of low resolution image gathers having a resolution of η K by η LWherein the content of the first and second substances,representing a low resolution image of the s-th modality, η representing a zoom factor, and 0 < η < 1;
step 2, constructing a multi-mode image super-resolution network, comprising the following steps: a shallow layer feature extraction part, a feature refining part and an image reconstruction part;
step 2.1, the shallow feature extraction part comprises a convolution layer with convolution kernel size of NxN and an activation function;
low resolution images of S different modalitiesAfter cascading, obtaining a cascaded low-resolution image I with the size of eta Kx eta Lx SinAnd inputting the data into the multi-mode image super-resolution network, and outputting a shallow feature map F with the size of eta K multiplied by eta L multiplied by C after the processing of the shallow feature extraction partinitC is the number of channels set by the network;
step 2.2, the feature refining part consists of G dense residual error attention modules, an NxN convolutional layer and an MxM convolutional layer;
g dense residual attention modules are denoted as DRAB1,DRAB2,...,DRABg,…,DRABGWherein DRABgIs the g-th dense residual attention module;
the g-th dense residual attention module DRABgThe system is formed by cascading a g-th dense residual error unit and a g-th channel attention mechanism unit;
the g-th dense residual error unit is composed of Y NxN convolutional layers and an MxM convolutional layer, and the Y NxN convolutional layers are densely connected;
the g channel attention mechanism unit consists of a g global pooling layer PGThe g-th one-dimensional convolution layer for adaptively adjusting the size of the convolution kernel and the g-th activation function FAForming;
when g is 1, the shallow feature map FinitAs the input characteristic of the g dense residual unit, and inputting the characteristic into the g dense residual unit, and cascading the output characteristic graph of each N × N convolution layer with the input shallow characteristic graph FinitThen, the g-th intermediate feature is obtained through the M × M convolutional layerThe g-th intermediate featureAdding the input characteristics of the g dense residual error unit to obtain the output characteristics of the g dense residual error unit
Output characteristics of the g-th dense residual unitAs the input characteristic of the g channel attention mechanism unit, obtaining a weight vector L by the g channel attention mechanism unitAgAnd then, obtaining the output of the g channel attention mechanism unit by using the formula (1) and using the output as a g dense residual attention module DRABgOutput characteristic of
FDRAg=LAg×FDRg (1)
When G is more than or equal to 2 and less than or equal to G, the G-th dense residual error attention module DRABgIs the g-1 th dense residual attentionForce module DRABg-1Output characteristic ofSo that the output characteristics are obtained by G dense residual attention modulesAfter cascade connection, sequentially passing through the N × N convolution layer and one M × M convolution layer of the feature refining part, outputting an intermediate feature map F 'of the feature refining part with the size of eta K × eta L × C'fineAnd from said intermediate feature map F'fineAnd the shallow feature map FinitAfter jump linking, adding to obtain final characteristic diagram F of low resolution space with the size of eta K multiplied by eta L multiplied by CLR;
Step 2.3, the image reconstruction part comprises an up-sampling layer and S image reconstruction branches, wherein the S-th image reconstruction branch comprises: h, NxN convolutional layers with activation functions and one NxN convolutional layer without activation functions;
final feature F of low resolution spaceLRInputting the image into the image reconstruction part, and obtaining a high-resolution spatial feature F through an up-sampling layerHRAnd outputting a residual error map of the s-th mode after passing through the s-th image reconstruction branchThus, residual error maps of all modes are obtained by the S image reconstruction branches
Step 2.4, for low resolution image setUpsampling to obtain an interpolated low resolution image setWherein the content of the first and second substances,low resolution image representing the s-th modalityCarrying out up-sampling to obtain an interpolated low-resolution image;
step 2.5, residual error map of the s-th modeAnd interpolated low resolution map of the s-th modalityAdding the obtained S-th modal super-resolution imageSo as to obtain S super-resolution images of different modes
Step 3, training the multi-mode image super-resolution network:
step 3.1, obtaining R groups of reference image sets and R groups of low-resolution image sets corresponding to the R groups of reference image sets according to the process of the step 1;
step 3.2, defining the current cycle number as t, and initializing t to be 0; defining the maximum number of iterations asZ is the maximum number of preset rounds of super-resolution network training; x is the number of groups extracted each time;
3.3, randomly taking out X groups of low-resolution image sets from the t times of the R groups of low-resolution image sets, inputting the X groups of low-resolution image sets into the multi-mode image super-resolution network for training, and obtaining super-resolution image sets output by the t times of training Representing the super-resolution image of the s-th mode in the x-th group of multi-mode super-resolution images output by the t-th training; x is 1,2, …, X;
and correspondingly taking X groups of images from the t-th time of the R group of reference image sets, and constructing a loss function L of the t-th training shown in formula (2)t(θ):
In the formula (2), the reaction mixture is,a reference image of the s-th modality representing the x-th set of reference images; optimizing and solving the loss function constructed by the formula (2) by adopting a back propagation algorithm, so as to adjust all parameters in the deep learning network;
step 3.4, after T +1 is assigned to T, judging whether T is greater than T or not, if so, indicating that a super-resolution network model which is finally trained is obtained; otherwise, returning to the step 3.3 for sequential execution;
and 3.5, inputting the low-resolution image to be tested into the trained super-resolution network model so as to obtain a predicted super-resolution image.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110870612.6A CN113506222B (en) | 2021-07-30 | 2021-07-30 | Multi-mode image super-resolution method based on convolutional neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110870612.6A CN113506222B (en) | 2021-07-30 | 2021-07-30 | Multi-mode image super-resolution method based on convolutional neural network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113506222A true CN113506222A (en) | 2021-10-15 |
CN113506222B CN113506222B (en) | 2024-03-01 |
Family
ID=78014561
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110870612.6A Active CN113506222B (en) | 2021-07-30 | 2021-07-30 | Multi-mode image super-resolution method based on convolutional neural network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113506222B (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114049408A (en) * | 2021-11-15 | 2022-02-15 | 哈尔滨工业大学(深圳) | Depth network model for accelerating multi-modality MR imaging |
CN114331849A (en) * | 2022-03-15 | 2022-04-12 | 之江实验室 | Cross-mode nuclear magnetic resonance hyper-resolution network and image super-resolution method |
CN114943650A (en) * | 2022-04-14 | 2022-08-26 | 北京东软医疗设备有限公司 | Image deblurring method and device, computer equipment and storage medium |
WO2023109719A1 (en) * | 2021-12-15 | 2023-06-22 | 深圳先进技术研究院 | Terahertz single-pixel super-resolution imaging method and system |
CN117391938A (en) * | 2023-12-13 | 2024-01-12 | 长春理工大学 | Infrared image super-resolution reconstruction method, system, equipment and terminal |
Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170351935A1 (en) * | 2016-06-01 | 2017-12-07 | Mitsubishi Electric Research Laboratories, Inc | Method and System for Generating Multimodal Digital Images |
CN109325931A (en) * | 2018-08-22 | 2019-02-12 | 中北大学 | Based on the multi-modality images fusion method for generating confrontation network and super-resolution network |
US20190325621A1 (en) * | 2016-06-24 | 2019-10-24 | Rensselaer Polytechnic Institute | Tomographic image reconstruction via machine learning |
CN110415170A (en) * | 2019-06-24 | 2019-11-05 | 武汉大学 | A kind of image super-resolution method based on multiple dimensioned attention convolutional neural networks |
US20200034948A1 (en) * | 2018-07-27 | 2020-01-30 | Washington University | Ml-based methods for pseudo-ct and hr mr image estimation |
CN111192200A (en) * | 2020-01-02 | 2020-05-22 | 南京邮电大学 | Image super-resolution reconstruction method based on fusion attention mechanism residual error network |
AU2020100200A4 (en) * | 2020-02-08 | 2020-06-11 | Huang, Shuying DR | Content-guide Residual Network for Image Super-Resolution |
CN111445390A (en) * | 2020-02-28 | 2020-07-24 | 天津大学 | Wide residual attention-based three-dimensional medical image super-resolution reconstruction method |
CN111899165A (en) * | 2020-06-16 | 2020-11-06 | 厦门大学 | Multi-task image reconstruction convolution network model based on functional module |
CN112200725A (en) * | 2020-10-26 | 2021-01-08 | 深圳大学 | Super-resolution reconstruction method and device, storage medium and electronic equipment |
CN113096017A (en) * | 2021-04-14 | 2021-07-09 | 南京林业大学 | Image super-resolution reconstruction method based on depth coordinate attention network model |
-
2021
- 2021-07-30 CN CN202110870612.6A patent/CN113506222B/en active Active
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170351935A1 (en) * | 2016-06-01 | 2017-12-07 | Mitsubishi Electric Research Laboratories, Inc | Method and System for Generating Multimodal Digital Images |
US20190325621A1 (en) * | 2016-06-24 | 2019-10-24 | Rensselaer Polytechnic Institute | Tomographic image reconstruction via machine learning |
US20200034948A1 (en) * | 2018-07-27 | 2020-01-30 | Washington University | Ml-based methods for pseudo-ct and hr mr image estimation |
CN109325931A (en) * | 2018-08-22 | 2019-02-12 | 中北大学 | Based on the multi-modality images fusion method for generating confrontation network and super-resolution network |
CN110415170A (en) * | 2019-06-24 | 2019-11-05 | 武汉大学 | A kind of image super-resolution method based on multiple dimensioned attention convolutional neural networks |
CN111192200A (en) * | 2020-01-02 | 2020-05-22 | 南京邮电大学 | Image super-resolution reconstruction method based on fusion attention mechanism residual error network |
AU2020100200A4 (en) * | 2020-02-08 | 2020-06-11 | Huang, Shuying DR | Content-guide Residual Network for Image Super-Resolution |
CN111445390A (en) * | 2020-02-28 | 2020-07-24 | 天津大学 | Wide residual attention-based three-dimensional medical image super-resolution reconstruction method |
CN111899165A (en) * | 2020-06-16 | 2020-11-06 | 厦门大学 | Multi-task image reconstruction convolution network model based on functional module |
CN112200725A (en) * | 2020-10-26 | 2021-01-08 | 深圳大学 | Super-resolution reconstruction method and device, storage medium and electronic equipment |
CN113096017A (en) * | 2021-04-14 | 2021-07-09 | 南京林业大学 | Image super-resolution reconstruction method based on depth coordinate attention network model |
Non-Patent Citations (1)
Title |
---|
雷鹏程;刘丛;唐坚刚;彭敦陆;: "分层特征融合注意力网络图像超分辨率重建", 中国图象图形学报, no. 09 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114049408A (en) * | 2021-11-15 | 2022-02-15 | 哈尔滨工业大学(深圳) | Depth network model for accelerating multi-modality MR imaging |
WO2023109719A1 (en) * | 2021-12-15 | 2023-06-22 | 深圳先进技术研究院 | Terahertz single-pixel super-resolution imaging method and system |
CN114331849A (en) * | 2022-03-15 | 2022-04-12 | 之江实验室 | Cross-mode nuclear magnetic resonance hyper-resolution network and image super-resolution method |
CN114943650A (en) * | 2022-04-14 | 2022-08-26 | 北京东软医疗设备有限公司 | Image deblurring method and device, computer equipment and storage medium |
CN117391938A (en) * | 2023-12-13 | 2024-01-12 | 长春理工大学 | Infrared image super-resolution reconstruction method, system, equipment and terminal |
CN117391938B (en) * | 2023-12-13 | 2024-02-20 | 长春理工大学 | Infrared image super-resolution reconstruction method, system, equipment and terminal |
Also Published As
Publication number | Publication date |
---|---|
CN113506222B (en) | 2024-03-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113506222B (en) | Multi-mode image super-resolution method based on convolutional neural network | |
Zhao et al. | Channel splitting network for single MR image super-resolution | |
Xiao et al. | Satellite video super-resolution via multiscale deformable convolution alignment and temporal grouping projection | |
CN107610194B (en) | Magnetic resonance image super-resolution reconstruction method based on multi-scale fusion CNN | |
CN108460726B (en) | Magnetic resonance image super-resolution reconstruction method based on enhanced recursive residual network | |
CN109214989B (en) | Single image super resolution ratio reconstruction method based on Orientation Features prediction priori | |
CN110461228A (en) | The quality of medical image is improved using more contrasts and deep learning | |
Wang et al. | Wide weighted attention multi-scale network for accurate MR image super-resolution | |
Gu et al. | Deep generative adversarial networks for thin-section infant MR image reconstruction | |
CN111899165A (en) | Multi-task image reconstruction convolution network model based on functional module | |
CN114565816B (en) | Multi-mode medical image fusion method based on global information fusion | |
Yan et al. | SMIR: A Transformer-Based Model for MRI super-resolution reconstruction | |
CN116563100A (en) | Blind super-resolution reconstruction method based on kernel guided network | |
Chen et al. | Self-supervised cycle-consistent learning for scale-arbitrary real-world single image super-resolution | |
CN113313728B (en) | Intracranial artery segmentation method and system | |
Rashid et al. | Single MR image super-resolution using generative adversarial network | |
Yang et al. | MGDUN: An interpretable network for multi-contrast MRI image super-resolution reconstruction | |
Peng et al. | Deep slice interpolation via marginal super-resolution, fusion, and refinement | |
CN114898110A (en) | Medical image segmentation method based on full-resolution representation network | |
Sun et al. | DIR3D: Cascaded Dual-Domain Inter-Scale Mutual Reinforcement 3D Network for highly accelerated 3D MR image reconstruction | |
Zhou et al. | GAN-based super-resolution for confocal superficial eyelid imaging: Real-time, domain generalization, and noise robustness | |
Li et al. | Video frame interpolation based on multi-scale convolutional network and adversarial training | |
Wei et al. | Sofnet: Optical-flow based large-scale slice augmentation of brain mri | |
CN113744132A (en) | MR image depth network super-resolution method based on multiple optimization | |
CN112529949A (en) | Method and system for generating DWI image based on T2 image |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |