CN115457359A - PET-MRI image fusion method based on adaptive countermeasure generation network - Google Patents
PET-MRI image fusion method based on adaptive countermeasure generation network Download PDFInfo
- Publication number
- CN115457359A CN115457359A CN202211094448.5A CN202211094448A CN115457359A CN 115457359 A CN115457359 A CN 115457359A CN 202211094448 A CN202211094448 A CN 202211094448A CN 115457359 A CN115457359 A CN 115457359A
- Authority
- CN
- China
- Prior art keywords
- image
- layer
- fusion
- convolution
- network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000003044 adaptive effect Effects 0.000 title claims abstract description 8
- 238000007500 overflow downdraw method Methods 0.000 title claims description 14
- 230000004927 fusion Effects 0.000 claims abstract description 76
- 238000000034 method Methods 0.000 claims abstract description 40
- 230000006870 function Effects 0.000 claims description 33
- 238000012549 training Methods 0.000 claims description 14
- 238000000605 extraction Methods 0.000 claims description 12
- 238000012545 processing Methods 0.000 claims description 12
- 239000000126 substance Substances 0.000 claims description 9
- 230000008569 process Effects 0.000 claims description 7
- 230000009466 transformation Effects 0.000 claims description 6
- 230000004913 activation Effects 0.000 claims description 5
- 238000004364 calculation method Methods 0.000 claims description 4
- 230000003042 antagnostic effect Effects 0.000 claims description 3
- 238000013507 mapping Methods 0.000 claims description 3
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 claims description 2
- 239000002131 composite material Substances 0.000 claims description 2
- 239000011159 matrix material Substances 0.000 claims description 2
- NAWXUBYGYWOOIX-SFHVURJKSA-N (2s)-2-[[4-[2-(2,4-diaminoquinazolin-6-yl)ethyl]benzoyl]amino]-4-methylidenepentanedioic acid Chemical compound C1=CC2=NC(N)=NC(N)=C2C=C1CCC1=CC=C(C(=O)N[C@@H](CC(=C)C(O)=O)C(O)=O)C=C1 NAWXUBYGYWOOIX-SFHVURJKSA-N 0.000 claims 1
- 230000008485 antagonism Effects 0.000 abstract description 6
- 238000012360 testing method Methods 0.000 abstract description 4
- 238000003745 diagnosis Methods 0.000 abstract description 2
- 230000002349 favourable effect Effects 0.000 abstract 1
- 238000002595 magnetic resonance imaging Methods 0.000 description 24
- 238000002600 positron emission tomography Methods 0.000 description 19
- 238000013135 deep learning Methods 0.000 description 11
- 238000013527 convolutional neural network Methods 0.000 description 8
- 238000003384 imaging method Methods 0.000 description 8
- 230000000694 effects Effects 0.000 description 7
- 238000013461 design Methods 0.000 description 6
- 238000010586 diagram Methods 0.000 description 6
- 238000013459 approach Methods 0.000 description 4
- 239000000284 extract Substances 0.000 description 4
- 230000000007 visual effect Effects 0.000 description 4
- 239000003086 colorant Substances 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 238000011156 evaluation Methods 0.000 description 3
- 230000007246 mechanism Effects 0.000 description 3
- 238000012216 screening Methods 0.000 description 3
- 230000003595 spectral effect Effects 0.000 description 3
- 230000006978 adaptation Effects 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 2
- 210000004556 brain Anatomy 0.000 description 2
- 238000003759 clinical diagnosis Methods 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 238000007499 fusion processing Methods 0.000 description 2
- 230000004060 metabolic process Effects 0.000 description 2
- 238000011158 quantitative evaluation Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 230000000717 retained effect Effects 0.000 description 2
- 230000011218 segmentation Effects 0.000 description 2
- MCNQUWLLXZZZAC-UHFFFAOYSA-N 4-cyano-1-(2,4-dichlorophenyl)-5-(4-methoxyphenyl)-n-piperidin-1-ylpyrazole-3-carboxamide Chemical compound C1=CC(OC)=CC=C1C1=C(C#N)C(C(=O)NN2CCCCC2)=NN1C1=CC=C(Cl)C=C1Cl MCNQUWLLXZZZAC-UHFFFAOYSA-N 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 238000011217 control strategy Methods 0.000 description 1
- 238000000354 decomposition reaction Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000002059 diagnostic imaging Methods 0.000 description 1
- 230000008034 disappearance Effects 0.000 description 1
- 229940079593 drug Drugs 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 230000009977 dual effect Effects 0.000 description 1
- 238000013535 dynamic contrast enhanced MRI Methods 0.000 description 1
- 238000004880 explosion Methods 0.000 description 1
- 125000001153 fluoro group Chemical group F* 0.000 description 1
- 230000002068 genetic effect Effects 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
- 238000003475 lamination Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 238000004321 preservation Methods 0.000 description 1
- 102000004169 proteins and genes Human genes 0.000 description 1
- 108090000623 proteins and genes Proteins 0.000 description 1
- 238000011524 similarity measure Methods 0.000 description 1
- 210000004872 soft tissue Anatomy 0.000 description 1
- 210000001519 tissue Anatomy 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/80—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
- G06V10/806—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4038—Image mosaicing, e.g. composing plane images from plane sub-images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/50—Image enhancement or restoration using two or more images, e.g. averaging or subtraction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/56—Extraction of image or video features relating to colour
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10072—Tomographic images
- G06T2207/10088—Magnetic resonance imaging [MRI]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10072—Tomographic images
- G06T2207/10104—Positron emission tomography [PET]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20212—Image combination
- G06T2207/20221—Image fusion; Image merging
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Artificial Intelligence (AREA)
- Multimedia (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Evolutionary Computation (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- Computational Linguistics (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Image Analysis (AREA)
Abstract
The invention generates a countermeasure network by constructing an adaptive residual dense and combining YC-based b C r The method mainly adopts a regional residual error learning module and an output cascade deepening generation network to avoid feature loss, dynamically guides a generator to generate a fusion image with the same distribution as a source image through a self-adaptive decision block, and fuses an image gradient map and the source imagePerforming a antagonism game between the input images and the gradient maps to efficiently train the generator and the discriminator so as to obtain a fused image with rich details and clear textures; the method has the characteristics of end-to-end and unsupervised, manual intervention is not needed, real data is not needed to be used as a label, and images with different resolutions can be fused under the condition of not introducing a traditional frame; the peak signal-to-noise ratio and the structural similarity respectively reach 55.2124 and 0.4697 in the test of an MRI/PET data set of Harvard medical institute, are superior to the most advanced algorithm at present, and are more favorable for clinical application diagnosis.
Description
Technical Field
The invention belongs to the technical field of medical imaging, and particularly relates to a PET-MRI image fusion method based on a self-adaptive countermeasure generation network.
Background
Medical images are divided into structural systems and functional systems, and Imaging mechanisms of different systems can acquire different Imaging information of the same part, for example, magnetic Resonance Imaging (MRI) images can provide high-resolution brain soft tissue structural information, and Positron Emission Tomography (PET) images can reflect color information of metabolism and functional conditions of tissues, but the images of different modalities are limited by the MRI images, the MRI images lack movement information such as body metabolism, the PET images are low in resolution, and focuses cannot be accurately positioned; different images have specific characteristics, and the limited information of the single-mode images hardly meets the requirement of clinical diagnosis and treatment on information quantity, and the images from multiple imaging mechanisms are required to be fused; in recent years, the success of PET-MRI fusion imaging in the clinical field has led to a great interest in non-invasive functional imaging and anatomical imaging.
In the fusion process, the spatial information of the MRI image and the spectral information of the PET image need to be retained, or the spatial information existing in the MRI data needs to be introduced into the PET, so that the limitation of the single-mode medical image is overcome, the imaging quality is improved while the image characteristics are retained, and the clinical applicability of the image in diagnosing and evaluating medical problems is improved.
The most widely applied technology in the field of traditional medical image fusion is pixel-level fusion, which can be divided into two types, namely a spatial domain and a transform domain, wherein the former fusion rule directly acts on pixels, and the rule is simple, but the fusion effect is poor, for example, in He C T, liu Q X, li H L, et al, multimodal media image fusion on IHS and PCA [ J ]. Procedia Engineering, 2010, 7: 280-285", the image is converted into a luminance, chrominance, saturation (Intensity, hue, saturation, IHS) channel, and the IHS transformation causes spectral and spatial distortion; the transform domain-based image fusion technology mostly adopts a multi-scale transform (MST) technology, and is divided into three processes of decomposition, fusion and reconstruction; the method comprises the following steps that a source image is firstly transformed to a frequency domain and fused according to a certain rule, and then the fused coefficient and a transformation base are used for image reconstruction; the method well protects the detail information of the source image, but neglects the space consistency, and leads to the distortion of the brightness and the color of the fused image; the rules of the traditional fusion method need to be artificially designed and selected, different filter parameters are selected from MST, and the obtained fusion effect has great difference; however, due to the diversity of feature extraction and the complexity of the fusion rules, it becomes difficult to manually design the fusion method, so that the model robustness is reduced.
With the rise of deep learning in recent years, a neural network is used to solve the above problems, a Convolutional Neural Network (CNN) model is mostly used for image fusion based on the existing deep learning, research based on the deep learning in the field of image fusion is gradually activated in the last years, and scholars successively propose a plurality of fusion methods, and an important branch is gradually formed; in some approaches, a deep learning framework is used to extract Image features for reconstruction in an end-to-end manner, typically the document "Liu Y, chen X, ward R K, et al. Image fusion with a convolutional sparse representation [ J ]. IEEE signal processing letters, 2016, 23 (12): 1882-1886" applies a Convolutional Sparse Representation (CSR) to Image fusion, extracts multi-layer features, and generates a fused Image using these features; IFCNN adds convolutional neural networks to transform domain image Fusion algorithms (Zhang Y, liu Y, sun P, et al. IFCNN: A genetic image Fusion frame based on volumetric neural network [ J ] Information Fusion, 2020, 54: 99-118); the documents "Young A S, omar Z, shell U. An improved adaptation for a Medical image fusion using a sparse representation and a simple convolutional neural network [ J ]. Biological Signal Processing and Control, 2022, 72: 103357" propose a fusion method for Medical images based on a sparse representation and a twin convolutional neural network, the documents "Hou R, zhou D, nie R, et al. Brain CT and MRI Medical image fusion using a sparse convolutional neural network [ J ]. Medical & biological engineering & computing, 2019, 57 (4): 887-900" choose to add a deep learning technique to the conventional image fusion scheme, and N is fused with a high frequency coefficient for a high frequency fusion technique; the Denseuse comprises a convolution layer and a fusion layer dense block, an encoder is responsible for providing input for a network, and after the network obtains a characteristic diagram, a decoder reconstructs a fusion Image (Li H, wu X J. Denseuse: A fusion approach to input and visual images [ J ]. IEEE Transactions on Image Processing, 2018, 28 (5): 2614-2623.); the GCF is a multi-focus image fusion unsupervised model based on gradient and connected regions; the document "Chen M, zheng H, lu C, et al. A spatial-temporal fusion segmentation in DCE-MRI [ C ]// International Conference on Neural Information processing. Springer, cham, 2018: 358-368" extracts features by combining CNN and RNN, then fuses for segmentation, and introduces the generated confrontation network into the fusion of infrared and visible light images for the first time, wherein the purpose of the generator is to generate a fused image in which mainly infrared Information contains a small amount of visible light Information, and the purpose of the discriminator is to force the fused image to have more detailed Information in the visible light images; DDcGAN constructs a dual discriminator generation countermeasure network (Ma J, xu H, jiang J, et al. DDcGAN: A dual-discrete passive adaptive network for multi-resolution Image fusion [ J ]. IEEE Transactions on Image Processing, 2020, 29: 4980-4995); the document "Tang W, liu Y, zhang C, et al, green fluoro protein and phase-contrast image fusion visual general networks [ J ]. Computational and chemical Methods in Medicine, 2019, 5450373" proposes to combat network fusion biological images by generativity; PMGI extracts information using Image Gradient and contrast and performs feature reuse on the same path (Zhang H, xu H, xiao Y, et al. Retening the Image Fusion: A Fast Unified Image Fusion Network based on probability of Gradient and probability [ C ]// probabilities of the AAAI Conference on scientific probability, 2020, 34 (7): 12797-12804).
The research based on deep learning becomes an active topic in the field of image fusion in the last years, a plurality of fusion methods based on deep learning are proposed in succession and gradually form an important branch, although the methods have achieved good results, most of fusion rules are still designed manually, and the whole method can not get rid of the limitation of the traditional fusion method; the biggest obstacle to image fusion by using deep learning is missing real tag data, and the MRI-PET fusion task is difficult to directly acquire a real tag image.
Thus, while these prior efforts have been successful, there are still some disadvantages: (1) The deep learning framework is only used for making up for certain defects of the traditional fusion method, such as feature extraction, the design of the whole fusion method is still based on the traditional method, and the end-to-end result generation cannot be realized based on the traditional fusion framework which needs complex fusion rule design; (2) Due to the loss of label data, a solution depending on the design of a loss function is not comprehensive, and due to the limitation of a physical imaging process, a fusion task cannot obtain a real fusion image as a label, the existing deep learning method depends on artificial prior in a large amount, and a manually made pseudo data label is adopted, so that the performance of an algorithm is limited to a great extent; (3) The solution based on the traditional generation countermeasure network can only make the result similar to a source image, namely, only the pixel level L1 loss training is used for generating the network, and due to the existence of the Nash equilibrium theory, partial high-frequency detail information contained in the source image is lost.
Disclosure of Invention
In order to avoid the loss of spatial information in the image fusion process, the empty texture structure of the MRI and PET images is protected, and further the texture and detail information of the high-resolution image and the structure information of the low-resolution image are simultaneously saved, a PET-MRI image fusion method based on a self-adaptive countermeasure generation network is provided, which specifically comprises the following steps:
a) Mapping PET image from RGB space to YC space b C r Space and extracting Y components;
b) Inputting the Y component of the PET image and the MRI image into a generation network;
c) Respectively extracting a combined gradient map of the input image and a gradient map for generating a network output result by using a Laplace operator;
d) Inputting the two gradient maps into a discrimination network, and enabling the probability of a real input label to be 0.7-1.2 (soft label), and enabling the probability of a generated result input label to be 0-0.3 (soft label);
e) Training a generation network and a discrimination network based on the confrontation generation strategy;
f) Optimizing by adopting an Adam optimizer;
g) Obtaining a trained generation network model;
h) And predicting by using the network model.
Wherein, in connection with step a), a decorrelated color model YC is used b C r The model, which divides the image information into three channels: y channel, C b Channel and C r Channels representing luminance components of colors, and chrominance offset components of blue and red colors, respectively; y channel stores luminance information of image, C b Channel and C r The channel stores red and blue color difference information of the image; therefore, in image fusion iteration, only the information of the Y-channel component in the MRI image and the PET image needs to be processed, and both are gray level images; specifically, the transformation equation and the inverse transformation equation based on are respectively shown in the formulas (1) and (2):
the framework of the adaptive countermeasure generation network mainly comprises a generator, a discriminator and a regional residual error learning module, wherein the network can fuse a low-resolution Y component (PET _ Y) of a PET image with a grayscale image MRI with higher spatial resolution to obtain a fused image comprising abundant structural information and higher spatial resolution; in order to simultaneously store the texture and detail information of the high-resolution image and the structure information of the low-resolution image, a mechanism for adjusting a loss function is used for optimizing a prediction result; the general architecture of the adaptive countermeasure generation network is shown in fig. 1; wherein, the adaptation of the network is derived from a decision block shown in a preprocessing stage before the input of the discriminator network in fig. 1, and the decision block can also be referred to as a maximum matching algorithm shown in formula (3);
the maximum matching algorithm process is as follows: inputting a PET _ Y component Y and an MRI image M, extracting a Laplace gradient image of the PET _ Y component Y and the MRI image M through a Laplace operator, comparing pixel values pixel by pixel, taking the pixel value in each pixel of the two images as a gradient pixel after fusion, and finally calculating to obtain a combined gradient image; the decision block can guide the fusion result to approach the brightness and gradient distribution of the source image, and the principle is to evaluate the definition of each pixel so as to generate a screening image with effective information positions.
The structure of the generator, as shown in fig. 2, the generator is a two-branch fusion network, and divides the Y component of the PET image and the MRI image into two paths for processing respectively; the two-branch fusion network framework respectively uses a group of 3 x 3 convolution layers to carry out feature extraction, then deepens the network to carry out feature processing, and finally uses a group of 1 x 1 convolution layers to carry out reconstruction; wherein the first convolution layer extracts shallow features by equation (4):
wherein, the first and the second end of the pipe are connected with each other,H conv convolution operation representing convolution kernel of 5 × 5 in the shallow feature extraction layer; the second layer output can be obtained by equation (5):
wherein, whereinH LRLP Is a complex function of LRLR layer operations;F pre the convolution of each layer in the block is fully utilized to generate the local feature; the third layer output, as shown in equation (6):
wherein, the first and the second end of the pipe are connected with each other,H RL represents a residual join; lambda is the weight when residual errors are fused; the fourth layer and the third layer have the same principle, and the input of the fourth layer and the third layer is based on the output cascade of the first three layers; the fourth layer output, as shown in equation (7):
and then, the output characteristics of each layer are spliced and fused by using a 3 x 3 convolution, and the output formula is shown as the formula (8):
wherein, the first and the second end of the pipe are connected with each other,H concat representing a characteristic graph splicing operation; the last layer of the extraction module is set as a convolution of 1 multiplied by 1 after splicing the characteristic graphs, W is a weight matrix of the first four layers of the fusion extraction module, and then the output of two pathsF ext,1 ,F ext,2 Entering a fusion module to be fused,after the fusion operation, finally fusing the image, as shown in formula (9):
wherein the content of the first and second substances,H fuse representing the composite operation of the fusion module.
The structure of the discriminator is shown in fig. 3, two sources are input, the two input images are subjected to calculation of a gradient map through a laplacian operator, and then a combined gradient map obtained through a maximum function and a gradient map obtained by fusion of the images through the laplacian operator are used as two inputs of the discriminator; the four convolution layers and one linear layer form a discriminator of the model, the sizes of convolution kernels of the convolution layers are all set to be 3 multiplied by 3, the step length is set to be 4, and an ELU is used as an activation function; the last layer is a linear layer used for calculating probability so as to judge the truth of the generated data.
Region residual learning module (LRLP):
in the forward transmission process of the convolutional neural network, with the increase of the network depth, the information contained in the feature map obtained by convolution is gradually reduced, and in order to solve the problems, the method uses a regional residual error learning module, and the features contained in each layer are stored as much as possible through the direct mapping of information among different layers; the LRLP module is as shown in FIG. 4, and firstly obtains image features of different depths by connecting different convolutions c times in series, then performs weight splicing on the features of different depths after convolution, then performs compression reconstruction by using a 1 × 1 convolution layer, and finally performs activation by using an ELU (element-free unit), so that the features contained in each layer are saved as much as possible; if there are c convolutional layers, then its final output, as shown in equation (10):
wherein, the first and the second end of the pipe are connected with each other,F c is the output of the c convolutional layer;H concat a stitching function representing a feature map; w is a group representing each roll at the time of splicingA union function of the lamination weights;H active the ELU activation is carried out on the spliced data; in the LRLP block, the output of each previous layer is used as the input of the next layer, which not only preserves the feedforward characteristics, but also improves the utilization of the input data.
In step e), the discriminator defines the joint gradient map as real data, and performs continuous antagonistic learning with the gradient map of the fusion image defined as pseudo data, and the target function of GAN is defined as shown in formula (11):
in implementing antagonism learning in GAN, a group of distinguishable semi-supervised loss functions are designed, which are different from the fixed loss functions of the traditional deep learning.
Loss function of generator
The loss function of the generator is based on the antagonism loss, the pixel level euclidean loss and the texture loss, and is shown in equation (12):
wherein the content of the first and second substances,is a resistance loss from the generator-discriminator network;by using a screening chartOptimized pixel-level euclidean penalties;representing texture loss based on the gradient map;andthe weights are respectively pixel level loss and texture loss and are used for ensuring that the importance of the three loss functions is the same;
loss of antagonism
In order for the image generated by the generator to be closer to the ideal fused image, a loss needs to be established between the generator and the discriminator, and the traditional countermeasures to reduce the maximum-minimum problem into a maximum-minimum problemBut at the beginning of the training phase,saturation is possible, so maximization is used to train the generator network; in order to provide stronger gradient, a square operation is added on the basis of the maximum operation,the definition is shown in formula (13):
wherein M is the number of images of a batch during training; c, the discriminator identifies the rate-of-change label of the true and false images; the invention is to getIndicating that the calculation of the gradient map is performed by using the laplacian operator; m, Y represents the Y channel of the input MRI image and PET image;
pixel level euclidean loss
The invention utilizes Euclidean distance between the fused image and the original image pixel to restrict the intensity distribution of the fused image and the original image in a clear area, and pixel level Euclidean loss can be formulated as shown in formula (14):
wherein the content of the first and second substances,h,wis shown ashGo to the firstwPixel values of the columns; h, W are the height and width of the image, respectively;Map 1 ,Map 2 representing a sifting map generated by the decision block based on two input images;
texture loss
The gradient of the image may partially characterize the texture details, especially for sharp MRI images, thus requiring the fused image to have similar gradients as the input image, and in conjunction with the screening map, the texture loss may be formulated as shown in equation (15):
discriminator loss function
The invention designs a gradient map-based loss function for the discriminator, wherein 'false data' is the gradient map of the fused image and can be formulated as shown in an equation (16):
the "true data" required by the discriminator comes from the joint gradient map of the MRI and PET _ y constructs, formulated as shown in equation (17):
wherein abs represents an absolute value function; maximum represents the maximization function; based on the above two gradient maps, the loss function is expressed as shown in equation (18):
wherein a is a label of "false data" set to 0; b is a label of "true data" set to 1; so that the discriminator treats the combined gradient map of the image as true data and the gradient map of the fused image as false data; this constraint may guide the generator based onGrad union To adjustGrad fused The texture of the fused image is enhanced in the confrontation.
The invention has the following beneficial effects:
the invention provides a novel image fusion method by constructing a self-adaptive residual dense generation countermeasure network and combining a color space method based on YCbCr, the method enables the generation network to avoid gradient disappearance and gradient explosion, improves the network characteristic extraction performance, performs a countermeasure game between a fusion image gradient map and an input image combined gradient map, combines the designed countermeasure loss, discriminator loss, pixel level consistency loss and gradient consistency loss to obtain a fusion image with rich details and clear texture, does not need real data as a label for training, can fuse images with different resolutions under the condition of not introducing a traditional frame, greatly optimizes the fusion rule design of the traditional method, realizes the self-adaptive fusion without manual intervention, has more high-frequency details of the fusion image and greatly retains MRI pseudo-color content information, achieves the peak signal-to-noise ratio PSNR =55.2124 in the test of a Harvard institute of medicine/PET data set, achieves the structural similarity SSIM =0.4697, RMSE =0.1968 and the peak signal-to-noise ratio reaches PSNR = 55.4 abf =0.3635 and Q cv =2009.348, is superior to the most advanced algorithm at present, and is more helpful for assisting clinical application diagnosis.
Drawings
FIG. 1 is a schematic diagram of a countermeasure dense residual countermeasure generation network;
FIG. 2 is a schematic diagram of a generator network;
FIG. 3 is a schematic diagram of a network of discriminators;
FIG. 4 is a schematic diagram of a region residual learning module (LRLP);
fig. 5 is a schematic diagram of the qualitative comparison of the proposed method with other internationally leading methods.
Detailed Description
The invention is further illustrated by the following specific examples.
Example 1
The PET, MRI image used in this example is from the public data set of the website of the medical college of harvard university, wherein the MRI image is a single channel image of size 256 × 256; the PET image is a pseudo color image of size 256 × 256 × 3;
training a generator and a discriminator iteratively according to a antagonism process, setting the batch processing size as b, training an iteration to require k steps, and training M times with the ratio of the training number of the discriminator to the training number of the generator as p; the method is obtained through multiple tests: setting parameters in b =32, p =2, m =300, adrgan to be updated by adammoptimizer; to make the training of GAN more stable, a soft tag is used for the loss term parameter, which is set to a random number from 0.7 to 1.2 for the tag that should be set to 1;
the image is preprocessed from RGB channel to YC b C r Color space, since the Y channel (luminance channel) can represent structural details and luminance variations, only the Y channel needs to be fused; fusing C using a color space based approach b And C r A channel, and then inversely transforming the fusion component into an RGB channel; the experimental environment of this example is: windows 10, CPU AMD R5 5600X, memory 16G GPU RTX-3060 (6G); the software environments are Python 3.7.6 and Pytrch 1.10.0, and the training set, the verification set and the test set of the data set are as follows: 2:1, the specific training process is shown as algorithm 1:
quantitative evaluation index
The method and the comparison method are objectively evaluated by using five evaluation indexes, namely Q abf 、Q cv PSNR, SSIM and Rmse, Q abf The algorithm uses local metrics to estimate the performance of the important information input in the fused image, the higher the value, the better the quality of the fused image, as shown in equation (19):
wherein the content of the first and second substances,Wused for dividing local areas;λ(w)representing the local area weight;A,B,Ftwo input images and a fused image respectively;
Q cv the quality of the local area image is obtained by calculating the mean square error of the weighted difference image of the fused area image and the source area image, and finally, the fused image quality is the weighted sum of the quality measures of the local area image, and the formula is shown as the formula (20):
wherein the content of the first and second substances,Dis a local region similarity measure function;
the peak signal-to-noise ratio (PSNR) is the ratio of peak power to noise power in the fused image, reflects the distortion condition of the fused image, and is calculated according to the following formulas (21) to (24):
wherein the content of the first and second substances,MSErepresenting in the image as mean square erroriLine ofjPixels of a column; r represents the peak value of the fused image, and the larger the signal-to-noise ratio of the peak value is, the closer the fused image is to the source image;
the Structural Similarity (SSIM) is used to model the loss and distortion of images, and the index consists of three parts, respectively: correlation loss, contrast loss, brightness loss; the product of these three components is the result of the evaluation of the fused image, defined as follows:
wherein x and f respectively represent a source image and a block in the fused image;is the covariance between the two blocks;,standard Deviation (SD);u x ,u y representing the mean value between two blocks, additionally addingC1,C2,C3The loss function is more stable;
the Root Mean Square Error (RMSE) is based on MSE, and the quantitative description of the difference between the source image and the fused image is completed by calculating the mean square error of the source image and the result image, and the calculation is as shown in a formula (26):
quantitative and qualitative comparison results
In order to verify the effect of the model on PET-MRI image fusion and verify the robustness of the model, five methods of DDcMan, densefruse, GCF, IFCNN and PMGI are selected to be compared with the model, and the methods have good effect in the traditional medical image fusion;
the effect of the vision experiment of the six related methods is shown in fig. 4, in which the result of DDcGan (third column) has a problem of spectral distortion, and the edge is blurred compared with the present model; denseuse (fourth column) loses the intensity of colors in the PET image, loses partial functional information and increases the difficulty of searching for focuses; GCF (fifth column) color is well stored, but a plurality of images have large-area noise blocks, structure information is directly lost, clinical judgment can be misled due to information loss, and robustness is poor; IFCNN (sixth column) blurs lost details near the boundary line, and is not clear enough where the texture is dense; when PMGI (seventh column) is fused, although color intensity is high and functional information is completely stored, background blurring, high-frequency information loss and no texture detail exist; the fusion image has no problems, the structure and function information is well preserved, the details are clear and have high contrast, particularly, the contrast at the edge is obvious, the details at the dense texture position are clear, and the fusion image contains enough image information required by clinical diagnosis; since most of these methods attempt to sharpen the edge by direct object enhancement and gradient, the difference in naturalness and reality of the fused image is still large; in addition, almost all the methods rely on large data sets, and the structural content loss function and the antagonistic loss function provided by the invention respectively protect high-frequency information and content information, and improve the effect of image fusion through respective nonlinear loss constraints.
The fusion effect is evaluated qualitatively and only by subjective feelings of human eyes, so that the invention has great limitation, in order to objectively verify the superiority of the invention, the invention selects an objective evaluation mode to carry out quantitative evaluation on the experimental result, and the result is shown in table 1:
experimental methods | PSNR↑ | SSIM↑ | RMSE↓ | Q abf ↑ | Q cv ↓ |
DDcGan | 54.8162 | 0.3000 | 0.2146 | 0.1602 | 2534.607 |
DenseFuse | 55.1830 | 0.3628 | 0.1986 | 0.1368 | 2242.367 |
GCF | 54.4163 | 0.3347 | 0.2367 | 0.3401 | 2521.672 |
IFCNN | 54.4163 | 0.4160 | 0.2083 | 0.3516 | 2226.219 |
PMGI | 54.0151 | 0.1022 | 0.2581 | 0.0460 | 3469.525 |
OURS | 55.2124 | 0.4697 | 0.1968 | 0.3635 | 2009.348 |
It can be seen that the five indexes of the method are respectively superior to the other five comparison methods; q cv The indexes are based on the mean square error of a Human Visual System (HVS) and the region, and the model can adaptively judge the pixel weight by benefiting from the adaptive module, so that the region similarity is improved; compared with DDcGan, the invention reduces 20.7%, which proves that the human eye perception is stronger than other methods and the regional similarity is higher; the antagonism game enables the model to have excellent denoising capability, so that the PSNR is improved, and compared with other methods, the method provided by the invention has the advantages of less noise and less interference information; the SSIM index tends to verify structural information, and is increased by 11.4% compared with the most excellent IFCNN, the index is higher, so that the texture structure is completely preserved, fuzzy areas are few, and the PMGI structural similarity is only 21% of that of the method, and the structure preservation is poor; compared with qualitative comparison result, the invention adopts a ground pixel scale control strategy, and the pixelThe Euclidean distance between the two indexes is well controlled, and the pixel level fusion index Q abf The visual information is perfect, and the difference between the pixel levels of the fusion image and the source image is small; while a smaller Rmse indicates that the fused image of the present invention has less error and distortion.
The foregoing detailed description is intended to illustrate and not limit the invention, which is intended to be within the spirit and scope of the appended claims, and any changes and modifications that fall within the true spirit and scope of the invention are intended to be covered by the following claims.
Claims (1)
1. A PET-MRI image fusion method based on an adaptive countermeasure generation network is characterized by comprising the following steps:
a) Mapping PET image from RGB space to YC space b C r Space and extracting Y components; the transformation equation is shown in equation (1), and the inverse transformation equation is shown in equation (2):
b) Inputting the Y component of the PET image and the MRI image into a generator; the generator is a double-branch fusion network, and the input of the generator is the Y component of the PET image and the MRI grayscale image respectively; the double-branch fusion network framework respectively uses a group of 3 multiplied by 3 convolution layers to extract features; then deepening the network to perform feature processing; finally, a group of 1 × 1 convolution layers are used for reconstruction; the deepening network adopts a region residual error learning module LRLP in the process of carrying out feature processing, the module obtains different image features through c times of different convolutions at different network branches, then carries out weight splicing on the features after the convolutions, and finally uses ELU to activate, so that the features contained in each layer are saved as much as possible, and the process is as shown in formula (3):
wherein, the first and the second end of the pipe are connected with each other,F c is as followscThe output of each convolutional layer;H concat a stitching function representing a feature map;Wis a group of union functions representing the weight of each convolution layer during splicing;H active the ELU activation is performed on the spliced data, in an LRLP block, the output of each previous layer is used as the input of the next layer, the feedforward characteristic is kept, the utilization degree of the input data is improved, and then a characteristic extraction stage is entered, and the shallow characteristic is extracted by a first convolution layer through formula (4):
wherein, the first and the second end of the pipe are connected with each other,H conv convolution operation with convolution kernel of 5 × 5 in the shallow feature extraction layer, then the extracted shallow features are sent to the next layer, and the second layer output is obtained by equation (5):
wherein, the first and the second end of the pipe are connected with each other,H LRLP is a complex function of LRLP layer operations;F pre the method fully utilizes convolution generated by each layer in a block and takes the convolution as local features, then further convolution is carried out to realize deep feature extraction, and in the subsequent convolution layer, the input is all the previous layers and the output cascade of an LRLP module; meanwhile, parameter sharing is also set between the two paths; the third layer output, as shown in equation (6):
wherein, the first and the second end of the pipe are connected with each other,H RL represents a residual join; lambda is the weight when residual errors are fused; the fourth layer is the same as the third layer, the input of the fourth layer is based on the output cascade of the first three layers, and the output of the fourth layer is shown as the formula (7):
and then, the output characteristics of each layer are spliced and fused by using a 3 x 3 convolution, and the output formula is shown as the formula (8):
wherein the content of the first and second substances,H concat representing the splicing operation of the feature maps, setting the last layer of an extraction module as the convolution of 1 multiplied by 1 after the feature maps are spliced, W is the weight matrix of the front four layers of the fusion extraction module, and then outputting two pathsF ext,1 ,F ext,2 Entering a fusion module, and obtaining a fusion image after fusion operation, wherein the fusion image is shown as a formula (9):
wherein, the first and the second end of the pipe are connected with each other,H fuse representing a composite operation of the fusion module;
c) Respectively extracting the fused image obtained in the step b), the Y component of the PET image and the gradient map of MRI which are really input in the step a) by using a Laplace operator, and processing the extracted Y component of the PET image and the gradient map of MRI by a decision block, wherein the processing result of the decision block is shown as a formula (10);
wherein abs represents an absolute value function; maximum represents the maximization function, and the decision block processing procedure comprises: inputting a PET _ Y component Y and an MRI image M, extracting a Laplace gradient image of the PET _ Y component Y and the MRI image M through a Laplace operator, comparing pixel values pixel by pixel, taking the pixel value in each pixel of the two images as a gradient pixel after fusion, and finally calculating to obtain a combined gradient image;
d) Inputting the gradient map of the fused image extracted in the step c) and the combined gradient map obtained by calculation into a discriminator, and enabling the probability of a real input label to be 0.7-1.2, and enabling the probability of a generated result input label to be 0-0.3; the discriminator consists of four convolution layers and a linear layer, the sizes of convolution kernels of the convolution layers are all set to be 3 multiplied by 3, the step length is set to be 4, an ELU is used as an activation function, and the last layer is the linear layer and used for calculating probability so as to judge the truth of generated data;
e) Training a generation network and a discrimination network based on a countermeasure generation strategy, defining a joint gradient map as real data by a discriminator, and performing continuous countermeasure learning with a gradient map of a fusion image defined as pseudo data, wherein an objective function of GAN is defined as shown in formula (11): (11)
the loss function of the generator is shown in equation (12):
wherein the content of the first and second substances,is an antagonistic loss from the generator-discriminator network;
andthe weights of pixel-level loss and texture loss are used for ensuring that the importance of the three loss functions is the same;
the loss function of the discriminator is shown in equation (16):
wherein a is a label of 'false data' and is set to be 0-0.3; "false data" is a gradient map of the fused image, formulated as equation (17):
b is a label of 'true data' and is set to be 0.7-1.2; "true data" is from the joint gradient map of MRI and PET _ y constructed by the decision block, formulated as equation (10);
f) Optimizing by adopting an Adam optimizer;
g) Obtaining a trained confrontation generation network model;
h) And predicting by using the network model.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211094448.5A CN115457359A (en) | 2022-09-08 | 2022-09-08 | PET-MRI image fusion method based on adaptive countermeasure generation network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211094448.5A CN115457359A (en) | 2022-09-08 | 2022-09-08 | PET-MRI image fusion method based on adaptive countermeasure generation network |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115457359A true CN115457359A (en) | 2022-12-09 |
Family
ID=84302889
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211094448.5A Pending CN115457359A (en) | 2022-09-08 | 2022-09-08 | PET-MRI image fusion method based on adaptive countermeasure generation network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115457359A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116630762A (en) * | 2023-06-25 | 2023-08-22 | 山东卓业医疗科技有限公司 | Multi-mode medical image fusion method based on deep learning |
CN116862789A (en) * | 2023-06-29 | 2023-10-10 | 广州沙艾生物科技有限公司 | PET-MR image correction method |
-
2022
- 2022-09-08 CN CN202211094448.5A patent/CN115457359A/en active Pending
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116630762A (en) * | 2023-06-25 | 2023-08-22 | 山东卓业医疗科技有限公司 | Multi-mode medical image fusion method based on deep learning |
CN116630762B (en) * | 2023-06-25 | 2023-12-22 | 山东卓业医疗科技有限公司 | Multi-mode medical image fusion method based on deep learning |
CN116862789A (en) * | 2023-06-29 | 2023-10-10 | 广州沙艾生物科技有限公司 | PET-MR image correction method |
CN116862789B (en) * | 2023-06-29 | 2024-04-23 | 广州沙艾生物科技有限公司 | PET-MR image correction method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Liang et al. | MCFNet: Multi-layer concatenation fusion network for medical images fusion | |
CN115457359A (en) | PET-MRI image fusion method based on adaptive countermeasure generation network | |
CN111882514B (en) | Multi-mode medical image fusion method based on double-residual ultra-dense network | |
CN110060225B (en) | Medical image fusion method based on rapid finite shear wave transformation and sparse representation | |
CN113177882A (en) | Single-frame image super-resolution processing method based on diffusion model | |
Wu et al. | FW-GAN: Underwater image enhancement using generative adversarial network with multi-scale fusion | |
Cheng et al. | DDU-Net: A dual dense U-structure network for medical image segmentation | |
Zhou et al. | Multi-modal medical image fusion based on densely-connected high-resolution CNN and hybrid transformer | |
CN114187214A (en) | Infrared and visible light image fusion system and method | |
CN114140341A (en) | Magnetic resonance image non-uniform field correction method based on deep learning | |
CN113139585A (en) | Infrared and visible light image fusion method based on unified multi-scale dense connection network | |
CN115100093A (en) | Medical image fusion method based on gradient filtering | |
Yan et al. | Attention-guided dynamic multi-branch neural network for underwater image enhancement | |
CN114821259A (en) | Zero-learning medical image fusion method based on twin convolutional neural network | |
Yang et al. | Underwater image enhancement with latent consistency learning‐based color transfer | |
Li et al. | Multi-scale transformer network with edge-aware pre-training for cross-modality MR image synthesis | |
CN111667407A (en) | Image super-resolution method guided by depth information | |
CN111696042A (en) | Image super-resolution reconstruction method based on sample learning | |
CN113762277B (en) | Multiband infrared image fusion method based on Cascade-GAN | |
Chen et al. | Multi-level difference information replenishment for medical image fusion | |
CN117197627B (en) | Multi-mode image fusion method based on high-order degradation model | |
Wang et al. | AMFNet: An attention-guided generative adversarial network for multi-model image fusion | |
CN117333750A (en) | Spatial registration and local global multi-scale multi-modal medical image fusion method | |
CN113284079A (en) | Multi-modal medical image fusion method | |
Yu et al. | A multi-band image synchronous fusion method based on saliency |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |