CN115457359A - PET-MRI image fusion method based on adaptive countermeasure generation network - Google Patents

PET-MRI image fusion method based on adaptive countermeasure generation network Download PDF

Info

Publication number
CN115457359A
CN115457359A CN202211094448.5A CN202211094448A CN115457359A CN 115457359 A CN115457359 A CN 115457359A CN 202211094448 A CN202211094448 A CN 202211094448A CN 115457359 A CN115457359 A CN 115457359A
Authority
CN
China
Prior art keywords
image
layer
fusion
convolution
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211094448.5A
Other languages
Chinese (zh)
Inventor
刘尚旺
杨荔涵
刘国奇
申华磊
张新明
张非
李文凤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Henan Normal University
Original Assignee
Henan Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Henan Normal University filed Critical Henan Normal University
Priority to CN202211094448.5A priority Critical patent/CN115457359A/en
Publication of CN115457359A publication Critical patent/CN115457359A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4038Image mosaicing, e.g. composing plane images from plane sub-images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/56Extraction of image or video features relating to colour
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10072Tomographic images
    • G06T2207/10088Magnetic resonance imaging [MRI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10072Tomographic images
    • G06T2207/10104Positron emission tomography [PET]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Multimedia (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Image Analysis (AREA)

Abstract

The invention generates a countermeasure network by constructing an adaptive residual dense and combining YC-based b C r The method mainly adopts a regional residual error learning module and an output cascade deepening generation network to avoid feature loss, dynamically guides a generator to generate a fusion image with the same distribution as a source image through a self-adaptive decision block, and fuses an image gradient map and the source imagePerforming a antagonism game between the input images and the gradient maps to efficiently train the generator and the discriminator so as to obtain a fused image with rich details and clear textures; the method has the characteristics of end-to-end and unsupervised, manual intervention is not needed, real data is not needed to be used as a label, and images with different resolutions can be fused under the condition of not introducing a traditional frame; the peak signal-to-noise ratio and the structural similarity respectively reach 55.2124 and 0.4697 in the test of an MRI/PET data set of Harvard medical institute, are superior to the most advanced algorithm at present, and are more favorable for clinical application diagnosis.

Description

PET-MRI image fusion method based on adaptive countermeasure generation network
Technical Field
The invention belongs to the technical field of medical imaging, and particularly relates to a PET-MRI image fusion method based on a self-adaptive countermeasure generation network.
Background
Medical images are divided into structural systems and functional systems, and Imaging mechanisms of different systems can acquire different Imaging information of the same part, for example, magnetic Resonance Imaging (MRI) images can provide high-resolution brain soft tissue structural information, and Positron Emission Tomography (PET) images can reflect color information of metabolism and functional conditions of tissues, but the images of different modalities are limited by the MRI images, the MRI images lack movement information such as body metabolism, the PET images are low in resolution, and focuses cannot be accurately positioned; different images have specific characteristics, and the limited information of the single-mode images hardly meets the requirement of clinical diagnosis and treatment on information quantity, and the images from multiple imaging mechanisms are required to be fused; in recent years, the success of PET-MRI fusion imaging in the clinical field has led to a great interest in non-invasive functional imaging and anatomical imaging.
In the fusion process, the spatial information of the MRI image and the spectral information of the PET image need to be retained, or the spatial information existing in the MRI data needs to be introduced into the PET, so that the limitation of the single-mode medical image is overcome, the imaging quality is improved while the image characteristics are retained, and the clinical applicability of the image in diagnosing and evaluating medical problems is improved.
The most widely applied technology in the field of traditional medical image fusion is pixel-level fusion, which can be divided into two types, namely a spatial domain and a transform domain, wherein the former fusion rule directly acts on pixels, and the rule is simple, but the fusion effect is poor, for example, in He C T, liu Q X, li H L, et al, multimodal media image fusion on IHS and PCA [ J ]. Procedia Engineering, 2010, 7: 280-285", the image is converted into a luminance, chrominance, saturation (Intensity, hue, saturation, IHS) channel, and the IHS transformation causes spectral and spatial distortion; the transform domain-based image fusion technology mostly adopts a multi-scale transform (MST) technology, and is divided into three processes of decomposition, fusion and reconstruction; the method comprises the following steps that a source image is firstly transformed to a frequency domain and fused according to a certain rule, and then the fused coefficient and a transformation base are used for image reconstruction; the method well protects the detail information of the source image, but neglects the space consistency, and leads to the distortion of the brightness and the color of the fused image; the rules of the traditional fusion method need to be artificially designed and selected, different filter parameters are selected from MST, and the obtained fusion effect has great difference; however, due to the diversity of feature extraction and the complexity of the fusion rules, it becomes difficult to manually design the fusion method, so that the model robustness is reduced.
With the rise of deep learning in recent years, a neural network is used to solve the above problems, a Convolutional Neural Network (CNN) model is mostly used for image fusion based on the existing deep learning, research based on the deep learning in the field of image fusion is gradually activated in the last years, and scholars successively propose a plurality of fusion methods, and an important branch is gradually formed; in some approaches, a deep learning framework is used to extract Image features for reconstruction in an end-to-end manner, typically the document "Liu Y, chen X, ward R K, et al. Image fusion with a convolutional sparse representation [ J ]. IEEE signal processing letters, 2016, 23 (12): 1882-1886" applies a Convolutional Sparse Representation (CSR) to Image fusion, extracts multi-layer features, and generates a fused Image using these features; IFCNN adds convolutional neural networks to transform domain image Fusion algorithms (Zhang Y, liu Y, sun P, et al. IFCNN: A genetic image Fusion frame based on volumetric neural network [ J ] Information Fusion, 2020, 54: 99-118); the documents "Young A S, omar Z, shell U. An improved adaptation for a Medical image fusion using a sparse representation and a simple convolutional neural network [ J ]. Biological Signal Processing and Control, 2022, 72: 103357" propose a fusion method for Medical images based on a sparse representation and a twin convolutional neural network, the documents "Hou R, zhou D, nie R, et al. Brain CT and MRI Medical image fusion using a sparse convolutional neural network [ J ]. Medical & biological engineering & computing, 2019, 57 (4): 887-900" choose to add a deep learning technique to the conventional image fusion scheme, and N is fused with a high frequency coefficient for a high frequency fusion technique; the Denseuse comprises a convolution layer and a fusion layer dense block, an encoder is responsible for providing input for a network, and after the network obtains a characteristic diagram, a decoder reconstructs a fusion Image (Li H, wu X J. Denseuse: A fusion approach to input and visual images [ J ]. IEEE Transactions on Image Processing, 2018, 28 (5): 2614-2623.); the GCF is a multi-focus image fusion unsupervised model based on gradient and connected regions; the document "Chen M, zheng H, lu C, et al. A spatial-temporal fusion segmentation in DCE-MRI [ C ]// International Conference on Neural Information processing. Springer, cham, 2018: 358-368" extracts features by combining CNN and RNN, then fuses for segmentation, and introduces the generated confrontation network into the fusion of infrared and visible light images for the first time, wherein the purpose of the generator is to generate a fused image in which mainly infrared Information contains a small amount of visible light Information, and the purpose of the discriminator is to force the fused image to have more detailed Information in the visible light images; DDcGAN constructs a dual discriminator generation countermeasure network (Ma J, xu H, jiang J, et al. DDcGAN: A dual-discrete passive adaptive network for multi-resolution Image fusion [ J ]. IEEE Transactions on Image Processing, 2020, 29: 4980-4995); the document "Tang W, liu Y, zhang C, et al, green fluoro protein and phase-contrast image fusion visual general networks [ J ]. Computational and chemical Methods in Medicine, 2019, 5450373" proposes to combat network fusion biological images by generativity; PMGI extracts information using Image Gradient and contrast and performs feature reuse on the same path (Zhang H, xu H, xiao Y, et al. Retening the Image Fusion: A Fast Unified Image Fusion Network based on probability of Gradient and probability [ C ]// probabilities of the AAAI Conference on scientific probability, 2020, 34 (7): 12797-12804).
The research based on deep learning becomes an active topic in the field of image fusion in the last years, a plurality of fusion methods based on deep learning are proposed in succession and gradually form an important branch, although the methods have achieved good results, most of fusion rules are still designed manually, and the whole method can not get rid of the limitation of the traditional fusion method; the biggest obstacle to image fusion by using deep learning is missing real tag data, and the MRI-PET fusion task is difficult to directly acquire a real tag image.
Thus, while these prior efforts have been successful, there are still some disadvantages: (1) The deep learning framework is only used for making up for certain defects of the traditional fusion method, such as feature extraction, the design of the whole fusion method is still based on the traditional method, and the end-to-end result generation cannot be realized based on the traditional fusion framework which needs complex fusion rule design; (2) Due to the loss of label data, a solution depending on the design of a loss function is not comprehensive, and due to the limitation of a physical imaging process, a fusion task cannot obtain a real fusion image as a label, the existing deep learning method depends on artificial prior in a large amount, and a manually made pseudo data label is adopted, so that the performance of an algorithm is limited to a great extent; (3) The solution based on the traditional generation countermeasure network can only make the result similar to a source image, namely, only the pixel level L1 loss training is used for generating the network, and due to the existence of the Nash equilibrium theory, partial high-frequency detail information contained in the source image is lost.
Disclosure of Invention
In order to avoid the loss of spatial information in the image fusion process, the empty texture structure of the MRI and PET images is protected, and further the texture and detail information of the high-resolution image and the structure information of the low-resolution image are simultaneously saved, a PET-MRI image fusion method based on a self-adaptive countermeasure generation network is provided, which specifically comprises the following steps:
a) Mapping PET image from RGB space to YC space b C r Space and extracting Y components;
b) Inputting the Y component of the PET image and the MRI image into a generation network;
c) Respectively extracting a combined gradient map of the input image and a gradient map for generating a network output result by using a Laplace operator;
d) Inputting the two gradient maps into a discrimination network, and enabling the probability of a real input label to be 0.7-1.2 (soft label), and enabling the probability of a generated result input label to be 0-0.3 (soft label);
e) Training a generation network and a discrimination network based on the confrontation generation strategy;
f) Optimizing by adopting an Adam optimizer;
g) Obtaining a trained generation network model;
h) And predicting by using the network model.
Wherein, in connection with step a), a decorrelated color model YC is used b C r The model, which divides the image information into three channels: y channel, C b Channel and C r Channels representing luminance components of colors, and chrominance offset components of blue and red colors, respectively; y channel stores luminance information of image, C b Channel and C r The channel stores red and blue color difference information of the image; therefore, in image fusion iteration, only the information of the Y-channel component in the MRI image and the PET image needs to be processed, and both are gray level images; specifically, the transformation equation and the inverse transformation equation based on are respectively shown in the formulas (1) and (2):
Figure DEST_PATH_IMAGE002
(1)
Figure DEST_PATH_IMAGE004
(2)
the framework of the adaptive countermeasure generation network mainly comprises a generator, a discriminator and a regional residual error learning module, wherein the network can fuse a low-resolution Y component (PET _ Y) of a PET image with a grayscale image MRI with higher spatial resolution to obtain a fused image comprising abundant structural information and higher spatial resolution; in order to simultaneously store the texture and detail information of the high-resolution image and the structure information of the low-resolution image, a mechanism for adjusting a loss function is used for optimizing a prediction result; the general architecture of the adaptive countermeasure generation network is shown in fig. 1; wherein, the adaptation of the network is derived from a decision block shown in a preprocessing stage before the input of the discriminator network in fig. 1, and the decision block can also be referred to as a maximum matching algorithm shown in formula (3);
Figure DEST_PATH_IMAGE006
(3)
the maximum matching algorithm process is as follows: inputting a PET _ Y component Y and an MRI image M, extracting a Laplace gradient image of the PET _ Y component Y and the MRI image M through a Laplace operator, comparing pixel values pixel by pixel, taking the pixel value in each pixel of the two images as a gradient pixel after fusion, and finally calculating to obtain a combined gradient image; the decision block can guide the fusion result to approach the brightness and gradient distribution of the source image, and the principle is to evaluate the definition of each pixel so as to generate a screening image with effective information positions.
The structure of the generator, as shown in fig. 2, the generator is a two-branch fusion network, and divides the Y component of the PET image and the MRI image into two paths for processing respectively; the two-branch fusion network framework respectively uses a group of 3 x 3 convolution layers to carry out feature extraction, then deepens the network to carry out feature processing, and finally uses a group of 1 x 1 convolution layers to carry out reconstruction; wherein the first convolution layer extracts shallow features by equation (4):
Figure DEST_PATH_IMAGE008
(4)
wherein, the first and the second end of the pipe are connected with each other,H conv convolution operation representing convolution kernel of 5 × 5 in the shallow feature extraction layer; the second layer output can be obtained by equation (5):
Figure DEST_PATH_IMAGE010
(5)
wherein, whereinH LRLP Is a complex function of LRLR layer operations;F pre the convolution of each layer in the block is fully utilized to generate the local feature; the third layer output, as shown in equation (6):
Figure DEST_PATH_IMAGE012
(6)
wherein, the first and the second end of the pipe are connected with each other,H RL represents a residual join; lambda is the weight when residual errors are fused; the fourth layer and the third layer have the same principle, and the input of the fourth layer and the third layer is based on the output cascade of the first three layers; the fourth layer output, as shown in equation (7):
Figure DEST_PATH_IMAGE014
(7)
and then, the output characteristics of each layer are spliced and fused by using a 3 x 3 convolution, and the output formula is shown as the formula (8):
Figure DEST_PATH_IMAGE016
(8)
wherein, the first and the second end of the pipe are connected with each other,H concat representing a characteristic graph splicing operation; the last layer of the extraction module is set as a convolution of 1 multiplied by 1 after splicing the characteristic graphs, W is a weight matrix of the first four layers of the fusion extraction module, and then the output of two pathsF ext,1 ,F ext,2 Entering a fusion module to be fused,after the fusion operation, finally fusing the image, as shown in formula (9):
Figure DEST_PATH_IMAGE018
(9)
wherein the content of the first and second substances,H fuse representing the composite operation of the fusion module.
The structure of the discriminator is shown in fig. 3, two sources are input, the two input images are subjected to calculation of a gradient map through a laplacian operator, and then a combined gradient map obtained through a maximum function and a gradient map obtained by fusion of the images through the laplacian operator are used as two inputs of the discriminator; the four convolution layers and one linear layer form a discriminator of the model, the sizes of convolution kernels of the convolution layers are all set to be 3 multiplied by 3, the step length is set to be 4, and an ELU is used as an activation function; the last layer is a linear layer used for calculating probability so as to judge the truth of the generated data.
Region residual learning module (LRLP):
in the forward transmission process of the convolutional neural network, with the increase of the network depth, the information contained in the feature map obtained by convolution is gradually reduced, and in order to solve the problems, the method uses a regional residual error learning module, and the features contained in each layer are stored as much as possible through the direct mapping of information among different layers; the LRLP module is as shown in FIG. 4, and firstly obtains image features of different depths by connecting different convolutions c times in series, then performs weight splicing on the features of different depths after convolution, then performs compression reconstruction by using a 1 × 1 convolution layer, and finally performs activation by using an ELU (element-free unit), so that the features contained in each layer are saved as much as possible; if there are c convolutional layers, then its final output, as shown in equation (10):
Figure DEST_PATH_IMAGE020
(10)
wherein, the first and the second end of the pipe are connected with each other,F c is the output of the c convolutional layer;H concat a stitching function representing a feature map; w is a group representing each roll at the time of splicingA union function of the lamination weights;H active the ELU activation is carried out on the spliced data; in the LRLP block, the output of each previous layer is used as the input of the next layer, which not only preserves the feedforward characteristics, but also improves the utilization of the input data.
In step e), the discriminator defines the joint gradient map as real data, and performs continuous antagonistic learning with the gradient map of the fusion image defined as pseudo data, and the target function of GAN is defined as shown in formula (11):
Figure DEST_PATH_IMAGE022
Figure DEST_PATH_IMAGE024
(11)
in implementing antagonism learning in GAN, a group of distinguishable semi-supervised loss functions are designed, which are different from the fixed loss functions of the traditional deep learning.
Loss function of generator
The loss function of the generator is based on the antagonism loss, the pixel level euclidean loss and the texture loss, and is shown in equation (12):
Figure DEST_PATH_IMAGE026
(12)
wherein the content of the first and second substances,
Figure DEST_PATH_IMAGE028
is a resistance loss from the generator-discriminator network;
Figure DEST_PATH_IMAGE030
by using a screening chartOptimized pixel-level euclidean penalties;
Figure DEST_PATH_IMAGE032
representing texture loss based on the gradient map;
Figure DEST_PATH_IMAGE034
and
Figure DEST_PATH_IMAGE036
the weights are respectively pixel level loss and texture loss and are used for ensuring that the importance of the three loss functions is the same;
loss of antagonism
In order for the image generated by the generator to be closer to the ideal fused image, a loss needs to be established between the generator and the discriminator, and the traditional countermeasures to reduce the maximum-minimum problem into a maximum-minimum problem
Figure DEST_PATH_IMAGE038
But at the beginning of the training phase,
Figure DEST_PATH_IMAGE040
saturation is possible, so maximization is used to train the generator network; in order to provide stronger gradient, a square operation is added on the basis of the maximum operation,
Figure DEST_PATH_IMAGE042
the definition is shown in formula (13):
Figure DEST_PATH_IMAGE044
(13)
wherein M is the number of images of a batch during training; c, the discriminator identifies the rate-of-change label of the true and false images; the invention is to get
Figure DEST_PATH_IMAGE046
Indicating that the calculation of the gradient map is performed by using the laplacian operator; m, Y represents the Y channel of the input MRI image and PET image;
pixel level euclidean loss
The invention utilizes Euclidean distance between the fused image and the original image pixel to restrict the intensity distribution of the fused image and the original image in a clear area, and pixel level Euclidean loss can be formulated as shown in formula (14):
Figure DEST_PATH_IMAGE048
Figure DEST_PATH_IMAGE050
(14)
wherein the content of the first and second substances,h,wis shown ashGo to the firstwPixel values of the columns; h, W are the height and width of the image, respectively;Map 1 Map 2 representing a sifting map generated by the decision block based on two input images;
texture loss
The gradient of the image may partially characterize the texture details, especially for sharp MRI images, thus requiring the fused image to have similar gradients as the input image, and in conjunction with the screening map, the texture loss may be formulated as shown in equation (15):
Figure DEST_PATH_IMAGE052
Figure DEST_PATH_IMAGE054
(15)。
discriminator loss function
The invention designs a gradient map-based loss function for the discriminator, wherein 'false data' is the gradient map of the fused image and can be formulated as shown in an equation (16):
Figure DEST_PATH_IMAGE056
(16)
the "true data" required by the discriminator comes from the joint gradient map of the MRI and PET _ y constructs, formulated as shown in equation (17):
Figure DEST_PATH_IMAGE058
(17)
wherein abs represents an absolute value function; maximum represents the maximization function; based on the above two gradient maps, the loss function is expressed as shown in equation (18):
Figure DEST_PATH_IMAGE060
(18)
wherein a is a label of "false data" set to 0; b is a label of "true data" set to 1; so that the discriminator treats the combined gradient map of the image as true data and the gradient map of the fused image as false data; this constraint may guide the generator based onGrad union To adjustGrad fused The texture of the fused image is enhanced in the confrontation.
The invention has the following beneficial effects:
the invention provides a novel image fusion method by constructing a self-adaptive residual dense generation countermeasure network and combining a color space method based on YCbCr, the method enables the generation network to avoid gradient disappearance and gradient explosion, improves the network characteristic extraction performance, performs a countermeasure game between a fusion image gradient map and an input image combined gradient map, combines the designed countermeasure loss, discriminator loss, pixel level consistency loss and gradient consistency loss to obtain a fusion image with rich details and clear texture, does not need real data as a label for training, can fuse images with different resolutions under the condition of not introducing a traditional frame, greatly optimizes the fusion rule design of the traditional method, realizes the self-adaptive fusion without manual intervention, has more high-frequency details of the fusion image and greatly retains MRI pseudo-color content information, achieves the peak signal-to-noise ratio PSNR =55.2124 in the test of a Harvard institute of medicine/PET data set, achieves the structural similarity SSIM =0.4697, RMSE =0.1968 and the peak signal-to-noise ratio reaches PSNR = 55.4 abf =0.3635 and Q cv =2009.348, is superior to the most advanced algorithm at present, and is more helpful for assisting clinical application diagnosis.
Drawings
FIG. 1 is a schematic diagram of a countermeasure dense residual countermeasure generation network;
FIG. 2 is a schematic diagram of a generator network;
FIG. 3 is a schematic diagram of a network of discriminators;
FIG. 4 is a schematic diagram of a region residual learning module (LRLP);
fig. 5 is a schematic diagram of the qualitative comparison of the proposed method with other internationally leading methods.
Detailed Description
The invention is further illustrated by the following specific examples.
Example 1
The PET, MRI image used in this example is from the public data set of the website of the medical college of harvard university, wherein the MRI image is a single channel image of size 256 × 256; the PET image is a pseudo color image of size 256 × 256 × 3;
training a generator and a discriminator iteratively according to a antagonism process, setting the batch processing size as b, training an iteration to require k steps, and training M times with the ratio of the training number of the discriminator to the training number of the generator as p; the method is obtained through multiple tests: setting parameters in b =32, p =2, m =300, adrgan to be updated by adammoptimizer; to make the training of GAN more stable, a soft tag is used for the loss term parameter, which is set to a random number from 0.7 to 1.2 for the tag that should be set to 1;
the image is preprocessed from RGB channel to YC b C r Color space, since the Y channel (luminance channel) can represent structural details and luminance variations, only the Y channel needs to be fused; fusing C using a color space based approach b And C r A channel, and then inversely transforming the fusion component into an RGB channel; the experimental environment of this example is: windows 10, CPU AMD R5 5600X, memory 16G GPU RTX-3060 (6G); the software environments are Python 3.7.6 and Pytrch 1.10.0, and the training set, the verification set and the test set of the data set are as follows: 2:1, the specific training process is shown as algorithm 1:
Figure DEST_PATH_IMAGE062
quantitative evaluation index
The method and the comparison method are objectively evaluated by using five evaluation indexes, namely Q abf 、Q cv PSNR, SSIM and Rmse, Q abf The algorithm uses local metrics to estimate the performance of the important information input in the fused image, the higher the value, the better the quality of the fused image, as shown in equation (19):
Figure DEST_PATH_IMAGE064
(19)
wherein the content of the first and second substances,Wused for dividing local areas;λ(w)representing the local area weight;A,B,Ftwo input images and a fused image respectively;
Q cv the quality of the local area image is obtained by calculating the mean square error of the weighted difference image of the fused area image and the source area image, and finally, the fused image quality is the weighted sum of the quality measures of the local area image, and the formula is shown as the formula (20):
Figure DEST_PATH_IMAGE066
(20)
wherein the content of the first and second substances,Dis a local region similarity measure function;
the peak signal-to-noise ratio (PSNR) is the ratio of peak power to noise power in the fused image, reflects the distortion condition of the fused image, and is calculated according to the following formulas (21) to (24):
Figure DEST_PATH_IMAGE068
(21)
Figure DEST_PATH_IMAGE070
(22)
Figure DEST_PATH_IMAGE072
(23)
Figure DEST_PATH_IMAGE074
(24)
wherein the content of the first and second substances,MSErepresenting in the image as mean square erroriLine ofjPixels of a column; r represents the peak value of the fused image, and the larger the signal-to-noise ratio of the peak value is, the closer the fused image is to the source image;
the Structural Similarity (SSIM) is used to model the loss and distortion of images, and the index consists of three parts, respectively: correlation loss, contrast loss, brightness loss; the product of these three components is the result of the evaluation of the fused image, defined as follows:
Figure DEST_PATH_IMAGE076
(25)
wherein x and f respectively represent a source image and a block in the fused image;
Figure DEST_PATH_IMAGE078
is the covariance between the two blocks;
Figure DEST_PATH_IMAGE080
Figure DEST_PATH_IMAGE082
standard Deviation (SD);u x u y representing the mean value between two blocks, additionally addingC1,C2,C3The loss function is more stable;
the Root Mean Square Error (RMSE) is based on MSE, and the quantitative description of the difference between the source image and the fused image is completed by calculating the mean square error of the source image and the result image, and the calculation is as shown in a formula (26):
Figure DEST_PATH_IMAGE084
(26)
quantitative and qualitative comparison results
In order to verify the effect of the model on PET-MRI image fusion and verify the robustness of the model, five methods of DDcMan, densefruse, GCF, IFCNN and PMGI are selected to be compared with the model, and the methods have good effect in the traditional medical image fusion;
the effect of the vision experiment of the six related methods is shown in fig. 4, in which the result of DDcGan (third column) has a problem of spectral distortion, and the edge is blurred compared with the present model; denseuse (fourth column) loses the intensity of colors in the PET image, loses partial functional information and increases the difficulty of searching for focuses; GCF (fifth column) color is well stored, but a plurality of images have large-area noise blocks, structure information is directly lost, clinical judgment can be misled due to information loss, and robustness is poor; IFCNN (sixth column) blurs lost details near the boundary line, and is not clear enough where the texture is dense; when PMGI (seventh column) is fused, although color intensity is high and functional information is completely stored, background blurring, high-frequency information loss and no texture detail exist; the fusion image has no problems, the structure and function information is well preserved, the details are clear and have high contrast, particularly, the contrast at the edge is obvious, the details at the dense texture position are clear, and the fusion image contains enough image information required by clinical diagnosis; since most of these methods attempt to sharpen the edge by direct object enhancement and gradient, the difference in naturalness and reality of the fused image is still large; in addition, almost all the methods rely on large data sets, and the structural content loss function and the antagonistic loss function provided by the invention respectively protect high-frequency information and content information, and improve the effect of image fusion through respective nonlinear loss constraints.
The fusion effect is evaluated qualitatively and only by subjective feelings of human eyes, so that the invention has great limitation, in order to objectively verify the superiority of the invention, the invention selects an objective evaluation mode to carry out quantitative evaluation on the experimental result, and the result is shown in table 1:
experimental methods PSNR↑ SSIM↑ RMSE↓ Q abf Q cv
DDcGan 54.8162 0.3000 0.2146 0.1602 2534.607
DenseFuse 55.1830 0.3628 0.1986 0.1368 2242.367
GCF 54.4163 0.3347 0.2367 0.3401 2521.672
IFCNN 54.4163 0.4160 0.2083 0.3516 2226.219
PMGI 54.0151 0.1022 0.2581 0.0460 3469.525
OURS 55.2124 0.4697 0.1968 0.3635 2009.348
It can be seen that the five indexes of the method are respectively superior to the other five comparison methods; q cv The indexes are based on the mean square error of a Human Visual System (HVS) and the region, and the model can adaptively judge the pixel weight by benefiting from the adaptive module, so that the region similarity is improved; compared with DDcGan, the invention reduces 20.7%, which proves that the human eye perception is stronger than other methods and the regional similarity is higher; the antagonism game enables the model to have excellent denoising capability, so that the PSNR is improved, and compared with other methods, the method provided by the invention has the advantages of less noise and less interference information; the SSIM index tends to verify structural information, and is increased by 11.4% compared with the most excellent IFCNN, the index is higher, so that the texture structure is completely preserved, fuzzy areas are few, and the PMGI structural similarity is only 21% of that of the method, and the structure preservation is poor; compared with qualitative comparison result, the invention adopts a ground pixel scale control strategy, and the pixelThe Euclidean distance between the two indexes is well controlled, and the pixel level fusion index Q abf The visual information is perfect, and the difference between the pixel levels of the fusion image and the source image is small; while a smaller Rmse indicates that the fused image of the present invention has less error and distortion.
The foregoing detailed description is intended to illustrate and not limit the invention, which is intended to be within the spirit and scope of the appended claims, and any changes and modifications that fall within the true spirit and scope of the invention are intended to be covered by the following claims.

Claims (1)

1. A PET-MRI image fusion method based on an adaptive countermeasure generation network is characterized by comprising the following steps:
a) Mapping PET image from RGB space to YC space b C r Space and extracting Y components; the transformation equation is shown in equation (1), and the inverse transformation equation is shown in equation (2):
Figure 103008DEST_PATH_IMAGE002
(1)
Figure 348045DEST_PATH_IMAGE004
(2)
b) Inputting the Y component of the PET image and the MRI image into a generator; the generator is a double-branch fusion network, and the input of the generator is the Y component of the PET image and the MRI grayscale image respectively; the double-branch fusion network framework respectively uses a group of 3 multiplied by 3 convolution layers to extract features; then deepening the network to perform feature processing; finally, a group of 1 × 1 convolution layers are used for reconstruction; the deepening network adopts a region residual error learning module LRLP in the process of carrying out feature processing, the module obtains different image features through c times of different convolutions at different network branches, then carries out weight splicing on the features after the convolutions, and finally uses ELU to activate, so that the features contained in each layer are saved as much as possible, and the process is as shown in formula (3):
Figure 721258DEST_PATH_IMAGE006
(3)
wherein, the first and the second end of the pipe are connected with each other,F c is as followscThe output of each convolutional layer;H concat a stitching function representing a feature map;Wis a group of union functions representing the weight of each convolution layer during splicing;H active the ELU activation is performed on the spliced data, in an LRLP block, the output of each previous layer is used as the input of the next layer, the feedforward characteristic is kept, the utilization degree of the input data is improved, and then a characteristic extraction stage is entered, and the shallow characteristic is extracted by a first convolution layer through formula (4):
Figure 319729DEST_PATH_IMAGE008
(4)
wherein, the first and the second end of the pipe are connected with each other,H conv convolution operation with convolution kernel of 5 × 5 in the shallow feature extraction layer, then the extracted shallow features are sent to the next layer, and the second layer output is obtained by equation (5):
Figure 727577DEST_PATH_IMAGE010
(5)
wherein, the first and the second end of the pipe are connected with each other,H LRLP is a complex function of LRLP layer operations;F pre the method fully utilizes convolution generated by each layer in a block and takes the convolution as local features, then further convolution is carried out to realize deep feature extraction, and in the subsequent convolution layer, the input is all the previous layers and the output cascade of an LRLP module; meanwhile, parameter sharing is also set between the two paths; the third layer output, as shown in equation (6):
Figure 376470DEST_PATH_IMAGE012
(6)
wherein, the first and the second end of the pipe are connected with each other,H RL represents a residual join; lambda is the weight when residual errors are fused; the fourth layer is the same as the third layer, the input of the fourth layer is based on the output cascade of the first three layers, and the output of the fourth layer is shown as the formula (7):
Figure 112345DEST_PATH_IMAGE014
(7)
and then, the output characteristics of each layer are spliced and fused by using a 3 x 3 convolution, and the output formula is shown as the formula (8):
Figure 904721DEST_PATH_IMAGE016
(8)
wherein the content of the first and second substances,H concat representing the splicing operation of the feature maps, setting the last layer of an extraction module as the convolution of 1 multiplied by 1 after the feature maps are spliced, W is the weight matrix of the front four layers of the fusion extraction module, and then outputting two pathsF ext,1 F ext,2 Entering a fusion module, and obtaining a fusion image after fusion operation, wherein the fusion image is shown as a formula (9):
Figure 776862DEST_PATH_IMAGE018
(9)
wherein, the first and the second end of the pipe are connected with each other,H fuse representing a composite operation of the fusion module;
c) Respectively extracting the fused image obtained in the step b), the Y component of the PET image and the gradient map of MRI which are really input in the step a) by using a Laplace operator, and processing the extracted Y component of the PET image and the gradient map of MRI by a decision block, wherein the processing result of the decision block is shown as a formula (10);
Figure 363701DEST_PATH_IMAGE020
(10)
wherein abs represents an absolute value function; maximum represents the maximization function, and the decision block processing procedure comprises: inputting a PET _ Y component Y and an MRI image M, extracting a Laplace gradient image of the PET _ Y component Y and the MRI image M through a Laplace operator, comparing pixel values pixel by pixel, taking the pixel value in each pixel of the two images as a gradient pixel after fusion, and finally calculating to obtain a combined gradient image;
d) Inputting the gradient map of the fused image extracted in the step c) and the combined gradient map obtained by calculation into a discriminator, and enabling the probability of a real input label to be 0.7-1.2, and enabling the probability of a generated result input label to be 0-0.3; the discriminator consists of four convolution layers and a linear layer, the sizes of convolution kernels of the convolution layers are all set to be 3 multiplied by 3, the step length is set to be 4, an ELU is used as an activation function, and the last layer is the linear layer and used for calculating probability so as to judge the truth of generated data;
e) Training a generation network and a discrimination network based on a countermeasure generation strategy, defining a joint gradient map as real data by a discriminator, and performing continuous countermeasure learning with a gradient map of a fusion image defined as pseudo data, wherein an objective function of GAN is defined as shown in formula (11):
Figure DEST_PATH_IMAGE021
Figure 914768DEST_PATH_IMAGE022
(11)
the loss function of the generator is shown in equation (12):
Figure 746720DEST_PATH_IMAGE024
(12)
wherein the content of the first and second substances,
Figure 473368DEST_PATH_IMAGE026
is an antagonistic loss from the generator-discriminator network;
Figure 496687DEST_PATH_IMAGE028
(13)
Figure 207154DEST_PATH_IMAGE030
is pixel level euclidean loss optimized using the filter graph;
Figure DEST_PATH_IMAGE031
Figure 403649DEST_PATH_IMAGE032
(14)
Figure 250383DEST_PATH_IMAGE034
representing texture loss based on the gradient map;
Figure DEST_PATH_IMAGE035
Figure 158120DEST_PATH_IMAGE036
(15)
Figure 214938DEST_PATH_IMAGE038
and
Figure 887227DEST_PATH_IMAGE040
the weights of pixel-level loss and texture loss are used for ensuring that the importance of the three loss functions is the same;
the loss function of the discriminator is shown in equation (16):
Figure 588467DEST_PATH_IMAGE042
(16)
wherein a is a label of 'false data' and is set to be 0-0.3; "false data" is a gradient map of the fused image, formulated as equation (17):
Figure 953589DEST_PATH_IMAGE044
(17)
b is a label of 'true data' and is set to be 0.7-1.2; "true data" is from the joint gradient map of MRI and PET _ y constructed by the decision block, formulated as equation (10);
f) Optimizing by adopting an Adam optimizer;
g) Obtaining a trained confrontation generation network model;
h) And predicting by using the network model.
CN202211094448.5A 2022-09-08 2022-09-08 PET-MRI image fusion method based on adaptive countermeasure generation network Pending CN115457359A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211094448.5A CN115457359A (en) 2022-09-08 2022-09-08 PET-MRI image fusion method based on adaptive countermeasure generation network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211094448.5A CN115457359A (en) 2022-09-08 2022-09-08 PET-MRI image fusion method based on adaptive countermeasure generation network

Publications (1)

Publication Number Publication Date
CN115457359A true CN115457359A (en) 2022-12-09

Family

ID=84302889

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211094448.5A Pending CN115457359A (en) 2022-09-08 2022-09-08 PET-MRI image fusion method based on adaptive countermeasure generation network

Country Status (1)

Country Link
CN (1) CN115457359A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116630762A (en) * 2023-06-25 2023-08-22 山东卓业医疗科技有限公司 Multi-mode medical image fusion method based on deep learning
CN116862789A (en) * 2023-06-29 2023-10-10 广州沙艾生物科技有限公司 PET-MR image correction method

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116630762A (en) * 2023-06-25 2023-08-22 山东卓业医疗科技有限公司 Multi-mode medical image fusion method based on deep learning
CN116630762B (en) * 2023-06-25 2023-12-22 山东卓业医疗科技有限公司 Multi-mode medical image fusion method based on deep learning
CN116862789A (en) * 2023-06-29 2023-10-10 广州沙艾生物科技有限公司 PET-MR image correction method
CN116862789B (en) * 2023-06-29 2024-04-23 广州沙艾生物科技有限公司 PET-MR image correction method

Similar Documents

Publication Publication Date Title
Liang et al. MCFNet: Multi-layer concatenation fusion network for medical images fusion
CN115457359A (en) PET-MRI image fusion method based on adaptive countermeasure generation network
CN111882514B (en) Multi-mode medical image fusion method based on double-residual ultra-dense network
CN110060225B (en) Medical image fusion method based on rapid finite shear wave transformation and sparse representation
CN113177882A (en) Single-frame image super-resolution processing method based on diffusion model
Wu et al. FW-GAN: Underwater image enhancement using generative adversarial network with multi-scale fusion
Cheng et al. DDU-Net: A dual dense U-structure network for medical image segmentation
Zhou et al. Multi-modal medical image fusion based on densely-connected high-resolution CNN and hybrid transformer
CN114187214A (en) Infrared and visible light image fusion system and method
CN114140341A (en) Magnetic resonance image non-uniform field correction method based on deep learning
CN113139585A (en) Infrared and visible light image fusion method based on unified multi-scale dense connection network
CN115100093A (en) Medical image fusion method based on gradient filtering
Yan et al. Attention-guided dynamic multi-branch neural network for underwater image enhancement
CN114821259A (en) Zero-learning medical image fusion method based on twin convolutional neural network
Yang et al. Underwater image enhancement with latent consistency learning‐based color transfer
Li et al. Multi-scale transformer network with edge-aware pre-training for cross-modality MR image synthesis
CN111667407A (en) Image super-resolution method guided by depth information
CN111696042A (en) Image super-resolution reconstruction method based on sample learning
CN113762277B (en) Multiband infrared image fusion method based on Cascade-GAN
Chen et al. Multi-level difference information replenishment for medical image fusion
CN117197627B (en) Multi-mode image fusion method based on high-order degradation model
Wang et al. AMFNet: An attention-guided generative adversarial network for multi-model image fusion
CN117333750A (en) Spatial registration and local global multi-scale multi-modal medical image fusion method
CN113284079A (en) Multi-modal medical image fusion method
Yu et al. A multi-band image synchronous fusion method based on saliency

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination