CN116228520A - Image Compression Sensing Reconstruction Method and System Based on Transformer Generative Adversarial Network - Google Patents
Image Compression Sensing Reconstruction Method and System Based on Transformer Generative Adversarial Network Download PDFInfo
- Publication number
- CN116228520A CN116228520A CN202211502231.3A CN202211502231A CN116228520A CN 116228520 A CN116228520 A CN 116228520A CN 202211502231 A CN202211502231 A CN 202211502231A CN 116228520 A CN116228520 A CN 116228520A
- Authority
- CN
- China
- Prior art keywords
- image
- network
- network model
- layer
- transformer
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 41
- 230000006835 compression Effects 0.000 title abstract description 16
- 238000007906 compression Methods 0.000 title abstract description 16
- 238000005070 sampling Methods 0.000 claims abstract description 45
- 238000012549 training Methods 0.000 claims abstract description 27
- 230000006870 function Effects 0.000 claims abstract description 22
- 238000005457 optimization Methods 0.000 claims abstract description 20
- 238000011156 evaluation Methods 0.000 claims abstract description 10
- 230000007246 mechanism Effects 0.000 claims abstract description 9
- 238000012360 testing method Methods 0.000 claims abstract description 9
- 238000007781 pre-processing Methods 0.000 claims abstract 3
- 238000010606 normalization Methods 0.000 claims description 12
- 235000004257 Cordia myxa Nutrition 0.000 claims description 6
- 244000157795 Cordia myxa Species 0.000 claims description 6
- 230000004913 activation Effects 0.000 claims description 6
- 238000010276 construction Methods 0.000 claims description 4
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 claims description 3
- 230000003042 antagnostic effect Effects 0.000 claims 2
- 230000008447 perception Effects 0.000 abstract description 8
- 238000012545 processing Methods 0.000 abstract description 2
- 230000000694 effects Effects 0.000 description 8
- 238000005259 measurement Methods 0.000 description 7
- 238000010586 diagram Methods 0.000 description 4
- 230000008569 process Effects 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 238000003384 imaging method Methods 0.000 description 2
- 241001270131 Agaricus moelleri Species 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/04—Context-preserving transformations, e.g. by using an importance map
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T11/00—2D [Two Dimensional] image generation
- G06T11/003—Reconstruction from projections, e.g. tomography
- G06T11/008—Specific post-processing after tomographic reconstruction, e.g. voxelisation, metal artifact correction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10072—Tomographic images
- G06T2207/10088—Magnetic resonance imaging [MRI]
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Biophysics (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Compression Of Band Width Or Redundancy In Fax (AREA)
Abstract
本发明属于图像处理技术领域,特别涉及一种基于Transformer生成对抗网络的图像压缩感知重建方法及系统,该方法包括获取图像样本集,按比例划分为训练集和测试集,并对图像进行预处理;根据采样率,构建Transformer生成对抗网络模型;设置Transformer生成对抗网络模型的超参数,选择损失函数和优化方法;使用图像数据集在不同采样率下对网络模型进行训练,通过损失函数和优化方法训练学习网络模型的最优参数,得到不同采样率下训练好的Transformer生成对抗网络模型;使用训练好的Transformer生成对抗网络模型进行图像压缩感知重构,并使用评价指标来验证网络的性能。本发明使用基于注意力机制的Transformer Block构造深度生成对抗网络显著提升了重建图像质量。
The invention belongs to the technical field of image processing, and particularly relates to an image compression perception reconstruction method and system based on Transformer generation confrontation network. The method includes obtaining an image sample set, dividing it into a training set and a test set in proportion, and preprocessing the image ;According to the sampling rate, construct the Transformer generation confrontation network model; set the hyperparameters of the Transformer generation confrontation network model, select the loss function and optimization method; use the image data set to train the network model at different sampling rates, and pass the loss function and optimization method Train and learn the optimal parameters of the network model, and obtain the trained Transformer generation confrontation network model at different sampling rates; use the trained Transformer generation confrontation network model to perform image compression perception reconstruction, and use the evaluation index to verify the performance of the network. The present invention uses the Transformer Block based on the attention mechanism to construct a deep generative confrontation network, which significantly improves the quality of the reconstructed image.
Description
技术领域technical field
本发明属于图像处理技术领域,特别涉及一种基于Transformer生成对抗网络的图像压缩感知重建方法及系统。The invention belongs to the technical field of image processing, and in particular relates to an image compression perception reconstruction method and system based on a Transformer generation confrontation network.
背景技术Background technique
随着大数据信息时代的来临,依靠传统采样定理对数据采样的缺点变得愈发明显。压缩感知作为一种先进的数据采样理论,基于信号的可压缩性,对低维测量值实现重构信号的感知。但传统迭代优化的压缩感知重构算法时间复杂度高,并且在低采样率下重构效果不太理想。随着深度学习的发展,基于深度学习的压缩感知模型的提出,大大减少了重构算法的时间复杂度,提高了重建效果。With the advent of the big data information age, the shortcomings of relying on traditional sampling theorem for data sampling become more and more obvious. As an advanced data sampling theory, compressed sensing is based on the compressibility of signals to realize the perception of reconstructed signals for low-dimensional measurement values. However, the traditional iterative optimized compressed sensing reconstruction algorithm has high time complexity, and the reconstruction effect is not ideal at low sampling rates. With the development of deep learning, the proposed compressed sensing model based on deep learning greatly reduces the time complexity of the reconstruction algorithm and improves the reconstruction effect.
发明内容Contents of the invention
针对现有技术中存在的问题,本发明提出一种基于Transformer生成对抗网络的图像压缩感知重建方法及系统,使用基于注意力机制的Transformer Block构造深度生成对抗网络显著提升了重建图像质量。Aiming at the problems existing in the prior art, the present invention proposes an image compression perception reconstruction method and system based on a Transformer generation confrontational network, using an attention mechanism-based Transformer Block to construct a deep generational confrontational network, which significantly improves the quality of the reconstructed image.
为了实现上述目的,本发明采用以下的技术方案:In order to achieve the above object, the present invention adopts the following technical solutions:
本发明提供一种基于Transformer生成对抗网络的图像压缩感知重建方法,包括以下步骤:The present invention provides a kind of image compression sensing reconstruction method based on Transformer generation confrontation network, comprising the following steps:
获取图像样本集,按比例划分为训练集和测试集,并对图像进行预处理;Obtain an image sample set, divide it into a training set and a test set in proportion, and preprocess the image;
根据采样率,构建Transformer生成对抗网络模型;According to the sampling rate, build a Transformer to generate an adversarial network model;
设置Transformer生成对抗网络模型的超参数,选择损失函数和优化方法;Set the hyperparameters of Transformer to generate the confrontation network model, select the loss function and optimization method;
使用图像数据集在不同采样率下对网络模型进行训练,通过损失函数和优化方法训练学习网络模型的最优参数,得到不同采样率下训练好的Transformer生成对抗网络模型;Use the image data set to train the network model at different sampling rates, train and learn the optimal parameters of the network model through loss functions and optimization methods, and obtain the trained Transformer generation confrontation network model at different sampling rates;
使用训练好的Transformer生成对抗网络模型进行图像压缩感知重构,并使用评价指标来验证网络的性能。Use the trained Transformer to generate an adversarial network model for image compression sensing reconstruction, and use evaluation indicators to verify the performance of the network.
进一步地,在训练前,每个batchsize将图像大小调整为64×64像素。Further, each batchsize resizes the image to 64×64 pixels before training.
进一步地,所述Transformer生成对抗网络模型包括采样网络、生成网络和鉴别网络。Further, the Transformer generation confrontation network model includes a sampling network, a generation network and a discrimination network.
进一步地,所述采样网络采用卷积核大小为32×32、步长为32的卷积层生成测量值,其输出通道数根据采样率设定。Further, the sampling network uses a convolution layer with a convolution kernel size of 32×32 and a step size of 32 to generate measured values, and the number of output channels is set according to the sampling rate.
进一步地,所述生成网络包括Flatten层、全连接层、第一层隐藏层和第二层隐藏层,采样网络生成的测量值先通过Flatten层展平为一维,再通过全连接层扩充结点到24567,之后再通过reshape操作将全连接层的输出调整到指定图像大小;第一层隐藏层为Transformer Block,其中Transformer Block包含原始图像及其对应的位置编码、PixelNorm归一化层、多头自注意力机制、PixelNorm归一化层和MLP层;第二层隐藏层为亚像素卷积块,所述亚像素卷积块包括大小为3×3卷积层、批量归一化层、SELU激活函数层、亚像素卷积层和SELU激活函数层。Further, the generating network includes a Flatten layer, a fully connected layer, a first hidden layer, and a second hidden layer, and the measurement value generated by the sampling network is first flattened into one dimension through the Flatten layer, and then expanded through the fully connected layer. Click to 24567, and then adjust the output of the fully connected layer to the specified image size through the reshape operation; the first hidden layer is the Transformer Block, where the Transformer Block contains the original image and its corresponding position code, PixelNorm normalization layer, multi-head Self-attention mechanism, PixelNorm normalization layer and MLP layer; the second hidden layer is a sub-pixel convolution block, which includes a size of 3×3 convolution layer, batch normalization layer, SELU Activation function layer, sub-pixel convolution layer and SELU activation function layer.
进一步地,所述鉴别网络判断生成网络生成的图像是否为真,包括多个卷积层、批量归一化层、Flatten层和全连接层。Further, the identification network judges whether the image generated by the generation network is real, including multiple convolutional layers, batch normalization layers, Flatten layers and fully connected layers.
进一步地,设置Transformer生成对抗网络模型的超参数包括:设置初始学习速率learning rate为0.001,设置网络迭代次数为20次。Further, setting the hyperparameters of the Transformer generation confrontation network model includes: setting the initial learning rate learning rate to 0.001, and setting the number of network iterations to 20.
进一步地,利用Adam优化算法对生成网络的参数进行训练并更新,利用RMSProp优化算法对鉴别网络的参数进行训练并更新。Further, the Adam optimization algorithm is used to train and update the parameters of the generation network, and the RMSProp optimization algorithm is used to train and update the parameters of the identification network.
进一步地,使用评价指标峰值信噪比PSNR、结构相似性SSIM来验证网络的性能。Furthermore, the performance of the network is verified by using the evaluation indexes PSNR and structural similarity SSIM.
本发明还提供一种基于Transformer生成对抗网络的图像压缩感知重建系统,包括:The present invention also provides an image compression sensing reconstruction system based on Transformer generation confrontation network, comprising:
图像样本集获取模块,用于获取图像样本集,按比例划分为训练集和测试集,并对图像进行预处理;The image sample set acquisition module is used to obtain the image sample set, divide it into training set and test set in proportion, and preprocess the image;
网络模型构造模块,用于根据采样率,构建Transformer生成对抗网络模型;The network model construction module is used to construct a Transformer to generate an adversarial network model according to the sampling rate;
超参数设置模块,用于设置Transformer生成对抗网络模型的超参数,选择损失函数和优化方法;The hyperparameter setting module is used to set the hyperparameters of the Transformer generation confrontation network model, select the loss function and the optimization method;
训练模块,用于使用图像数据集在不同采样率下对网络模型进行训练,通过损失函数和优化方法训练学习网络模型的最优参数,得到不同采样率下训练好的Transformer生成对抗网络模型;The training module is used to use the image data set to train the network model at different sampling rates, train and learn the optimal parameters of the network model through loss functions and optimization methods, and obtain the trained Transformer generation confrontation network model at different sampling rates;
图像重构模块,用于使用训练好的Transformer生成对抗网络模型进行图像压缩感知重构,并使用评价指标来验证网络的性能。The image reconstruction module is used to use the trained Transformer generation confrontation network model to perform image compression sensing reconstruction, and use the evaluation index to verify the performance of the network.
与现有技术相比,本发明具有以下优点:Compared with the prior art, the present invention has the following advantages:
1、本发明使用卷积层作为采样网络模拟传统压缩感知测量过程对图像进行测量后得到测量值,提升了测量值与图像的相关性,相比传统基于块采样的压缩感知算法,本发明解决了现有基于块采样存在的图像块效应问题。测量值经过卷积神经网络的全连接层及reshape操作,完成从测量向量到原始图像的初步重建。使用基于注意力机制的Transformer Block构建深度生成对抗网络,初步重建图像送入该深度生成对抗网络,Transformer Block中的注意力机制可以获取初始重建图像的全局信息,并确保了初始重建图像与注意力权重之间的内容依赖性,增加网络的感受野以捕捉更多的上下文信息,并通过生成对抗的方式迭代提升重建图像质量。1. The present invention uses the convolutional layer as a sampling network to simulate the traditional compressed sensing measurement process to measure the image and obtain the measured value, which improves the correlation between the measured value and the image. Compared with the traditional compressed sensing algorithm based on block sampling, the present invention solves the problem of The problem of image block effect existing in the existing block-based sampling is solved. The measured value goes through the fully connected layer and reshape operation of the convolutional neural network to complete the preliminary reconstruction from the measured vector to the original image. Use the Transformer Block based on the attention mechanism to build a deep generative confrontation network, and the preliminary reconstruction image is sent to the deep generative confrontation network. The attention mechanism in the Transformer Block can obtain the global information of the initial reconstruction image, and ensure that the initial reconstruction image and attention The content dependence between weights increases the receptive field of the network to capture more contextual information, and iteratively improves the reconstructed image quality through generative confrontation.
2、本发明的图像压缩感知重建方法可应用于医学MRI领域,与传统扫描方法相比,大大加快成像速度,提升成像质量,缩短扫描时间。在低时间成本的同时,获得准确、高效的图像,保留了图像细节信息;在减少病人扫描时间的同时,也有利于医生快速诊断。2. The image compression sensing reconstruction method of the present invention can be applied in the field of medical MRI. Compared with the traditional scanning method, the imaging speed is greatly accelerated, the imaging quality is improved, and the scanning time is shortened. At the same time of low time cost, accurate and efficient images are obtained, and the detailed information of the images is preserved; while reducing the patient's scanning time, it is also conducive to the doctor's rapid diagnosis.
附图说明Description of drawings
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the accompanying drawings in the following description are only These are some embodiments of the present invention. Those skilled in the art can also obtain other drawings based on these drawings without creative work.
图1是本发明实施例的基于Transformer生成对抗网络的图像压缩感知重建方法的流程示意图;FIG. 1 is a schematic flow diagram of an image compression sensing reconstruction method based on Transformer generated confrontation network according to an embodiment of the present invention;
图2是本发明实施例的Transformer生成对抗网络模型的结构图;FIG. 2 is a structural diagram of Transformer generating an adversarial network model according to an embodiment of the present invention;
图3是本发明实施例的采样网络和生成网络的结构图;3 is a structural diagram of a sampling network and a generating network according to an embodiment of the present invention;
图4是本发明实施例的鉴别网络的结构图。Fig. 4 is a structural diagram of an authentication network according to an embodiment of the present invention.
具体实施方式Detailed ways
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.
如图1所示,本实施例提供了一种基于Transformer生成对抗网络的图像压缩感知重建方法,包括以下步骤:As shown in Figure 1, this embodiment provides an image compression sensing reconstruction method based on Transformer generation confrontation network, including the following steps:
步骤S1,获取图像样本集,按比例划分为训练集和测试集,并对图像进行预处理。Step S1, obtain an image sample set, divide it into a training set and a test set in proportion, and preprocess the image.
步骤S2,根据采样率,构建Transformer生成对抗网络模型。In step S2, according to the sampling rate, a Transformer generation confrontation network model is constructed.
步骤S3,设置Transformer生成对抗网络模型的超参数,选择损失函数和优化方法。Step S3, setting Transformer to generate hyperparameters of the confrontation network model, selecting a loss function and an optimization method.
步骤S4,使用图像数据集在不同采样率下对网络模型进行训练,通过损失函数和优化方法训练学习网络模型的最优参数,得到不同采样率下训练好的Transformer生成对抗网络模型。Step S4, use the image data set to train the network model at different sampling rates, learn the optimal parameters of the network model through loss function and optimization method training, and obtain the Transformer generation confrontation network model trained at different sampling rates.
步骤S5,使用训练好的Transformer生成对抗网络模型进行图像压缩感知重构,并使用评价指标来验证网络的性能。Step S5, using the trained Transformer to generate an adversarial network model to perform image compression perception reconstruction, and use the evaluation index to verify the performance of the network.
在步骤S1中,将公开训练数据集中CelebA的202599张178×218大小的图像分为162770张作为训练集,19867张作为验证集,19962张作为测试集。在训练前,每个batchsize将图像大小调整为64×64像素。In step S1, 202,599 178×218 images of CelebA in the public training dataset are divided into 162,770 images as a training set, 19,867 images as a verification set, and 19,962 images as a test set. Before training, images are resized to 64×64 pixels per batchsize.
进一步的,步骤S2中的Transformer生成对抗网络模型包括采样网络、生成网络和鉴别网络三个部分,如图2所示。Further, the Transformer generation confrontation network model in step S2 includes three parts: a sampling network, a generation network and a discrimination network, as shown in FIG. 2 .
如图3所示,采样网络采用卷积核大小为32×32、步长为32的卷积层生成测量值,其输出通道数根据采样率设定。对于大小为M×N的图像,使用卷积核大小为B×B×l,步长为B×B的卷积层模拟采样操作,最终获得的测量值的大小为 As shown in Figure 3, the sampling network uses a convolution layer with a convolution kernel size of 32×32 and a step size of 32 to generate measurement values, and the number of output channels is set according to the sampling rate. For an image with a size of M×N, the convolution layer with a convolution kernel size of B×B×l and a step size of B×B is used to simulate the sampling operation, and the size of the finally obtained measurement value is
如图3所示,生成网络包括Flatten层、全连接层、第一层隐藏层和第二层隐藏层。采样网络生成的测量值先通过Flatten层展平为一维,再通过全连接层扩充结点到24567,之后再通过reshape操作将全连接层的输出24567重塑为图像大小为8*8,通道数为384的特征图。As shown in Figure 3, the generation network includes a flatten layer, a fully connected layer, a first hidden layer and a second hidden layer. The measurement value generated by the sampling network is first flattened into one dimension through the Flatten layer, and then the node is expanded to 24567 through the fully connected layer, and then the output of the fully connected layer 24567 is reshaped into an image size of 8*8 through the reshape operation. The number of feature maps is 384.
第一层隐藏层为Transformer Block,利用Transformer Block对图像进行高质量重建,其中Transformer Block包含原始图像及其对应的位置编码、PixelNorm归一化层、多头自注意力机制、PixelNorm归一化层和MLP层。其结构可以表示为:[Embedded Patches-Pixel Norm-Multi headself Attention-Pixel Norm-MLP],输出通道数为384。Transformer Block中的多头自注意力机制(Multi head self-Attention)可以将图像分块并提取块内图像的全局信息,注意力机制可以捕获更多的上下文信息并计算图像内各个部分纹理复杂程度的权值,并按照权值大小分配计算资源。通过搭配亚像素卷积来逐阶段扩大图像尺寸,并通过生成对抗的方式继续提高重建质量。The first hidden layer is the Transformer Block, which uses the Transformer Block to reconstruct the image with high quality. The Transformer Block contains the original image and its corresponding position code, PixelNorm normalization layer, multi-head self-attention mechanism, PixelNorm normalization layer and MLP layer. Its structure can be expressed as: [Embedded Patches-Pixel Norm-Multi headself Attention-Pixel Norm-MLP], and the number of output channels is 384. The multi-head self-attention mechanism (Multi head self-Attention) in the Transformer Block can divide the image into blocks and extract the global information of the image in the block. The attention mechanism can capture more context information and calculate the texture complexity of each part of the image. weight, and allocate computing resources according to the size of the weight. The image size is enlarged step by step with sub-pixel convolution, and the reconstruction quality is continuously improved by generating confrontation.
第二层隐藏层为亚像素卷积块,通过亚像素卷积块进行图像尺寸的倍数提升,亚像素卷积块包括大小为3×3卷积层、批量归一化层、SELU激活函数层、亚像素卷积层和SELU激活函数层,其结构可表示为[Conv3×3-BN-SeLU-subpixelConv3×3],Transformer Block输出图像大小为8*8,输出通道数为384,经过亚像素卷积变为图像大小为16*16,通道数为96,再经过多次亚像素卷积与Transformer Block组合,图像大小变为64*64,通道数变为6,再经过一个卷积层,最终获得通道数为3,图像大小为64*64的重建图像。The second hidden layer is a sub-pixel convolution block, which multiplies the image size through the sub-pixel convolution block. The sub-pixel convolution block includes a 3×3 convolution layer, a batch normalization layer, and a SELU activation function layer. , sub-pixel convolution layer and SELU activation function layer, its structure can be expressed as [Conv 3×3 -BN-SeLU-subpixelConv 3×3 ], Transformer Block output image size is 8*8, the number of output channels is 384, after The sub-pixel convolution changes the image size to 16*16, the number of channels is 96, and then after multiple sub-pixel convolutions and Transformer Block combinations, the image size becomes 64*64, the number of channels becomes 6, and then a convolution layer, and finally obtain a reconstructed image with 3 channels and an image size of 64*64.
如图4所示,鉴别网络判断生成网络生成的图像是否为真,鉴别网络包括多个卷积层、批量归一化层、Flatten层和全连接层,其结构可表示为[Conv3×3-LreLU-BN-…-Conv3×3-BN-Flatten-DenseLayer],输入图像的通道数为3,然后通过多个卷积层不断升维到512,之后经过全连接层将通道数继续扩展到1024,最后输出真/假结果。通过生成网络和鉴别网络相互对抗,迭代训练,鉴别网络可以帮助生成网络更好的优化参数,从而提升重构图像质量。As shown in Figure 4, the identification network judges whether the image generated by the generation network is true. The identification network includes multiple convolutional layers, batch normalization layers, Flatten layers and fully connected layers. Its structure can be expressed as [Conv 3×3 -LreLU-BN-…-Conv 3×3 -BN-Flatten-DenseLayer], the number of channels of the input image is 3, and then the dimension is continuously increased to 512 through multiple convolutional layers, and then the number of channels continues to expand through the fully connected layer to 1024, and finally output a true/false result. Through the confrontation between the generation network and the identification network, iterative training, the identification network can help the generation network to better optimize parameters, thereby improving the quality of the reconstructed image.
步骤S3中,将训练集中的图像按照batchsize输入到步骤S2所搭建的网络模型中,根据硬件的条件设置合适的batchsize大小,在本实例中将batchsize大小设置为16,设置Transformer生成对抗网络模型的超参数:设置初始学习速率learning rate为0.001,设置网络迭代次数为20次。In step S3, the images in the training set are input into the network model built in step S2 according to the batch size, and an appropriate batch size is set according to the hardware conditions. In this example, the batch size is set to 16, and the Transformer is set to generate the confrontation network model. Hyperparameters: Set the initial learning rate learning rate to 0.001, and set the number of network iterations to 20.
(1)设置生成网络的目标损失函数为:(1) Set the target loss function of the generated network as:
其中,n为训练集中训练图像的数量,xi为原始图像,为生成网络输出的重建图像。利用Adam优化算法对生成网络的参数进行训练并更新。Among them, n is the number of training images in the training set, xi is the original image, is the reconstructed image output by the generating network. The parameters of the generated network are trained and updated using the Adam optimization algorithm.
(2)设置鉴别网络的目标损失函数为:(2) Set the target loss function of the discriminator network as:
其中,n为训练集中训练图像的数量,xi为原始图像,为生成网络输出的重建图像。利用RMSProp优化算法对鉴别网络的参数进行训练并更新。Among them, n is the number of training images in the training set, xi is the original image, is the reconstructed image output by the generating network. The parameters of the discriminant network are trained and updated using the RMSProp optimization algorithm.
具体的,步骤S4具体包括以下步骤:Specifically, step S4 specifically includes the following steps:
步骤S401,根据采样率设置采样卷积层的通道数。Step S401, setting the number of channels of the sampling convolution layer according to the sampling rate.
步骤S402,将训练好的模型进行保存,保存格式为.npz。Step S402, saving the trained model in .npz format.
具体的,步骤S5中使用评价指标峰值信噪比PSNR、结构相似性SSIM来验证网络的性能,具体包括以下步骤:Specifically, in step S5, the evaluation index peak signal-to-noise ratio PSNR and structural similarity SSIM are used to verify the performance of the network, which specifically includes the following steps:
步骤S501,选取测试集中的图像,输入到训练好的Transformer生成对抗网络模型中,经过采样网络获得测量值,测量值进入生成网络最终输出重建后的图像。Step S501, select the images in the test set, input them into the trained Transformer generation confrontation network model, obtain the measurement value through the sampling network, and enter the generation network to finally output the reconstructed image.
步骤S502,使用PSNR来度量网络模型的重建效果,其中PSNR越大表示重建效果越好,计算公式如下:Step S502, using PSNR to measure the reconstruction effect of the network model, wherein the larger the PSNR, the better the reconstruction effect, and the calculation formula is as follows:
均方误差(MSE):Mean square error (MSE):
MSE表示当前图像和参考图像f(i,j)的均方误差;M,N分别为图像的高度和宽度。MSE indicates the current image and the mean square error of the reference image f(i,j); M, N are the height and width of the image, respectively.
峰值信噪比(PSNR):Peak Signal-to-Noise Ratio (PSNR):
n为每像素的比特数,一般取8,即像素灰阶数为256,单位为dB。n is the number of bits per pixel, generally 8, that is, the number of gray scales of the pixel is 256, and the unit is dB.
步骤S503,使用SSIM来度量该网络模型的重建效果,其中SSIM越大表示重建效果越好,对于给定的图像x和y,两张图像的SSIM计算公式如下:Step S503, using SSIM to measure the reconstruction effect of the network model, wherein the larger the SSIM, the better the reconstruction effect. For a given image x and y, the SSIM calculation formula of the two images is as follows:
其中μx是x的平均值,μy是y的平均值,为x的方差,/>为t的方差,σxy为x,y的协方差,c1=(k1L)2,c2=(k2L)2是用来维持稳定的常数,L是像素值的动态范围,k1=0.01,k2=0.03。where μ x is the mean value of x, μ y is the mean value of y, is the variance of x, /> is the variance of t, σ xy is the covariance of x and y, c 1 =(k 1 L) 2 , c 2 =(k 2 L) 2 is a constant used to maintain stability, L is the dynamic range of pixel values, k 1 =0.01, k 2 =0.03.
本发明使用的网络模型与亚像素卷积对抗神经网络SCGAN相比,在MNIST数据集上PSNR平均提高了2.0018dB,SSIM平均提高了0.0609。在Fashion-MNIST数据集上PSNR平均提高了1.0031dB,SSIM平均提高了0.0513。在CelebA数据集上PSNR平均提高了1.2301dB,SSIM平均提高了0.1123。实验结果表明本方法具有优于目前先进的深度压缩感知算法的重建效果。Compared with the sub-pixel convolution confrontational neural network SCGAN, the network model used in the present invention has an average PSNR increase of 2.0018dB and an average SSIM increase of 0.0609 on the MNIST data set. On the Fashion-MNIST dataset, PSNR has been improved by 1.0031dB on average, and SSIM has been improved by 0.0513 on average. On the CelebA dataset, PSNR is improved by 1.2301dB on average, and SSIM is improved by 0.1123 on average. Experimental results show that this method has a better reconstruction effect than the current advanced deep compressed sensing algorithm.
与上述一种基于Transformer生成对抗网络的图像压缩感知重建方法相应地,本实施例还提供一种基于Transformer生成对抗网络的图像压缩感知重建系统,包括图像样本集获取模块、网络模型构造模块、超参数设置模块、训练模块和图像重构模块。Corresponding to the above-mentioned image compression perception reconstruction method based on Transformer generation confrontation network, this embodiment also provides an image compression perception reconstruction system based on Transformer generation confrontation network, including image sample set acquisition module, network model construction module, super Parameter setting module, training module and image reconstruction module.
图像样本集获取模块,用于获取图像样本集,按比例划分为训练集和测试集,并对图像进行预处理;The image sample set acquisition module is used to obtain the image sample set, divide it into training set and test set in proportion, and preprocess the image;
网络模型构造模块,用于根据采样率,构建Transformer生成对抗网络模型;The network model construction module is used to construct a Transformer to generate an adversarial network model according to the sampling rate;
超参数设置模块,用于设置Transformer生成对抗网络模型的超参数,选择损失函数和优化方法;The hyperparameter setting module is used to set the hyperparameters of the Transformer generation confrontation network model, select the loss function and the optimization method;
训练模块,用于使用图像数据集在不同采样率下对网络模型进行训练,通过损失函数和优化方法训练学习网络模型的最优参数,得到不同采样率下训练好的Transformer生成对抗网络模型;The training module is used to use the image data set to train the network model at different sampling rates, train and learn the optimal parameters of the network model through loss functions and optimization methods, and obtain the trained Transformer generation confrontation network model at different sampling rates;
图像重构模块,用于使用训练好的Transformer生成对抗网络模型进行图像压缩感知重构,并使用评价指标来验证网络的性能。The image reconstruction module is used to use the trained Transformer generation confrontation network model to perform image compression sensing reconstruction, and use the evaluation index to verify the performance of the network.
需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。It should be noted that, in this document, the terms "comprising", "comprising" or any other variation thereof are intended to cover a non-exclusive inclusion such that a process, method, article or apparatus comprising a set of elements includes not only those elements, It also includes other elements not expressly listed, or elements inherent in the process, method, article, or apparatus.
最后需要说明的是:以上所述仅为本发明的较佳实施例,仅用于说明本发明的技术方案,并非用于限定本发明的保护范围。凡在本发明的精神和原则之内所做的任何修改、等同替换、改进等,均包含在本发明的保护范围内。Finally, it should be noted that the above descriptions are only preferred embodiments of the present invention, and are only used to illustrate the technical solution of the present invention, and are not used to limit the protection scope of the present invention. Any modification, equivalent replacement, improvement, etc. made within the spirit and principles of the present invention are included in the protection scope of the present invention.
Claims (10)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211502231.3A CN116228520A (en) | 2022-11-28 | 2022-11-28 | Image Compression Sensing Reconstruction Method and System Based on Transformer Generative Adversarial Network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211502231.3A CN116228520A (en) | 2022-11-28 | 2022-11-28 | Image Compression Sensing Reconstruction Method and System Based on Transformer Generative Adversarial Network |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116228520A true CN116228520A (en) | 2023-06-06 |
Family
ID=86589863
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211502231.3A Pending CN116228520A (en) | 2022-11-28 | 2022-11-28 | Image Compression Sensing Reconstruction Method and System Based on Transformer Generative Adversarial Network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116228520A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117319656A (en) * | 2023-11-30 | 2023-12-29 | 广东工业大学 | Quantized signal reconstruction method based on depth expansion |
-
2022
- 2022-11-28 CN CN202211502231.3A patent/CN116228520A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117319656A (en) * | 2023-11-30 | 2023-12-29 | 广东工业大学 | Quantized signal reconstruction method based on depth expansion |
CN117319656B (en) * | 2023-11-30 | 2024-03-26 | 广东工业大学 | A quantitative signal reconstruction method based on depth expansion |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110211045B (en) | Super-resolution face image reconstruction method based on SRGAN network | |
CN112001847A (en) | Method for generating high-quality image by relatively generating antagonistic super-resolution reconstruction model | |
CN110021037A (en) | A kind of image non-rigid registration method and system based on generation confrontation network | |
CN110074813A (en) | A kind of ultrasonic image reconstruction method and system | |
CN113538616B (en) | A MRI Image Reconstruction Method Combined with PUGAN and Improved U-net | |
CN109375125B (en) | Compressed sensing magnetic resonance imaging reconstruction method for correcting regularization parameters | |
CN114170088A (en) | Relational reinforcement learning system and method based on graph structure data | |
CN113160057B (en) | RPGAN image super-resolution reconstruction method based on generative confrontation network | |
CN112419203B (en) | Diffusion-weighted image compression sensing restoration method and device based on confrontation network | |
CN113850883A (en) | Magnetic particle imaging reconstruction method based on attention mechanism | |
CN117974693B (en) | Image segmentation method, device, computer equipment and storage medium | |
Wang et al. | IKWI-net: A cross-domain convolutional neural network for undersampled magnetic resonance image reconstruction | |
CN113269256A (en) | Construction method and application of Misrc-GAN model | |
CN114612714A (en) | A no-reference image quality assessment method based on curriculum learning | |
CN110517249A (en) | Imaging method, device, equipment and medium of ultrasonic elasticity image | |
CN110097499A (en) | The single-frame image super-resolution reconstruction method returned based on spectrum mixed nucleus Gaussian process | |
CN114529519B (en) | Image compressed sensing reconstruction method and system based on multi-scale depth cavity residual error network | |
Zhu et al. | Super resolution reconstruction method for infrared images based on pseudo transferred features | |
CN116228520A (en) | Image Compression Sensing Reconstruction Method and System Based on Transformer Generative Adversarial Network | |
CN109544652A (en) | Add to weigh imaging method based on the nuclear magnetic resonance that depth generates confrontation neural network | |
CN112907748B (en) | A 3D Topography Reconstruction Method Based on Non-downsampling Shearlet Transform and Depth Image Texture Feature Clustering | |
CN115399780A (en) | Electrocardiosignal depth compression sensing method | |
CN112184552B (en) | Sub-pixel convolution image super-resolution method based on high-frequency feature learning | |
CN114022362A (en) | Image super-resolution method based on pyramid attention mechanism and symmetric network | |
CN117291803B (en) | PAMGAN lightweight facial super-resolution reconstruction method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |