CN113393543A

CN113393543A - Hyperspectral image compression method, device and equipment and readable storage medium

Info

Publication number: CN113393543A
Application number: CN202110662427.8A
Authority: CN
Inventors: 种衍文; 郭圆圆; 潘少明
Original assignee: Wuhan University WHU
Current assignee: Wuhan University WHU
Priority date: 2021-06-15
Filing date: 2021-06-15
Publication date: 2021-09-14
Anticipated expiration: 2041-06-15
Also published as: CN113393543B

Abstract

The invention provides a hyperspectral image compression method, a hyperspectral image compression device, hyperspectral image compression equipment and a readable storage medium. The method comprises the following steps: training a convolutional neural network through a training set, wherein the convolutional neural network comprises a nonlinear transformation module, a quantization module and an entropy model; and verifying the compression performance of the trained convolutional neural network by using the test set, and compressing the hyperspectral image by using the trained convolutional neural network when the compression performance of the trained convolutional neural network reaches the standard. The method has better rate distortion performance for the compression of the hyperspectral image.

Description

Hyperspectral image compression method, apparatus, device and readable storage medium

技术领域technical field

本发明涉及图像处理技术领域，尤其涉及一种高光谱图像压缩方法、装置、设备及可读存储介质。The present invention relates to the technical field of image processing, and in particular, to a hyperspectral image compression method, apparatus, device and readable storage medium.

背景技术Background technique

高光谱图像具有丰富而独特的光谱信息，给基于高光谱图像的很多应用，比如农作物分类、质量检测、灾难预测等任务带来了极大的便利。然而这种优势在有限的传输带宽和存储能力下，也制约着高光谱图像的进一步发展。因此，如何有效解决高光谱图像大数据量带来的各种挑战是高光谱图像得以广泛应用的前提和关键。Hyperspectral images have rich and unique spectral information, which brings great convenience to many applications based on hyperspectral images, such as crop classification, quality inspection, disaster prediction and other tasks. However, this advantage also restricts the further development of hyperspectral images under the limited transmission bandwidth and storage capacity. Therefore, how to effectively solve the various challenges brought by the large amount of hyperspectral image data is the premise and key to the wide application of hyperspectral images.

在高光谱图像压缩算法中，变换编码因其较小的计算复杂度和良好的适应性而得到广泛应用。基于变换编码的图像压缩算法包括四部分：变换，量化和熵编码，反变换，分开实现编码过程和去相关过程。Among hyperspectral image compression algorithms, transform coding is widely used because of its small computational complexity and good adaptability. The image compression algorithm based on transform coding includes four parts: transform, quantization and entropy coding, inverse transform, separate encoding process and decorrelation process.

变换是通过一定的方式将图像从像素域变换到一个更为紧凑的空间，现有基于变换编码的高光谱图像压缩方法一般假设高光谱图像为高斯源，在此条件下只需可逆的线性变换即能将像素映射成独立的潜在表示，通过量化和熵编码将潜变量压缩成码流用于存储和传输。然而，实际场景的高光谱图像具有明显的非高斯特性，使得线性变换不再适用，非线性变换的探索为该问题提供了新的方法和思路。近年来，以人工神经网络尤其是深度学习为工具的非线性变换的发展改变了传统图像压缩手工设参的状况。现有基于深度学习的图像压缩技术表现出巨大的潜力，性能已超过工业界的H.266/VVC(Versatile VideoCoding，VVC)标准。然而，这些方法多用于处理三波段的自然图像，对于高光谱图像的压缩相对较少。The transformation is to transform the image from the pixel domain to a more compact space in a certain way. The existing hyperspectral image compression methods based on transform coding generally assume that the hyperspectral image is a Gaussian source, and only a reversible linear transformation is required under this condition. That is, pixels can be mapped into independent latent representations, and latent variables can be compressed into code streams for storage and transmission through quantization and entropy coding. However, hyperspectral images of actual scenes have obvious non-Gaussian properties, which makes linear transformation no longer applicable. The exploration of nonlinear transformation provides new methods and ideas for this problem. In recent years, the development of nonlinear transformations using artificial neural networks, especially deep learning, has changed the situation of manual parameter setting in traditional image compression. The existing deep learning-based image compression technologies show great potential, and the performance has exceeded the H.266/VVC (Versatile Video Coding, VVC) standard in the industry. However, these methods are mostly used for processing three-band natural images, and the compression of hyperspectral images is relatively less.

变换过程使得量化和和熵编码得以在一个紧凑空间执行，相较于RGB自然图像，高光谱图像的光谱间具有更强的相关性，这使得经过相同的变换过程，高光谱图像得到的潜在表示具有与RGB图像不同的统计特性。经过量化，潜变量变成离散形式，然后基于熵编码算法对该离散形式进行编码。而熵编码的过程依赖于潜变量的概率分布模型，由最小熵理论可知，设计的熵模型越接近真实的潜变量分布，码率越小，熵率优化过程中得到的解越接近最优解。The transformation process allows quantization and entropy encoding to be performed in a compact space. Compared with RGB natural images, the spectra of hyperspectral images have stronger correlation, which makes the potential representation of hyperspectral images after the same transformation process. Has different statistical properties than RGB images. After quantization, the latent variable becomes a discrete form, which is then encoded based on an entropy coding algorithm. The process of entropy coding depends on the probability distribution model of latent variables. According to the minimum entropy theory, the closer the designed entropy model is to the real latent variable distribution, the smaller the code rate, and the closer the solution obtained in the entropy rate optimization process is to the optimal solution. .

结合上述分析，目前基于深度学习的压缩技术还需要进一步根据高光谱图像的特点，设计更加灵活准确的熵模型，以减少熵模型与真实潜变量分布之间的失配，从而达到最优的率失真性能。Combined with the above analysis, the current compression technology based on deep learning needs to further design a more flexible and accurate entropy model according to the characteristics of hyperspectral images, so as to reduce the mismatch between the entropy model and the distribution of real latent variables, so as to achieve the optimal rate. Distortion performance.

发明内容SUMMARY OF THE INVENTION

为解决上述技术问题，本发明提供一种高光谱图像压缩方法、装置、设备及可读存储介质。In order to solve the above technical problems, the present invention provides a hyperspectral image compression method, apparatus, device and readable storage medium.

第一方面，本发明提供一种高光谱图像压缩方法，所述高光谱图像压缩方法包括：In a first aspect, the present invention provides a hyperspectral image compression method, the hyperspectral image compression method comprising:

通过训练集对卷积神经网络进行训练，其中，卷积神经网络包括非线性变换模块、量化模块以及熵模型；The convolutional neural network is trained through the training set, wherein the convolutional neural network includes a nonlinear transformation module, a quantization module and an entropy model;

使用测试集验证训练完成的卷积神经网络的压缩性能，当训练完成的卷积神经网络的压缩性能达标时，通过训练完成的卷积神经网络对高光谱图像进行压缩处理。Use the test set to verify the compression performance of the trained convolutional neural network. When the compression performance of the trained convolutional neural network reaches the standard, the hyperspectral image is compressed by the trained convolutional neural network.

可选的，在所述通过训练集对卷积神经网络进行训练的步骤之前，还包括：Optionally, before the step of training the convolutional neural network through the training set, the method further includes:

将样本高光谱图像在空间维切分成若干固定大小的立方块；Divide the sample hyperspectral image into several fixed-size cubes in the spatial dimension;

按预设比例将所述若干固定大小的立方块划分为训练集和测试集。The several fixed-size cubes are divided into a training set and a test set according to a preset ratio.

可选的，非线性变换模块对高光谱图像的空间和光谱维进行正向非线性变换，得到潜变量；量化模块通过添加均匀噪声的方式对潜变量进行量化；熵模型用于得到潜变量的概率分布，以供在熵编码时基于所述概率分布确定潜变量中每个元素所分配的码字。Optionally, the nonlinear transformation module performs forward nonlinear transformation on the spatial and spectral dimensions of the hyperspectral image to obtain latent variables; the quantization module quantifies the latent variables by adding uniform noise; the entropy model is used to obtain the latent variables. A probability distribution for determining the codeword assigned to each element in the latent variable based on the probability distribution during entropy encoding.

可选的，基于率失真准则对卷积神经网络训练过程进行约束，用来确定非线性变换模块以及熵模型中的参数值。Optionally, the convolutional neural network training process is constrained based on the rate-distortion criterion, which is used to determine the nonlinear transformation module and parameter values in the entropy model.

可选的，所述非线性变换包括正向变换：Y＝g_a(W_aX+b_a)，反向变换：

其中，

表示输入的高光谱图像，

表示重构图像，H,W,B分别对应高光谱图像的行、列和波段数，

表示潜变量，h,w,N分别对应潜变量的行、列和滤波器的个数，

和

表示正变换网络参数，

和

表示反变换网络参数，g_a(.)表示非线性正向变换函数，g_s(.)表示非线性反向变换函数。Optionally, the nonlinear transformation includes forward transformation: Y=ga (W _a X+ _b _a ), and reverse transformation:

in,

represents the input hyperspectral image,

represents the reconstructed image, H, W, B correspond to the number of rows, columns and bands of the hyperspectral image, respectively,

Represents latent variables, h, w, N correspond to the number of rows, columns and filters of latent variables, respectively,

and

represents the positive transformation network parameters,

and

represents the inverse transformation network parameters, ga (.) represents the nonlinear forward transformation function, and _g _s (.) represents the nonlinear reverse transformation function.

可选的，所述通过添加均匀噪声的方式对潜变量进行量化的函数表达如下：Optionally, the function for quantizing latent variables by adding uniform noise is expressed as follows:

其中，training表示训练过程，testing表示测试过程，

表示单位均匀噪声，round表示取整操作，

表示量化后的潜向量。Among them, training represents the training process, testing represents the testing process,

represents unit uniform noise, round represents the rounding operation,

represents the quantized latent vector.

可选的，将潜变量的统计特性引入熵模型的设计，同时，引入额外的变量构造一个条件模型，以提高熵模型的精度。Optionally, the statistical characteristics of latent variables are introduced into the design of the entropy model, and at the same time, additional variables are introduced to construct a conditional model to improve the accuracy of the entropy model.

第二方面，本发明还提供一种高光谱图像压缩装置，所述高光谱图像压缩装置包括：In a second aspect, the present invention also provides a hyperspectral image compression device, the hyperspectral image compression device comprising:

训练模块，用于通过训练集对卷积神经网络进行训练，其中，卷积神经网络包括非线性变换模块、量化模块以及熵模型；a training module for training the convolutional neural network through the training set, wherein the convolutional neural network includes a nonlinear transformation module, a quantization module and an entropy model;

处理模块，用于使用测试集验证训练完成的卷积神经网络的压缩性能，当训练完成的卷积神经网络的压缩性能达标时，通过训练完成的卷积神经网络对高光谱图像进行压缩处理。The processing module is used to verify the compression performance of the trained convolutional neural network using the test set. When the compression performance of the trained convolutional neural network reaches the standard, the hyperspectral image is compressed by the trained convolutional neural network.

第三方面，本发明还提供一种高光谱图像压缩设备，所述高光谱图像压缩设备包括处理器、存储器、以及存储在所述存储器上并可被所述处理器执行的高光谱图像压缩程序，其中所述高光谱图像压缩程序被所述处理器执行时，实现如上所述的高光谱图像压缩方法的步骤。In a third aspect, the present invention also provides a hyperspectral image compression device, the hyperspectral image compression device includes a processor, a memory, and a hyperspectral image compression program stored on the memory and executable by the processor , wherein when the hyperspectral image compression program is executed by the processor, the steps of the hyperspectral image compression method described above are implemented.

第四方面，本发明还提供一种可读存储介质，所述可读存储介质上存储有高光谱图像压缩程序，其中所述高光谱图像压缩程序被处理器执行时，实现如上所述的高光谱图像压缩方法的步骤。In a fourth aspect, the present invention also provides a readable storage medium on which a hyperspectral image compression program is stored, wherein when the hyperspectral image compression program is executed by a processor, the Steps of a spectral image compression method.

本发明中，通过训练集对卷积神经网络进行训练，其中，卷积神经网络包括非线性变换模块、量化模块以及熵模型；使用测试集验证训练完成的卷积神经网络的压缩性能，当训练完成的卷积神经网络的压缩性能达标时，通过训练完成的卷积神经网络对高光谱图像进行压缩处理。通过本发明，对于高光谱图像压缩具有较好的率失真性能。In the present invention, the convolutional neural network is trained through the training set, wherein the convolutional neural network includes a nonlinear transformation module, a quantization module and an entropy model; the compression performance of the trained convolutional neural network is verified by using the test set. When the compression performance of the completed convolutional neural network reaches the standard, the hyperspectral image is compressed through the trained convolutional neural network. The present invention has better rate-distortion performance for hyperspectral image compression.

附图说明Description of drawings

图1为本发明实施例方案中涉及的高光谱图像压缩设备的硬件结构示意图；FIG. 1 is a schematic diagram of the hardware structure of a hyperspectral image compression device involved in an embodiment of the present invention;

图2为本发明高光谱图像压缩方法一实施例的流程示意图；2 is a schematic flowchart of an embodiment of a hyperspectral image compression method according to the present invention;

图3为本发明高光谱图像压缩装置一实施例的功能模块示意图。FIG. 3 is a schematic diagram of functional modules of an embodiment of a hyperspectral image compression apparatus according to the present invention.

本发明目的的实现、功能特点及优点将结合实施例，参照附图做进一步说明。The realization, functional characteristics and advantages of the present invention will be further described with reference to the accompanying drawings in conjunction with the embodiments.

具体实施方式Detailed ways

应当理解，此处所描述的具体实施例仅仅用以解释本发明，并不用于限定本发明。It should be understood that the specific embodiments described herein are only used to explain the present invention, but not to limit the present invention.

第一方面，本发明实施例提供一种高光谱图像压缩设备，该高光谱图像压缩设备可以是个人计算机(personal computer，PC)、笔记本电脑、服务器等具有数据处理功能的设备。In a first aspect, an embodiment of the present invention provides a hyperspectral image compression device, where the hyperspectral image compression device may be a personal computer (personal computer, PC), a notebook computer, a server, and other devices with data processing functions.

参照图1，图1为本发明实施例方案中涉及的高光谱图像压缩设备的硬件结构示意图。本发明实施例中，高光谱图像压缩设备可以包括处理器1001(例如中央处理器CentralProcessing Unit，CPU)，通信总线1002，用户接口1003，网络接口1004，存储器1005。其中，通信总线1002用于实现这些组件之间的连接通信；用户接口1003可以包括显示屏(Display)、输入单元比如键盘(Keyboard)；网络接口1004可选的可以包括标准的有线接口、无线接口(如无线保真WIreless-FIdelity，WI-FI接口)；存储器1005可以是高速随机存取存储器(random access memory，RAM)，也可以是稳定的存储器(non-volatile memory)，例如磁盘存储器，存储器1005可选的还可以是独立于前述处理器1001的存储装置。本领域技术人员可以理解，图1中示出的硬件结构并不构成对本发明的限定，可以包括比图示更多或更少的部件，或者组合某些部件，或者不同的部件布置。Referring to FIG. 1 , FIG. 1 is a schematic diagram of a hardware structure of a hyperspectral image compression device involved in an embodiment of the present invention. In this embodiment of the present invention, the hyperspectral image compression device may include a processor 1001 (for example, a central processing unit, Central Processing Unit, CPU), a communication bus 1002 , a user interface 1003 , a network interface 1004 , and a memory 1005 . Wherein, the communication bus 1002 is used to realize the connection and communication between these components; the user interface 1003 may include a display screen (Display), an input unit such as a keyboard (Keyboard); the network interface 1004 may optionally include a standard wired interface, a wireless interface (such as wireless fidelity WIreless-FIdelity, WI-FI interface); the memory 1005 can be a high-speed random access memory (random access memory, RAM), or can be a stable memory (non-volatile memory), such as disk memory, memory Optionally, 1005 may also be a storage device independent of the aforementioned processor 1001 . Those skilled in the art can understand that the hardware structure shown in FIG. 1 does not constitute a limitation of the present invention, and may include more or less components than those shown in the drawings, or combine some components, or arrange different components.

继续参照图1，图1中作为一种计算机存储介质的存储器1005中可以包括操作系统、网络通信模块、用户接口模块以及高光谱图像压缩程序。其中，处理器1001可以调用存储器1005中存储的高光谱图像压缩程序，并执行本发明实施例提供的高光谱图像压缩方法。Continuing to refer to FIG. 1 , the memory 1005 as a computer storage medium in FIG. 1 may include an operating system, a network communication module, a user interface module and a hyperspectral image compression program. The processor 1001 may call the hyperspectral image compression program stored in the memory 1005, and execute the hyperspectral image compression method provided by the embodiment of the present invention.

第二方面，本发明实施例提供了一种高光谱图像压缩方法。In a second aspect, an embodiment of the present invention provides a hyperspectral image compression method.

一实施例中，参照图2，图2为本发明高光谱图像压缩方法一实施例的流程示意图。如图2所示，高光谱图像压缩方法包括：In an embodiment, referring to FIG. 2 , FIG. 2 is a schematic flowchart of an embodiment of a hyperspectral image compression method according to the present invention. As shown in Figure 2, hyperspectral image compression methods include:

步骤S10，通过训练集对卷积神经网络进行训练，其中，卷积神经网络包括非线性变换模块、量化模块以及熵模型；Step S10, train the convolutional neural network through the training set, wherein the convolutional neural network includes a nonlinear transformation module, a quantization module and an entropy model;

本实施例中，预先构建训练集，并通过训练集对卷积神经网络进行训练，其中，卷积神经网络包括非线性变换模块、量化模块以及熵模型。In this embodiment, a training set is pre-built, and the convolutional neural network is trained through the training set, wherein the convolutional neural network includes a nonlinear transformation module, a quantization module, and an entropy model.

进一步地，一实施例中，在步骤S10之前，还包括：Further, in an embodiment, before step S10, it further includes:

将样本高光谱图像在空间维切分成若干固定大小的立方块；按预设比例将所述若干固定大小的立方块划分为训练集和测试集。The sample hyperspectral image is divided into several fixed-size cubes in the spatial dimension; the several fixed-size cubes are divided into a training set and a test set according to a preset ratio.

本实施例中，在训练开始前需准备数据集，包括训练集和测试集，以及设置卷积神经网络的超参数。例如，将27张大小为2704×3376×31的KAIST数据集(30张)以及28张512×512×31的CAVE数据(32张)进行随机裁剪为尺寸为128×128×31的图像块。实验采用tensorflow框架进行模型训练，将裁减好的128×128×31图像块批量(batchesize＝32)送入搭好的网络，迭代500000次，训练时损失函数如下：In this embodiment, it is necessary to prepare a data set, including a training set and a test set, and set the hyperparameters of the convolutional neural network before training starts. For example, 27 KAIST datasets of size 2704 × 3376 × 31 (30) and 28 CAVE data of 512 × 512 × 31 (32) are randomly cropped into image patches of size 128 × 128 × 31. The experiment uses the tensorflow framework for model training, and sends the cropped 128×128×31 image block batches (batchesize=32) into the built network, and iterates 500,000 times. The loss function during training is as follows:

其中，近似后验

采用全分解的单位均匀密度函数，因此损失函数的第一项

表示失真项，在训练过程中采用带参数λ的均方误差MSE损失进行失真度量，λ的取值从0.00001到0.01之间，通过调整的取值可使得bppp(编码每个波段每个像素所占的比特数)的范围控制在0.1到2之间(λ越大，bppp越大)；

表示总的编码比特数。where the approximate posterior

A fully decomposed unit uniform density function is used, so the first term of the loss function

Represents the distortion term. In the training process, the mean square error MSE loss with the parameter λ is used to measure the distortion. The value of λ ranges from 0.00001 to 0.01. The range of the number of bits occupied) is controlled between 0.1 and 2 (the larger the λ, the larger the bppp);

Indicates the total number of encoded bits.

进一步地，一实施例中，非线性变换模块对高光谱图像的空间和光谱维进行正向非线性变换，得到潜变量；量化模块通过添加均匀噪声的方式对潜变量进行量化；熵模型用于得到潜变量的概率分布，以供在熵编码时基于所述概率分布确定潜变量中每个元素所分配的码字。Further, in one embodiment, the nonlinear transformation module performs forward nonlinear transformation on the spatial and spectral dimensions of the hyperspectral image to obtain latent variables; the quantization module quantifies the latent variables by adding uniform noise; the entropy model is used for A probability distribution of latent variables is obtained for determining a codeword assigned to each element in the latent variable based on the probability distribution during entropy coding.

本实施例中，将训练集输入到卷积神经网络中，卷积神经网络中的非线性变换模块对高光谱图像的空间和光谱维进行正向非线性变换，使得图像从像素域映射到一个紧凑的潜层空间，得到潜变量；然后，卷积神经网络中的量化模块通过添加均匀噪声的方式对潜变量进行量化，训练过程由于量化会使得反向传播的梯度为0，为了保证训练顺利进行，本实施例采用添加均匀噪声的方式代替量化过程，使得量化可导，测试过程直接取整；再然后，基于熵模型得到潜变量的概率分布，以供在熵编码时，基于该概率分布确定潜变量中每个元素所分配的码字(即每个元素用几个字符)。In this embodiment, the training set is input into the convolutional neural network, and the nonlinear transformation module in the convolutional neural network performs forward nonlinear transformation on the spatial and spectral dimensions of the hyperspectral image, so that the image is mapped from the pixel domain to a The compact latent layer space is used to obtain latent variables; then, the quantization module in the convolutional neural network quantizes the latent variables by adding uniform noise. The training process will make the gradient of back propagation to 0 due to quantization. In order to ensure smooth training In this embodiment, the method of adding uniform noise is used to replace the quantization process, so that the quantization can be guided, and the test process is directly rounded; and then, the probability distribution of the latent variables is obtained based on the entropy model, for entropy coding, based on the probability distribution Determine the codeword assigned to each element in the latent variable (ie, how many characters per element).

进一步地，一实施例中，基于率失真准则对卷积神经网络训练过程进行约束，用来确定非线性变换模块以及熵模型中的参数值。Further, in one embodiment, the training process of the convolutional neural network is constrained based on the rate-distortion criterion, which is used to determine the nonlinear transformation module and parameter values in the entropy model.

本实施例中，采用率失真准则对非线性变换模块以及熵模型中的参数值进行求解，在优化过程中将变分推断的概念与率失真结合，从概率的角度解释率失真优化的过程：In this embodiment, the rate-distortion criterion is used to solve the parameter values in the nonlinear transformation module and the entropy model, and the concept of variational inference is combined with rate-distortion in the optimization process, and the process of rate-distortion optimization is explained from the perspective of probability:

其中，

表示重构图像，d(.)表示失真度量准则，在高光谱图像中一般采用峰值信噪比PSNR，结构相似度SSIM对像素进行失真度量，其值越大表示像素的重构效果越好，利用光谱角SAM度量光谱的重构精度，λ表示拉格朗日乘子。in,

Represents the reconstructed image, and d(.) represents the distortion measurement criterion. In hyperspectral images, the peak signal-to-noise ratio (PSNR) and the structural similarity (SSIM) are generally used to measure the pixel distortion. The larger the value, the better the reconstruction effect of the pixel. The spectral reconstruction accuracy is measured by the spectral angle SAM, and λ represents the Lagrange multiplier.

为了优化上述损失函数，采用变分推断的思想，通过设计一个近似后验逼近真实的后验进行参数求解，采用KL散度对两个后验进行度量，计算公式如下：In order to optimize the above loss function, the idea of variational inference is adopted, and an approximate posterior is designed to approximate the real posterior to solve the parameters, and the KL divergence is used to measure the two posteriors. The calculation formula is as follows:

其中，

表示近似后验，可用任何简单的分布表示，在压缩中一般表示为全分解的单位均匀分布，以便将该项从损失函数中去除，而其余除常数项const之外的三项，

对应失真

对应码率

对应额外的信息

in,

Represents the approximate posterior, which can be represented by any simple distribution, generally expressed as a fully decomposed unit uniform distribution in compression, so that this term is removed from the loss function, and the remaining three terms except the constant term const,

Corresponding distortion

Corresponding code rate

Corresponds to additional information

进一步地，一实施例中，所述非线性变换包括正向变换：Y＝g_a(W_aX+b_a)，反向变换：

其中，

表示输入的高光谱图像，

和

表示正变换网络参数，

和

表示反变换网络参数，g_a(.)表示非线性正向变换函数，g_s(.)表示非线性反向变换函数。Further, in an embodiment, the nonlinear transformation includes a forward transformation: Y=ga (W _a X+b _a ), and _a reverse transformation:

in,

represents the input hyperspectral image,

and

represents the positive transformation network parameters,

and

进一步地，一实施例中，所述通过添加均匀噪声的方式对潜变量进行量化的函数表达如下：Further, in an embodiment, the function for quantizing latent variables by adding uniform noise is expressed as follows:

其中，training表示训练过程，testing表示测试过程，

表示单位均匀噪声，round表示取整操作，

represents unit uniform noise, round represents the rounding operation,

represents the quantized latent vector.

本实施例中，对潜变量进行量化，训练时采用单位均匀噪声近似，测试时采用取整的方式。In this embodiment, the latent variable is quantized, and the unit uniform noise approximation is adopted during training, and the rounding method is adopted during testing.

进一步地，一实施例中，将潜变量的统计特性引入熵模型的设计，同时，引入额外的变量构造一个条件模型，以提高熵模型的精度。Further, in one embodiment, the statistical characteristics of the latent variables are introduced into the design of the entropy model, and at the same time, an additional variable is introduced to construct a conditional model, so as to improve the accuracy of the entropy model.

本实施例中，将潜变量的统计特性引入熵模型的设计，以减少熵模型与真实潜变量分布的差异，二者的差异越小，得到的码率越小，可以在潜层表示中加入一定的先验认知，以提高熵模型的精度。这里通过引入额外的变量构造一个条件模型实现，计算公式如下：In this embodiment, the statistical characteristics of latent variables are introduced into the design of the entropy model to reduce the difference between the distribution of the entropy model and the real latent variable. Certain prior knowledge is required to improve the accuracy of the entropy model. Here, a conditional model is constructed by introducing additional variables. The calculation formula is as follows:

其中，

表示量化后的潜层表示，

表示条件熵模型，

表示额外变量

的分布，作为熵模型的先验信息，

表示潜层表示的真实分布。in,

represents the latent layer representation after quantization,

represents the conditional entropy model,

represents an extra variable

The distribution of , as the prior information of the entropy model,

represents the true distribution represented by the latent layer.

在熵模型设计时，加入了潜层表示的统计先验，利用卷积神经网络进行参数的求解：In the design of the entropy model, the statistical prior represented by the latent layer is added, and the convolutional neural network is used to solve the parameters:

其中，f表示能够描述潜层表示统计特性的一种分布，如果高斯特性比较明显可以表示为高斯分布，如果非高斯特性比较明显，可以选择T分布，拉普拉斯分布等；f的选择由潜层表示的统计特性决定。f的参数由卷积神经网络学习得到，即在确定f类型的前提下，利用变量学习f分布的参数信息。Among them, f represents a distribution that can describe the statistical characteristics of the latent layer. If the Gaussian characteristic is obvious, it can be expressed as a Gaussian distribution. If the non-Gaussian characteristic is obvious, T distribution, Laplace distribution, etc. can be selected; the selection of f is determined by It is determined by the statistical properties of the latent layer representation. The parameters of f are learned by the convolutional neural network, that is, on the premise of determining the type of f, the parameter information of the f distribution is learned by using variables.

由于高光谱图像经过卷积神经网络的非线性变换后，潜变量的分布具有明显的非高斯特性，因此在熵模型设计时需要加入这一先验信息，在实验过程中发现t分布能够很好地捕获这一特性，因此，选择t分布对高光谱图像的潜变量进行建模。Because the distribution of latent variables has obvious non-Gaussian characteristics after the hyperspectral image is nonlinearly transformed by the convolutional neural network, this prior information needs to be added in the design of the entropy model. During the experiment, it was found that the t distribution can be very good To capture this property, the t-distribution was chosen to model the latent variables of hyperspectral images.

为了使整个压缩过程可微，量化过程采用加性单位均匀噪声近似；为了使熵模型与后验分布更加契合，在熵模型设计时，卷积一个单位均匀分布，公式如下：In order to make the whole compression process differentiable, the quantization process adopts the additive unit uniform noise approximation; in order to make the entropy model fit the posterior distribution more closely, in the design of the entropy model, a unit uniform distribution is convolved, and the formula is as follows:

其中，η_i表示t分布的尺度参数(类似方差但不等于方差)，v表示自由度，可以调整t分布的形状。c表示熵模型的概率分布的解析形式。概率分布的参数变量利用额外的变量

通过超先验网络求的。Among them, η _i represents the scale parameter of the t distribution (similar to variance but not equal to the variance), and v represents the degree of freedom, which can adjust the shape of the t distribution. c represents the analytical form of the probability distribution of the entropy model. Parametric variables of probability distributions take advantage of additional variables

obtained through a super-prior network.

熵编码采用算术编码，在算数编码过程中，熵模型为算数编码和算数解码过程提供概率分布，根据潜变量中每个元素的概率分布确定为其分配的码字大小(即每个符号占几位)，经过熵编码之后，潜变量变成二进制码流的形式用于存储或者传输。Entropy coding adopts arithmetic coding. In the process of arithmetic coding, the entropy model provides probability distribution for the process of arithmetic coding and arithmetic decoding. According to the probability distribution of each element in the latent variable, the size of the code word allocated to it (that is, how many symbols each symbol occupies) is determined. bit), after entropy encoding, the latent variable becomes a binary code stream for storage or transmission.

步骤S20，使用测试集验证训练完成的卷积神经网络的压缩性能，当训练完成的卷积神经网络的压缩性能达标时，通过训练完成的卷积神经网络对高光谱图像进行压缩处理。Step S20, using the test set to verify the compression performance of the trained convolutional neural network, when the compression performance of the trained convolutional neural network reaches the standard, compress the hyperspectral image through the trained convolutional neural network.

本实施例中，训练时采用均匀噪声近似量化过程，测试时直接采用取整方式。熵编码采用常用的算数编码，最小化率失真损失训练模型直至收敛。测试时直接将整张大图放进去，在CAVE数据集上，自由度为21时，可达到bppp为0.1219，PSNR为36.74dB，SSIM为0.9175，SAM为0.2137的效果。在自由度为20时，KAIST数据集上，可达到bppp为0.0885时，PSNR为39.99dB，SSIM为0.9524，SAM为0.2331的效果。In this embodiment, the uniform noise approximate quantization process is used during training, and the rounding method is directly used during testing. The entropy coding adopts the commonly used arithmetic coding, which minimizes the rate-distortion loss and trains the model until convergence. When testing, the entire large image is directly put in. On the CAVE data set, when the degree of freedom is 21, the bppp is 0.1219, the PSNR is 36.74dB, the SSIM is 0.9175, and the SAM is 0.2137. When the degree of freedom is 20, on the KAIST dataset, when the bppp is 0.0885, the PSNR is 39.99dB, the SSIM is 0.9524, and the SAM is 0.2331.

用户如果需要使用图像信息，可将二进制码流经过算数解码还原成潜变量，然后输入由两个空间和光谱模块组成的反变换网络，在反变换网络中，空间和光谱模块由IGDN连接，上采样恢复到原来的图像大小。压缩框架划分为四部分，包括变换网络、量化、熵编码，反变换网络。If the user needs to use the image information, the binary code stream can be restored to latent variables through arithmetic decoding, and then input into the inverse transformation network composed of two spatial and spectral modules. In the inverse transformation network, the spatial and spectral modules are connected by IGDN. Sampling is restored to the original image size. The compression framework is divided into four parts, including transform network, quantization, entropy coding, and inverse transform network.

本实施例中，针对高光谱图像的各向异性，提出空间和光谱卷积模块(SS module，包含SS module_down(用于构建编码网络)和SS module_up(用于构建解码网络))，中间由GDN连接。在SS module_down中，对于光谱维度为B的图像张量(B*H*W)，第一层下采样后输入5*5*B的滤波器，生成N个特征表示；经过GDN，再经过一次下采样，输入5*5*N的卷积层，生成B个特征表示，再经过1*1*B的卷积层生成N个特征表示。SS module_up的过程与SSmodule_down类似，不过把下采样替换为上采样。在光谱模块设计时将高光谱图像的光谱维引入网络，实现光谱信息的重排，从而降低光谱之间的相关性。针对高光谱图像潜层表示的非高斯特性，在熵模型设计时，不再采用传统高斯分布的假设，通过引入一些非高斯分布作为潜层表示的统计先验，以提高熵模型与潜层表示统计分布的匹配度。根据对高光谱数据集CAVE和KAIST数据集潜变量的拟合，发现t分布表现良好，同时可以通过调整自由度灵活地改变分布的形状，在自由度趋于无穷时，t分布与高斯分布可等价。这一特性使得t分布既能捕获到潜层表示的非高斯特性，又能具有高斯分布的普适性。In this embodiment, in view of the anisotropy of hyperspectral images, a spatial and spectral convolution module (SS module, including SS module_down (for constructing an encoding network) and SS module_up (for constructing a decoding network)) is proposed, and GDN is used in the middle. connect. In SS module_down, for an image tensor (B*H*W) whose spectral dimension is B, the first layer is down-sampled and then input a 5*5*B filter to generate N feature representations; after GDN, one more time Downsampling, input a 5*5*N convolutional layer, generate B feature representations, and then go through a 1*1*B convolutional layer to generate N feature representations. The process of SS module_up is similar to SSmodule_down, but the downsampling is replaced by upsampling. The spectral dimension of the hyperspectral image was introduced into the network during the design of the spectral module to realize the rearrangement of spectral information, thereby reducing the correlation between spectra. In view of the non-Gaussian characteristics of the latent layer representation of hyperspectral images, the assumption of traditional Gaussian distribution is no longer used in the design of the entropy model, and some non-Gaussian distributions are introduced as the statistical prior of latent layer representation to improve the entropy model and latent layer representation. The fit of the statistical distribution. According to the fitting of the latent variables of the hyperspectral datasets CAVE and KAIST datasets, it is found that the t distribution performs well, and the shape of the distribution can be flexibly changed by adjusting the degrees of freedom. When the degrees of freedom tend to be infinite, the t distribution and the Gaussian distribution can be equivalence. This property enables the t-distribution to capture both the non-Gaussian properties of the latent layer representation and the generality of the Gaussian distribution.

本实施例中，通过训练集对卷积神经网络进行训练，其中，卷积神经网络包括非线性变换模块、量化模块以及熵模型；使用测试集验证训练完成的卷积神经网络的压缩性能，当训练完成的卷积神经网络的压缩性能达标时，通过训练完成的卷积神经网络对高光谱图像进行压缩处理。通过本实施例，对于高光谱图像压缩具有较好的率失真性能。In this embodiment, the convolutional neural network is trained by the training set, wherein the convolutional neural network includes a nonlinear transformation module, a quantization module and an entropy model; the compression performance of the trained convolutional neural network is verified by using the test set. When the compression performance of the trained convolutional neural network reaches the standard, the hyperspectral image is compressed by the trained convolutional neural network. This embodiment has better rate-distortion performance for hyperspectral image compression.

第三方面，本发明实施例还提供一种高光谱图像压缩装置。In a third aspect, an embodiment of the present invention further provides a hyperspectral image compression device.

一实施例中，参照图3，图3为本发明高光谱图像压缩装置一实施例的功能模块示意图。如图3所示，高光谱图像压缩装置包括：In an embodiment, referring to FIG. 3 , FIG. 3 is a schematic diagram of functional modules of an embodiment of a hyperspectral image compression apparatus according to the present invention. As shown in Figure 3, the hyperspectral image compression device includes:

训练模块10，用于通过训练集对卷积神经网络进行训练，其中，卷积神经网络包括非线性变换模块、量化模块以及熵模型；The training module 10 is used for training the convolutional neural network through the training set, wherein the convolutional neural network includes a nonlinear transformation module, a quantization module and an entropy model;

处理模块20，用于使用测试集验证训练完成的卷积神经网络的压缩性能，当训练完成的卷积神经网络的压缩性能达标时，通过训练完成的卷积神经网络对高光谱图像进行压缩处理。The processing module 20 is used to verify the compression performance of the trained convolutional neural network using the test set. When the compression performance of the trained convolutional neural network reaches the standard, the hyperspectral image is compressed by the trained convolutional neural network. .

进一步地，一实施例中，高光谱图像压缩装置还包括构建模块，用于：Further, in one embodiment, the hyperspectral image compression device further includes a building block for:

其中，

表示输入的高光谱图像，

和

表示正变换网络参数，

和

in,

represents the input hyperspectral image,

and

represents the positive transformation network parameters,

and

其中，training表示训练过程，testing表示测试过程，

表示单位均匀噪声，round表示取整操作，

represents unit uniform noise, round represents the rounding operation,

represents the quantized latent vector.

其中，上述高光谱图像压缩装置中各个模块的功能实现与上述高光谱图像压缩方法实施例中各步骤相对应，其功能和实现过程在此处不再一一赘述。The function implementation of each module in the above-mentioned hyperspectral image compression apparatus corresponds to each step in the above-mentioned embodiment of the hyperspectral image compression method, and the functions and implementation process thereof will not be repeated here.

第四方面，本发明实施例还提供一种可读存储介质。In a fourth aspect, an embodiment of the present invention further provides a readable storage medium.

本发明可读存储介质上存储有高光谱图像压缩程序，其中所述高光谱图像压缩程序被处理器执行时，实现如上述的高光谱图像压缩方法的步骤。The readable storage medium of the present invention stores a hyperspectral image compression program, wherein when the hyperspectral image compression program is executed by the processor, the steps of the hyperspectral image compression method described above are implemented.

其中，高光谱图像压缩程序被执行时所实现的方法可参照本发明高光谱图像压缩方法的各个实施例，此处不再赘述。The method implemented when the hyperspectral image compression program is executed may refer to the various embodiments of the hyperspectral image compression method of the present invention, which will not be repeated here.

需要说明的是，在本文中，术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含，从而使得包括一系列要素的过程、方法、物品或者系统不仅包括那些要素，而且还包括没有明确列出的其他要素，或者是还包括为这种过程、方法、物品或者系统所固有的要素。在没有更多限制的情况下，由语句“包括一个……”限定的要素，并不排除在包括该要素的过程、方法、物品或者系统中还存在另外的相同要素。It should be noted that, herein, the terms "comprising", "comprising" or any other variation thereof are intended to encompass non-exclusive inclusion, such that a process, method, article or system comprising a series of elements includes not only those elements, It also includes other elements not expressly listed or inherent to such a process, method, article or system. Without further limitation, an element qualified by the phrase "comprising a..." does not preclude the presence of additional identical elements in the process, method, article or system that includes the element.

上述本发明实施例序号仅仅为了描述，不代表实施例的优劣。The above-mentioned serial numbers of the embodiments of the present invention are only for description, and do not represent the advantages or disadvantages of the embodiments.

通过以上的实施方式的描述，本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现，当然也可以通过硬件，但很多情况下前者是更佳的实施方式。基于这样的理解，本发明的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来，该计算机软件产品存储在如上所述的一个存储介质(如ROM/RAM、磁碟、光盘)中，包括若干指令用以使得一台终端设备执行本发明各个实施例所述的方法。From the description of the above embodiments, those skilled in the art can clearly understand that the method of the above embodiment can be implemented by means of software plus a necessary general hardware platform, and of course can also be implemented by hardware, but in many cases the former is better implementation. Based on such understanding, the technical solutions of the present invention can be embodied in the form of software products in essence or the parts that make contributions to the prior art, and the computer software products are stored in a storage medium (such as ROM/RAM) as described above. , magnetic disk, optical disk), including several instructions to make a terminal device execute the method described in each embodiment of the present invention.

以上仅为本发明的优选实施例，并非因此限制本发明的专利范围，凡是利用本发明说明书及附图内容所作的等效结构或等效流程变换，或直接或间接运用在其他相关的技术领域，均同理包括在本发明的专利保护范围内。The above are only preferred embodiments of the present invention, and are not intended to limit the scope of the present invention. Any equivalent structure or equivalent process transformation made by using the contents of the description and drawings of the present invention, or directly or indirectly applied in other related technical fields , are similarly included in the scope of patent protection of the present invention.

Claims

1. A hyperspectral image compression method, wherein the hyperspectral image compression method comprises:

The convolutional neural network is trained through the training set, wherein the convolutional neural network includes a nonlinear transformation module, a quantization module and an entropy model;

Use the test set to verify the compression performance of the trained convolutional neural network. When the compression performance of the trained convolutional neural network reaches the standard, the hyperspectral image is compressed by the trained convolutional neural network.

2. The hyperspectral image compression method according to claim 1, wherein, before the step of training the convolutional neural network through the training set, further comprising:

Divide the sample hyperspectral image into several fixed-size cubes in the spatial dimension;

The several fixed-size cubes are divided into a training set and a test set according to a preset ratio.

3. The hyperspectral image compression method according to claim 2, wherein the nonlinear transformation module performs forward nonlinear transformation on the space and spectral dimensions of the hyperspectral image to obtain latent variables; The latent variable is quantified by the method; the entropy model is used to obtain the probability distribution of the latent variable, so as to determine the codeword allocated to each element in the latent variable based on the probability distribution during entropy coding.

4 . The hyperspectral image compression method according to claim 3 , wherein the training process of the convolutional neural network is constrained based on the rate-distortion criterion, which is used to determine the parameter values in the nonlinear transformation module and the entropy model. 5 .

5. The hyperspectral image compression method according to claim 4, wherein the nonlinear transformation comprises a forward transformation: Y=g _a (W _a X+b _a ), and a reverse transformation:

in,

represents the input hyperspectral image,

and

represents the positive transformation network parameters,

and

6. The hyperspectral image compression method as claimed in claim 5, wherein the function for quantizing latent variables by adding uniform noise is expressed as follows:

Among them, training represents the training process, testing represents the testing process,

represents unit uniform noise, round represents the rounding operation,

represents the quantized latent vector.

7 . The hyperspectral image compression method according to claim 6 , wherein statistical characteristics of latent variables are introduced into the design of the entropy model, and additional variables are introduced to construct a conditional model to improve the accuracy of the entropy model. 8 .

8. A hyperspectral image compression device, wherein the hyperspectral image compression device comprises:

a training module for training the convolutional neural network through the training set, wherein the convolutional neural network includes a nonlinear transformation module, a quantization module and an entropy model;

The processing module is used to verify the compression performance of the trained convolutional neural network using the test set. When the compression performance of the trained convolutional neural network reaches the standard, the hyperspectral image is compressed by the trained convolutional neural network.

9. A hyperspectral image compression device, characterized in that the hyperspectral image compression device comprises a processor, a memory, and a hyperspectral image compression program stored on the memory and executable by the processor, wherein When the hyperspectral image compression program is executed by the processor, the steps of the hyperspectral image compression method according to any one of claims 1 to 7 are implemented.

10. A readable storage medium, characterized in that, a hyperspectral image compression program is stored on the readable storage medium, and when the hyperspectral image compression program is executed by a processor, the program as claimed in claims 1 to 7 is implemented. The steps of any one of the hyperspectral image compression methods.