CN113962882A - JPEG image compression artifact eliminating method based on controllable pyramid wavelet network - Google Patents

JPEG image compression artifact eliminating method based on controllable pyramid wavelet network Download PDF

Info

Publication number
CN113962882A
CN113962882A CN202111155935.3A CN202111155935A CN113962882A CN 113962882 A CN113962882 A CN 113962882A CN 202111155935 A CN202111155935 A CN 202111155935A CN 113962882 A CN113962882 A CN 113962882A
Authority
CN
China
Prior art keywords
network
channel
image
cbcr
wavelet
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111155935.3A
Other languages
Chinese (zh)
Other versions
CN113962882B (en
Inventor
张译
禹冬晔
牟轩沁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian Jiaotong University
Original Assignee
Xian Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian Jiaotong University filed Critical Xian Jiaotong University
Priority to CN202111155935.3A priority Critical patent/CN113962882B/en
Publication of CN113962882A publication Critical patent/CN113962882A/en
Application granted granted Critical
Publication of CN113962882B publication Critical patent/CN113962882B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/77Retouching; Inpainting; Scratch removal
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4038Image mosaicing, e.g. composing plane images from plane sub-images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/90Determination of colour characteristics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding
    • G06T9/002Image coding using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Compression Of Band Width Or Redundancy In Fax (AREA)

Abstract

The invention discloses a JPEG image compression artifact eliminating method based on a controllable pyramid wavelet network, which comprises the steps of firstly extracting image characteristics related to compression grade, secondly guiding and recovering a Y channel image by using the characteristics, then guiding and recovering a CbCr channel image by using the characteristics and the recovered Y channel image, and finally transforming the image to an RGB space to obtain a final recovery result. The method provided by the invention has the advantages that the image coding parameter information is not required to be predicted, the method has a good recovery effect aiming at a plurality of images with different compression levels, each recovery network only needs to train a single network model, the model avoids the problems of gradient disappearance and gradient explosion possibly occurring in the training process by using jump connection on one hand, and reduces the complexity of the model by using a recursive module shared parameter strategy on the other hand, and the high-efficiency operation of the algorithm is ensured. Therefore, the method provided by the invention has the advantages of simple model, less parameter quantity, wide application range, remarkable recovery effect and the like.

Description

JPEG image compression artifact eliminating method based on controllable pyramid wavelet network
Technical Field
The invention belongs to the field of image processing, and particularly relates to a JPEG image compression artifact elimination method based on a controllable pyramid wavelet network.
Background
Due to the limitations of transmission bandwidth and storage capacity, images or videos shot by a camera need to be compressed in the using process, and at present, lossy compression methods represented by JPEG compression are widely applied to various links of image processing. Since high frequency information of an image is lost in a quantization stage, a compressed image may contain compression artifacts such as blocking artifacts, ringing artifacts, and blurring, which not only cause a reduction in the perceived quality of the image, but also affect the performance of various computer vision algorithms that take the compressed image as an input. Therefore, the JPEG compressed image recovery algorithm is designed quickly and effectively, and the method has wide application prospect and practical value.
Currently, JPEG image compression artifact removal techniques can be roughly classified into the following three types:
1) a filter-based approach, i.e. eliminating compression artifacts by performing a filtering operation along block boundaries in the spatial or frequency domain. The spatial domain filtering method generally selects an appropriate filter near a block boundary for filtering according to the characteristics of different image regions, i.e. a spatial adaptive filtering method. Later, some more complex filtering methods are developed, including image block shift window filtering, nonlinear filtering, adaptive non-local mean filtering, adaptive bilateral filtering, etc. The frequency domain filtering method restores image detail information mainly by adjusting Discrete Cosine Transform (DCT) coefficients.
2) The method based on the optimization of the inverse problem, namely, the optimization and solving process of treating the image decompression artifact as the inverse problem, solves the original image by utilizing certain priori knowledge of the image. Typical image priors include low rank priors, quantization constraint priors, non-local similarities, sparse representation priors, and the like. Some methods use a number of a priori knowledge to obtain an optimal solution to the inverse problem. Most algorithms are time-consuming due to the complex optimization process of the prior knowledge-based method.
3) Machine learning based methods map/transform compressed images into original images by learning a large number of original images and compressed image samples to obtain some image mapping/transformation relationship. Typical machine learning methods utilize Convolutional Neural Networks (CNN) to implement image mapping/transformation, such as ARCNN, TNRD, DnCNN, CAS-CNN, MemNet, S-Net, Deep Convolutional Sparse Coding (DCSC) networks, generative confrontation network models, and the like. Some methods (such as DMCNN, DDCN, MWCNN, DPW-SDNet, etc.) use CNN to recover the spatial domain and the frequency domain of the image respectively, so as to obtain better image recovery performance.
Among the three methods, the filtering-based method has poor image recovery performance, the inverse problem optimization-based method has high computational complexity, and the algorithm consumes time. In comparison, with the development of the GPU parallel computing technology, the method based on machine learning can not only obtain better image recovery performance, but also has faster algorithm speed. However, most current methods based on machine learning require prediction of coding information of compressed images and are effective only for partial compression level images, thereby limiting the application range of the algorithm. While DnCNN overcomes the above limitations by adjusting the training data, its image recovery performance is general and only effective for grayscale images. In addition, there are also some methods to implement the multi-compression level image recovery task by training multiple network models, however, multiple network models mean that more storage space is occupied. Therefore, there is a need for a unified network model that does not require any pre-knowledge of the coding information of the compressed image, and that is effective for gray-scale and color images of various compression levels while occupying less memory space, and thus is easy to implement on small devices.
Disclosure of Invention
The invention aims to overcome the defects and provides a JPEG image compression artifact elimination method based on a controllable pyramid wavelet network. The Y-channel image recovery network of the invention comprises 6 recursion modules, and takes multi-scale and multi-direction wavelet coefficients as input to predict the wavelet coefficients of the original image. The CbCr channel image recovery network comprises 6 recursive U-Net structures, and the CbCr channels of the compressed images and the recovered Y channels are used as input, so that the recovered CbCr channel images are obtained.
In order to achieve the above object, the method comprises the following steps:
converting an input JPEG compressed image into a YCbCr color space;
inputting a Y channel of a JPEG compressed image into a QF prediction network, extracting a nonlinear feature map by utilizing a cascade convolution layer, and rescaling the nonlinear feature map in a third dimension before inputting the JPEG compressed image into a recovery network;
step three, decomposing a Y channel of the JPEG compressed image in three dimensions and eight directions through controllable pyramid wavelet transformation to generate a high-pass sub-band, twenty-four band-pass sub-bands and a low-pass sub-band, sending all sub-band coefficients together with a related characteristic diagram of the QF prediction network and compressed images corresponding to different dimensions into a Y channel recovery network, and performing inverse wavelet transformation on the sub-band coefficients output by the Y channel recovery network to a spatial domain to obtain a recovered Y channel image;
inputting the linear characteristic diagram in the QF prediction network, the Y channel image restored in the step three and the CbCr channel of the compressed image into the CbCr channel restoration network together, thereby obtaining a restored CbCr channel image;
and step five, converting the YCbCr channel image into an RGB color space to obtain a recovered color image.
The structure of the QF prediction network is as follows:
Figure BDA0003288394740000031
the size of the input image block is 128 x 128 pixels.
In the second step, the output of the fourth, sixth and eighth convolution layers of the QF prediction network is input into the recovery network of the Y channel and the CbCr channel after being adjusted and scaled by the number of the channels.
In the third step, the Y-channel recovery network comprises six recursion modules, each recursion module comprises four parallel convolutional neural networks, the four convolutional neural networks correspond to image input of different scales respectively, after feature map splicing and two-layer convolutional layer operation, feature maps of different scales are subjected to up-sampling and down-sampling operation, feature fusion is realized in each path of neural network, and after the feature maps are fused, network output is obtained after the feature maps pass through two layers of convolutional layers and a parameter correction linear unit.
In the third step, the nonlinear feature map extracted by the QF prediction network comprises feature maps of four scales, the feature maps of four scales are respectively connected into corresponding four convolutional neural networks by means of splicing with wavelet coefficient feature maps, compressed images of different scales are respectively spliced with corresponding wavelet subband coefficients to be used as the input of each neural network of the recursion module, the real part and the imaginary part of the coefficient are respectively extracted because the wavelet transformation coefficient is a complex number, and the real part feature map and the imaginary part feature map are spliced to be used as the input of the network.
The real part and the imaginary part of the wavelet coefficient output by the recursion module in the Y channel recovery network are summed element by element with the real part and the imaginary part of the wavelet coefficient input originally, so that the recovered wavelet coefficient is obtained; the calculation process is repeated six times, and the six recursion modules share the same network parameters.
In the fourth step, the CbCr channel recovery network comprises six recursion modules with the same network structure, the input of each recursion module comprises a CbCr channel of a compressed image, a nonlinear feature map extracted by the QF prediction network and a recovered Y-channel image obtained in the third step, and the CbCr channel recovery network repeatedly predicts the CbCr channel image through residual learning by utilizing the recovered Y-channel image and the related features extracted by the QF prediction network.
The network structure of a recursion module in the CbCr channel recovery network comprises an encoder end and a decoder end, wherein the encoder end completes down sampling for three times, and the decoder end completes up sampling for three times; feature fusion is realized between feature graphs with the same scale at the encoder end and the decoder end through jump connection;
the output of each convolution layer of the CbCr channel recovery network is 64 channels, and at the encoding end, the output of the convolution layers with different scales is spliced and combined with the features of the corresponding scale extracted by the QF prediction network;
and the recursion module outputs the CbCr channel which is subjected to element-by-element summation with the original compressed image CbCr channel to obtain a recovered CbCr channel, the calculation process is repeated for six times, and the six recursion modules share the same network parameters.
Compared with the prior art, the method has the advantages that the QF prediction network is utilized to extract the image characteristics related to the compression grade, then the Y channel and the CbCr channel of the image are respectively recovered through the two recovery networks, and finally the image is converted into the RGB space to obtain the final recovery result. The invention provides that QF prediction network is used for extracting image characteristics related to compression grade, thereby guiding the network to restore the image in a self-adaptive way under the condition of different compression grades, therefore, the method of the invention does not need to predict image compression coding information and can better finish the restoration tasks of a plurality of images with different compression grades; the recovery network provided by the invention only needs to train a single network model, and the model avoids the problems of gradient disappearance and gradient explosion which may occur in the training process by using jump connection on one hand, and reduces the complexity of the model by using a recursive module shared parameter strategy on the other hand, thereby ensuring the efficient operation of the algorithm. The invention uses QF prediction network to extract image characteristics, which can increase the nonlinear capability of the network, thereby learning more complex image mapping/transformation relation. The invention adopts a Y-channel recovery network to analyze the multi-scale and multi-directional spatial correlation of the image by using the controllable pyramid wavelet transform, and simultaneously analyzes the amplitude and phase information of the wavelet sub-band coefficient, thereby obtaining better image recovery performance. The invention inputs the recovered Y channel image into a CbCr channel recovery network, thereby assisting in recovering the color component of the image by utilizing the structure and texture characteristics of the brightness image.
Furthermore, the recovery network of the invention uses a recursive module to share the model parameter strategy, which is beneficial to reducing the complexity of the model and ensuring the efficient operation of the algorithm.
Furthermore, the recovery network of the invention uses jump connection to increase the network depth, and avoids the problems of gradient disappearance and gradient explosion which may occur in the training process.
Drawings
FIG. 1 is a process framework diagram of the present invention;
FIG. 2 is a schematic diagram of a QF-related feature extraction network according to the present invention;
FIG. 3 is a network structure diagram of a recursive module in the Y-channel recovery network according to the present invention;
FIG. 4 is a method for stitching wavelet feature maps of different dimensions;
FIG. 5 is a network structure diagram of a recursion module in a CbCr channel recovery network;
FIG. 6 is a graph illustrating the recovery performance of the method of the present invention compared to other methods for images of different compression levels; wherein, (a) is the PSNR index increase condition when different algorithms are used for recovering images; (b) the SSIM index increases when different algorithms are used to restore the image.
Detailed Description
The invention is further described below with reference to the accompanying drawings.
In the description of the present invention, it is to be understood that the embodiments described herein are exemplary and that the specific parameters set forth in the description of the embodiments are merely for the purpose of describing the invention and are not intended to be limiting.
The method comprises the following steps: the input JPEG compressed image is converted from the RGB color space to the YCbCr color space.
Step two: inputting the Y channel of the compressed image into a QF prediction network, extracting a nonlinear feature map by utilizing a cascade convolution layer, and rescaling the nonlinear feature map in a third dimension (the number of channels) before inputting the Y channel into a recovery network.
The size of the input image block is 128 × 128, and the QF prediction network structure is shown in table 1.
TABLE 1QF prediction network architecture model
Figure BDA0003288394740000061
QF-related features are extracted using the second, fourth, sixth and eighth convolutional layers of the QF prediction network, as shown in fig. 2. Assuming that a W × H image block is input, the dimensions of the output feature map of the second, fourth, sixth, and eighth convolutional layers of the network are: w × H × 64, W/2 × H/2 × 128, W/4 × H/4 × 256, W/8 × H/8 × 512. And adjusting the number of output characteristic diagram channels of the convolution layers of the fourth layer, the sixth layer and the eighth layer to 64. Specifically, for the feature maps of W/2 XH/2X 128, each adjacent 2 feature maps are averaged; for the feature maps of W/4 XH/4X 256, taking the mean value of every adjacent 4 feature maps; for the W/8 XH/8 512 feature maps, each adjacent 8 feature maps were averaged. The extracted QF related features are respectively represented by C1, f1, f2 and f3, and the dimensions are respectively as follows: w × H × 64, W/2 × H/2 × 64, W/4 × H/4 × 64, W/8 × H/8 × 64.
Step three: the Y-channel of the compressed image undergoes a controlled pyramid wavelet transform, decomposing at 3 scales and 8 directions, producing 1 high-pass sub-band, 24 band-pass sub-bands and 1 low-pass sub-band. The sub-band coefficients, related QF characteristics and original compressed images with different scales are input into a recovery network consisting of 6 recursive modules, and the sub-band coefficients output by the network are subjected to inverse wavelet transformation to a spatial domain, so that a recovered Y-channel image is obtained.
As shown in fig. 1, the Y-channel image is first decomposed using a controlled pyramid wavelet transform (SPWT) in 3 scales and 8 directions, resulting in 1 high-pass sub-band (H0), 24 band-pass sub-bands of different scales and different directions (the sub-bands of different scales are denoted B1, B2, and B3, respectively), and 1 low-pass sub-band (L0). Splicing and merging the H0, B1 and Y channel images as the first scale input of a recursion module; b2 is spliced and combined with the downsampled Y-channel image to be used as a second scale input of the recursion module; b3 is spliced and merged with the Y-channel image which is down-sampled twice, and the spliced and merged image is used as the third scale input of the recursion module; l0 is the fourth scale input to the recursion module. For H0, B1, B2 and B3, the real part and imaginary part of the wavelet coefficients are extracted respectively, and the real part feature map and imaginary part feature map are merged. Thus, the number of channels input by the four scales of the recursion module is 18, 17 and 1, respectively.
In the recursive module, convolution and parameter correction linear unit (PReLU) operation are firstly carried out on the input of four scales, and a 64-channel characteristic diagram of the four scales is obtained. As shown in fig. 3, these feature maps are merged with corresponding QF features (C1, f1, f2 and f3), and feature map fusion is performed on feature maps of different scales on each scale after two convolutions and prilu operations. As shown in fig. 4, the feature maps of different scales are uniform in dimension by means of mean pooling and deconvolution. Specifically, the length and width dimensions of the feature map are not changed through a 1 × 1 convolution core, the length and width dimensions of the feature map are halved through a mean pooling operation with a scale factor of 2, the length and width dimensions of the feature map are doubled through a deconvolution operation with a scale factor of 2, and the length and width dimensions of the feature map are halved/doubled through multiple mean pooling/deconvolution operations. Finally, the output of the recursion module is a wavelet coefficient residual map with four scales, and the channel numbers are 17, 16 and 1 respectively.
As shown in fig. 1, the output of the recursion module is added pixel by pixel to the original wavelet subband coefficients, the result of which is again input into the recursion module, the output of which is again added pixel by pixel to the original wavelet subband coefficients. The calculation process is repeated 6 times, and 6 recursion modules share the same network parameters, so that the complexity of the model is reduced. Finally, these restored wavelet sub-band coefficients are subjected to an Inverse wavelet transform (Inverse SPWT) to the spatial domain, resulting in a restored Y-channel image. In the parameter selection, all convolution kernel sizes are 3 × 3 pixels except for the convolution kernel sizes of the first and last convolutional layers set to 7 × 7 and 5 × 5 pixels, respectively. And (3) filling the first and last convolutional layers by using mirror image 3 pixels and mirror image 2 pixels respectively, and filling the rest convolutional layers by using 1 pixel zero, so that the input and output dimensions of each layer of the network are consistent. In addition, the characteristic nonlinearity is achieved using the PReLU after each convolutional layer except the last convolutional layer.
Step four: and inputting the QF related features extracted in the second step, the Y-channel image restored in the third step and the CbCr channel of the compressed image into a CbCr channel restoration network comprising 6 recursion modules, thereby obtaining a restored CbCr channel image.
As shown in fig. 1, the CbCr channel restoration network adopts a structure similar to that of the Y channel restoration network, that is, the CbCr channel image is predicted repeatedly through residual learning by using the restored Y channel image and QF-related features, and all recursion modules share the same network parameters, thereby reducing the model complexity. As shown in fig. 5, the recursive module adopts a network structure similar to that of U-Net, except that at the encoder end, convolutional layers of different scales are output and spliced and combined with QF-related features of corresponding scale sizes, so that the network can adaptively complete the task of restoring images of different compression levels. Since the human eye is insensitive to image color distortion, the outputs of all layers are set to 64 channels, i.e., the CbCr channels are recovered using a lightweight network. In terms of parameter selection, similar to the Y-channel restoration network, all convolution kernel sizes are 3 × 3 pixels except that the convolution kernel sizes of the first and last convolutional layers are set to 7 × 7 and 5 × 5 pixels, respectively, and the convolution kernel size of the deconvolution layer is set to 2 × 2 pixels. And (3) filling the first and last convolutional layers by using mirror image 3 pixels and mirror image 2 pixels respectively, and filling the rest convolutional layers by using 1 pixel zero, so that the input and output dimensions of each layer of the network are consistent. In addition, the characteristic nonlinearity is achieved using the PReLU after each convolutional layer except the last convolutional layer.
The network training method can be summarized as follows:
1) the invention mainly comprises three networks: QF prediction network, Y channel recovery network and CbCr channel recovery network. Because a certain variable dependency relationship exists among the networks, firstly, a QF prediction network is trained; then, extracting QF related characteristics of the compressed image blocks by using a QF prediction network with fixed parameters, and training a Y channel recovery network; and finally, extracting QF related characteristics of the compressed image blocks and calculating recovered Y-channel images by using a QF prediction network and a Y-channel recovery network with fixed parameters, and training a CbCr channel recovery network.
2) The QF prediction network was trained using the VOC2012 database. The specific method comprises the following steps: compressing each RGB image in the database by using a random QF value, wherein QF is an integer and the value range is [5, 95], and generating 12700 compressed images; the compressed image was then converted to YCbCr space and 105234 non-overlapping 128 x 128 pixel image blocks were extracted from the Y channel image for network training, using an L1 loss function.
3) The Y-channel and CbCr-channel restoration networks were trained using the Berkeley Segmentation Database (BSD), the DIV2K database, the smooth Exploration database (WED), and the Flickr2K database, where BSD contains 400 images (200 for each training and test sets), DIV2K contains 900 images, WED contains 4744 images, and Flickr2K contains 2000 images. The specific method comprises the following steps: compressing each reference image by using a random QF value, wherein QF is an integer and has a value range of QF belonging to {10:20,22:2:30,35:5:60,70:10:90}, and generating 8044 compressed images; subsequently, the reference image and the compressed image were converted to YCbCr space, and 583625 non-overlapping 128 × 128 pixel image blocks were extracted for network training, respectively. Wherein, training the Y-channel recovery network needs to use the Y-channel image block as training data, and the loss function is pixel mean square error loss (l)MSE) And loss of structural similarity (l)SSIM) Linear combinations of (i.e. L ═ L)MSE+λ·lSSIM(λ ═ 0.001); training the CbCr channel recovery network requires using CbCr channel image blocks as training data, and the loss function is pixel mean square error loss.
4) Loss of pixel mean square error (l)MSE) The calculation formula is as follows:
Figure BDA0003288394740000101
wherein, I (I, j) and IR(I, j) denote a reference image I and a restored image I, respectivelyRThe spatial position is a pixel value of (i, j); w and H represent the width and height of the image, respectively. Loss of structural similarity (l)SSIM) The calculation formula is as follows:
Figure BDA0003288394740000102
wherein,
Figure BDA0003288394740000103
denotes SSIM (I, I)R) The calculation formula of (a) is:
Figure BDA0003288394740000104
in the formula,
Figure BDA0003288394740000105
and
Figure BDA0003288394740000106
respectively represent I (I)R) Local mean and local standard deviation of; c1And C2Is a constant, and the value thereof is the same as that of the SSIM method.
5) The invention uses PyTorch deep learning framework to perform experiments on 8-core Intel i9-9900K 3.60GHz CPU and NVIDIA GeForce RTX 2080SUPER GPU workstation. The network initialization parameter is a sampling value of normal distribution N (1, 0.02); the initialization parameter of the PReLU slope is 0.1; optimizing by using an Adam algorithm; the initial learning rate is 2 × 10-4And the first/second order moment exponential decay rates are set to 0.9 and 0.999, respectively. When the QF prediction network is trained, the batch size is set to 64, and the learning rate is reduced to 0.8 before every epoch, so that 120 epochs are trained. When two recovery networks are trained, the batch size is set to 4, and the learning rate is reduced to 0.9 before every 20000 times of iteration, and 4 epochs are trained in total.
The network testing method can be summarized as follows:
1) recovery of compressed image of known QF value
And selecting LIVE, CSIQ, BSD100 (100 images in a BSD verification set) and a reference image of an Urban100 database to perform algorithm performance test. The specific method comprises the following steps: for each reference image of each database, JPEG compression is carried out by respectively using eight different compression levels, and QF values in experiments are 10, 20, 30, 40, 50, 60,70 and 80; then, recovering the compressed image by using different algorithms/network models; and finally, performing algorithm performance test on the restored image by using two indexes of peak signal to noise ratio (PSNR) and Structural Similarity (SSIM). Table 2 shows the recovery performance of the present invention compared to other methods for JPEG compressed images with known QF values.
TABLE 2 comparison of test Performance of the present invention (SPW-Net) with other methods on LIVE, BSD100, CSIQ, Urban100 databases
Figure BDA0003288394740000111
Figure BDA0003288394740000121
2) Recovery of compressed images of unknown QF values
Selecting an SDIVL database to perform algorithm performance test, wherein the test content comprises two parts: (1) directly selecting a JPEG compressed image of QF epsilon [10,90] in an SDIVL database for testing; (2) JPEG-compressing each original image of the SDIVL database from 10 to 90 (step size is 1) in turn according to QF, and recovering 1620 generated compressed images. Table 3 and fig. 6 show the performance comparison of the different methods in the above two-part test, respectively. The experimental result shows that compared with other methods, the method provided by the invention has better recovery performance for images with different compression levels.
TABLE 3 comparison of test Performance of the invention (SPW-Net) with other methods on SDIVL database
Figure BDA0003288394740000122
In a word, the JPEG image compression artifact eliminating method based on the controllable pyramid wavelet network firstly utilizes the QF prediction network to extract image features related to compression grade, secondly respectively recovers a Y channel and a CbCr channel of an image through two recovery networks, and finally converts the image into an RGB space to obtain a final recovery result. The method provided by the invention has the advantages that the image coding parameter information is not required to be predicted, the better recovery effect can be realized for gray scales and color images with different compression levels, each recovery network only needs to train a single network model, the model avoids the problems of gradient disappearance and gradient explosion possibly occurring in the training process by using jump connection on one hand, and the complexity of the model is reduced by using a recursive module shared parameter strategy on the other hand, and the efficient operation of the algorithm is ensured. Therefore, the method provided by the invention has the advantages of simple model, less parameter quantity, wide application range, remarkable recovery effect and the like.
Although the embodiments of the present invention have been described above with reference to the accompanying drawings, the present invention is not limited to the above-described embodiments. The above embodiments are intended to be illustrative, not limiting. Those skilled in the art can make many JPEG image compression artifact removal methods without departing from the scope of the claimed invention, which falls within the protection scope of the present invention.

Claims (8)

1. A JPEG image compression artifact eliminating method based on a controllable pyramid wavelet network is characterized by comprising the following steps:
converting an input JPEG compressed image into a YCbCr color space;
inputting a Y channel of a JPEG compressed image into a QF prediction network, extracting a nonlinear feature map by utilizing a cascade convolution layer, and rescaling the nonlinear feature map in a third dimension before inputting the JPEG compressed image into a recovery network;
step three, decomposing a Y channel of the JPEG compressed image in three dimensions and eight directions through controllable pyramid wavelet transformation to generate a high-pass sub-band, twenty-four band-pass sub-bands and a low-pass sub-band, sending all sub-band coefficients together with a related characteristic diagram of the QF prediction network and compressed images corresponding to different dimensions into a Y channel recovery network, and performing inverse wavelet transformation on the sub-band coefficients output by the Y channel recovery network to a spatial domain to obtain a recovered Y channel image;
inputting the linear characteristic diagram in the QF prediction network, the Y channel image restored in the step three and the CbCr channel of the compressed image into the CbCr channel restoration network together, thereby obtaining a restored CbCr channel image;
and step five, converting the YCbCr channel image into an RGB color space to obtain a recovered color image.
2. The JPEG image compression artifact elimination method based on the controllable pyramid wavelet network as claimed in claim 1, wherein the QF prediction network has the following structure:
Figure FDA0003288394730000011
the size of the input image block is 128 x 128 pixels.
3. The method for eliminating the JPEG image compression artifacts based on the controllable pyramid wavelet network as claimed in claim 2, wherein in the second step, the output of the fourth, sixth and eighth convolution layers of the QF prediction network is input into the recovery network of the Y channel and the CbCr channel after being scaled by the channel number adjustment.
4. The JPEG image compression artifact elimination method based on the controllable pyramid wavelet network as claimed in claim 1, wherein in the third step, the Y-channel recovery network comprises six recursion modules with the same network structure, each recursion module comprises four parallel convolutional neural networks, the four convolutional neural networks correspond to image inputs with different scales, after feature map splicing and two-layer convolutional layer operation, feature fusion is realized on each neural network through up-sampling and down-sampling operations of feature maps with different scales, and after the fused feature maps pass through two layers of convolutional layers and a parameter correction linear unit, network outputs are obtained.
5. The method for eliminating the JPEG image compression artifacts based on the controllable pyramid wavelet network as claimed in claim 1, wherein in step three, the nonlinear feature map extracted by the QF prediction network comprises feature maps of four scales, the feature maps of the four scales are respectively connected to corresponding four-way convolutional neural networks by means of being connected to the wavelet coefficient feature map, and the compressed images of different scales are respectively connected to corresponding wavelet subband coefficients and used as the input of each way of neural network of the recursion module.
6. The JPEG image compression artifact elimination method based on the controllable pyramid wavelet network as claimed in claim 4, wherein a Y channel restores the real part and the imaginary part of the wavelet coefficient output by a recursion module in the network, and the real part and the imaginary part of the wavelet coefficient input originally are summed element by element to obtain the restored wavelet coefficient; the calculation process is repeated six times, and the six recursion modules share the same network parameters.
7. The JPEG image compression artifact elimination method based on the controllable pyramid wavelet network as claimed in claim 1, wherein in step four, the CbCr channel restoration network comprises six recursion modules with the same network structure, the input of each recursion module comprises a CbCr channel of a compressed image, a non-linear feature map extracted by the QF prediction network and a restored Y channel image obtained in step three, and the CbCr channel restoration network uses the restored Y channel image and the relevant features extracted by the QF prediction network to repeatedly predict the CbCr channel image through residual error learning.
8. The JPEG image compression artifact elimination method based on the controllable pyramid wavelet network as claimed in claim 7, wherein the network structure of a recursion module in the CbCr channel recovery network comprises an encoder end and a decoder end, wherein the encoder end completes three times of down-sampling, and the decoder end completes three times of up-sampling; feature fusion is realized between feature graphs with the same scale at the encoder end and the decoder end through jump connection;
the output of each convolution layer of the CbCr channel recovery network is 64 channels, and at the encoding end, the output of the convolution layers with different scales is spliced and combined with the features of the corresponding scale extracted by the QF prediction network;
and the recursion module outputs the CbCr channel which is subjected to element-by-element summation with the original compressed image CbCr channel to obtain a recovered CbCr channel, the calculation process is repeated for six times, and the six recursion modules share the same network parameters.
CN202111155935.3A 2021-09-29 2021-09-29 JPEG image compression artifact eliminating method based on controllable pyramid wavelet network Active CN113962882B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111155935.3A CN113962882B (en) 2021-09-29 2021-09-29 JPEG image compression artifact eliminating method based on controllable pyramid wavelet network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111155935.3A CN113962882B (en) 2021-09-29 2021-09-29 JPEG image compression artifact eliminating method based on controllable pyramid wavelet network

Publications (2)

Publication Number Publication Date
CN113962882A true CN113962882A (en) 2022-01-21
CN113962882B CN113962882B (en) 2023-08-25

Family

ID=79463365

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111155935.3A Active CN113962882B (en) 2021-09-29 2021-09-29 JPEG image compression artifact eliminating method based on controllable pyramid wavelet network

Country Status (1)

Country Link
CN (1) CN113962882B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116452465A (en) * 2023-06-13 2023-07-18 江苏游隼微电子有限公司 Method for eliminating JPEG image block artifact
CN117291962A (en) * 2023-11-27 2023-12-26 电子科技大学 Deblocking effect method of lightweight neural network based on channel decomposition

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2020103901A4 (en) * 2020-12-04 2021-02-11 Chongqing Normal University Image Semantic Segmentation Method Based on Deep Full Convolutional Network and Conditional Random Field
CN112419238A (en) * 2020-11-03 2021-02-26 广东机电职业技术学院 Copy-paste counterfeit image evidence obtaining method based on end-to-end deep neural network
CN112509094A (en) * 2020-12-22 2021-03-16 西安交通大学 JPEG image compression artifact elimination algorithm based on cascade residual error coding and decoding network
CN113065558A (en) * 2021-04-21 2021-07-02 浙江工业大学 Lightweight small target detection method combined with attention mechanism
CN113362225A (en) * 2021-06-03 2021-09-07 太原科技大学 Multi-description compressed image enhancement method based on residual recursive compensation and feature fusion

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112419238A (en) * 2020-11-03 2021-02-26 广东机电职业技术学院 Copy-paste counterfeit image evidence obtaining method based on end-to-end deep neural network
AU2020103901A4 (en) * 2020-12-04 2021-02-11 Chongqing Normal University Image Semantic Segmentation Method Based on Deep Full Convolutional Network and Conditional Random Field
CN112509094A (en) * 2020-12-22 2021-03-16 西安交通大学 JPEG image compression artifact elimination algorithm based on cascade residual error coding and decoding network
CN113065558A (en) * 2021-04-21 2021-07-02 浙江工业大学 Lightweight small target detection method combined with attention mechanism
CN113362225A (en) * 2021-06-03 2021-09-07 太原科技大学 Multi-description compressed image enhancement method based on residual recursive compensation and feature fusion

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
刘文斌;崔学英;上官宏;刘斌;: "用于低剂量CT图像去噪的递归残差编解码网络", 太原科技大学学报, no. 04 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116452465A (en) * 2023-06-13 2023-07-18 江苏游隼微电子有限公司 Method for eliminating JPEG image block artifact
CN116452465B (en) * 2023-06-13 2023-08-11 江苏游隼微电子有限公司 Method for eliminating JPEG image block artifact
CN117291962A (en) * 2023-11-27 2023-12-26 电子科技大学 Deblocking effect method of lightweight neural network based on channel decomposition
CN117291962B (en) * 2023-11-27 2024-02-02 电子科技大学 Deblocking effect method of lightweight neural network based on channel decomposition

Also Published As

Publication number Publication date
CN113962882B (en) 2023-08-25

Similar Documents

Publication Publication Date Title
Luo et al. Lattice network for lightweight image restoration
CN111709895A (en) Image blind deblurring method and system based on attention mechanism
CN113962882B (en) JPEG image compression artifact eliminating method based on controllable pyramid wavelet network
Sharma et al. From pyramids to state‐of‐the‐art: a study and comprehensive comparison of visible–infrared image fusion techniques
Perumal et al. A hybrid discrete wavelet transform with neural network back propagation approach for efficient medical image compression
CN112509094A (en) JPEG image compression artifact elimination algorithm based on cascade residual error coding and decoding network
CN114331913B (en) Motion blurred image restoration method based on residual attention block
CN115187455A (en) Lightweight super-resolution reconstruction model and system for compressed image
CN112150356A (en) Single compressed image super-resolution reconstruction method based on cascade framework
CN112991169B (en) Image compression method and system based on image pyramid and generation countermeasure network
Alsayyh et al. A Novel Fused Image Compression Technique Using DFT, DWT, and DCT.
CN117593187A (en) Remote sensing image super-resolution reconstruction method based on meta-learning and transducer
CN115272131B (en) Image mole pattern removing system and method based on self-adaptive multispectral coding
CN116128722A (en) Image super-resolution reconstruction method and system based on frequency domain-texture feature fusion
CN106846286B (en) Video super-resolution algorithm for reconstructing based on a variety of complementary priori
CN114219738A (en) Single-image multi-scale super-resolution reconstruction network structure and method
CN113362241A (en) Depth map denoising method combining high-low frequency decomposition and two-stage fusion strategy
Yu et al. Local excitation network for restoring a jpeg-compressed image
Hilles Spatial Frequency Filtering Using Sofm For Image Compression
Deshmukh Image compression using neural networks
CN111246205B (en) Image compression method based on directional double-quaternion filter bank
Dumitrescu et al. Image compression and noise reduction through algorithms in wavelet domain
Boudechiche et al. Ensemble leaning-CNN for reducing JPEG artifacts
CN115330635B (en) Image compression artifact removing method, device and storage medium
Pushpalatha et al. Interpolative Model on Hueristic Projection Transform for Image Compression in Cloud Services

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant