CN113962882B - JPEG image compression artifact eliminating method based on controllable pyramid wavelet network - Google Patents

JPEG image compression artifact eliminating method based on controllable pyramid wavelet network Download PDF

Info

Publication number
CN113962882B
CN113962882B CN202111155935.3A CN202111155935A CN113962882B CN 113962882 B CN113962882 B CN 113962882B CN 202111155935 A CN202111155935 A CN 202111155935A CN 113962882 B CN113962882 B CN 113962882B
Authority
CN
China
Prior art keywords
network
channel
image
cbcr
wavelet
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111155935.3A
Other languages
Chinese (zh)
Other versions
CN113962882A (en
Inventor
张译
禹冬晔
牟轩沁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian Jiaotong University
Original Assignee
Xian Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian Jiaotong University filed Critical Xian Jiaotong University
Priority to CN202111155935.3A priority Critical patent/CN113962882B/en
Publication of CN113962882A publication Critical patent/CN113962882A/en
Application granted granted Critical
Publication of CN113962882B publication Critical patent/CN113962882B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/77Retouching; Inpainting; Scratch removal
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4038Image mosaicing, e.g. composing plane images from plane sub-images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/90Determination of colour characteristics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding
    • G06T9/002Image coding using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Compression Of Band Width Or Redundancy In Fax (AREA)

Abstract

The invention discloses a JPEG image compression artifact eliminating method based on a controllable pyramid wavelet network, which comprises the steps of firstly extracting image characteristics related to compression grades, secondly using the characteristics to guide and restore Y channel images, then using the characteristics and the restored Y channel images to guide and restore CbCr channel images, and finally transforming the images into RGB space to obtain a final restoring result. The method provided by the invention does not need to predict the image coding parameter information, can have better recovery effect on a plurality of images with different compression grades, and each recovery network only needs to train a single network model, so that the problems of gradient disappearance and gradient explosion possibly occurring in the training process are avoided by using jump connection on one hand, and on the other hand, the complexity of the model is reduced by using a recursion module sharing parameter strategy, and the efficient operation of an algorithm is ensured. Therefore, the method provided by the invention has the advantages of simple model, less parameter quantity, wide application range, obvious recovery effect and the like.

Description

JPEG image compression artifact eliminating method based on controllable pyramid wavelet network
Technical Field
The invention belongs to the field of image processing, and particularly relates to a JPEG image compression artifact eliminating method based on a controllable pyramid wavelet network.
Background
The image or video shot by the camera needs to be compressed in the use process due to the limitation of transmission bandwidth and storage capacity, and the lossy compression method represented by JPEG compression is widely applied to various links of image processing at present. Since high frequency information of an image is lost in the quantization stage, the compressed image may contain compression artifacts such as blocking, ringing, blurring, etc., which may not only lead to degradation of perceived quality of the image, but also affect performance of various computer vision algorithms having the compressed image as input. Therefore, a quick and effective JPEG compressed image recovery algorithm is designed, and the method has wide application prospect and practical value.
Currently, JPEG image compression artifact removal techniques can be broadly divided into three types:
1) The filtering-based approach eliminates compression artifacts by performing filtering operations along block boundaries in the spatial or frequency domain. The spatial domain filtering method generally selects a suitable filter for filtering in the vicinity of a block boundary according to characteristics of different image regions, i.e., a spatial adaptive filtering method. Later, some more complex filtering methods have developed, including image block shift window filtering, nonlinear filtering, adaptive non-local mean filtering, adaptive bilateral filtering, and the like. The frequency domain filtering method recovers image detail information mainly by adjusting Discrete Cosine Transform (DCT) coefficients.
2) The method based on the inverse problem optimization, namely the image decompression artifact is regarded as an optimization and solving process of the inverse problem, and the original image is solved by utilizing certain priori knowledge of the image. Typical image priors include low rank priors, quantization constraint priors, non-local similarity, sparse representation priors, and the like. Part of the method uses a number of prior knowledge to obtain an optimal solution to the inverse problem. The prior knowledge-based method is time-consuming due to the complex optimization process.
3) The machine learning-based method maps/transforms a compressed image into an original image by learning a large number of original images and compressed image samples to obtain a certain image mapping/transformation relationship. Typical machine learning methods utilize Convolutional Neural Networks (CNNs) to implement image mapping/transformation, such as ARCNN, TNRD, dnCNN, CAS-CNN, memNet, S-Net, deep Convolutional Sparse Coding (DCSC) networks, generative countermeasure network models, and the like. Some methods (such as DMCNN, DDCN, MWCNN, DPW-SDNet, etc.) use CNNs to recover the spatial domain and the frequency domain of an image respectively, so as to obtain better image recovery performance.
In the three methods, the filtering-based method has poor image recovery performance, the inverse problem optimization-based method has high computational complexity, and the algorithm is time-consuming. In comparison, with the development of GPU parallel computing technology, the machine learning-based method not only can obtain better image recovery performance, but also has faster algorithm speed. However, most of the current methods based on machine learning require prediction of the encoded information of the compressed image and are effective only for a part of the compressed level image, thereby limiting the application range of the algorithm. Although DnCNN overcomes the above limitations by adjusting the training data, its image recovery performance is general and valid only for gray scale images. In addition, there are also methods to accomplish the multi-compression-level image restoration task by training multiple network models, which however mean that more memory space is occupied. Therefore, there is a need for a unified network model that does not require knowledge of the coding information of the compressed image, and that is efficient for gray scale and color images of various compression levels while taking up less memory space, and thus is easy to implement on a small device.
Disclosure of Invention
The invention aims to overcome the defects, and provides a JPEG image compression artifact eliminating method based on a controllable pyramid wavelet network, which firstly extracts image characteristics related to compression level (represented by quality factor QF), secondly uses the characteristics to guide and restore a Y-channel image, then uses the characteristics and the restored Y-channel image to guide and restore a CbCr channel image, and finally transforms the image into RGB space to obtain a final restoring result. The Y-channel image restoration network of the present invention comprises 6 recursive modules, taking multi-scale, multi-directional wavelet coefficients as inputs to predict the wavelet coefficients of the original image. The CbCr channel image restoration network contains 6 recursive U-Net structures, using the CbCr channels of the compressed image and the restored Y channels as inputs, to obtain a restored CbCr channel image.
In order to achieve the above object, the method comprises the following steps:
step one, converting an input JPEG compressed image into a YCbCr color space;
inputting a Y channel of the JPEG compressed image into a QF prediction network, extracting a nonlinear feature map by using a cascade convolution layer, and rescaling the nonlinear feature map in a third dimension before inputting a recovery network;
step three, the Y channel of the JPEG compressed image is subjected to controllable pyramid wavelet transformation and is decomposed in three dimensions and eight directions, so that a high-pass sub-band, twenty-four band-pass sub-bands and a low-pass sub-band are generated, all sub-band coefficients, a relevant characteristic diagram of a QF prediction network and compressed images corresponding to different dimensions are sent into a Y channel recovery network together, and sub-band coefficients output by the Y channel recovery network are subjected to inverse wavelet transformation to a spatial domain, so that a recovered Y channel image is obtained;
inputting the linear feature map in the QF prediction network, the Y channel image restored in the step three and the CbCr channel of the compressed image into a CbCr channel restoration network together, so as to obtain a restored CbCr channel image;
and fifthly, converting the YCbCr channel image into an RGB color space to obtain a recovered color image.
The structure of the QF prediction network is as follows:
the size of the input image block is 128 x 128 pixels.
In the second step, the outputs of the fourth, sixth and eighth convolution layers of the QF prediction network are input into the recovery network of the Y channel and the CbCr channel after being scaled by channel number adjustment.
In the third step, the Y channel recovery network comprises six recursion modules, the recursion modules comprise four parallel convolutional neural networks, the four convolutional neural networks respectively correspond to image inputs of different scales, after feature map splicing and two-layer convolutional layer operation, feature maps of different scales are subjected to up-sampling and down-sampling operations, feature fusion is realized on each path of neural network, and after feature maps are fused, network output is obtained after the feature maps pass through the two-layer convolutional layers and the parameter correction linear units.
In the third step, the nonlinear feature map extracted by the QF prediction network comprises four scale feature maps, the four scale feature maps are respectively connected into the corresponding four paths of convolution neural networks in a mode of splicing with the wavelet coefficient feature maps, and compressed images with different scales are respectively spliced with corresponding wavelet subband coefficients to serve as input of each path of neural network of the recursion module, and as the wavelet transform coefficients are complex numbers, real parts and imaginary parts of the coefficients are respectively extracted, and the real part feature map and the imaginary part feature map are spliced to serve as input of the network.
The real part and the imaginary part of the wavelet coefficient output by the recursion module in the Y channel recovery network are summed with the real part and the imaginary part of the wavelet coefficient input originally element by element, so that the recovered wavelet coefficient is obtained; this calculation process is repeated six times in total, with the six recursion modules sharing the same network parameters.
In the fourth step, the CbCr channel recovery network includes six recursive modules with the same network structure, and the input of each recursive module includes the CbCr channel of the compressed image, the nonlinear feature map extracted by the QF prediction network, and the recovered Y channel image obtained in the third step, and the CbCr channel recovery network predicts the CbCr channel image through residual learning repeatedly using the recovered Y channel image and the relevant features extracted by the QF prediction network.
The network structure of the recursion module in the CbCr channel recovery network comprises an encoder end and a decoder end, wherein the encoder end finishes three downsampling and the decoder end finishes three upsampling; feature fusion is realized between feature graphs with the same scale at the encoder end and the decoder end through jump connection;
the output of each convolution layer of the CbCr channel recovery network is 64 channels, and at the coding end, the convolution layer outputs with different scales are spliced and combined with the features extracted by the QF prediction network and corresponding to the scales;
the recursion module outputs the CbCr channels which are summed with the original compressed image element by element to obtain the restored CbCr channels, the calculation process is repeated six times, and the six recursion modules share the same network parameters.
Compared with the prior art, the method and the device have the advantages that the QF prediction network is utilized to extract the image characteristics related to the compression level, the Y channel and the CbCr channel of the image are respectively restored through the two restoration networks, and finally the image is transformed into the RGB space to obtain a final restoration result. The QF prediction network is used for extracting image characteristics related to compression levels, so that the network is guided to adaptively recover images under different compression levels, and therefore, the method does not need to predict image compression coding information and can better complete recovery tasks of a plurality of images with different compression levels; the recovery network provided by the invention only needs to train a single network model, and the model uses jump connection to avoid the problems of gradient disappearance and gradient explosion possibly occurring in the training process on one hand, and uses a recursion module sharing parameter strategy to reduce the complexity of the model and ensure the efficient operation of the algorithm on the other hand. The invention uses QF predictive network to extract image characteristics, which can increase the nonlinear capability of the network, thereby learning more complex image mapping/transformation relations. The invention adopts a Y channel recovery network to analyze the multi-scale and multi-directional spatial correlation of the image by using controllable pyramid wavelet transformation, and simultaneously analyzes the amplitude and phase information of wavelet subband coefficients, thereby obtaining better image recovery performance. The invention inputs the restored Y channel image into the CbCr channel restoration network, thereby utilizing the structure and texture characteristics of the brightness image to assist in restoring the color components of the image.
Furthermore, the recovery network of the invention uses the recursion module to share the model parameter strategy, thereby being beneficial to reducing the complexity of the model and ensuring the high-efficiency operation of the algorithm.
Furthermore, the recovery network of the invention uses jump connection to increase the network depth, thereby avoiding the problems of gradient disappearance and gradient explosion which may occur in the training process.
Drawings
FIG. 1 is a method framework of the present invention;
FIG. 2 is a diagram of a QF-related feature extraction network according to the present invention;
FIG. 3 is a diagram of a recursive modular network in a Y-channel recovery network in accordance with the present invention;
FIG. 4 is a method of stitching wavelet feature maps of different dimensions;
fig. 5 is a diagram of a recursive modular network in a CbCr channel recovery network;
FIG. 6 is a graph showing the comparison of the recovery performance of the method of the present invention with other methods for images of different compression levels; wherein, (a) is the PSNR index increase condition when the image is restored by using different algorithms; (b) To recover images using different algorithms, the SSIM index grows.
Detailed Description
The invention is further described below with reference to the accompanying drawings.
In the description of the present invention, it should be understood that the embodiments described in the present invention are exemplary, and specific parameters are presented in the description of the embodiments for convenience of description of the invention only and should not be construed as limitations on the invention.
Step one: the input JPEG compressed image is converted from the RGB color space to the YCbCr color space.
Step two: the Y-channel of the compressed image is input into the QF prediction network, the nonlinear feature map is extracted using a cascaded convolutional layer, and the nonlinear feature map is rescaled in a third dimension (channel number) before being input into the recovery network.
The size of the input image block is 128×128 and the qf prediction network structure is shown in table 1.
Table 1QF predictive network architecture model
As shown in fig. 2, QF related features are extracted using the second, fourth, sixth and eighth convolutional layers of the QF prediction network. Assuming that w×h image blocks are input, the dimensions of the second, fourth, sixth, and eighth layer convolution layer output feature maps of the network are respectively: W.times.H.times.64, W/2.times.H/2.times.128, W/4.times.H/4.times.256, W/8.times.H/8.times.512. The number of output characteristic diagram channels of the fourth layer, the sixth layer and the eighth layer of convolution layers is adjusted to 64. Specifically, for the W/2×H/2×128 feature maps, an average value is taken for each adjacent 2 feature maps; for W/4 XH/4X 256 feature maps, taking an average value of every 4 adjacent feature maps; for the W/8×H/8×512 feature maps, the mean value is taken for each adjacent 8 feature maps. The extracted QF related features are respectively represented by C1, f2 and f3, and the dimensions are respectively as follows: W.times.H.times.64, W/2.times.H/2.times.64, W/4.times.H/4.times.64, W/8.times.H/8.times.64.
Step three: the Y-channel of the compressed image is decomposed in 3 dimensions and 8 directions via a controllable pyramid wavelet transform, resulting in 1 high-pass subband, 24 band-pass subbands, and 1 low-pass subband. These sub-band coefficients, together with QF related features and original compressed images of different scales, are input into a restoration network consisting of 6 recursion modules, and sub-band coefficients output from the network are transformed into a spatial domain through inverse wavelet, so that a restored Y-channel image is obtained.
As shown in fig. 1, the Y-channel image is first decomposed in 3 dimensions and 8 directions using a controllable pyramid wavelet transform (SPWT), resulting in 1 high-pass subband (H0), 24 band-pass subbands of different dimensions and different directions (subbands of different dimensions are denoted B1, B2 and B3, respectively), and 1 low-pass subband (L0). Combining the H0, B1 and Y channel images as a first scale input of a recursion module; combining the B2 and the downsampled Y-channel image to be used as a second scale input of the recursion module; combining the B3 and the twice down-sampled Y-channel images to be used as a third scale input of a recursion module; l0 is the fourth scale input of the recursion module. For H0, B1, B2 and B3, respectively extracting the real part and the imaginary part of the wavelet coefficient, and combining the real part characteristic diagram and the imaginary part characteristic diagram. Thus, the number of channels for the four scale inputs of the recursion module is 18, 17 and 1, respectively.
In the recursion module, convolution and parameter correction linear unit (PReLU) operation are firstly carried out on four-scale input, and a 64-channel characteristic diagram with four scales is obtained. As shown in fig. 3, these feature maps are merged with the corresponding QF features (C1, f2, and f 3), and feature map merging is performed on each scale for feature maps of different scales after two convolutions and a pralu operation. As shown in fig. 4, feature graphs with different scales achieve dimension consistency through a mean pooling and deconvolution mode. Specifically, the length and width dimensions of the feature map are unchanged through a 1×1 convolution, the length and width dimensions of the feature map are halved through a mean value pooling operation with a scale factor of 2, the length and width dimensions of the feature map are doubled through a deconvolution operation with a scale factor of 2, and the length and width dimensions of the feature map are halved/doubled through multiple mean value pooling/deconvolution operations. Finally, the output of the recursion module is a four-scale wavelet coefficient residual diagram with channel numbers of 17, 16 and 1 respectively.
As shown in fig. 1, the output of the recursion module is added pixel by pixel with the original wavelet subband coefficients, the result of which is again input to the recursion module, the output of which is again added pixel by pixel with the original wavelet subband coefficients. This calculation process is repeated 6 times, and the 6 recursion modules share the same network parameters, thereby reducing model complexity. Finally, these restored wavelet subband coefficients are subjected to an Inverse wavelet transform (Inverse SPWT) to the spatial domain, resulting in a restored Y-channel image. In terms of parameter selection, all other convolution kernel sizes are 3×3 pixels except that the convolution kernel sizes of the first and last convolution layers are set to 7×7 and 5×5 pixels, respectively. Mirror image 3 pixels and mirror image 2 pixels are adopted for the first layer and the last layer of convolution layer respectively for filling, and 1 pixel is adopted for zero filling for the rest convolution layers, so that the consistency of the input dimension and the output dimension of each layer of the network is ensured. In addition, the PReLU is used to realize the characteristic nonlinearity after each convolution layer except the final convolution layer.
Step four: and (3) inputting the QF related features extracted in the second step, the Y channel image restored in the third step and the CbCr channels of the compressed image into a CbCr channel restoration network containing 6 recursion modules, so as to obtain a restored CbCr channel image.
As shown in fig. 1, the CbCr channel recovery network adopts a structure similar to that of the Y channel recovery network, that is, the CbCr channel image is predicted by residual learning repeatedly using the recovered Y channel image and QF related features, and all recursive modules share the same network parameters, thereby reducing model complexity. As shown in FIG. 5, the recursion module adopts a network structure similar to U-Net, and is different in that at the encoder end, convolution layer outputs with different scales are spliced and combined with QF related features with corresponding scales, so that the network can adaptively complete the recovery tasks of images with different compression grades. Since the human eye is insensitive to image color distortion, the output of all layers is set to 64 channels, i.e. the CbCr channels are restored using a lightweight network. In terms of parameter selection, similar to the Y channel recovery network, all other convolution kernel sizes are 3×3 pixels except that the convolution kernel sizes of the first and last convolution layers are set to 7×7 and 5×5 pixels, respectively, and the convolution kernel size of the deconvolution layer is set to 2×2 pixels. Mirror image 3 pixels and mirror image 2 pixels are adopted for the first layer and the last layer of convolution layer respectively for filling, and 1 pixel is adopted for zero filling for the rest convolution layers, so that the consistency of the input dimension and the output dimension of each layer of the network is ensured. In addition, the PReLU is used to realize the characteristic nonlinearity after each convolution layer except the final convolution layer.
The training method of the network can be summarized as follows:
1) The invention mainly comprises three networks: QF prediction network, Y channel recovery network, and CbCr channel recovery network. Because of a certain variable dependency relationship among the networks, the QF prediction network is trained firstly; then QF related characteristics of the compressed image blocks are extracted by using a QF prediction network with fixed parameters, and a Y channel recovery network is trained; and finally, extracting QF related characteristics of the compressed image block and calculating a restored Y-channel image by using a QF prediction network and a Y-channel restoration network with fixed parameters, and training a CbCr channel restoration network.
2) QF prediction networks were trained using VOC2012 databases. The specific method comprises the following steps: compressing each RGB image in the database by using a random QF value, wherein QF is an integer and the value range is [5, 95], and 12700 compressed images are formed; the compressed image was then converted to YCbCr space and 105234 non-overlapping 128 x 128 pixel image blocks were extracted from the Y-channel image for network training using the L1 penalty function.
3) Y-channel and CbCr channel recovery networks were trained using a Berkeley segmentation database (Berkeley Segmentation Dataset, BSD), a DIV2K database, a sliding iron exploration database (Waterloo Exploration Dataset, WED), and a Flickr2K database, wherein BSD contained 400 images (200 each in training and test sets), DIV2K contained 900 images, WED contained 4744 images, and Flickr2K contained 2000 images. The specific method comprises the following steps: using a random QF value compression for each reference image, wherein QF is an integer and the value range is QF epsilon {10:20,22:2:30,35:5:60,70:10:90}, and generating by symbiosis8044 compressed images; subsequently, the reference image and the compressed image were converted into YCbCr space, and 583625 mutually non-overlapping 128×128 pixel image blocks were extracted for network training, respectively. Wherein, the training of the Y-channel recovery network requires the use of Y-channel image blocks as training data, and the loss function is pixel mean square error loss (l MSE ) And structural similarity loss (l SSIM ) Linear combinations of (i.e. l=l) MSE +λ·l SSIM (λ=0.001); training the CbCr channel recovery network requires using CbCr channel image blocks as training data, with the loss function being pixel mean square error loss.
4) Pixel mean square error loss (l MSE ) The calculation formula is as follows:
wherein I (I, j) and I R (I, j) represent the reference image I and the restored image I, respectively R Wherein the spatial position is the pixel value of (i, j); w and H represent the width and height of the image, respectively. Structural similarity loss (l) SSIM ) The calculation formula is as follows:
wherein, the liquid crystal display device comprises a liquid crystal display device,representing SSIM (I, I) R ) The average value of (2) is calculated as:
in the method, in the process of the invention,and->Respectively represent I (I) R ) Local mean and local standard deviation of (2); c (C) 1 And C 2 Is constant and takes the same value as the SSIM method.
5) The present invention uses the PyTorch deep learning framework to conduct experiments on 8-kernel Intel i9-9900K 3.60GHz CPU and NVIDIA GeForce RTX 2080SUPER GPU workstations. The network initialization parameter is a sampling value of normal distribution N (1,0.02); the initialization parameter of PReLU slope is 0.1; optimizing by using an Adam algorithm; initial learning rate of 2×10 -4 And the first/second moment exponential decay rates were set to 0.9 and 0.999, respectively. When training the QF prediction network, the batch size is set to 64, and the learning rate is reduced to 0.8 before every time one epoch passes, and 120 epochs are trained. When training two recovery networks, the batch size was set to 4, and every 20000 iterative learning rates were reduced to the previous 0.9, a total of 4 epochs were trained.
The test method of the network can be summarized as follows:
1) Restoration of a compressed image of known QF values
The reference images of LIVE, CSIQ, BSD (100 images in the BSD verification set) and the Urban100 database were selected for the algorithmic performance test. The specific method comprises the following steps: JPEG compression is carried out on each reference image of each database by using eight different compression levels, wherein QF values in experiments are 10, 20, 30, 40, 50, 60,70 and 80; then, the compressed image is restored by using different algorithms/network models; finally, the two indexes of peak signal-to-noise ratio (PSNR) and Structural Similarity (SSIM) are used for carrying out algorithm performance test on the restored image. Table 2 shows the recovery performance of the present invention compared to other methods for JPEG compressed images of known QF values.
TABLE 2 comparison of test Performance of the invention (SPW-Net) with other methods on LIVE, BSD100, CSIQ, urban100 databases
2) Restoration of compressed images of unknown QF values
The SDILL database is selected for carrying out algorithm performance test, and the test contents comprise two parts: (1) Directly selecting JPEG compressed images of QF epsilon 10,90 in an SDILL database for testing; (2) And sequentially carrying out JPEG compression on each original image of the SDILL database according to QF from 10 to 90 (step length is 1), and recovering the generated 1620 compressed images. Table 3 and fig. 6 show a comparison of the performance of the different methods in the above two part test, respectively. Experimental results show that compared with other methods, the method provided by the invention has better recovery performance for images with different compression grades.
TABLE 3 comparison of test Performance of the invention (SPW-Net) with other methods on SDILL database
In a word, the JPEG image compression artifact eliminating method based on the controllable pyramid wavelet network firstly utilizes the QF prediction network to extract image features related to compression level, secondly respectively restores a Y channel and a CbCr channel of an image through two restoring networks, and finally transforms the image to an RGB space to obtain a final restoring result. The method provided by the invention does not need to predict the image coding parameter information, has good recovery effect on gray scale and color images with different compression levels, and each recovery network only needs to train a single network model, so that the problems of gradient disappearance and gradient explosion possibly occurring in the training process are avoided by using jump connection on one hand, and on the other hand, the complexity of the model is reduced by using a recursion module sharing parameter strategy, and the efficient operation of an algorithm is ensured. Therefore, the method provided by the invention has the advantages of simple model, less parameter quantity, wide application range, obvious recovery effect and the like.
Although the specific embodiments of the present invention have been described above with reference to the accompanying drawings, the present invention is not limited to the specific embodiments described above. The above embodiments are merely instructive, illustrative, and not limiting. Those skilled in the art, with the benefit of this disclosure, may make many different methods of JPEG image compression artifact removal without departing from the scope of the invention as claimed.

Claims (5)

1. The JPEG image compression artifact eliminating method based on the controllable pyramid wavelet network is characterized by comprising the following steps of:
step one, converting an input JPEG compressed image into a YCbCr color space;
inputting a Y channel of the JPEG compressed image into a QF prediction network, extracting a nonlinear feature map by using a cascade convolution layer, and rescaling the nonlinear feature map in a third dimension before inputting a recovery network;
step three, the Y channel of the JPEG compressed image is subjected to controllable pyramid wavelet transformation and is decomposed in three dimensions and eight directions, so that a high-pass sub-band, twenty-four band-pass sub-bands and a low-pass sub-band are generated, all sub-band coefficients, a relevant characteristic diagram of a QF prediction network and compressed images corresponding to different dimensions are sent into a Y channel recovery network together, and sub-band coefficients output by the Y channel recovery network are subjected to inverse wavelet transformation to a spatial domain, so that a recovered Y channel image is obtained;
the Y channel recovery network comprises six recursion modules with the same network structure, the recursion modules comprise four parallel convolutional neural networks, the four convolutional neural networks respectively correspond to image inputs with different scales, after characteristic image splicing and two-layer convolutional layer operation, characteristic images with different scales are subjected to up-sampling and down-sampling operation, characteristic fusion is realized on each path of neural network, and after the characteristic images are fused, network output is obtained after the characteristic images pass through the two-layer convolutional layers and the parameter correction linear units;
inputting the linear feature map in the QF prediction network, the Y channel image restored in the step three and the CbCr channel of the compressed image into a CbCr channel restoration network together, so as to obtain a restored CbCr channel image;
the CbCr channel recovery network comprises six recursion modules with the same network structure, the input of each recursion module comprises a CbCr channel of a compressed image, a nonlinear feature map extracted by a QF prediction network and a recovered Y-channel image obtained in the step three, and the CbCr channel recovery network predicts the CbCr channel image repeatedly through residual error learning by utilizing the recovered Y-channel image and relevant features extracted by the QF prediction network;
the network structure of the recursion module in the CbCr channel recovery network comprises an encoder end and a decoder end, wherein the encoder end finishes three downsampling and the decoder end finishes three upsampling; feature fusion is realized between feature graphs with the same scale at the encoder end and the decoder end through jump connection;
the output of each convolution layer of the CbCr channel recovery network is 64 channels, and at the coding end, the convolution layer outputs with different scales are spliced and combined with the features extracted by the QF prediction network and corresponding to the scales;
the recursion module outputs and the original compressed image CbCr channel is summed element by element to obtain a restored CbCr channel, the calculation process is repeated six times, and the six recursion modules share the same network parameters;
and fifthly, converting the YCbCr channel image into an RGB color space to obtain a recovered color image.
2. The method for eliminating JPEG image compression artifacts based on controllable pyramid wavelet network according to claim 1, wherein the structure of QF prediction network is as follows:
the size of the input image block is 128 x 128 pixels.
3. The method for removing artifacts from JPEG image compression based on controllable pyramid wavelet network according to claim 2, wherein in step two, the outputs of the fourth, sixth and eighth convolution layers of the QF prediction network are input to the restoration network of the Y channel and CbCr channel after being scaled by the channel number adjustment.
4. The method for eliminating JPEG image compression artifacts based on the controllable pyramid wavelet network according to claim 1, wherein in the third step, the nonlinear feature map extracted by the QF prediction network comprises four scale feature maps, the four scale feature maps are respectively connected into corresponding four paths of convolution neural networks in a mode of splicing with the wavelet coefficient feature maps, compressed images with different scales are respectively spliced with corresponding wavelet subband coefficients to serve as inputs of each path of the neural network of the recursion module, and the real part and the imaginary part of the coefficients are respectively extracted due to the fact that wavelet transformation coefficients are complex, and the real part feature map and the imaginary part feature map are spliced to serve as inputs of the network.
5. A method for eliminating JPEG image compression artifacts based on a controllable pyramid wavelet network according to claim 3, wherein the real part and the imaginary part of the wavelet coefficient outputted by the recursion module in the Y-channel restoration network are summed with the real part and the imaginary part of the wavelet coefficient inputted originally element by element, thereby obtaining the restored wavelet coefficient; this calculation process is repeated six times in total, with the six recursion modules sharing the same network parameters.
CN202111155935.3A 2021-09-29 2021-09-29 JPEG image compression artifact eliminating method based on controllable pyramid wavelet network Active CN113962882B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111155935.3A CN113962882B (en) 2021-09-29 2021-09-29 JPEG image compression artifact eliminating method based on controllable pyramid wavelet network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111155935.3A CN113962882B (en) 2021-09-29 2021-09-29 JPEG image compression artifact eliminating method based on controllable pyramid wavelet network

Publications (2)

Publication Number Publication Date
CN113962882A CN113962882A (en) 2022-01-21
CN113962882B true CN113962882B (en) 2023-08-25

Family

ID=79463365

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111155935.3A Active CN113962882B (en) 2021-09-29 2021-09-29 JPEG image compression artifact eliminating method based on controllable pyramid wavelet network

Country Status (1)

Country Link
CN (1) CN113962882B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116452465B (en) * 2023-06-13 2023-08-11 江苏游隼微电子有限公司 Method for eliminating JPEG image block artifact
CN117291962B (en) * 2023-11-27 2024-02-02 电子科技大学 Deblocking effect method of lightweight neural network based on channel decomposition

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2020103901A4 (en) * 2020-12-04 2021-02-11 Chongqing Normal University Image Semantic Segmentation Method Based on Deep Full Convolutional Network and Conditional Random Field
CN112419238A (en) * 2020-11-03 2021-02-26 广东机电职业技术学院 Copy-paste counterfeit image evidence obtaining method based on end-to-end deep neural network
CN112509094A (en) * 2020-12-22 2021-03-16 西安交通大学 JPEG image compression artifact elimination algorithm based on cascade residual error coding and decoding network
CN113065558A (en) * 2021-04-21 2021-07-02 浙江工业大学 Lightweight small target detection method combined with attention mechanism
CN113362225A (en) * 2021-06-03 2021-09-07 太原科技大学 Multi-description compressed image enhancement method based on residual recursive compensation and feature fusion

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112419238A (en) * 2020-11-03 2021-02-26 广东机电职业技术学院 Copy-paste counterfeit image evidence obtaining method based on end-to-end deep neural network
AU2020103901A4 (en) * 2020-12-04 2021-02-11 Chongqing Normal University Image Semantic Segmentation Method Based on Deep Full Convolutional Network and Conditional Random Field
CN112509094A (en) * 2020-12-22 2021-03-16 西安交通大学 JPEG image compression artifact elimination algorithm based on cascade residual error coding and decoding network
CN113065558A (en) * 2021-04-21 2021-07-02 浙江工业大学 Lightweight small target detection method combined with attention mechanism
CN113362225A (en) * 2021-06-03 2021-09-07 太原科技大学 Multi-description compressed image enhancement method based on residual recursive compensation and feature fusion

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
用于低剂量CT图像去噪的递归残差编解码网络;刘文斌;崔学英;上官宏;刘斌;;太原科技大学学报(第04期);全文 *

Also Published As

Publication number Publication date
CN113962882A (en) 2022-01-21

Similar Documents

Publication Publication Date Title
CN110969577B (en) Video super-resolution reconstruction method based on deep double attention network
CN109886871B (en) Image super-resolution method based on channel attention mechanism and multi-layer feature fusion
CN112330542A (en) Image reconstruction system and method based on CRCSAN network
CN109272452B (en) Method for learning super-resolution network based on group structure sub-band in wavelet domain
CN113962882B (en) JPEG image compression artifact eliminating method based on controllable pyramid wavelet network
Luo et al. Lattice network for lightweight image restoration
CN111951164B (en) Image super-resolution reconstruction network structure and image reconstruction effect analysis method
Sharma et al. From pyramids to state‐of‐the‐art: a study and comprehensive comparison of visible–infrared image fusion techniques
CN112509094A (en) JPEG image compression artifact elimination algorithm based on cascade residual error coding and decoding network
CN115829834A (en) Image super-resolution reconstruction method based on half-coupling depth convolution dictionary learning
CN114972036A (en) Blind image super-resolution reconstruction method and system based on fusion degradation prior
CN112150356A (en) Single compressed image super-resolution reconstruction method based on cascade framework
Amaranageswarao et al. Residual learning based densely connected deep dilated network for joint deblocking and super resolution
CN114022356A (en) River course flow water level remote sensing image super-resolution method and system based on wavelet domain
Alsayyh et al. A Novel Fused Image Compression Technique Using DFT, DWT, and DCT.
Xin et al. FISTA-CSNet: a deep compressed sensing network by unrolling iterative optimization algorithm
CN116563167A (en) Face image reconstruction method, system, device and medium based on self-adaptive texture and frequency domain perception
CN114549361B (en) Image motion blur removing method based on improved U-Net model
CN114331853B (en) Single image restoration iteration framework based on target vector updating module
Zhang et al. Multi-domain residual encoder–decoder networks for generalized compression artifact reduction
Li et al. Compression artifact removal with stacked multi-context channel-wise attention network
CN114219738A (en) Single-image multi-scale super-resolution reconstruction network structure and method
Abd-Elhafiez Image compression algorithm using a fast curvelet transform
Hilles Spatial Frequency Filtering Using Sofm For Image Compression
Luo et al. Super-resolving compressed images via parallel and series integration of artifact reduction and resolution enhancement

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant