WO2016127271A1 - An apparatus and a method for reducing compression artifacts of a lossy-compressed image - Google Patents

An apparatus and a method for reducing compression artifacts of a lossy-compressed image Download PDF

Info

Publication number
WO2016127271A1
WO2016127271A1 PCT/CN2015/000093 CN2015000093W WO2016127271A1 WO 2016127271 A1 WO2016127271 A1 WO 2016127271A1 CN 2015000093 W CN2015000093 W CN 2015000093W WO 2016127271 A1 WO2016127271 A1 WO 2016127271A1
Authority
WO
WIPO (PCT)
Prior art keywords
high dimensional
lossy
image
dimensional feature
sub
Prior art date
Application number
PCT/CN2015/000093
Other languages
French (fr)
Inventor
Xiaoou Tang
Chao Dong
Chen Change Loy
Original Assignee
Xiaoou Tang
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xiaoou Tang filed Critical Xiaoou Tang
Priority to CN201580075726.4A priority Critical patent/CN107251053B/en
Priority to PCT/CN2015/000093 priority patent/WO2016127271A1/en
Publication of WO2016127271A1 publication Critical patent/WO2016127271A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/85Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
    • H04N19/86Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression involving reduction of coding artifacts, e.g. of blockiness
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06T5/60
    • G06T5/70
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/30Noise filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/24Aligning, centring, orientation detection or correction of the image
    • G06V10/247Aligning, centring, orientation detection or correction of the image by affine transforms, e.g. correction due to perspective effects; Quadrilaterals, e.g. trapezoids

Definitions

  • the present application generally relates to a field of image processing, more particularly, to an apparatus and a method for reducing compression artifacts of a lossy-compressed image.
  • Lossy compression is the class of data encoding methods that uses inexact approximations or partial data discarding for representing the content that has been encoded. Such compression techniques are used to reduce the amount of data that would otherwise be needed to store, handle, and/or transmit the represented content.
  • lossy image compression formats e.g. JPEG, WebP, JPEG XR, and HEVC-MSP. JPEG remains the most widely adopted format among the various alternatives.
  • Lossy compression introduces compression artifacts, especially when used in low bit rates/quantization levels.
  • JPEG compression artifacts are a complex combination of different specific artifacts comprising blocking artifacts, ringing effects and blurring.
  • Blocking artifacts arise when each block is encoded without considering the correlation with the adjacent blocks, resulting in discontinuities at the borders. Ringing effects along the edges occur due to the coarse quantization of the high-frequency components. Blurring happens due to the loss of high-frequency components.
  • deblocking oriented and restoration oriented methods Existing algorithms for eliminating the artifacts can be classified into deblocking oriented and restoration oriented methods.
  • the deblocking oriented methods focus on removing blocking and ringing artifacts.
  • most deblocking oriented methods could not reproduce sharp edges, and tend to over smooth texture regions.
  • the restoration oriented methods regard the compression operation as distortion and propose restoration algorithms.
  • the restoration oriented methods tend to reconstruct the original image directly, thus the sharpened output is often accompanied with ringing effects around edges and abrupt transition in smooth regions.
  • an apparatus for reducing compression artifacts of a lossy-compressed image may comprise: a feature extraction device comprising a first set of filters configured to extract patches from the lossy-compressed image and map the extracted patches to a first set of high dimensional feature vectors, and a feature enhancement device electronically communicated with the feature extraction device and comprising a second set of filters configured to denoise each high dimensional feature vector in the first set and map the denoised high dimensional feature vectors to a second set of high dimensional feature vectors.
  • the apparatus further comprises a mapping device coupled to the feature enhancement device and comprising a third set of filters configured to map nonlinearly each high dimensional vector in the second set onto a restored patch-wise representation, and an aggregating device electronically communicated with the mapping device and configured to aggregate patch-wise representations mapped from all high dimensional vectors in the second set to generate a restored clear image.
  • a mapping device coupled to the feature enhancement device and comprising a third set of filters configured to map nonlinearly each high dimensional vector in the second set onto a restored patch-wise representation
  • an aggregating device electronically communicated with the mapping device and configured to aggregate patch-wise representations mapped from all high dimensional vectors in the second set to generate a restored clear image.
  • the first set of filters may be configured to extract patches from the lossy-compressed image and map nonlinearly each of the extracted patches as a high dimensional feature vector, and the mapped vectors for all the patches form said first set of high dimensional feature vectors.
  • the second set of filters may be configured to denoise each high dimensional feature vector in the first set and map nonlinearly the denoised high dimensional feature vectors to a second set of high dimensional feature vectors.
  • the first, second and third set of filters and the aggregating device may map the vectors based on predetermined first, second and third parameters, respectively, or may aggregate the patch-wise representations based on fourth parameter.
  • the apparatus may further comprise a comparing device, it may be coupled to the aggregating device and configured to sample a ground truth uncompressed image corresponding to the lossy-compressed image from a predetermined training set and compare a dissimilarity between the aggregated restored clear image received from the aggregating device and the corresponding ground truth uncompressed image to generate a reconstruction error, wherein the reconstruction error is back-propagated in order to optimize the first, second, third and fourth parameters.
  • a comparing device it may be coupled to the aggregating device and configured to sample a ground truth uncompressed image corresponding to the lossy-compressed image from a predetermined training set and compare a dissimilarity between the aggregated restored clear image received from the aggregating device and the corresponding ground truth uncompressed image to generate a reconstruction error, wherein the reconstruction error is back-propagated in order to optimize the first, second, third and fourth parameters.
  • the apparatus may further comprise a training set preparation device coupled to the comparing device, in which the training set preparation device further comprises: a cropper configured to crop randomly a plurality of sub-images from a randomly selected training image to generate a set of ground truth uncompressed sub-images and a lossy-compressed sub-image generator electronically communicated with the cropper and configured to generate a set of lossy-compressed sub-images based on the set of ground truth uncompressed sub-images received from the cropper.
  • the training set preparation device further comprises: a cropper configured to crop randomly a plurality of sub-images from a randomly selected training image to generate a set of ground truth uncompressed sub-images and a lossy-compressed sub-image generator electronically communicated with the cropper and configured to generate a set of lossy-compressed sub-images based on the set of ground truth uncompressed sub-images received from the cropper.
  • the training set preparation device comprises a pairing device electronically communicated with the cropper and generator and configured to pair each of the ground truth uncompressed sub-images with a corresponding lossy-compressed sub-image and a collector electronically communicated with the pairing device and configured to collect the paired ground truth uncompressed sub-images and the lossy-compressed sub-image to form the predetermined training set.
  • the lossy-compressed sub-image generator further comprises a compressing device electronically communicated with the cropper and generator and configured to encode and decode the ground truth sub-image with Compression encoder and decoder to generate the set of lossy-compressed sub-images.
  • the reconstruction error comprises a mean squared error.
  • a method for reducing compression artifacts of a lossy-compressed image may comprise: extracting patches from the lossy-compressed image and mapping the extracted patches to a first set of high dimensional feature vectors by feature extraction device comprising a first set of filters; denoising each high dimensional feature vector in the first set and mapping the denoised high dimensional feature vectors to a second set of high dimensional feature vectors by a feature enhancement device electronically communicated with the feature extraction device and comprising a second set of filters; mapping nonlinearly each high dimensional vector in the second set onto a restored patch-wise representation by a mapping device coupled to the feature enhancement device and comprising a third set of filters; and aggregating patch-wise representations mapped from all high dimensional vectors in the second set to generate a restored clear image by an aggregating device electronically communicated with the mapping device.
  • the apparatus may comprise a reconstructing unit configured to reconstruct the lossy-compressed image to a restored clear image based on predetermined parameters and a training unit configured to train the convolutional neural network system with a predetermined training set so as to determine the parameters used by the reconstructing unit.
  • the reconstructing unit may comprise: feature extraction device comprising a first set of filters configured to extract patches from the lossy-compressed image and map the extracted patches to a first set of high dimensional feature vectors; a feature enhancement device electronically communicated with the feature extraction device and comprising a second set of filters configured to denoise each high dimensional feature vector in the first set and map the denoised high dimensional feature vectors to a second set of high dimensional feature vectors; a mapping device coupled to the feature enhancement device and comprising a third set of filters configured to map nonlinearly each high dimensional vector in the second set onto a restored patch-wise representation; and an aggregating device electronically communicated with the mapping device and configured to aggregate patch-wise representations mapped from all high dimensional vectors in the second set to generate a restored clear image.
  • the feature extraction device, the feature enhancement device, the mapping device and the aggregating device comprise at least one convolutional layer, respectively.
  • the convolutional layers are sequentially connected to each other to form a convolutional neural network system
  • the system may comprise a memory that stores executable components and a processor executes the executable components to perform operations of the system.
  • the executable components comprise: a feature extraction component configured to extract patches from the lossy-compressed image and map the extracted patches to a first set of high dimensional feature vectors; a feature enhancement component configured to denoise each high dimensional feature vector in the first set and map the denoised high dimensional feature vectors to a second set of high dimensional feature vectors; a mapping component configured to map nonlinearly each high dimensional vector in the second set onto a restored patch-wise representation; and an aggregating component configured to aggregate patch-wise representations mapped from all high dimensional vectors in the second set to generate a restored clear image.
  • Fig. 1 is a schematic diagram illustrating an apparatus for reducing compression artifacts of a lossy-compressed image consistent with an embodiment of the present application.
  • Fig. 2 is a schematic diagram illustrating an apparatus for reducing compression artifacts of a lossy-compressed image consistent with another embodiment of the present application.
  • Fig. 3 is a schematic diagram illustrating a convolutional neural network system, consistent with some disclosed embodiments.
  • Fig. 4. is a schematic diagram illustrating a training unit of the apparatus, consistent with some disclosed embodiments.
  • Fig. 5. is a schematic diagram illustrating a training set preparation device of the training unit, consistent with some disclosed embodiments.
  • Fig. 6 is a schematic flowchart illustrating a method for reducing compression artifacts of a lossy-compressed image, consistent with some disclosed embodiments.
  • Fig. 7 is a schematic flowchart illustrating a method for training a convolutional neural network system for reducing compression artifacts of a lossy-compressed image, consistent with some disclosed embodiments.
  • Fig. 8 is a schematic diagram illustrating a system for reducing compression artifacts of a lossy-compressed image consistent with an embodiment of the present application.
  • the apparatus 1000 may comprise a feature extraction device 100, a feature enhancement device 200, a mapping device 300 and an aggregating device 400.
  • the feature extraction device 100, the feature enhancement device 200, the mapping device 300 and the aggregating device 400 will be further discussed in detail.
  • the lossy-compressed image is denoted by Y
  • the restored clear image is denoted by F (Y) which is as similar as possible to a ground truth uncompressed image X.
  • the feature extraction device 100 comprises a first set of filters.
  • the first set of filters is configured to extract patches from the lossy-compressed image and map the extracted patches to a first set of high dimensional feature vectors.
  • the first set of filters map the extracted patches to a first set of high dimensional feature vectors by rule of a function of F' (first parameters) , where F' (x) is a nonlinear function (e.g., max (0, x) , 1/ (1+exp (-x) ) or tanh (x) ) and the first parameters are determined from predetermined parameters associated with the lossy-compressed image.
  • F' (x) is a nonlinear function (e.g., max (0, x) , 1/ (1+exp (-x) ) or tanh (x) ) and the first parameters are determined from predetermined parameters associated with the lossy-compressed image.
  • these first set of high dimensional feature vectors may comprise a set of feature maps, of which the number equals to the dimensionality of the vectors.
  • a popular strategy in image restoration is to densely extract patches and then represent them by a set of pre-trained bases such as PCA (Principal Component Analysis) , DCT (Discrete Cosine Transformation) , Haar, etc.
  • the operations for the feature extraction device 100 may be formulated as:
  • W 1 and B 1 represent the filters and biases respectively.
  • F' (x) is a nonlinear function (e.g., max (0, x) , 1/ (1+exp (-x) ) or tanh (x) ) .
  • W 1 is of a size c ⁇ f 1 ⁇ f 1 ⁇ n 1 , where c is the number of channels in the input image, f 1 is the spatial size of a filter, and n 1 is the number of filters.
  • W 1 applies n 1 convolutions on the image, and each convolution has a kernel size c ⁇ f 1 ⁇ f 1 .
  • the output is composed of n 1 feature maps.
  • B 1 is an n 1 -dimensional vector, whose each element is associated with a filter.
  • the feature enhancement device 200 may be electronically communicated with the feature extraction device 100, and may comprise a second set of filters configured to denoise each high dimensional feature vector in the first set and map the denoised high dimensional feature vectors to a second set of high dimensional feature vectors, for example, a set of relatively cleaner feature vectors.
  • the feature enhancement device 200 is configured to denoise each high dimensional feature vector in the first set and map the denoised high dimensional feature vectors to the second set of high dimensional feature vectors by rule of a function of F' (second parameters) , where F' (x) is a nonlinear function (e.g., max (0, x) , 1/(1+exp (-x) ) or tanh (x) ) and the second parameters are determined from predetermined parameters associated with the first set of high dimensional feature vectors.
  • F' (x) is a nonlinear function (e.g., max (0, x) , 1/(1+exp (-x) ) or tanh (x) )
  • the second parameters are determined from predetermined parameters associated with the first set of high dimensional feature vectors.
  • the feature extraction device 100 extracts an n 1 -dimensional feature for each patch.
  • the second set of filters maps these n 1 -dimensional vectors into a set of n 2 -dimensional vectors.
  • Each mapped vector is conceptually a relatively cleaner feature vector.
  • These vectors comprise another set of feature maps.
  • the feature enhancement may be formulated as:
  • W 2 is of a size n 1 ⁇ f 2 ⁇ f 2 ⁇ n 2 and B 2 is a n 2 -dimensional vector.
  • the apparatus 1000 may further comprise a mapping device 300.
  • the mapping device 300 may be coupled to the feature enhancement device 200 and comprise a third set of filters configured to map nonlinearly each high dimensional vector in the second set onto a restored patch-wise representation.
  • the mapping device 300 is configured to map nonlinearly each of the high dimensional vectors onto a patch-wise representation by rule of a function of F' (third parameters) , where F' (x) is a nonlinear function (e.g., max (0, x) , 1/(1+exp (-x) ) or tanh (x) ) and the third parameters are determined from predetermined parameters associated with the second set of high dimensional feature vectors, i.e. the cleaner high dimensional feature vectors.
  • F' (x) is a nonlinear function (e.g., max (0, x) , 1/(1+exp (-x) ) or tanh (x) )
  • the third parameters are determined from predetermined parameters associated with the second set of high dimensional feature vectors, i.e. the cleaner high dimensional feature vectors.
  • the feature enhancement device 200 generates a set of n 2 -dimensional feature vectors.
  • the mapping device 300 maps each of these n 2 -dimensional vectors into an n 3 -dimensional vector. Each mapped vector is conceptually the representation of a restored patch. These vectors comprise another set of feature maps.
  • mapping may be formulated as:
  • W 3 is of a size n 2 ⁇ f 3 ⁇ f 3 ⁇ n 3
  • B 3 is a n 3 -dimensional vector.
  • Each of the output n 3 -dimensional vectors is conceptually a representation of a restored patch that will be used for reconstruction.
  • the apparatus 1000 may further comprise an aggregating device 400.
  • the aggregating device 400 may be electronically communicated with the mapping device 300 and configured to aggregate the patch-wise representations to generate a restored clear image
  • the aggregating device 400 aggregates the restored patch-wise representations to generate a restored clear image.
  • the aggregating may be formulated as:
  • W 4 is of a size n 3 ⁇ f 4 ⁇ f 4 ⁇ c
  • B 4 is a c-dimensional vector.
  • the apparatus 1000 may further comprise a comparing device (not shown) which is coupled to the aggregating device 400 and configured to sample a ground truth uncompressed sub-image corresponding to the lossy-compressed sub-image from a predetermined training set and compare dissimilarity between the aggregated restored clear sub-image received from the aggregating device 400 and the sampled ground truth uncompressed sub-image to generate a reconstruction error.
  • the reconstruction error comprises a mean squared error.
  • the reconstruction error is back-propagated in order to determine the parameters, i.e., W 1 , W 2 , W 3 , W 4 , B 1 , B 2 , B 3 and B 4 .
  • Fig. 2 is a schematic diagram illustrating an apparatus 1000’ for reducing compression artifacts of a lossy-compressed image consistent with another embodiment of the present application.
  • the apparatus 1000’ may comprise a reconstructing unit 100’a nd a training unit 200’ .
  • the reconstructing unit 100’ is configured to reconstruct the lossy-compressed image to a restored clear image based on predetermined parameters.
  • the reconstructing unit 100’ may further comprise a feature extraction device 110’ , a feature enhancement device 120’ , a mapping device 130’a nd a aggregating device 140’ .
  • the feature extraction device 110’ , the feature enhancement device 120’ , the mapping device 130’a nd the aggregating device 140’ may comprise at least one convolutional layer, respectively, and the convolutional layers are sequentially connected to each other to form a convolutional neural network system.
  • Fig. 3 illustrates the layer configuration of the convolutional neural network system in mathematic simulation model.
  • each of the feature extraction device 110’ , the feature enhancement device 120’ , the mapping device 130’ and the aggregating device 140’ may be simulated as at least one convolutional layer, respectively. Different operations are performed at different convolutional layers, respectively.
  • the feature extraction device 110’ is configured to extract patches from the lossy-compressed image and map the extracted patches to a first set of high dimensional feature vectors. This is equivalent to convolving the image by a set of filters as mentioned above.
  • the feature enhancement device 120’ is configured to be electronically communicated with the feature extraction device 110’ and denoise each high dimensional feature vector in the first set and map the denoised high dimensional feature vectors to a second set of high dimensional feature vectors, for example, a set of relatively cleaner feature vectors. This is equivalent to applying a second set of filters as mentioned above.
  • the mapping device 130’ is configured to be coupled to the feature enhancement device 120’a nd map nonlinearly each high dimensional vector in the second set onto a restored patch-wise representation. This is equivalent to applying a third set of filters as mentioned above.
  • the aggregating device 140’ is configured to be electronically communicated with the mapping device 130’a nd aggregate patch-wise representations mapped from all high dimensional vectors in the second set to generate a restored clear image.
  • the feature extracting device 110’ , the feature enhancement device 120’ , the mapping device 130’a nd the aggregating device 140’ comprise at least one convolutional layer, respectively, and the convolutional layers are sequentially connected to each other to form a convolutional neural network system.
  • the convolutional neural network system dates back decades and has recently shown an explosive popularity partially due to its success in image classification.
  • the convolutional neural network system is usually applied for natural image denoising and removing noisy patterns (dirt/rain) .
  • the training unit 200’ is configured to train the convolutional neural network system with a predetermined training set so as to optimize the parameters, for example W 1 , W 2 , W 3 , W 4 , B 1 , B 2 , B 3 , B 4 used by the reconstructing unit.
  • the training unit 200’ may comprise a sampling device 210’ , a comparing device 220’ , and a back-propagating device 230’ .
  • the sampling device 210’ may be configured to sample a lossy-compressed sub-image and its corresponding ground truth uncompressed sub-image from a predetermined training set and input the lossy-compressed sub-image to the convolutional neural network system.
  • “sub-images” means these samples are treated as small “images” rather than “patches” , in the sense that “patches” are overlapping and require some averaging as post-processing but “sub-images” need not.
  • the comparing device 220’ may be configured to compare dissimilarity between the reconstructed clear sub-image based on the input lossy-compressed sub-image from the convolutional neural network system and the corresponding ground truth uncompressed sub-image to generate a reconstruction error.
  • the reconstruction error may comprise a mean squared error, and the error is minimized by using stochastic gradient descent with the standard back propagation.
  • the back-propagating device 230’ is configured to back-propagate the reconstruction error through the convolutional neural network system so as to adjust weights on connections between neurons of the convolutional neural network system.
  • the convolutional neural network system do not preclude the usage of other kinds of reconstruction error, if only the reconstruction error are derivable. If a better perceptually motivated metric is given during the training, it is flexible for the convolutional neural network system to adapt to that metric.
  • the apparatus 1000 and 1000’ may further comprise a training set preparation device coupled to the comparing device and configured to prepare the predetermined training set for training the convolutional neural network system.
  • Fig. 5 is a schematic diagram illustrating the training set preparation device. As shown, the training set preparation device may comprise a cropper 241’ , a lossy-compressed sub-image generator 242’ , a pairing device 243’ and a collector 244’ .
  • the cropper 241’ may be configured to crop randomly a plurality of sub-images from a randomly selected training image to generate a set of ground truth uncompressed sub-images. For example, the cropper 241’ may crop n sub-images of m ⁇ m pixels each.
  • the lossy-compressed sub-image generator 242’ may be electronically communicated with the cropper 241’a nd configured to generate a set of lossy-compressed sub-images based on the set of ground truth uncompressed sub-images received from the cropper 241’ .
  • the pairing device 243’ may be electronically communicated with the cropper 241’ and generator 242’ and configured to pair each of the ground truth uncompressed sub-images with a corresponding lossy-compressed sub-image.
  • the collector 244’ may be electronically communicated with the pairing device 243’ and configured to collect all the pairs to form the predetermined training set.
  • the lossy-compressed sub-image generator 242’ may comprise a compressing device electronically communicated with the cropper 241’ and configured to encode and decode the ground truth sub-image with compression encoder and decoder to generate the set of lossy-compressed sub-images.
  • Fig. 6 is a schematic flowchart illustrating a method 2000 for reducing compression artifacts of a lossy-compressed image, consistent with some disclosed embodiments.
  • the method 2000 may be described in detail with respect to Fig. 6.
  • patches are extracted from the lossy-compressed image and each of the extracted patches is mapped into a high dimensional feature vector, by the feature extraction device comprising the first set of filters, such that a first set of high dimensional feature vectors is formed.
  • these vectors comprise a set of feature maps, of which the number equals to the dimensionality of the vectors.
  • a popular strategy in image restoration is to densely extract patches and then represent them by a set of pre-trained bases such as PCA, DCT, Haar, etc.
  • each high dimensional feature vector in the first set is denoised and the denoised high dimensional feature vectors are mapped into a second set of high dimensional feature vectors by a feature enhancement device electronically communicated with the feature extraction device and comprising a second set of filters.
  • the feature extraction device extracts an n 1 -dimensional feature for each patch.
  • the second set of filters maps these n 1 -dimensional vectors into a set of n 2 -dimensional vectors.
  • Each mapped vector is conceptually a relatively cleaner feature vector.
  • each high dimensional vector in the second set is mapped nonlinearly onto a restored patch-wise representation by a mapping device coupled to the feature enhancement device and comprising a third set of filters.
  • the feature enhancement device generates a set of n 2 -dimensional feature vectors.
  • the mapping device maps each of these n 2 -dimensional vectors into an n 3 -dimensional vector.
  • Each mapped vector is conceptually the representation of a restored patch. These vectors comprise another set of feature maps.
  • step S240 patch-wise representations mapped from all high dimensional vectors in the second set are aggregated to generate a restored clear image by an aggregating device electronically communicated with the mapping device.
  • these steps S210-S230 may be simulated by the above-mentioned formulae (1) - (3) .
  • the patches may be extracted from the lossy-compressed image and each of the extracted patches may be mapped as a high dimensional feature vector by rule of a function of F' (first parameters) , where F' (x) is a nonlinear function (e.g., max (0, x) , 1/ (1+exp (-x) ) or tanh (x) ) and the first parameters are determined from predetermined parameters associated with the lossy-compressed image.
  • F' (x) is a nonlinear function (e.g., max (0, x) , 1/ (1+exp (-x) ) or tanh (x) ) and the first parameters are determined from predetermined parameters associated with the lossy-compressed image.
  • the first set of high dimensional feature vectors may be denoised and the denoised high dimensional feature vectors may be mapped nonlinearly to a second set of high dimensional feature vectors, i.e. a set of relatively cleaner feature vectors by rule of a function of F' (second parameters) , where F' (x) is a nonlinear function (e.g., max (0, x) , 1/ (1+exp (-x) ) or tanh (x) ) and the second parameters are determined from predetermined parameters associated with the first set of high dimensional feature vectors.
  • F' (x) is a nonlinear function (e.g., max (0, x) , 1/ (1+exp (-x) ) or tanh (x) ) and the second parameters are determined from predetermined parameters associated with the first set of high dimensional feature vectors.
  • each high dimensional vector in the second set may be mapped nonlinearly onto a restored patch-wise representation by rule of a function of F'(third parameters) , where F' (x) is a nonlinear function (e.g., max (0, x) , 1/ (1+exp (-x)) or tanh (x) ) and the third parameters are determined from predetermined parameters associated with the second set of high dimensional vectors.
  • F' (x) is a nonlinear function (e.g., max (0, x) , 1/ (1+exp (-x)) or tanh (x) ) and the third parameters are determined from predetermined parameters associated with the second set of high dimensional vectors.
  • the method 2000 may further comprise a step of sampling a ground truth uncompressed sub-image corresponding to the lossy-compressed sub-image from a predetermined training set and a step of comparing dissimilarity between the aggregated restored clear sub-image and the corresponding ground truth uncompressed sub-image to generate a reconstruction error.
  • the reconstruction error is back-propagated in order to optimize the parameters, i.e., W 1 , W 2 , W 3 , W 4 , B 1 , B 2 , B 3 and B 4 .
  • the method 2000 further comprises a step of preparing the predetermined training set.
  • a plurality of sub-images is first cropped from a randomly selected training image to generate a set of ground truth uncompressed sub-images. For example, n sub-images of m ⁇ m pixels each may be cropped.
  • a set of lossy-compressed sub-images are generated based on the set of ground truth uncompressed sub-images.
  • each of the ground truth uncompressed sub-images is paired with a corresponding lossy-compressed sub-image. Then, all the pairs are collected to form the predetermined training set.
  • a method 3000 for training a convolutional neural network system for reducing compression artifacts of a lossy-compressed image is illustrated.
  • the method 3000 may be described in detail with respect to Fig. 7.
  • a lossy-compressed sub-image and its corresponding ground truth uncompressed sub-image are sampled from a predetermined training set at step S310.
  • a restored clear sub-image is reconstructed from the lossy-compressed sub-image by the convolutional neural network system.
  • a reconstruction error is generated by comparing dissimilarity between the reconstructed clear sub-image and the ground truth uncompressed sub-image.
  • the reconstruction error is back-propagated through the convolutional neural network system so as to adjust weights on connections between neurons of the convolutional neural network system. Repeating steps S310-S340 until an average value of the reconstruction error is lower than a preset threshold, for example, half of the mean square error between the lossy-compressed sub-images and ground truth uncompressed sub-image in the predetermined training set.
  • the system 4000 comprises a memory 402 that stores executable components and a processor 404, coupled to the memory 402, that executes the executable components to perform operations of the system 4000.
  • the executable components may comprise: a feature extraction component 410 configured to extract patches from the lossy-compressed image and map the extracted patches to a first set of high dimensional feature vectors; and a feature enhancement component 420 configured to denoise each high dimensional feature vector in the first set and map the denoised high dimensional feature vectors to a second set of high dimensional feature vectors.
  • the executable components may further comprise: a mapping component 430 configured to map nonlinearly each high dimensional vector in the second set onto a restored patch-wise representation; and an aggregating component 440 configured to aggregate patch-wise representations mapped from all high dimensional vectors in the second set to generate a restored clear image.
  • a mapping component 430 configured to map nonlinearly each high dimensional vector in the second set onto a restored patch-wise representation
  • an aggregating component 440 configured to aggregate patch-wise representations mapped from all high dimensional vectors in the second set to generate a restored clear image.
  • the feature extraction component 410 is configured to extract patches from the lossy-compressed image and map nonlinearly each of the extracted patches as a high dimensional feature vector, and the mapped vectors for all the patches forming said first set of high dimensional feature vectors.
  • the feature enhancement component 420 is configured to denoise each high dimensional feature vector in the first set and map nonlinearly the denoised high dimensional feature vectors to a second set of high dimensional feature vectors.
  • the feature extraction component 410, feature enhancement component 420 and mapping component 430 map the vectors based on predetermined first, second and third parameters, respectively.
  • the executable components further comprises a comparing component coupled to the aggregating component and configured to sample a ground truth uncompressed image corresponding to the lossy-compressed image from a predetermined training set and compare a dissimilarity between the aggregated restored clear image received from the aggregating component and the corresponding ground truth uncompressed image to generate a reconstruction error, wherein the reconstruction error is back-propagated in order to optimize the first, second and third parameters.
  • a comparing component coupled to the aggregating component and configured to sample a ground truth uncompressed image corresponding to the lossy-compressed image from a predetermined training set and compare a dissimilarity between the aggregated restored clear image received from the aggregating component and the corresponding ground truth uncompressed image to generate a reconstruction error, wherein the reconstruction error is back-propagated in order to optimize the first, second and third parameters.
  • the executable components further comprise a training set preparation component coupled to the comparing component.
  • the training set preparation component further comprises: a cropper configured to crop randomly a plurality of sub-images from a randomly selected training image to generate a set of ground truth uncompressed sub-images; a lossy-compressed sub-image generator electronically communicated with the cropper and configured to generate a set of lossy-compressed sub-images based on the set of ground truth uncompressed sub-images received from the cropper; a pairing module electronically communicated with the cropper and generator and configured to pair each of the ground truth uncompressed sub-images with a corresponding lossy-compressed sub-image; and a collector electronically communicated with the pairing module and configured to collect the paired ground truth uncompressed sub-images and the lossy-compressed sub-image to form the predetermined training set.
  • the lossy-compressed sub-image generator further comprises a compressing module electronically communicated with the cropper and generator and configured to encode and decode the ground truth sub-image with compression encoder and decoder to generate the set of lossy-compressed sub-images.
  • the present application does not explicitly learn the dictionaries or manifolds for modeling the patch space. These are implicitly achieved via the convolutional layers. Furthermore, the feature extraction, feature enhancement and aggregation are also formulated as convolutional layers, so are involved in the optimization.
  • the method and apparatus of the present application reveals different kinds of compression artifacts and provide an efficient reduction of various compression artifacts in different image regions. In the method and apparatus of the present application, the entire convolutional neural network is fully obtained through training, with no pre/post-processing. With a lightweight structure, the apparatus and method of the present application have achieved superior performance than the state-of-the-art methods.
  • Embodiments within the scope of the present invention can be implemented in digital electronic circuitry, or in computer hardware, firmware, software, or in combinations thereof. Apparatus within the scope of the present invention can be implemented in a computer program product tangibly embodied in a machine-readable storage device for execution by a programmable processor; and method actions within the scope of the present invention can be performed by a programmable processor executing a program of instructions to perform functions of the invention by operating on input data and generating output.
  • Embodiments within the scope of the present invention be implemented advantageously in one or more computer programs that are executable on a programmable system including at least one programmable processor coupled to receive data and instructions from, and to transmit data and instructions to, a data storage system, at least one input device, and at least one output device.
  • Each computer program can be implemented in a high-level procedural or object oriented programming language, or in assembly or machine language if desired; and in any case, the language can be a compiled or interpreted language.
  • Suitable processors include, by way of example, both general and special purpose microprocessors.
  • a processor will receive instructions and data from a read-only memory and/or a random access memory.
  • a computer will include one or more mass storage devices for storing data files.
  • Embodiments within the scope of the present invention include computer-readable media for carrying or having computer-executable instructions, computer-readable instructions, or data structures stored thereon.
  • Such computer-readable media may be any available media, which is accessible by a general-purpose or special-purpose computer system.
  • Examples of computer-readable media may include physical storage media such as RAM, ROM, EPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other media which can be used to carry or store desired program code means in the form of computer-executable instructions, computer-readable instructions, or data structures and which may be accessed by a general-purpose or special-purpose computer system. Any of the foregoing can be supplemented by, or incorporated in, ASICs (application-specific integrated circuits) . While particular embodiments of the present invention have been shown and described, changes and modifications may be made to such embodiments without departing from the true scope of the invention.

Abstract

Disclosed is an apparatus for reducing compression artifacts of a lossy-compressed image. The apparatus may comprise: a feature extraction device comprising a first set of filters configured to extract patches from the lossy-compressed image and map the extracted patches to a first set of high dimensional feature vectors, a feature enhancement device electronically communicated with the feature extraction device and comprising a second set of filters configured to denoise each high dimensional feature vector in the first set and map the denoised high dimensional feature vectors to a second set of high dimensional feature vectors, a mapping device coupled to the feature enhancement device and comprising a third set of filters configured to map nonlinearly each high dimensional vector in the second set onto a restored patch-wise representation, and an aggregating device electronically communicated with the mapping device and configured to aggregate the patch-wise representations to generate a restored clear image.

Description

AN APPARATUS AND A METHOD FOR REDUCING COMPRESSION ARTIFACTS OF A LOSSY-COMPRESSED IMAGE Technical Field
The present application generally relates to a field of image processing, more particularly, to an apparatus and a method for reducing compression artifacts of a lossy-compressed image.
Background
Lossy compression is the class of data encoding methods that uses inexact approximations or partial data discarding for representing the content that has been encoded. Such compression techniques are used to reduce the amount of data that would otherwise be needed to store, handle, and/or transmit the represented content. There are different kinds of lossy image compression formats, e.g. JPEG, WebP, JPEG XR, and HEVC-MSP. JPEG remains the most widely adopted format among the various alternatives.
Lossy compression introduces compression artifacts, especially when used in low bit rates/quantization levels. For instance, JPEG compression artifacts are a complex combination of different specific artifacts comprising blocking artifacts, ringing effects and blurring. Blocking artifacts arise when each block is encoded without considering the correlation with the adjacent blocks, resulting in discontinuities at the borders. Ringing effects along the edges occur due to the coarse quantization of the high-frequency components. Blurring happens due to the loss of high-frequency components.
Existing algorithms for eliminating the artifacts can be classified into deblocking oriented and restoration oriented methods. The deblocking oriented methods focus on removing blocking and ringing artifacts. However, most deblocking oriented methods could not reproduce sharp edges, and tend to over smooth texture regions. The restoration oriented methods regard the compression operation as distortion and propose restoration algorithms. The restoration oriented methods tend to reconstruct the original image directly, thus the sharpened output is often accompanied with ringing effects around edges and abrupt transition  in smooth regions.
Summary
The following presents a simplified summary of the disclosure in order to provide an apparatus for reducing compression artifacts of a lossy-compressed image of some aspects of the disclosure. This summary is not an extensive overview of the disclosure. It is intended to neither identify key or critical elements of the disclosure nor delineate any scope of particular embodiments of the disclosure, or any scope of the claims. Its sole purpose is to present some concepts of the disclosure in a simplified form as a prelude to the more detailed description that is presented later.
According to an embodiment of the present application, disclosed is an apparatus for reducing compression artifacts of a lossy-compressed image. The apparatus may comprise: a feature extraction device comprising a first set of filters configured to extract patches from the lossy-compressed image and map the extracted patches to a first set of high dimensional feature vectors, and a feature enhancement device electronically communicated with the feature extraction device and comprising a second set of filters configured to denoise each high dimensional feature vector in the first set and map the denoised high dimensional feature vectors to a second set of high dimensional feature vectors. The apparatus further comprises a mapping device coupled to the feature enhancement device and comprising a third set of filters configured to map nonlinearly each high dimensional vector in the second set onto a restored patch-wise representation, and an aggregating device electronically communicated with the mapping device and configured to aggregate patch-wise representations mapped from all high dimensional vectors in the second set to generate a restored clear image.
In an aspect, the first set of filters may be configured to extract patches from the lossy-compressed image and map nonlinearly each of the extracted patches as a high dimensional feature vector, and the mapped vectors for all the patches form said first set of high dimensional feature vectors.
In yet another aspect, the second set of filters may be configured to denoise each high dimensional feature vector in the first set and map nonlinearly the denoised high  dimensional feature vectors to a second set of high dimensional feature vectors.
In an aspect, the first, second and third set of filters and the aggregating device may map the vectors based on predetermined first, second and third parameters, respectively, or may aggregate the patch-wise representations based on fourth parameter.
In yet another aspect, the apparatus may further comprise a comparing device, it may be coupled to the aggregating device and configured to sample a ground truth uncompressed image corresponding to the lossy-compressed image from a predetermined training set and compare a dissimilarity between the aggregated restored clear image received from the aggregating device and the corresponding ground truth uncompressed image to generate a reconstruction error, wherein the reconstruction error is back-propagated in order to optimize the first, second, third and fourth parameters.
According to an embodiment of the present application, the apparatus may further comprise a training set preparation device coupled to the comparing device, in which the training set preparation device further comprises: a cropper configured to crop randomly a plurality of sub-images from a randomly selected training image to generate a set of ground truth uncompressed sub-images and a lossy-compressed sub-image generator electronically communicated with the cropper and configured to generate a set of lossy-compressed sub-images based on the set of ground truth uncompressed sub-images received from the cropper. Furthermore, the training set preparation device comprises a pairing device electronically communicated with the cropper and generator and configured to pair each of the ground truth uncompressed sub-images with a corresponding lossy-compressed sub-image and a collector electronically communicated with the pairing device and configured to collect the paired ground truth uncompressed sub-images and the lossy-compressed sub-image to form the predetermined training set.
In an aspect, the lossy-compressed sub-image generator further comprises a compressing device electronically communicated with the cropper and generator and configured to encode and decode the ground truth sub-image with Compression encoder and decoder to generate the set of lossy-compressed sub-images.
In an aspect, the reconstruction error comprises a mean squared error.
According to an embodiment of the present application, disclosed is a method for reducing compression artifacts of a lossy-compressed image, the method may comprise: extracting patches from the lossy-compressed image and mapping the extracted patches to a first set of high dimensional feature vectors by feature extraction device comprising a first set of filters; denoising each high dimensional feature vector in the first set and mapping the denoised high dimensional feature vectors to a second set of high dimensional feature vectors by a feature enhancement device electronically communicated with the feature extraction device and comprising a second set of filters; mapping nonlinearly each high dimensional vector in the second set onto a restored patch-wise representation by a mapping device coupled to the feature enhancement device and comprising a third set of filters; and aggregating patch-wise representations mapped from all high dimensional vectors in the second set to generate a restored clear image by an aggregating device electronically communicated with the mapping device.
According to an embodiment of the present application, disclosed is an apparatus for reducing compression artifacts of a lossy-compressed image. The apparatus may comprise a reconstructing unit configured to reconstruct the lossy-compressed image to a restored clear image based on predetermined parameters and a training unit configured to train the convolutional neural network system with a predetermined training set so as to determine the parameters used by the reconstructing unit. The reconstructing unit may comprise: feature extraction device comprising a first set of filters configured to extract patches from the lossy-compressed image and map the extracted patches to a first set of high dimensional feature vectors; a feature enhancement device electronically communicated with the feature extraction device and comprising a second set of filters configured to denoise each high dimensional feature vector in the first set and map the denoised high dimensional feature vectors to a second set of high dimensional feature vectors; a mapping device coupled to the feature enhancement device and comprising a third set of filters configured to map nonlinearly each high dimensional vector in the second set onto a restored patch-wise representation; and an aggregating device electronically communicated with the mapping device and configured to aggregate patch-wise representations mapped from all high  dimensional vectors in the second set to generate a restored clear image. The feature extraction device, the feature enhancement device, the mapping device and the aggregating device comprise at least one convolutional layer, respectively. The convolutional layers are sequentially connected to each other to form a convolutional neural network system.
According to an embodiment of the present application, disclosed is a system for reducing compression artifacts of a lossy-compressed image. The system may comprise a memory that stores executable components and a processor executes the executable components to perform operations of the system. The executable components comprise: a feature extraction component configured to extract patches from the lossy-compressed image and map the extracted patches to a first set of high dimensional feature vectors; a feature enhancement component configured to denoise each high dimensional feature vector in the first set and map the denoised high dimensional feature vectors to a second set of high dimensional feature vectors; a mapping component configured to map nonlinearly each high dimensional vector in the second set onto a restored patch-wise representation; and an aggregating component configured to aggregate patch-wise representations mapped from all high dimensional vectors in the second set to generate a restored clear image.
The following description and the annexed drawings set forth certain illustrative aspects of the disclosure. These aspects are indicative, however, of but a few of the various ways in which the principles of the disclosure may be employed. Other aspects of the disclosure will become apparent from the following detailed description of the disclosure when considered in conjunction with the drawings.
Brief Description of the Drawing
Exemplary non-limiting embodiments of the present invention are described below with reference to the attached drawings. The drawings are illustrative and generally not to an exact scale. The same or similar elements on different figures are referenced with the same reference numbers.
Fig. 1 is a schematic diagram illustrating an apparatus for reducing compression artifacts of a lossy-compressed image consistent with an embodiment of the present  application.
Fig. 2 is a schematic diagram illustrating an apparatus for reducing compression artifacts of a lossy-compressed image consistent with another embodiment of the present application.
Fig. 3 is a schematic diagram illustrating a convolutional neural network system, consistent with some disclosed embodiments.
Fig. 4. is a schematic diagram illustrating a training unit of the apparatus, consistent with some disclosed embodiments.
Fig. 5. is a schematic diagram illustrating a training set preparation device of the training unit, consistent with some disclosed embodiments.
Fig. 6 is a schematic flowchart illustrating a method for reducing compression artifacts of a lossy-compressed image, consistent with some disclosed embodiments.
Fig. 7 is a schematic flowchart illustrating a method for training a convolutional neural network system for reducing compression artifacts of a lossy-compressed image, consistent with some disclosed embodiments.
Fig. 8 is a schematic diagram illustrating a system for reducing compression artifacts of a lossy-compressed image consistent with an embodiment of the present application.
Detailed Description
Reference will now be made in detail to some specific embodiments of the invention including the best modes contemplated by the inventors for carrying out the invention. Examples of these specific embodiments are illustrated in the accompanying drawings. While the invention is described in conjunction with these specific embodiments, it will be understood that it is not intended to limit the invention to the described embodiments. On the contrary, it is intended to cover alternatives, modifications, and equivalents as may be included within the spirit and scope of the invention as defined by the appended claims. In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention. The present invention may be practiced without some  or all of these specific details. In other instances, well-known process operations have not been described in detail in order not to unnecessarily obscure the present invention.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising, " when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
Referring to Fig. 1, the apparatus 1000 may comprise a feature extraction device 100, a feature enhancement device 200, a mapping device 300 and an aggregating device 400. Hereinafter, the feature extraction device 100, the feature enhancement device 200, the mapping device 300 and the aggregating device 400 will be further discussed in detail. For convenience of description, the lossy-compressed image is denoted by Y, and the restored clear image is denoted by F (Y) which is as similar as possible to a ground truth uncompressed image X.
According to an embodiment, the feature extraction device 100 comprises a first set of filters. The first set of filters is configured to extract patches from the lossy-compressed image and map the extracted patches to a first set of high dimensional feature vectors. For example, the first set of filters map the extracted patches to a first set of high dimensional feature vectors by rule of a function of F' (first parameters) , where F' (x) is a nonlinear function (e.g., max (0, x) , 1/ (1+exp (-x) ) or tanh (x) ) and the first parameters are determined from predetermined parameters associated with the lossy-compressed image.
In an embodiment, these first set of high dimensional feature vectors may comprise a set of feature maps, of which the number equals to the dimensionality of the vectors. A popular strategy in image restoration is to densely extract patches and then represent them by a set of pre-trained bases such as PCA (Principal Component Analysis) , DCT (Discrete Cosine Transformation) , Haar, etc.
According to an embodiment, the operations for the feature extraction device 100 may be formulated as:
F1(Y) = F’ (W1*Y + B1) ,                     (1)
where W1 and B1 represent the filters and biases respectively. F' (x) is a nonlinear function (e.g., max (0, x) , 1/ (1+exp (-x) ) or tanh (x) ) . Here W1 is of a size c×f1×f1×n1, where c is the number of channels in the input image, f1 is the spatial size of a filter, and n1 is the number of filters. Intuitively, W1 applies n1 convolutions on the image, and each convolution has a kernel size c×f1×f1. The output is composed of n1 feature maps. B1 is an n1-dimensional vector, whose each element is associated with a filter.
The feature enhancement device 200 may be electronically communicated with the feature extraction device 100, and may comprise a second set of filters configured to denoise each high dimensional feature vector in the first set and map the denoised high dimensional feature vectors to a second set of high dimensional feature vectors, for example, a set of relatively cleaner feature vectors.
According to an embodiment, the feature enhancement device 200 is configured to denoise each high dimensional feature vector in the first set and map the denoised high dimensional feature vectors to the second set of high dimensional feature vectors by rule of a function of F' (second parameters) , where F' (x) is a nonlinear function (e.g., max (0, x) , 1/(1+exp (-x) ) or tanh (x) ) and the second parameters are determined from predetermined parameters associated with the first set of high dimensional feature vectors.
In the embodiment, the feature extraction device 100 extracts an n1-dimensional feature for each patch. The second set of filters maps these n1-dimensional vectors into a set of n2-dimensional vectors. Each mapped vector is conceptually a relatively cleaner feature vector. These vectors comprise another set of feature maps.
According to an embodiment, the feature enhancement may be formulated as:
F2(Y) = F’ (W2*Y + B2) ,                          (2)
where W2 is of a size n1×f2×f2×n2 and B2 is a n2-dimensional vector.
As shown, the apparatus 1000 may further comprise a mapping device 300. The mapping device 300 may be coupled to the feature enhancement device 200 and comprise a third set of filters configured to map nonlinearly each high dimensional vector in the second set onto a restored patch-wise representation.
According to an embodiment, the mapping device 300 is configured to map nonlinearly each of the high dimensional vectors onto a patch-wise representation by rule of a function of F' (third parameters) , where F' (x) is a nonlinear function (e.g., max (0, x) , 1/(1+exp (-x) ) or tanh (x) ) and the third parameters are determined from predetermined parameters associated with the second set of high dimensional feature vectors, i.e. the cleaner high dimensional feature vectors.
In an embodiment, the feature enhancement device 200 generates a set of n2-dimensional feature vectors. The mapping device 300 maps each of these n2-dimensional vectors into an n3-dimensional vector. Each mapped vector is conceptually the representation of a restored patch. These vectors comprise another set of feature maps.
According to an embodiment, the mapping may be formulated as:
F3(Y) = F’ (W3*F2 (Y) + B3) ,                   (3)
where W3 is of a size n2×f3×f3×n3, and B3 is a n3-dimensional vector. Each of the output n3-dimensional vectors is conceptually a representation of a restored patch that will be used for reconstruction.
As shown, the apparatus 1000 may further comprise an aggregating device 400. The aggregating device 400 may be electronically communicated with the mapping device 300 and configured to aggregate the patch-wise representations to generate a restored clear  image
The aggregating device 400 aggregates the restored patch-wise representations to generate a restored clear image. The aggregating may be formulated as:
F(Y) = W4*F3 (Y) + B4,                           (4)
where W4 is of a size n3×f4×f4×c, and B4 is a c-dimensional vector.
According to the embodiment, the apparatus 1000 may further comprise a comparing device (not shown) which is coupled to the aggregating device 400 and configured to sample a ground truth uncompressed sub-image corresponding to the lossy-compressed sub-image from a predetermined training set and compare dissimilarity between the aggregated restored clear sub-image received from the aggregating device 400 and the sampled ground truth uncompressed sub-image to generate a reconstruction error. For example, the reconstruction error comprises a mean squared error. The reconstruction error is back-propagated in order to determine the parameters, i.e., W1, W2, W3, W4, B1, B2, B3 and B4.
Fig. 2 is a schematic diagram illustrating an apparatus 1000’ for reducing compression artifacts of a lossy-compressed image consistent with another embodiment of the present application. As shown in Fig. 2, the apparatus 1000’ may comprise a reconstructing unit 100’a nd a training unit 200’ . The reconstructing unit 100’ is configured to reconstruct the lossy-compressed image to a restored clear image based on predetermined parameters.
According to an embodiment shown in Fig. 2, the reconstructing unit 100’ may further comprise a feature extraction device 110’ , a feature enhancement device 120’ , a mapping device 130’a nd a aggregating device 140’ . In an embodiment, the feature extraction device 110’ , the feature enhancement device 120’ , the mapping device 130’a nd the aggregating device 140’ may comprise at least one convolutional layer, respectively, and the convolutional layers are sequentially connected to each other to form a convolutional neural network system.
Fig. 3 illustrates the layer configuration of the convolutional neural network  system in mathematic simulation model. In one embodiment, each of the feature extraction device 110’ , the feature enhancement device 120’ , the mapping device 130’ and the aggregating device 140’ may be simulated as at least one convolutional layer, respectively. Different operations are performed at different convolutional layers, respectively.
In the embodiment, the feature extraction device 110’ is configured to extract patches from the lossy-compressed image and map the extracted patches to a first set of high dimensional feature vectors. This is equivalent to convolving the image by a set of filters as mentioned above.
The feature enhancement device 120’ is configured to be electronically communicated with the feature extraction device 110’ and denoise each high dimensional feature vector in the first set and map the denoised high dimensional feature vectors to a second set of high dimensional feature vectors, for example, a set of relatively cleaner feature vectors. This is equivalent to applying a second set of filters as mentioned above.
The mapping device 130’ is configured to be coupled to the feature enhancement device 120’a nd map nonlinearly each high dimensional vector in the second set onto a restored patch-wise representation. This is equivalent to applying a third set of filters as mentioned above.
The aggregating device 140’ is configured to be electronically communicated with the mapping device 130’a nd aggregate patch-wise representations mapped from all high dimensional vectors in the second set to generate a restored clear image.
In an embodiment, the feature extracting device 110’ , the feature enhancement device 120’ , the mapping device 130’a nd the aggregating device 140’ comprise at least one convolutional layer, respectively, and the convolutional layers are sequentially connected to each other to form a convolutional neural network system. The convolutional neural network system dates back decades and has recently shown an explosive popularity partially due to its success in image classification. The convolutional neural network system is usually applied for natural image denoising and removing noisy patterns (dirt/rain) .
Alternatively, it is possible to add more convolutional layers to increase the non-linearity. But this can significantly increase the complexity of the convolutional neural  network system, and thus demands more training data and time.
The training unit 200’ is configured to train the convolutional neural network system with a predetermined training set so as to optimize the parameters, for example W1, W2, W3, W4, B1, B2, B3, B4used by the reconstructing unit. According to an embodiment as shown in Fig. 4, the training unit 200’ may comprise a sampling device 210’ , a comparing device 220’ , and a back-propagating device 230’ .
The sampling device 210’ may be configured to sample a lossy-compressed sub-image and its corresponding ground truth uncompressed sub-image from a predetermined training set and input the lossy-compressed sub-image to the convolutional neural network system. Here, “sub-images” means these samples are treated as small “images” rather than “patches” , in the sense that “patches” are overlapping and require some averaging as post-processing but “sub-images” need not.
The comparing device 220’ may be configured to compare dissimilarity between the reconstructed clear sub-image based on the input lossy-compressed sub-image from the convolutional neural network system and the corresponding ground truth uncompressed sub-image to generate a reconstruction error. For example, the reconstruction error may comprise a mean squared error, and the error is minimized by using stochastic gradient descent with the standard back propagation.
The back-propagating device 230’ is configured to back-propagate the reconstruction error through the convolutional neural network system so as to adjust weights on connections between neurons of the convolutional neural network system.
It should be noted that the convolutional neural network system do not preclude the usage of other kinds of reconstruction error, if only the reconstruction error are derivable. If a better perceptually motivated metric is given during the training, it is flexible for the convolutional neural network system to adapt to that metric.
In one embodiment, the apparatus 1000 and 1000’ may further comprise a training set preparation device coupled to the comparing device and configured to prepare the predetermined training set for training the convolutional neural network system. Fig. 5 is a schematic diagram illustrating the training set preparation device. As shown, the training set  preparation device may comprise a cropper 241’ , a lossy-compressed sub-image generator 242’ , a pairing device 243’ and a collector 244’ .
The cropper 241’ may be configured to crop randomly a plurality of sub-images from a randomly selected training image to generate a set of ground truth uncompressed sub-images. For example, the cropper 241’ may crop n sub-images of m×m pixels each. The lossy-compressed sub-image generator 242’ may be electronically communicated with the cropper 241’a nd configured to generate a set of lossy-compressed sub-images based on the set of ground truth uncompressed sub-images received from the cropper 241’ . The pairing device 243’ may be electronically communicated with the cropper 241’ and generator 242’ and configured to pair each of the ground truth uncompressed sub-images with a corresponding lossy-compressed sub-image. The collector 244’ may be electronically communicated with the pairing device 243’ and configured to collect all the pairs to form the predetermined training set.
According to an embodiment, the lossy-compressed sub-image generator 242’ may comprise a compressing device electronically communicated with the cropper 241’ and configured to encode and decode the ground truth sub-image with compression encoder and decoder to generate the set of lossy-compressed sub-images.
Fig. 6 is a schematic flowchart illustrating a method 2000 for reducing compression artifacts of a lossy-compressed image, consistent with some disclosed embodiments. Hereafter, the method 2000 may be described in detail with respect to Fig. 6.
At step S210, patches are extracted from the lossy-compressed image and each of the extracted patches is mapped into a high dimensional feature vector, by the feature extraction device comprising the first set of filters, such that a first set of high dimensional feature vectors is formed. In an embodiment, these vectors comprise a set of feature maps, of which the number equals to the dimensionality of the vectors. A popular strategy in image restoration is to densely extract patches and then represent them by a set of pre-trained bases such as PCA, DCT, Haar, etc.
At step S220, each high dimensional feature vector in the first set is denoised and the denoised high dimensional feature vectors are mapped into a second set of high  dimensional feature vectors by a feature enhancement device electronically communicated with the feature extraction device and comprising a second set of filters. In the embodiment, the feature extraction device extracts an n1-dimensional feature for each patch. The second set of filters maps these n1-dimensional vectors into a set of n2-dimensional vectors. Each mapped vector is conceptually a relatively cleaner feature vector. These vectors comprise another set of feature maps.
At step S230, each high dimensional vector in the second set is mapped nonlinearly onto a restored patch-wise representation by a mapping device coupled to the feature enhancement device and comprising a third set of filters. In the embodiment, the feature enhancement device generates a set of n2-dimensional feature vectors. The mapping device maps each of these n2-dimensional vectors into an n3-dimensional vector. Each mapped vector is conceptually the representation of a restored patch. These vectors comprise another set of feature maps.
At step S240, patch-wise representations mapped from all high dimensional vectors in the second set are aggregated to generate a restored clear image by an aggregating device electronically communicated with the mapping device. In an embodiment, these steps S210-S230 may be simulated by the above-mentioned formulae (1) - (3) .
According to an embodiment, the patches may be extracted from the lossy-compressed image and each of the extracted patches may be mapped as a high dimensional feature vector by rule of a function of F' (first parameters) , where F' (x) is a nonlinear function (e.g., max (0, x) , 1/ (1+exp (-x) ) or tanh (x) ) and the first parameters are determined from predetermined parameters associated with the lossy-compressed image.
According to an embodiment, the first set of high dimensional feature vectors may be denoised and the denoised high dimensional feature vectors may be mapped nonlinearly to a second set of high dimensional feature vectors, i.e. a set of relatively cleaner feature vectors by rule of a function of F' (second parameters) , where F' (x) is a nonlinear function (e.g., max (0, x) , 1/ (1+exp (-x) ) or tanh (x) ) and the second parameters are determined from predetermined parameters associated with the first set of high dimensional feature vectors. 
According to an embodiment, each high dimensional vector in the second set may be mapped nonlinearly onto a restored patch-wise representation by rule of a function of F'(third parameters) , where F' (x) is a nonlinear function (e.g., max (0, x) , 1/ (1+exp (-x)) or tanh (x) ) and the third parameters are determined from predetermined parameters associated with the second set of high dimensional vectors.
According to an embodiment, after the patch-wise representations are aggregated to generate a restored clear image, the method 2000 may further comprise a step of sampling a ground truth uncompressed sub-image corresponding to the lossy-compressed sub-image from a predetermined training set and a step of comparing dissimilarity between the aggregated restored clear sub-image and the corresponding ground truth uncompressed sub-image to generate a reconstruction error. The reconstruction error is back-propagated in order to optimize the parameters, i.e., W1, W2, W3, W4, B1, B2, B3and B4.
According to an embodiment, before a ground truth uncompressed sub-image corresponding to the lossy-compressed sub-image is sampled from a predetermined training set, the method 2000 further comprises a step of preparing the predetermined training set. In particular, a plurality of sub-images is first cropped from a randomly selected training image to generate a set of ground truth uncompressed sub-images. For example, n sub-images of m×m pixels each may be cropped. Next, a set of lossy-compressed sub-images are generated based on the set of ground truth uncompressed sub-images. Then, each of the ground truth uncompressed sub-images is paired with a corresponding lossy-compressed sub-image. Then, all the pairs are collected to form the predetermined training set.
According to an embodiment, a method 3000 for training a convolutional neural network system for reducing compression artifacts of a lossy-compressed image is illustrated. Hereafter, the method 3000 may be described in detail with respect to Fig. 7.
As shown in Fig. 7, a lossy-compressed sub-image and its corresponding ground truth uncompressed sub-image are sampled from a predetermined training set at step S310. As step S320, a restored clear sub-image is reconstructed from the lossy-compressed sub-image by the convolutional neural network system. At step S330, a reconstruction error is generated by comparing dissimilarity between the reconstructed clear sub-image and the ground truth  uncompressed sub-image. At step S340, the reconstruction error is back-propagated through the convolutional neural network system so as to adjust weights on connections between neurons of the convolutional neural network system. Repeating steps S310-S340 until an average value of the reconstruction error is lower than a preset threshold, for example, half of the mean square error between the lossy-compressed sub-images and ground truth uncompressed sub-image in the predetermined training set.
Referring to Fig. 8, a system 4000 is illustrated. The system 4000 comprises a memory 402 that stores executable components and a processor 404, coupled to the memory 402, that executes the executable components to perform operations of the system 4000. The executable components may comprise: a feature extraction component 410 configured to extract patches from the lossy-compressed image and map the extracted patches to a first set of high dimensional feature vectors; and a feature enhancement component 420 configured to denoise each high dimensional feature vector in the first set and map the denoised high dimensional feature vectors to a second set of high dimensional feature vectors. In addition, the executable components may further comprise: a mapping component 430 configured to map nonlinearly each high dimensional vector in the second set onto a restored patch-wise representation; and an aggregating component 440 configured to aggregate patch-wise representations mapped from all high dimensional vectors in the second set to generate a restored clear image.
In an aspect, the feature extraction component 410 is configured to extract patches from the lossy-compressed image and map nonlinearly each of the extracted patches as a high dimensional feature vector, and the mapped vectors for all the patches forming said first set of high dimensional feature vectors.
In an embodiment, the feature enhancement component 420 is configured to denoise each high dimensional feature vector in the first set and map nonlinearly the denoised high dimensional feature vectors to a second set of high dimensional feature vectors.
In an embodiment, the feature extraction component 410, feature enhancement component 420 and mapping component 430 map the vectors based on predetermined first, second and third parameters, respectively.
According to another embodiment, the executable components further comprises a comparing component coupled to the aggregating component and configured to sample a ground truth uncompressed image corresponding to the lossy-compressed image from a predetermined training set and compare a dissimilarity between the aggregated restored clear image received from the aggregating component and the corresponding ground truth uncompressed image to generate a reconstruction error, wherein the reconstruction error is back-propagated in order to optimize the first, second and third parameters.
In an embodiment, the executable components further comprise a training set preparation component coupled to the comparing component. The training set preparation component further comprises: a cropper configured to crop randomly a plurality of sub-images from a randomly selected training image to generate a set of ground truth uncompressed sub-images; a lossy-compressed sub-image generator electronically communicated with the cropper and configured to generate a set of lossy-compressed sub-images based on the set of ground truth uncompressed sub-images received from the cropper; a pairing module electronically communicated with the cropper and generator and configured to pair each of the ground truth uncompressed sub-images with a corresponding lossy-compressed sub-image; and a collector electronically communicated with the pairing module and configured to collect the paired ground truth uncompressed sub-images and the lossy-compressed sub-image to form the predetermined training set.
In an embodiment, the lossy-compressed sub-image generator further comprises a compressing module electronically communicated with the cropper and generator and configured to encode and decode the ground truth sub-image with compression encoder and decoder to generate the set of lossy-compressed sub-images.
In contrast to existing methods, the present application does not explicitly learn the dictionaries or manifolds for modeling the patch space. These are implicitly achieved via the convolutional layers. Furthermore, the feature extraction, feature enhancement and aggregation are also formulated as convolutional layers, so are involved in the optimization. The method and apparatus of the present application reveals different kinds of compression artifacts and provide an efficient reduction of various compression artifacts in different image  regions. In the method and apparatus of the present application, the entire convolutional neural network is fully obtained through training, with no pre/post-processing. With a lightweight structure, the apparatus and method of the present application have achieved superior performance than the state-of-the-art methods.
Embodiments within the scope of the present invention can be implemented in digital electronic circuitry, or in computer hardware, firmware, software, or in combinations thereof. Apparatus within the scope of the present invention can be implemented in a computer program product tangibly embodied in a machine-readable storage device for execution by a programmable processor; and method actions within the scope of the present invention can be performed by a programmable processor executing a program of instructions to perform functions of the invention by operating on input data and generating output.
Embodiments within the scope of the present invention be implemented advantageously in one or more computer programs that are executable on a programmable system including at least one programmable processor coupled to receive data and instructions from, and to transmit data and instructions to, a data storage system, at least one input device, and at least one output device. Each computer program can be implemented in a high-level procedural or object oriented programming language, or in assembly or machine language if desired; and in any case, the language can be a compiled or interpreted language. Suitable processors include, by way of example, both general and special purpose microprocessors. Generally, a processor will receive instructions and data from a read-only memory and/or a random access memory. Generally, a computer will include one or more mass storage devices for storing data files.
Embodiments within the scope of the present invention include computer-readable media for carrying or having computer-executable instructions, computer-readable instructions, or data structures stored thereon. Such computer-readable media may be any available media, which is accessible by a general-purpose or special-purpose computer system. Examples of computer-readable media may include physical storage media such as RAM, ROM, EPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other media which can be  used to carry or store desired program code means in the form of computer-executable instructions, computer-readable instructions, or data structures and which may be accessed by a general-purpose or special-purpose computer system. Any of the foregoing can be supplemented by, or incorporated in, ASICs (application-specific integrated circuits) . While particular embodiments of the present invention have been shown and described, changes and modifications may be made to such embodiments without departing from the true scope of the invention.
Although the preferred examples of the present invention have been described, those skilled in the art can make variations or modifications to these examples upon knowing the basic inventive concept. The appended claims are intended to be considered as comprising the preferred examples and all the variations or modifications fell into the scope of the present invention.
Obviously, those skilled in the art can make variations or modifications to the present invention without departing the spirit and scope of the present invention. As such, if these variations or modifications belong to the scope of the claims and equivalent technique, they may also fall into the scope of the present invention.

Claims (20)

  1. An apparatus for reducing compression artifacts of a lossy-compressed image, comprising:
    a feature extraction device comprising a first set of filters configured to extract patches from the lossy-compressed image and map the extracted patches to a first set of high dimensional feature vectors;
    a feature enhancement device electronically communicated with the feature extraction device and comprising a second set of filters configured to denoise each high dimensional feature vector in the first set and map the denoised high dimensional feature vectors to a second set of high dimensional feature vectors;
    a mapping device electronically coupled to the feature enhancement device and comprising a third set of filters configured to map nonlinearly each high dimensional vector in the second set onto a restored patch-wise representation; and
    an aggregating device electronically communicated with the mapping device and configured to aggregate patch-wise representations mapped from all high dimensional vectors in the second set to generate a restored clear image.
  2. The apparatus according to claim 1, wherein the first set of filters is configured to extract patches from the lossy-compressed image and map nonlinearly each of the extracted patches as a high dimensional feature vector, and the mapped vectors for all the patches form said first set of high dimensional feature vectors.
  3. The apparatus according to claim 1, wherein the second set of filters is configured to denoise each high dimensional feature vector in the first set and map nonlinearly the denoised high dimensional feature vectors to a second set of high dimensional feature vectors.
  4. The apparatus according to any of claims 1-3, wherein the first, second and third set of filters map the vectors based on predetermined first, second and third parameters, respectively.
  5. The apparatus according to claim 4, further comprising:
    a comparing device electronically coupled to the aggregating device and configured to sample a ground truth uncompressed image corresponding to the lossy-compressed image from a predetermined training set and compare a dissimilarity between the aggregated restored clear image received from the aggregating device and the corresponding ground truth uncompressed image to generate a reconstruction error, wherein the reconstruction error is back-propagated in order to optimize the first, second and third parameters.
  6. The apparatus according to claim 5, further comprising a training set preparation device electronically coupled to the comparing device, wherein the training set preparation device further comprises:
    a cropper configured to crop randomly a plurality of sub-images from a randomly selected training image to generate a set of ground truth uncompressed sub-images;
    a lossy-compressed sub-image generator electronically communicated with the cropper and configured to generate a set of lossy-compressed sub-images based on the set of ground truth uncompressed sub-images received from the cropper;
    a pairing device electronically communicated with the cropper and generator and configured to pair each of the ground truth uncompressed sub-images with a corresponding lossy-compressed sub-image; and
    a collector electronically communicated with the pairing device and configured to collect the paired ground truth uncompressed sub-images and the lossy-compressed sub-image to form the predetermined training set.
  7. The apparatus according to claim 6, wherein the lossy-compressed sub-image generator further comprises a compressing device electronically communicated with the  cropper and configured to encode and decode the ground truth sub-image with Compression encoder and decoder to generate the set of lossy-compressed sub-images.
  8. The apparatus according to claim 5, wherein the reconstruction error comprises a mean squared error.
  9. A method for reducing compression artifacts of a lossy-compressed image, comprising:
    extracting patches from the lossy-compressed image and mapping the extracted patches to a first set of high dimensional feature vectors by feature extraction device comprising a first set of filters;
    denoising each high dimensional feature vector in the first set and mapping the denoised high dimensional feature vectors to a second set of high dimensional feature vectors by a feature enhancement device electronically communicated with the feature extraction device and comprising a second set of filters;
    mapping nonlinearly each high dimensional vector in the second set onto a restored patch-wise representation by a mapping device electronically coupled to the feature enhancement device and comprising a third set of filters; and
    aggregating patch-wise representations mapped from all high dimensional vectors in the second set to generate a restored clear image by an aggregating device electronically communicated with the mapping device.
  10. The method according to claim 9, wherein the extracting patches from the lossy-compressed image and mapping the extracted patches to a first set of high dimensional feature vectors further comprises:
    extracting patches from the lossy-compressed image and mapping nonlinearly each of the extracted patches as a high dimensional feature vector, and the mapped vectors for all the patches forming said first set of high dimensional feature vectors.
  11. The method according to claim 9, wherein the denoising each high dimensional feature vector in the first set and mapping the denoised high dimensional feature vectors to a second set of high dimensional feature vectors further comprises:
    denoising each high dimensional feature vector in the first set and mapping nonlinearly the denoised high dimensional feature vectors to a second set of high dimensional feature vectors.
  12. The method according to any of claims 9-11, wherein the first, second and third set of filters map the vectors based on predetermined first, second and third parameters, respectively.
  13. The method according to claim 12, after the aggregating, further comprising:
    sampling a ground truth uncompressed image corresponding to the lossy-compressed image from a predetermined training set; and
    comparing a dissimilarity between the aggregated restored clear image and the corresponding ground truth uncompressed image to generate a reconstruction error, wherein the reconstruction error is back-propagated in order to optimize the first, second and third parameters.
  14. The method according to claim 13, wherein before sampling a ground truth uncompressed image corresponding to the lossy-compressed image from a predetermined training set, further comprising:
    cropping randomly a plurality of sub-images from a randomly selected training image to generate a set of ground truth uncompressed sub-images;
    generating a set of lossy-compressed sub-images based on the set of ground truth uncompressed sub-images;
    pairing each of the ground truth uncompressed sub-images with a corresponding lossy-compressed sub-image; and
    collecting the paired ground truth uncompressed sub-images and the lossy-compressed sub-image to form the predetermined training set.
  15. The method according to claim 14, wherein the generating a set of lossy-compressed sub-images based on the set of ground truth uncompressed sub-images further comprises:
    encoding and decoding the ground truth sub-image with Compression encoder and decoder to generate the set of lossy-compressed sub-images.
  16. The method according to claim 13, wherein the reconstruction error comprises a mean squared error.
  17. An apparatus for reducing compression artifacts of a lossy-compressed image, comprising:
    a reconstructing unit configured to reconstruct the lossy-compressed image to a restored clear image based on predetermined parameters, wherein the reconstructing unit comprises:
    a feature extraction device comprising a first set of filters configured to extract patches from the lossy-compressed image and map the extracted patches to a first set of high dimensional feature vectors;
    a feature enhancement device electronically communicated with the feature extraction device and comprising a second set of filters configured to denoise each high dimensional feature vector in the first set and map the denoised high dimensional feature vectors to a second set of high dimensional feature vectors;
    a mapping device electronically coupled to the feature enhancement device and comprising a third set of filters configured to map nonlinearly each high dimensional vector in the second set onto a restored patch-wise representation; and
    an aggregating device electronically communicated with the mapping device and configured to aggregate patch-wise representations mapped from all high dimensional vectors in the second set to generate a restored clear image;
    wherein the feature extracting device, the feature enhancement device, the mapping device and the aggregating device comprise at least one convolutional layer, respectively, and the convolutional layers are sequentially connected to each other to form a convolutional neural network system;
    a training unit electronically communicated with the reconstructing unit and configured to train the convolutional neural network system with a predetermined training set so as to modify the predetermined parameters used by the reconstructing unit.
  18. A system for reducing compression artifacts of a lossy-compressed image, comprising:
    a memory that stores executable components; and
    a processor, electronically coupled to the memory, that executes the executable components to perform operations of the system, the executable components comprising:
    a feature extraction component configured to extract patches from the lossy-compressed image and map the extracted patches to a first set of high dimensional feature vectors;
    a feature enhancement component configured to denoise each high dimensional feature vector in the first set and map the denoised high dimensional feature vectors to a second set of high dimensional feature vectors;
    a mapping component configured to map nonlinearly each high dimensional vector in the second set onto a restored patch-wise representation; and
    an aggregating component configured to aggregate patch-wise representations mapped from all high dimensional vectors in the second set to generate a restored clear image.
  19. The system according to claim 18, wherein feature extraction component is configured to extract patches from the lossy-compressed image and map nonlinearly each of the extracted patches as a high dimensional feature vector, and the mapped vectors for all the patches forming said first set of high dimensional feature vectors.
  20. The system according to claim 18, wherein feature enhancement component is configured to denoise each high dimensional feature vector in the first set and map nonlinearly the denoised high dimensional feature vectors to a second set of high dimensional feature vectors.
PCT/CN2015/000093 2015-02-13 2015-02-13 An apparatus and a method for reducing compression artifacts of a lossy-compressed image WO2016127271A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201580075726.4A CN107251053B (en) 2015-02-13 2015-02-13 A kind of method and device for the compression artefacts reducing lossy compression image
PCT/CN2015/000093 WO2016127271A1 (en) 2015-02-13 2015-02-13 An apparatus and a method for reducing compression artifacts of a lossy-compressed image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2015/000093 WO2016127271A1 (en) 2015-02-13 2015-02-13 An apparatus and a method for reducing compression artifacts of a lossy-compressed image

Publications (1)

Publication Number Publication Date
WO2016127271A1 true WO2016127271A1 (en) 2016-08-18

Family

ID=56614081

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/000093 WO2016127271A1 (en) 2015-02-13 2015-02-13 An apparatus and a method for reducing compression artifacts of a lossy-compressed image

Country Status (2)

Country Link
CN (1) CN107251053B (en)
WO (1) WO2016127271A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107871306A (en) * 2016-09-26 2018-04-03 北京眼神科技有限公司 Method and device for denoising picture
CN109120937A (en) * 2017-06-26 2019-01-01 杭州海康威视数字技术股份有限公司 A kind of method for video coding, coding/decoding method, device and electronic equipment
CN109151475A (en) * 2017-06-27 2019-01-04 杭州海康威视数字技术股份有限公司 A kind of method for video coding, coding/decoding method, device and electronic equipment
WO2023111856A1 (en) * 2021-12-14 2023-06-22 Spectrum Optix Inc. Neural network assisted removal of video compression artifacts

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108765338A (en) * 2018-05-28 2018-11-06 西华大学 Spatial target images restored method based on convolution own coding convolutional neural networks
CN109801218B (en) * 2019-01-08 2022-09-20 南京理工大学 Multispectral remote sensing image Pan-sharpening method based on multilayer coupling convolutional neural network
CN111986278B (en) * 2019-05-22 2024-02-06 富士通株式会社 Image encoding device, probability model generating device, and image compression system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040120564A1 (en) * 2002-12-19 2004-06-24 Gines David Lee Systems and methods for tomographic reconstruction of images in compressed format
US20070076959A1 (en) * 2005-10-03 2007-04-05 Xerox Corporation JPEG detectors and JPEG image history estimators
US20090067491A1 (en) * 2007-09-07 2009-03-12 Microsoft Corporation Learning-Based Image Compression
US20090238476A1 (en) * 2008-03-24 2009-09-24 Microsoft Corporation Spectral information recovery for compressed image restoration

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB201119206D0 (en) * 2011-11-07 2011-12-21 Canon Kk Method and device for providing compensation offsets for a set of reconstructed samples of an image
CN103517022B (en) * 2012-06-29 2017-06-20 华为技术有限公司 A kind of Image Data Compression and decompression method, device
CN103475876B (en) * 2013-08-27 2016-06-22 北京工业大学 A kind of low bit rate compression image super-resolution rebuilding method based on study

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040120564A1 (en) * 2002-12-19 2004-06-24 Gines David Lee Systems and methods for tomographic reconstruction of images in compressed format
US20070076959A1 (en) * 2005-10-03 2007-04-05 Xerox Corporation JPEG detectors and JPEG image history estimators
US20090067491A1 (en) * 2007-09-07 2009-03-12 Microsoft Corporation Learning-Based Image Compression
US20090238476A1 (en) * 2008-03-24 2009-09-24 Microsoft Corporation Spectral information recovery for compressed image restoration

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107871306A (en) * 2016-09-26 2018-04-03 北京眼神科技有限公司 Method and device for denoising picture
CN107871306B (en) * 2016-09-26 2021-07-06 北京眼神科技有限公司 Method and device for denoising picture
CN109120937A (en) * 2017-06-26 2019-01-01 杭州海康威视数字技术股份有限公司 A kind of method for video coding, coding/decoding method, device and electronic equipment
CN109151475A (en) * 2017-06-27 2019-01-04 杭州海康威视数字技术股份有限公司 A kind of method for video coding, coding/decoding method, device and electronic equipment
WO2023111856A1 (en) * 2021-12-14 2023-06-22 Spectrum Optix Inc. Neural network assisted removal of video compression artifacts

Also Published As

Publication number Publication date
CN107251053A (en) 2017-10-13
CN107251053B (en) 2018-08-28

Similar Documents

Publication Publication Date Title
WO2016127271A1 (en) An apparatus and a method for reducing compression artifacts of a lossy-compressed image
Liu et al. Multi-level wavelet-CNN for image restoration
Cui et al. Deep network cascade for image super-resolution
Li et al. An efficient deep convolutional neural networks model for compressed image deblocking
Zhang et al. One-two-one networks for compression artifacts reduction in remote sensing
CN110490832B (en) Magnetic resonance image reconstruction method based on regularized depth image prior method
CN111047516A (en) Image processing method, image processing device, computer equipment and storage medium
CN107463989A (en) A kind of image based on deep learning goes compression artefacts method
CN112801901A (en) Image deblurring algorithm based on block multi-scale convolution neural network
WO2016019484A1 (en) An apparatus and a method for providing super-resolution of a low-resolution image
Marinč et al. Multi-kernel prediction networks for denoising of burst images
Yue et al. CID: Combined image denoising in spatial and frequency domains using Web images
CN107301662B (en) Compression recovery method, device and equipment for depth image and storage medium
CN112053308B (en) Image deblurring method and device, computer equipment and storage medium
CN112150400B (en) Image enhancement method and device and electronic equipment
CN104700440B (en) Magnetic resonant part K spatial image reconstruction method
CN113673675A (en) Model training method and device, computer equipment and storage medium
Korus et al. Content authentication for neural imaging pipelines: End-to-end optimization of photo provenance in complex distribution channels
CN102148986A (en) Method for encoding progressive image based on adaptive block compressed sensing
KR20100016272A (en) Image compression and decompression using the pixon method
US8634671B2 (en) Methods and apparatus to perform multi-focal plane image acquisition and compression
Amaranageswarao et al. Residual learning based densely connected deep dilated network for joint deblocking and super resolution
Najgebauer et al. Fully convolutional network for removing dct artefacts from images
CN113033616B (en) High-quality video reconstruction method, device, equipment and storage medium
WO2022037146A1 (en) Image processing method, apparatus, device, computer storage medium, and system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15881433

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15881433

Country of ref document: EP

Kind code of ref document: A1