WO2021042270A1 - Compression artifacts reduction method based on dual-stream multi-path recursive residual network - Google Patents

Compression artifacts reduction method based on dual-stream multi-path recursive residual network Download PDF

Info

Publication number
WO2021042270A1
WO2021042270A1 PCT/CN2019/104234 CN2019104234W WO2021042270A1 WO 2021042270 A1 WO2021042270 A1 WO 2021042270A1 CN 2019104234 W CN2019104234 W CN 2019104234W WO 2021042270 A1 WO2021042270 A1 WO 2021042270A1
Authority
WO
WIPO (PCT)
Prior art keywords
network
residual
dual
layer
path
Prior art date
Application number
PCT/CN2019/104234
Other languages
French (fr)
Chinese (zh)
Inventor
金枝
齐银鹤
谭晓军
Original Assignee
中山大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中山大学 filed Critical 中山大学
Priority to PCT/CN2019/104234 priority Critical patent/WO2021042270A1/en
Publication of WO2021042270A1 publication Critical patent/WO2021042270A1/en

Links

Images

Classifications

    • G06T5/77
    • G06T5/60
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Definitions

  • the invention relates to a method for compressing and removing artifacts during storage and transmission of media files such as images and videos.
  • JPEG lossy compression
  • PNG lossless compression
  • lossy compression has a higher compression ratio.
  • lossy compression often causes irreversible information loss and compression artifacts, such as ringing, blocking, and blur, especially for low-bit-rate encoding. These compression artifacts not only degrade the user experience, but also adversely affect many primary image processing tasks.
  • JPEG is currently the most widely used image compression standard.
  • BDCT Block-based Discrete Cosine Transform
  • coarse quantization strategies it aims to reduce statistical redundancy between pixels and achieve a high compression ratio.
  • BDCT Block-based Discrete Cosine Transform
  • the intensity discontinuity at the block boundary is prone to appear.
  • the truncation of high-frequency discrete cosine transform coefficients can also cause ringing and blurring artifacts. Therefore, compression artifacts caused by lossy compression can generally be considered as hybrid artifacts.
  • the early compression artifact removal was mainly based on filtering methods, which reduced the blocking effect by manually designing a filter that worked on the block boundary.
  • wavelet transform adaptive discrete cosine transform (SA-DCT, Shape-Adaptive Discrete Cosine Transform), and methods based on sparse coding.
  • SA-DCT adaptive discrete cosine transform
  • SA-DCT Shape-Adaptive Discrete Cosine Transform
  • sparse coding When using the aforementioned method to obtain an artifact-free image from a compressed image, it is usually accompanied by noise edges and unnatural smooth areas.
  • methods based on deep learning obtain the best results by learning the nonlinear mapping relationship between the compressed image and the original image.
  • the reconstructed image appears excessively smooth.
  • the deep network can provide better performance, there are too many parameters, the network is difficult to train and the storage cost of the deep network model is higher than that of traditional methods.
  • CAR can be divided into three subtasks: de-blocking, de-ringing artifacts and de-blurring.
  • de-blocking and ringing artifacts it is necessary to suppress the interference high-frequency information, and for the deblurring tasks, it is necessary to enhance the sharp edge information and useful high-frequency information.
  • the object of the present invention is to provide a compression artifact removal method based on a dual-stream multi-path recursive residual network that can significantly suppress compression artifacts, reduce distortion, and further enhance reconstructed images.
  • the present invention achieves the above-mentioned objects in this way:
  • Compression artifact removal method based on dual-stream multi-path recursive residual network, based on dual-stream multi-path recursive residual network architecture, using image high frequency (HF, high frequency) and low frequency (LF, low frequency) components to target removal and compression Artifacts.
  • the network decomposes the compressed and distorted image into a texture layer (including HF component) and a structure layer (including LF component);
  • it uses a multi-path recursive residual network to enhance texture and structure information respectively; finally, the structure and texture information are merged And feed it into the regression network to generate the final reconstructed image.
  • the weight sharing strategy is adopted in each residual unit in the same tributary of the dual stream, and the training parameter amount of each tributary is fixed, which is equivalent to the parameter amount of a 4-layer convolutional neural network.
  • I lq represents a compressed distorted image
  • I s represents a structural layer corresponding to the coarse information of the distorted image
  • I t represents a texture layer corresponding to the fine information of the distorted image.
  • the network architecture is composed of several recursive residual units (RRU, Recursive Residual Unit) and intermediate residual blocks (IRB, Intermediate Residual Block), which integrates global residual learning, local residual learning, and multi-path intermediate Residual learning
  • RRU Recursive Residual Unit
  • IRB Intermediate Residual Block
  • residual learning There are three kinds of residual learning; recursion refers to the use of the same weight between each feature map, that is, parameter sharing. As the network deepens, the amount of network learning parameters will also increase linearly, and the amount of network learning parameters is limited by weight sharing between each recursive unit.
  • the RRU is mainly composed of two convolutional layers and a ReLU activation layer, and the expression formula of the RRU is:
  • X u and X u ′ represent the input and output of the u-th RRU, Represents the mapping function of RRU, Is the mapping function of the i-th convolutional layer in the u-th RRU, Indicates the weight of the i-th convolutional layer in the u-th RRU, and the function ⁇ is the ReLU activation function.
  • each IRB the low-level features are transmitted to the next network layer of the residual block by skip connection mode; assuming that the representation function of each IRB is The corresponding input and output of the b-th IRB are X b and X b ′, Indicates the weight of the i-th residual unit of the b-th recursive block; the formula of IRB composed of two RRUs is expressed as:
  • X and X' respectively represent the input and output of the network
  • f and f rec respectively represent the mapping function of the first and last convolutional layer in the entire dual-stream network. Represents the overall mapping function of the proposed network.
  • the beneficial effect of the present invention is that in the process of removing artifacts from lossy compressed images, the present invention designs a dual-stream multi-path recursive residual network architecture.
  • Figure 1 is a schematic structural diagram of the dual-stream multi-path recursive residual network of the present invention
  • Fig. 2 is a schematic diagram of the structure of the recursive residual unit of the present invention.
  • the present invention designs a dual-stream multi-path recursive residual network architecture, which uses high frequency (HF) and low frequency (LF) components of an image to specifically remove compression artifacts.
  • HF high frequency
  • LF low frequency
  • the network decomposes the compressed and distorted image into a texture layer (including HF components) and a structure layer (including LF components); secondly, it uses a multi-path recursive residual network to enhance texture and structure information respectively; finally, the structure and texture information are combined And feed it into the regression network to generate the final reconstructed image.
  • a weight sharing strategy is adopted in each residual unit in the same tributary. Therefore, the amount of training parameters for each tributary is fixed, which is equivalent to the amount of parameters of a 4-layer convolutional neural network.
  • the desired structure layer is obtained by minimizing the L 0 norm.
  • the difference between the compressed and distorted image and the structure layer is calculated and used as the corresponding texture layer.
  • the compressed and distorted image is decomposed into a structure layer (including LF component) and a texture layer (including HF component).
  • I lq represents a compressed distorted image
  • I s represents a structural layer corresponding to the coarse information of the distorted image
  • I t represents a texture layer corresponding to the fine information of the distorted image.
  • a recursive residual network based on multi-path is designed.
  • the entire network is composed of several recursive residual units (RRU, Recursive Residual Unit) and intermediate residual blocks (IRB, Intermediate Residual Block), which integrates global residual learning, local residual learning and multi-path intermediate residual learning. Residual learning.
  • RRU Recursive Residual Unit
  • IRB Intermediate Residual Block
  • Recursion refers to the use of the same weight between each feature map, that is, parameter sharing.
  • the amount of network learning parameters will also increase linearly. Therefore, the amount of network learning parameters can be limited by weight sharing between each recursive unit.
  • RRU is mainly composed of two convolutional layers and ReLU activation layer, and its structure diagram is shown in Figure 2. It has the same structure as the residual unit in ResNet, but the difference lies in the activation sequence. In each residual unit of ResNet, the activation function is executed after the convolutional layer, and the RRU executes the activation layer (BN and ReLU) before the convolutional layer. Therefore, the expression formula of this RRU is:
  • X u and X u′ represent the input and output of the u-th RRU
  • the function ⁇ is the ReLU activation function.
  • each IRB the low-level features are transmitted to the next network layer of the residual block by skip connection mode. Therefore, there are multiple jump connections between the global connection and the local connection, as shown by the curved arrow in the lower middle in Figure 1.
  • the representation function of each IRB is The corresponding input and output of the b-th IRB are X b and X b ′, Indicates the weight of the i-th residual unit of the b-th recursive block.
  • the formula of IRB composed of two RRUs is expressed as:
  • X and X' respectively represent the input and output of the network
  • f and f rec respectively represent the mapping function of the first and last convolutional layer in the entire dual-stream network. Represents the overall mapping function of the proposed network.
  • the network proposed by the present invention has a flexible combined structure. Given a specific number of network layers, the number of RRUs and the number of IRBs can be adjusted freely. Assuming that the number of RRUs is denoted as U and the number of IRBs is denoted as B, the calculation formula for the number of network layers is as follows:
  • the network structure has the following three different types:
  • A. 1B9U There is only one IRB block, and the IRB contains 9 RRUs;
  • B, 3B3U There are a total of 3 IRB blocks, and each IRB contains 3 RRUs;
  • C. 9B1U There are a total of 9 IRB blocks, and each IRB contains only one RRU.
  • the combined structure of 3B3U includes three kinds of residual learning
  • the combined structure of 3B3U is respectively applied to the structure flow and texture flow designed in the present invention.
  • the combination structure becomes more flexible, that is, there are more different combinations.
  • the purpose of the structure layer is to restore the high-frequency information lost in the image.
  • the processing process of the texture layer aims to remove compression artifacts and preserve the details such as the edges of the original image.
  • Supervised learning is performed on the texture layer of the original real image, and the designed network structure including the recursive residual unit area and the inter-residual block can greatly suppress the strong blocking effect and ringing artifacts in the texture layer.
  • the two branches of the structure flow and the texture flow are operated in parallel, and the corresponding information is outputted and the structure layer is enhanced. And texture layer Then, the corresponding outputs of the two tributaries are added pixel by pixel to obtain an enhanced image, namely Finally, the enhanced image It is fed into a non-linear regression network to further improve the reconstructed image.
  • the structure of the regression network is the same as the network applied to the structure or texture stream.

Abstract

Disclosed in the present invention is a compression artifacts reduction method based on a dual-stream multi-path recursive residual network. On the basis of a dual-stream multi-path recursive residual network architecture, high-frequency (HF) components and low-frequency (LF) components of an image are used to reduce compression artifacts in a targeted manner. First, a network decomposes a compression distortion image into a texture layer (containing HF components) and a structure layer (containing LF components); second, a multi-path recursive residual network is used to enhance texture information and structure information, respectively; and finally, the structure information and texture information are combined and then fed into a regression network to generate a final reconstructed image. The present invention designs a dual-stream multi-path recursive residual network architecture in the process of reducing artifacts of a lossy compressed image, the two streams in the network are used to reduce a specific type of artifacts associated with high-frequency or low-frequency components of an image, respectively, and two outputs are combined together by using a non-linear regression network, such that the proposed network can significantly suppress compression artifacts and reduce distortion, further enhancing the reconstructed image.

Description

基于双流多路径递归残差网络的压缩伪影去除方法Compression artifact removal method based on dual-stream multi-path recursive residual network 技术领域Technical field
本发明涉及一种图像和视频等媒体文件在存储和传输过程中压缩及去除伪影的方法。The invention relates to a method for compressing and removing artifacts during storage and transmission of media files such as images and videos.
背景技术Background technique
图像和视频等媒体文件在存储和传输过程中通常采用压缩算法来降低成本。基于信息论角度,图像压缩算法可分为有损压缩(如JPEG)和无损压缩(如PNG)两大类,与无损压缩相比,有损压缩具有更高的压缩比。但是,有损压缩往往会引起不可逆的信息丢失并导致压缩伪影,例如振铃、块效应和模糊,特别是对于低比特率的编码。这些压缩伪影不仅降低了用户体验,而且对许多初级图像处理任务造成不良的影响。JPEG是目前最广泛使用的图像压缩标准,通过采用基于块的离散余弦变换(BDCT,Block-based Discrete Cosine Transform)与粗略量化的策略,旨在减少像素间统计冗余信息,实现高压缩比。然而,在不考虑相邻块间的相关性时,容易出现块边界处的强度不连续。此外,高频离散余弦变换系数的截断也会导致振铃和模糊伪影。因此,有损压缩引起的压缩伪影通常可以认为是一种混合伪影。Media files such as images and videos usually use compression algorithms to reduce costs during storage and transmission. From the perspective of information theory, image compression algorithms can be divided into lossy compression (such as JPEG) and lossless compression (such as PNG). Compared with lossless compression, lossy compression has a higher compression ratio. However, lossy compression often causes irreversible information loss and compression artifacts, such as ringing, blocking, and blur, especially for low-bit-rate encoding. These compression artifacts not only degrade the user experience, but also adversely affect many primary image processing tasks. JPEG is currently the most widely used image compression standard. Through the use of block-based discrete cosine transform (BDCT, Block-based Discrete Cosine Transform) and coarse quantization strategies, it aims to reduce statistical redundancy between pixels and achieve a high compression ratio. However, when the correlation between adjacent blocks is not considered, the intensity discontinuity at the block boundary is prone to appear. In addition, the truncation of high-frequency discrete cosine transform coefficients can also cause ringing and blurring artifacts. Therefore, compression artifacts caused by lossy compression can generally be considered as hybrid artifacts.
早期的压缩伪影去除(CAR,Compression Artifacts Reduction)主要是基于滤波的方法,通过手动设计一个在块边界上工作的滤波器以减少块效应。此外,还有采用小波变换,自适应离散余弦变换(SA-DCT,Shape-Adaptive Discrete Cosine Transform),以及基于稀疏编码的方法等。利用前面提及的方法从压缩图像内获得无伪影图像时,通常会伴随有噪声边缘和不自然平滑区域。除了传统的压缩伪影去除方法,基于深度学习的方法通过学习压缩图像与原始图像之间的非线性映射关系,获得最优的效果。但由于受到浅层网络的限制,重建的图像出现过度平滑。另一方面,虽然深度网络可以提供更好的性能,但参数太多,网络难以训练并且深度网络模型的存储成本较传统方法也较 大。The early compression artifact removal (CAR, Compression Artifacts Reduction) was mainly based on filtering methods, which reduced the blocking effect by manually designing a filter that worked on the block boundary. In addition, there are also wavelet transform, adaptive discrete cosine transform (SA-DCT, Shape-Adaptive Discrete Cosine Transform), and methods based on sparse coding. When using the aforementioned method to obtain an artifact-free image from a compressed image, it is usually accompanied by noise edges and unnatural smooth areas. In addition to traditional compression artifact removal methods, methods based on deep learning obtain the best results by learning the nonlinear mapping relationship between the compressed image and the original image. However, due to the limitation of the shallow network, the reconstructed image appears excessively smooth. On the other hand, although the deep network can provide better performance, there are too many parameters, the network is difficult to train and the storage cost of the deep network model is higher than that of traditional methods.
此外,先前的工作或者集中于去除压缩伪影中一种具体的伪影,或者仅仅依赖于端到端的网络无针对地减少所有类型的伪影,但这样可能造成一种伪影的减少却无意增大另一种伪影的影响(例如,去除块效应的同时可能加剧模糊效应的影响)。一般地说,CAR可以分成去块效应,去振铃伪影和去模糊三个子任务。对于去块效应、去振铃伪影的任务,需要抑制干扰的高频信息,而对于去模糊任务,需要增强尖锐的边缘信息以及有用的高频信息。In addition, previous work either focused on removing a specific artifact in compression artifacts, or only relied on an end-to-end network to reduce all types of artifacts inadvertently, but this may result in a reduction of artifacts without intention. Increase the influence of another kind of artifact (for example, removing the blocking effect may exacerbate the influence of the blur effect). Generally speaking, CAR can be divided into three subtasks: de-blocking, de-ringing artifacts and de-blurring. For the tasks of deblocking and ringing artifacts, it is necessary to suppress the interference high-frequency information, and for the deblurring tasks, it is necessary to enhance the sharp edge information and useful high-frequency information.
由于去块效应、去振铃伪影与锐化边缘(去模糊)任务之间相互抵消,如果不区分这些伪影时,端到端的卷积网络在减少一种类型伪影的同时会增强其他类型的伪影。Since the tasks of deblocking, ringing artifacts, and edge sharpening (deblurring) cancel each other out, if these artifacts are not distinguished, the end-to-end convolutional network will reduce one type of artifact while enhancing the others. Type of artifacts.
发明内容Summary of the invention
本发明的目的是提供一种能够显著地抑制压缩伪影,减少失真,进一步增强重建图像的基于双流多路径递归残差网络的压缩伪影去除方法。The object of the present invention is to provide a compression artifact removal method based on a dual-stream multi-path recursive residual network that can significantly suppress compression artifacts, reduce distortion, and further enhance reconstructed images.
本发明是这样来实现上述目的的:The present invention achieves the above-mentioned objects in this way:
基于双流多路径递归残差网络的压缩伪影去除方法,基于双流多路径递归残差网络架构,利用图像高频(HF,high frequency)和低频(LF,low frequency)分量来针对性的去除压缩伪影。首先,网络将压缩失真图像分解为纹理层(包含HF分量)和结构层(包含LF分量);其次,利用多路径递归残差网络分别增强纹理和结构信息;最终,将结构和纹理信息合并后并馈送到回归网络中以生成最终的重建图像。Compression artifact removal method based on dual-stream multi-path recursive residual network, based on dual-stream multi-path recursive residual network architecture, using image high frequency (HF, high frequency) and low frequency (LF, low frequency) components to target removal and compression Artifacts. First, the network decomposes the compressed and distorted image into a texture layer (including HF component) and a structure layer (including LF component); secondly, it uses a multi-path recursive residual network to enhance texture and structure information respectively; finally, the structure and texture information are merged And feed it into the regression network to generate the final reconstructed image.
进一步地,在双流的同一支流内的每一个残差单元内采用权重共享的策略,每一个支流的训练参数量是固定的,等同于一个4层卷积神经网络的参数量。Further, the weight sharing strategy is adopted in each residual unit in the same tributary of the dual stream, and the training parameter amount of each tributary is fixed, which is equivalent to the parameter amount of a 4-layer convolutional neural network.
进一步地,首先通过最小化L 0范数的方法获得期望的结构层。其次计算出压缩失真图像与结构层的差异并作为相应的纹理层,其公式表示为: Further, first obtain the desired structure layer by minimizing the L 0 norm. Secondly, the difference between the compressed and distorted image and the structure layer is calculated and used as the corresponding texture layer. The formula is expressed as:
I lq=I s+I t I lq =I s +I t
I lq表示压缩失真图像,I s表示对应于失真图像粗略信息的结构层,I t表示对应于失真图像精细信息的纹理层。 I lq represents a compressed distorted image, I s represents a structural layer corresponding to the coarse information of the distorted image, and I t represents a texture layer corresponding to the fine information of the distorted image.
进一步地,所述网络架构由若干个递归残差单元(RRU,Recursive Residual Unit)和中间残差块(IRB,Intermediate Residual Block)组成,集成了全局残差学习、局部残差学习以及多路径中间残差学习三种残差学习;递归是指在各个特征映射之间使用相同权重,即参数共享。随着网络深度加深,网络学习的参数量也会线性增加,通过在每个递归单元之间权重共享来限制网络学习参数量。Further, the network architecture is composed of several recursive residual units (RRU, Recursive Residual Unit) and intermediate residual blocks (IRB, Intermediate Residual Block), which integrates global residual learning, local residual learning, and multi-path intermediate Residual learning There are three kinds of residual learning; recursion refers to the use of the same weight between each feature map, that is, parameter sharing. As the network deepens, the amount of network learning parameters will also increase linearly, and the amount of network learning parameters is limited by weight sharing between each recursive unit.
进一步地,所述RRU主要由两个卷积层和ReLU激活层组成,该RRU的表示公式为:Further, the RRU is mainly composed of two convolutional layers and a ReLU activation layer, and the expression formula of the RRU is:
Figure PCTCN2019104234-appb-000001
Figure PCTCN2019104234-appb-000001
X u、X u′表示第u个RRU的输入和输出,
Figure PCTCN2019104234-appb-000002
表示RRU的映射函数,
Figure PCTCN2019104234-appb-000003
是第u个RRU中第i个卷积层的映射函数,
Figure PCTCN2019104234-appb-000004
表示第u个RRU中第i个卷积层的权重,函数σ是ReLU激活函数。
X u and X u ′ represent the input and output of the u-th RRU,
Figure PCTCN2019104234-appb-000002
Represents the mapping function of RRU,
Figure PCTCN2019104234-appb-000003
Is the mapping function of the i-th convolutional layer in the u-th RRU,
Figure PCTCN2019104234-appb-000004
Indicates the weight of the i-th convolutional layer in the u-th RRU, and the function σ is the ReLU activation function.
进一步地,所在每个IRB内,采用跳跃连接方式将低级特征传送给残差块的后一个网络层;假设每个IRB的表示函数为
Figure PCTCN2019104234-appb-000005
第b个IRB的对应输入和输出分别为X b和X b′,
Figure PCTCN2019104234-appb-000006
表示第b个递归块的第i个残差单元的权重;则由两个RRU组成IRB的公式表示为:
Further, in each IRB, the low-level features are transmitted to the next network layer of the residual block by skip connection mode; assuming that the representation function of each IRB is
Figure PCTCN2019104234-appb-000005
The corresponding input and output of the b-th IRB are X b and X b ′,
Figure PCTCN2019104234-appb-000006
Indicates the weight of the i-th residual unit of the b-th recursive block; the formula of IRB composed of two RRUs is expressed as:
Figure PCTCN2019104234-appb-000007
Figure PCTCN2019104234-appb-000007
因此,具有两个IRB的网络表示公式为:Therefore, the expression formula for a network with two IRBs is:
Figure PCTCN2019104234-appb-000008
Figure PCTCN2019104234-appb-000008
X和X′分别表示网络的输入与输出,f和f rec分别表示整个双流网络内第一个和最后一个卷积层的映射函数。
Figure PCTCN2019104234-appb-000009
表示所提出的网络的整体映射函数。
X and X'respectively represent the input and output of the network, f and f rec respectively represent the mapping function of the first and last convolutional layer in the entire dual-stream network.
Figure PCTCN2019104234-appb-000009
Represents the overall mapping function of the proposed network.
本发明的有益效果是:本发明在去除有损压缩图像伪影过程中,设计了一 种双流多路径递归残差网络的架构,网络中两个支流分别用于减少与高频或低频图像分量相关的特定类型伪影,并利用非线性回归网络将两个输出组合在一起,使得所提出的网络能够显着地抑制压缩伪影,减少失真,进一步增强重建的图像。The beneficial effect of the present invention is that in the process of removing artifacts from lossy compressed images, the present invention designs a dual-stream multi-path recursive residual network architecture. Related specific types of artifacts, and use a nonlinear regression network to combine the two outputs, so that the proposed network can significantly suppress compression artifacts, reduce distortion, and further enhance the reconstructed image.
附图说明Description of the drawings
下面结合附图和实施例对本发明进一步说明:The present invention will be further described below in conjunction with the drawings and embodiments:
图1是本发明的双流多路径递归残差网络的结构示意图;Figure 1 is a schematic structural diagram of the dual-stream multi-path recursive residual network of the present invention;
图2是本发明的递归残差单元的结构示意图。Fig. 2 is a schematic diagram of the structure of the recursive residual unit of the present invention.
具体实施方式detailed description
由于去块效应、去振铃伪影与锐化边缘(去模糊)任务之间相互抵消,如果不区分这些伪影时,端到端的卷积网络在减少一种类型伪影的同时会增强其他类型的伪影。本发明设计一种基于双流多路径递归残差网络架构,利用图像高频(HF,high frequency)和低频(LF,low frequency)分量来针对性的去除压缩伪影。首先,网络将压缩失真图像分解为纹理层(包含HF分量)和结构层(包含LF分量);其次,利用多路径递归残差网络分别增强纹理和结构信息;最终,将结构和纹理信息合并后并馈送到回归网络中以生成最终的重建图像。另外,为了克服网络训练困难,减少训练参数,在同一支流内的每一个残差单元内采用权重共享的策略。因此,每一个支流的训练参数量是固定的,等同于一个4层卷积神经网络的参数量。Since the tasks of deblocking, ringing artifacts, and edge sharpening (deblurring) cancel each other out, if these artifacts are not distinguished, the end-to-end convolutional network will reduce one type of artifact while enhancing the others. Type of artifacts. The present invention designs a dual-stream multi-path recursive residual network architecture, which uses high frequency (HF) and low frequency (LF) components of an image to specifically remove compression artifacts. First, the network decomposes the compressed and distorted image into a texture layer (including HF components) and a structure layer (including LF components); secondly, it uses a multi-path recursive residual network to enhance texture and structure information respectively; finally, the structure and texture information are combined And feed it into the regression network to generate the final reconstructed image. In addition, in order to overcome the difficulties of network training and reduce training parameters, a weight sharing strategy is adopted in each residual unit in the same tributary. Therefore, the amount of training parameters for each tributary is fixed, which is equivalent to the amount of parameters of a 4-layer convolutional neural network.
具体的方法或步骤如下:The specific methods or steps are as follows:
1、结构-纹理分解1. Structure-texture decomposition
首先,通过最小化L 0范数的方法获得期望的结构层。其次,计算出压缩失真图像与结构层的差异并作为相应的纹理层。 First, the desired structure layer is obtained by minimizing the L 0 norm. Secondly, the difference between the compressed and distorted image and the structure layer is calculated and used as the corresponding texture layer.
将压缩失真图像分解为结构层(包含LF分量)和纹理层(包含HF分量)。The compressed and distorted image is decomposed into a structure layer (including LF component) and a texture layer (including HF component).
其公式表示为:Its formula is expressed as:
I lq=I s+I t I lq =I s +I t
其中,I lq表示压缩失真图像,I s表示对应于失真图像粗略信息的结构层,I t表示对应于失真图像精细信息的纹理层。 Among them, I lq represents a compressed distorted image, I s represents a structural layer corresponding to the coarse information of the distorted image, and I t represents a texture layer corresponding to the fine information of the distorted image.
2、结构流2. Structure flow
为了增强压缩图像结构层的细节信息,设计了基于多路径递归残差网络。整个网络由若干个递归残差单元(RRU,Recursive Residual Unit)和中间残差块(IRB,Intermediate Residual Block)组成,集成了全局残差学习、局部残差学习以及多路径中间残差学习三种残差学习。这类组合结构不仅可以帮助梯度和低级特征的传输,还可以显着减少网络的参数量。In order to enhance the detailed information of the compressed image structure layer, a recursive residual network based on multi-path is designed. The entire network is composed of several recursive residual units (RRU, Recursive Residual Unit) and intermediate residual blocks (IRB, Intermediate Residual Block), which integrates global residual learning, local residual learning and multi-path intermediate residual learning. Residual learning. This kind of combined structure can not only help the transmission of gradients and low-level features, but also significantly reduce the amount of network parameters.
1)递归残差单元1) Recursive residual unit
递归是指在各个特征映射之间使用相同权重,即参数共享。随着网络深度加深,网络学习的参数量也会线性增加,因此,可以通过在每个递归单元之间权重共享来限制网络学习参数量。RRU主要由两个卷积层和ReLU激活层组成,其结构示意图如图2所示。与ResNet内残差单元具有相同的结构,区别在于激活顺序的不同。在ResNet的每个残差单元中,激活函数在卷积层之后执行,而RRU则在卷积层之前执行激活层(BN和ReLU)。因此,该RRU的表示公式为:Recursion refers to the use of the same weight between each feature map, that is, parameter sharing. As the network deepens, the amount of network learning parameters will also increase linearly. Therefore, the amount of network learning parameters can be limited by weight sharing between each recursive unit. RRU is mainly composed of two convolutional layers and ReLU activation layer, and its structure diagram is shown in Figure 2. It has the same structure as the residual unit in ResNet, but the difference lies in the activation sequence. In each residual unit of ResNet, the activation function is executed after the convolutional layer, and the RRU executes the activation layer (BN and ReLU) before the convolutional layer. Therefore, the expression formula of this RRU is:
Figure PCTCN2019104234-appb-000010
Figure PCTCN2019104234-appb-000010
其中,X u、X u′表示第u个RRU的输入和输出,
Figure PCTCN2019104234-appb-000011
表示RRU的映射函数,
Figure PCTCN2019104234-appb-000012
是第u个RRU中第i个卷积层的映射函数,
Figure PCTCN2019104234-appb-000013
表示第u个RRU中第i个卷积层的权重,函数σ是ReLU激活函数。
Among them, X u and X u′ represent the input and output of the u-th RRU,
Figure PCTCN2019104234-appb-000011
Represents the mapping function of RRU,
Figure PCTCN2019104234-appb-000012
Is the mapping function of the i-th convolutional layer in the u-th RRU,
Figure PCTCN2019104234-appb-000013
Indicates the weight of the i-th convolutional layer in the u-th RRU, and the function σ is the ReLU activation function.
2)区间残差块2) Interval residual block
在每个IRB内,采用跳跃连接方式将低级特征传送给残差块的后一个网络层。因此,在全局连接和局部连接之间存在多条跳跃连接,如图1中中间下方弯曲箭头所示。假设每个IRB的表示函数为
Figure PCTCN2019104234-appb-000014
第b个IRB的对应输入和输出分别为X b和X b′,
Figure PCTCN2019104234-appb-000015
表示第b个递归块的第i个残差单元的权重。则由两个RRU 组成IRB的公式表示为:
In each IRB, the low-level features are transmitted to the next network layer of the residual block by skip connection mode. Therefore, there are multiple jump connections between the global connection and the local connection, as shown by the curved arrow in the lower middle in Figure 1. Assume that the representation function of each IRB is
Figure PCTCN2019104234-appb-000014
The corresponding input and output of the b-th IRB are X b and X b ′,
Figure PCTCN2019104234-appb-000015
Indicates the weight of the i-th residual unit of the b-th recursive block. Then the formula of IRB composed of two RRUs is expressed as:
Figure PCTCN2019104234-appb-000016
Figure PCTCN2019104234-appb-000016
因此,具有两个IRB的网络表示公式为:Therefore, the expression formula for a network with two IRBs is:
Figure PCTCN2019104234-appb-000017
Figure PCTCN2019104234-appb-000017
其中,X和X′分别表示网络的输入与输出,f和f rec分别表示整个双流网络内第一个和最后一个卷积层的映射函数。
Figure PCTCN2019104234-appb-000018
表示所提出的网络的整体映射函数。3)网络结构
Among them, X and X'respectively represent the input and output of the network, and f and f rec respectively represent the mapping function of the first and last convolutional layer in the entire dual-stream network.
Figure PCTCN2019104234-appb-000018
Represents the overall mapping function of the proposed network. 3) Network structure
由于三种残差学习,以及RRU和IRB的设计,本发明提出的网络具有灵活的组合结构。给定一个特定的网络层数,可以自由地调整RRU的数量和IRB的数量。假设RRU的数量表示为U,IRB的数量表示为B,则网络层数计算公式如下:Due to the three kinds of residual learning and the design of RRU and IRB, the network proposed by the present invention has a flexible combined structure. Given a specific number of network layers, the number of RRUs and the number of IRBs can be adjusted freely. Assuming that the number of RRUs is denoted as U and the number of IRBs is denoted as B, the calculation formula for the number of network layers is as follows:
d=2+2×U×Bd=2+2×U×B
假设设计的网络层数d=20,那么该网络结构具有以下三个不同的类型:Assuming that the number of designed network layers d=20, then the network structure has the following three different types:
A、1B9U:仅仅只有一个IRB块,且该IRB内含有9个RRU;A. 1B9U: There is only one IRB block, and the IRB contains 9 RRUs;
B、3B3U:总共有3个IRB块,且每个IRB内包含3个RRU;B, 3B3U: There are a total of 3 IRB blocks, and each IRB contains 3 RRUs;
C、9B1U:总共有9个IRB块,且每个IRB中只含有一个RRU。C. 9B1U: There are a total of 9 IRB blocks, and each IRB contains only one RRU.
由于3B3U的组合结构包含三种残差学习,因此,3B3U组合结构分别应用于本发明设计的结构流和纹理流。特别地是,随着网络层数增加,组合结构就越灵活,即存在更多不同的组合。Since the combined structure of 3B3U includes three kinds of residual learning, the combined structure of 3B3U is respectively applied to the structure flow and texture flow designed in the present invention. In particular, as the number of network layers increases, the combination structure becomes more flexible, that is, there are more different combinations.
3、纹理流和结构流回归网络3. Texture flow and structure flow regression network
结构层的目的在于恢复图像丢失的高频信息。相反地,纹理层的处理过程旨在去除压缩伪影,并保留原始图像的边缘等细节信息。对原始真实图像的纹理层进行监督学习,通过所设计的包含递归残差单元区和间残差块的网络结构能够极大的抑制纹理层中的强块效应和振铃伪影。The purpose of the structure layer is to restore the high-frequency information lost in the image. On the contrary, the processing process of the texture layer aims to remove compression artifacts and preserve the details such as the edges of the original image. Supervised learning is performed on the texture layer of the original real image, and the designed network structure including the recursive residual unit area and the inter-residual block can greatly suppress the strong blocking effect and ringing artifacts in the texture layer.
本发明设计的网络结构里,结构流和纹理流两个支流并行运作,分别输出相应 信息增强后结构层
Figure PCTCN2019104234-appb-000019
和纹理层
Figure PCTCN2019104234-appb-000020
然后,两个支流相应的输出进行逐个像素相加以获得增强的图像,即
Figure PCTCN2019104234-appb-000021
最后,增强的图像
Figure PCTCN2019104234-appb-000022
被送入非线性回归网络以进一步改善重建的图像,该回归网络的结构与应用于结构或纹理流的网络相同。
In the network structure designed by the present invention, the two branches of the structure flow and the texture flow are operated in parallel, and the corresponding information is outputted and the structure layer is enhanced.
Figure PCTCN2019104234-appb-000019
And texture layer
Figure PCTCN2019104234-appb-000020
Then, the corresponding outputs of the two tributaries are added pixel by pixel to obtain an enhanced image, namely
Figure PCTCN2019104234-appb-000021
Finally, the enhanced image
Figure PCTCN2019104234-appb-000022
It is fed into a non-linear regression network to further improve the reconstructed image. The structure of the regression network is the same as the network applied to the structure or texture stream.

Claims (6)

  1. 基于双流多路径递归残差网络的压缩伪影去除方法,其特征在于:基于双流多路径递归残差网络架构,利用图像高频(HF,high frequency)和低频(LF,low frequency)分量来针对性的去除压缩伪影。首先,网络将压缩失真图像分解为纹理层(包含HF分量)和结构层(包含LF分量);其次,利用多路径递归残差网络分别增强纹理和结构信息;最终,将结构和纹理信息合并后并馈送到回归网络中以生成最终的重建图像。The compression artifact removal method based on the dual-stream multi-path recursive residual network is characterized by: based on the dual-stream multi-path recursive residual network architecture, the high frequency (HF, high frequency) and low frequency (LF, low frequency) components of the image are used to target Remove compression artifacts. First, the network decomposes the compressed and distorted image into a texture layer (including HF component) and a structure layer (including LF component); secondly, it uses a multi-path recursive residual network to enhance texture and structure information respectively; finally, the structure and texture information are merged And feed it into the regression network to generate the final reconstructed image.
  2. 根据权利要求1所述的基于双流多路径递归残差网络的压缩伪影去除方法,其特征在于:在双流的同一支流内的每一个残差单元内采用权重共享的策略,每一个支流的训练参数量是固定的,等同于一个4层卷积神经网络的参数量。The compression artifact removal method based on dual-stream multi-path recursive residual network according to claim 1, characterized in that: a weight sharing strategy is adopted in each residual unit in the same tributary of the dual stream, and the training of each tributary The parameter quantity is fixed, which is equivalent to the parameter quantity of a 4-layer convolutional neural network.
  3. 根据权利要求1所述的基于双流多路径递归残差网络的压缩伪影去除方法,其特征在于:首先通过最小化L 0范数的方法获得期望的结构层。其次计算出压缩失真图像与结构层的差异并作为相应的纹理层,其公式表示为: The method for removing compression artifacts based on a dual-stream multi-path recursive residual network according to claim 1, characterized in that: firstly, a desired structure layer is obtained by a method of minimizing the L 0 norm. Secondly, the difference between the compressed and distorted image and the structure layer is calculated and used as the corresponding texture layer. The formula is expressed as:
    I lq=I s+I t I lq =I s +I t
    其中,I lq表示压缩失真图像,I s表示对应于失真图像粗略信息的结构层,I t表示对应于失真图像精细信息的纹理层。 Among them, I lq represents a compressed distorted image, I s represents a structural layer corresponding to the coarse information of the distorted image, and I t represents a texture layer corresponding to the fine information of the distorted image.
  4. 根据权利要求1所述的基于双流多路径递归残差网络的压缩伪影去除方法,其特征在于:所述网络架构由若干个递归残差单元(RRU,Recursive Residual Unit)和中间残差块(IRB,Intermediate Residual Block)组成,集成了全局残差学习、局部残差学习以及多路径中间残差学习三种残差学习;递归是指在各个特征映射之间使用相同权重,即参数共享。随着网络深度加深,网络学习的参数量也会线性增加,通过在每个递归单元之间权重共享来限制网络学习参数量。The compression artifact removal method based on dual-stream multi-path recursive residual network according to claim 1, characterized in that: the network architecture consists of a plurality of recursive residual units (RRU, Recursive Residual Unit) and intermediate residual blocks ( IRB, Intermediate Residual Block), which integrates three kinds of residual learning: global residual learning, local residual learning, and multi-path intermediate residual learning; recursion refers to the use of the same weight between each feature map, that is, parameter sharing. As the network deepens, the amount of network learning parameters will also increase linearly, and the amount of network learning parameters is limited by weight sharing between each recursive unit.
  5. 根据权利要求4所述的基于双流多路径递归残差网络的压缩伪影去除方法,其特征在于:所述RRU主要由两个卷积层和ReLU激活层组成,该RRU的表示公式为:The method for removing compression artifacts based on a dual-stream multi-path recursive residual network according to claim 4, wherein the RRU is mainly composed of two convolutional layers and a ReLU activation layer, and the expression formula of the RRU is:
    Figure PCTCN2019104234-appb-100001
    Figure PCTCN2019104234-appb-100001
    其中,X u、X u′表示第u个RRU的输入和输出,
    Figure PCTCN2019104234-appb-100002
    表示RRU的映射函数,
    Figure PCTCN2019104234-appb-100003
    是第u个RRU中第i个卷积层的映射函数,
    Figure PCTCN2019104234-appb-100004
    表示第u个RRU中第i个卷积层的权重,函数σ是ReLU激活函数。
    Among them, X u and X u′ represent the input and output of the u-th RRU,
    Figure PCTCN2019104234-appb-100002
    Represents the mapping function of RRU,
    Figure PCTCN2019104234-appb-100003
    Is the mapping function of the i-th convolutional layer in the u-th RRU,
    Figure PCTCN2019104234-appb-100004
    Indicates the weight of the i-th convolutional layer in the u-th RRU, and the function σ is the ReLU activation function.
  6. 根据权利要求4所述的基于双流多路径递归残差网络的压缩伪影去除方法,其特征在于:所在每个IRB内,采用跳跃连接方式将低级特征传送给残差块的后一个网络层;假设每个IRB的表示函数为
    Figure PCTCN2019104234-appb-100005
    第b个IRB的对应输入和输出分别为X b和X b′
    Figure PCTCN2019104234-appb-100006
    表示第b个递归块的第i个残差单元的权重;则由两个RRU组成IRB的公式表示为:
    The compression artifact removal method based on dual-stream multi-path recursive residual network according to claim 4, characterized in that: in each IRB, low-level features are transmitted to the next network layer of the residual block in a skip connection mode; Assume that the representation function of each IRB is
    Figure PCTCN2019104234-appb-100005
    The corresponding input and output of the b-th IRB are X b and X b′ , respectively,
    Figure PCTCN2019104234-appb-100006
    Indicates the weight of the i-th residual unit of the b-th recursive block; the formula of IRB composed of two RRUs is expressed as:
    Figure PCTCN2019104234-appb-100007
    Figure PCTCN2019104234-appb-100007
    因此,具有两个IRB的网络表示公式为:Therefore, the expression formula for a network with two IRBs is:
    Figure PCTCN2019104234-appb-100008
    Figure PCTCN2019104234-appb-100008
    其中,X和X′分别表示网络的输入与输出,f和f rec分别表示整个双流网络内第一个和最后一个卷积层的映射函数。
    Figure PCTCN2019104234-appb-100009
    表示所提出的网络的整体映射函数。
    Among them, X and X'respectively represent the input and output of the network, and f and f rec respectively represent the mapping function of the first and last convolutional layer in the entire dual-stream network.
    Figure PCTCN2019104234-appb-100009
    Represents the overall mapping function of the proposed network.
PCT/CN2019/104234 2019-09-03 2019-09-03 Compression artifacts reduction method based on dual-stream multi-path recursive residual network WO2021042270A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/104234 WO2021042270A1 (en) 2019-09-03 2019-09-03 Compression artifacts reduction method based on dual-stream multi-path recursive residual network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/104234 WO2021042270A1 (en) 2019-09-03 2019-09-03 Compression artifacts reduction method based on dual-stream multi-path recursive residual network

Publications (1)

Publication Number Publication Date
WO2021042270A1 true WO2021042270A1 (en) 2021-03-11

Family

ID=74852002

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/104234 WO2021042270A1 (en) 2019-09-03 2019-09-03 Compression artifacts reduction method based on dual-stream multi-path recursive residual network

Country Status (1)

Country Link
WO (1) WO2021042270A1 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107798665A (en) * 2017-11-07 2018-03-13 天津大学 Underwater picture Enhancement Method based on structural texture layering
CN108460726A (en) * 2018-03-26 2018-08-28 厦门大学 A kind of magnetic resonance image super-resolution reconstruction method based on enhancing recurrence residual error network
CN108921789A (en) * 2018-06-20 2018-11-30 华北电力大学 Super-resolution image reconstruction method based on recurrence residual error network
CN109509152A (en) * 2018-12-29 2019-03-22 大连海事大学 A kind of image super-resolution rebuilding method of the generation confrontation network based on Fusion Features

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107798665A (en) * 2017-11-07 2018-03-13 天津大学 Underwater picture Enhancement Method based on structural texture layering
CN108460726A (en) * 2018-03-26 2018-08-28 厦门大学 A kind of magnetic resonance image super-resolution reconstruction method based on enhancing recurrence residual error network
CN108921789A (en) * 2018-06-20 2018-11-30 华北电力大学 Super-resolution image reconstruction method based on recurrence residual error network
CN109509152A (en) * 2018-12-29 2019-03-22 大连海事大学 A kind of image super-resolution rebuilding method of the generation confrontation network based on Fusion Features

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ZHOU DENG-WEN , ZHAO LI-JUAN , DUAN RAN , CHAI XIAO-LIAN: "Image Super-resolution Based on Recursive Residual Networks", ACTA AUTOMATICA SINICA, vol. 45, no. 6, 30 June 2019 (2019-06-30), pages 1157 - 1165, XP055787602, ISSN: 0254-4156, DOI: 10.16383/j.aas.c180334 *

Similar Documents

Publication Publication Date Title
Li et al. An efficient deep convolutional neural networks model for compressed image deblocking
Liu et al. Data-driven sparsity-based restoration of JPEG-compressed images in dual transform-pixel domain
Chen et al. DPW-SDNet: Dual pixel-wavelet domain deep CNNs for soft decoding of JPEG-compressed images
CN108900848B (en) Video quality enhancement method based on self-adaptive separable convolution
JP2021016150A (en) Loop filtering device and image decoding device
JPH07231450A (en) Filter device and method for reducing artifact in moving video picture signal system
CN112509094B (en) JPEG image compression artifact elimination algorithm based on cascade residual error coding and decoding network
TWI250423B (en) Method for processing video images
CN111105357B (en) Method and device for removing distortion of distorted image and electronic equipment
Kim et al. Towards the perceptual quality enhancement of low bit-rate compressed images
CN111091515B (en) Image restoration method and device, and computer-readable storage medium
WO2021042270A1 (en) Compression artifacts reduction method based on dual-stream multi-path recursive residual network
CN110175959B (en) Typhoon cloud picture enhancement method
KR101998036B1 (en) Artifact reduction method and apparatus
Komatsu et al. Super-resolution decoding of JPEG-compressed image data with the shrinkage in the redundant DCT domain
CN114173130B (en) Loop filtering method of deep neural network suitable for low bit rate condition
Najgebauer et al. Fully convolutional network for removing dct artefacts from images
Cai et al. Inpainting for compressed images
Takagi et al. Image restoration of JPEG encoded images via block matching and wiener filtering
Malviya et al. 2D-discrete walsh wavelet transform for image compression with arithmetic coding
Luo et al. Residual Hybrid Attention Network for Compression Artifact Reduction
Albluwi et al. Artifacts reduction in jpeg-compressed images using cnns
Dolar et al. Total variation regularization filtering for video signal processing
US20080187237A1 (en) Method, medium, and system reducing image block noise
Nikitin et al. Adaptive bilateral filter for JPEG 2000 deringing

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19944171

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19944171

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 19.10.2022)

122 Ep: pct application non-entry in european phase

Ref document number: 19944171

Country of ref document: EP

Kind code of ref document: A1