CN108520505B - Loop filtering implementation method based on multi-network combined construction and self-adaptive selection - Google Patents

Loop filtering implementation method based on multi-network combined construction and self-adaptive selection Download PDF

Info

Publication number
CN108520505B
CN108520505B CN201810341067.XA CN201810341067A CN108520505B CN 108520505 B CN108520505 B CN 108520505B CN 201810341067 A CN201810341067 A CN 201810341067A CN 108520505 B CN108520505 B CN 108520505B
Authority
CN
China
Prior art keywords
network
video frame
video
filter
classification
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810341067.XA
Other languages
Chinese (zh)
Other versions
CN108520505A (en
Inventor
林巍峣
何晓艺
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Jiaotong University
Original Assignee
Shanghai Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Jiaotong University filed Critical Shanghai Jiaotong University
Priority to CN201810341067.XA priority Critical patent/CN108520505B/en
Publication of CN108520505A publication Critical patent/CN108520505A/en
Application granted granted Critical
Publication of CN108520505B publication Critical patent/CN108520505B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/10Image enhancement or restoration using non-spatial domain filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding
    • G06T9/002Image coding using neural networks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/172Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

A method for realizing loop filtering based on multi-network combined construction and self-adaptive selection includes such steps as jointly constructing a convolutional neural network containing a multi-classification network and multiple filter networks, iteratively training the convolutional neural network by using video frame of compressed video as training data, and adaptively selecting loop filtering in video compression process.

Description

Loop filtering implementation method based on multi-network combined construction and self-adaptive selection
Technical Field
The invention relates to a technology in the field of digital image processing, in particular to a video compression coding loop filtering implementation method based on multi-network joint construction and self-adaptive selection.
Background
The existing video compression algorithms all adopt a lossy compression scheme, namely, a certain distortion exists between an image of a compressed video and an original video. Especially, when the compression rate is high, distortion of the image is more serious. Therefore, the image loop filtering of the compressed video has important significance on the premise of keeping a higher compression rate. Loop filters based on conventional methods, such as SAO and deblocking filter in HEVC (high efficiency video coding), are already available in existing video coding standards. Some loop filters based on convolutional neural networks are available, and the loop filters have better effects compared with the traditional loop filters. However, the existing loop filtering implementation methods based on the convolutional neural network are all based on a single convolutional neural network, and the robustness of the model is insufficient under the conditions of more complex coding and image distortion.
Disclosure of Invention
Aiming at the defects in the prior art, the invention provides a loop filtering implementation method based on multi-network joint construction and self-adaptive selection, which uses a plurality of convolutional neural networks to carry out loop filtering in a video compression coding algorithm, has stronger robustness and expansibility, can further improve the performance of the prior compressed video loop filter implementation method based on the convolutional neural networks, and improves the coding efficiency of the video compression algorithm.
The invention is realized by the following technical scheme:
the invention relates to a loop filtering implementation method based on multi-network combined construction and self-adaptive selection.
The convolutional neural network comprises: a classification network and a plurality of filter networks, wherein: the classification network adopts but is not limited to VGG-16 network described in Very Deep conditional Networks for Large-Scale Image registration by K.Simony et al or ResNet classification network proposed in Deep Residual Learning for Image registration by K.He et al; the filter Network adopts, but is not limited to, a VRCNN Network proposed by Y.Dai et al in A capacitive neutral Network apparatus for Post-Processing in HEVC Intra Coding or a QECNN Network proposed by R.Yan et al in Enhancing Quality for HEVC Compressed video.
The classification classes of the multi-classification network are matched with the number of the filter networks, and are preferably the classification classes of power of 2.
In each iteration in the iterative training, a video frame of training data is firstly input into a multi-classification network, after a category i of the input video frame is obtained through prediction, the video frame is input into N filter networks, the serial number j of the jth network with the best filter effect is compared and recorded as a category label of the video frame to update parameters of the network, and then the parameters of the ith filter network are updated by using the video frame and an uncompressed original video frame corresponding to the video frame.
The filtering effect adopts, but is not limited to, peak signal to noise ratio (PSNR) as an evaluation index of image quality.
The loop filtering in the video compression process is realized by adopting any one of the following modes:
1) in a coding and decoding loop of video compression, firstly inputting a compressed video frame into a trained N classification network to obtain a predicted class i, then inputting the video frame into an i-th trained filter network, wherein the output of the filter network is the final filtered video frame;
2) in a coding and decoding loop of video compression, compressed video frames are respectively input into N trained filter networks, filtered video frames output by the N filter networks are compared according to evaluation indexes of image quality, a video frame output by the jth filter network with the best quality is selected as a final filtered video frame, j is expressed by binary system, and the j is written into a code stream obtained by coding.
Technical effects
Compared with the prior art, the method realizes the loop filtering of the compressed video by utilizing the plurality of convolution neural networks which are jointly constructed, and has better robustness and enhancement effect compared with the traditional method based on a single neural network. The realization method of the loop filter based on the single neural network cannot efficiently learn the complex image distortion with different degrees in the compressed video, and the plurality of models trained and constructed in the invention can better capture the complex image distortion phenomenon caused by the compression algorithm, thereby realizing better loop filtering effect.
Drawings
FIG. 1 is a block diagram of an embodiment of a multi-network federation configuration module;
FIGS. 2a and 2b are schematic diagrams of two loop filtering embodiments, respectively;
FIG. 3 is a schematic diagram of a system according to an embodiment.
Detailed Description
As shown in fig. 3, the system for implementing loop filtering according to this embodiment includes: the multi-network joint construction module and the loop filter module with self-adaptive selection connected with the multi-network joint construction module are characterized in that: and the multi-network joint construction module outputs the network model to the loop filtering module selected in a self-adaption mode, and the loop filtering module selected in the self-adaption mode carries out loop filtering on compressed video frames in a video compression coding algorithm according to the network model.
The multi-network joint construction module comprises: the network generation unit is used for constructing a convolutional neural network comprising a classification network and a plurality of filter networks, and the network joint training unit is connected with the network generation unit.
The loop filtering module comprises: and the filtering selection unit is used for selecting a filtering mode and is realized by adopting a convolutional neural network constructed and trained by a multi-network joint construction module.
The loop filter module is preferably embedded in a video coding algorithm.
The specific implementation steps of the embodiment include:
and step 1.1) performing compression coding on the videos in the data set by using video coding and decoding software HM-16.0 to finally obtain a plurality of decoded compressed videos. For each compressed video, its video frame and its corresponding video frame before compression are used as training data, and only the Y channel of the image is used.
And step 1.2) constructing a neural network based on tensoflow open source software, wherein the N-4 classification network adopts a VGG-16 network, 4 filter networks all adopt VRCNN networks, supervised training is carried out on the filter networks by using training data, and finally optimization of network parameters is completed.
The training is an iterative loop, and specifically comprises the following steps:
i) initializing network parameters randomly;
ii) for each iteration of the training stage, obtaining the class i of the input video frame of the training data predicted by the 4-class network, then inputting the video frame into 4 filter networks, calculating the PSNR of the output video frame, and recording the network with the highest PSNR gain as the jth network;
iii) updating the parameters of the classification network by using j as the class label of the video frame, and then updating the parameters of the ith filter network by using the video frame and the corresponding uncompressed original video frame.
Said updating of the parameters, respectively calculating a cost function (loss) of the network using pairs of data comprising: classifying the prediction type i and the labeling type j of the network, and filtering the video frame and the uncompressed original video frame output by the filter network, calculating to obtain a cost function, then carrying out backward propagation of the gradient used by the general neural network, and then updating the parameters of the network.
The cost function of the classification network is not limited to Softmax Loss, and specifically includes:
Figure BDA0001630659250000031
wherein: y isi1 if i j (label category), otherwise yi=0;piRepresenting the network's prediction (probability) for class i, with N being the classification class.
The cost function adopted by the filter network adopts but is not limited to Mean Squared Error, and specifically comprises the following steps:
Figure BDA0001630659250000032
wherein: xiAnd YiRespectively representing ith pixel values of the output filtered video frame and the original frame, wherein M is the total pixel number of the video frame and depends on the size of the video frame.
Step 1.3), deploying the trained network into a video coding algorithm, and performing loop filtering in two ways in this embodiment, as shown in fig. 1 and fig. 2:
as shown in fig. 1, in the encoding algorithm, a compressed video frame is input into a 4-class network VGG-16 obtained by training, so as to obtain a prediction class i; and then inputting the video frame into the filter network VRCNN obtained by the i x training to obtain a filtered video frame.
As shown in fig. 2, in the encoding algorithm, compressed video frames are respectively input to N trained filter networks, filtered video frames output by the N filter networks are compared according to evaluation indexes of image quality, a video frame output by the jth filter network with the best quality is selected as a final filtered video frame, and then j is represented by a binary system and written into a code stream obtained by encoding.
The foregoing embodiments may be modified in many different ways by those skilled in the art without departing from the spirit and scope of the invention, which is defined by the appended claims and all changes that come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein.

Claims (1)

1. A loop filtering implementation system based on multi-network joint construction and adaptive selection is characterized by comprising the following steps: the multi-network joint construction module and the loop filter module with self-adaptive selection connected with the multi-network joint construction module are characterized in that: the multi-network joint construction module outputs a network model to the loop filtering module selected in a self-adaption mode, and the loop filtering module selected in the self-adaption mode carries out loop filtering on compressed video frames in a video compression coding algorithm according to the network model;
the loop filtering means: firstly, a convolutional neural network comprising a multi-classification network and a plurality of filter networks is jointly constructed, then, a video frame of a compressed video is used as training data to carry out iterative training on the convolutional neural network, and finally, loop filtering of self-adaptive selection is carried out in the video compression process;
the training data uses video coding and decoding software HM-16.0 to compress and code the video in the data set, and finally a plurality of decoded compressed videos are obtained; for each compressed video, taking a video frame of the compressed video and a video frame before compression corresponding to the video frame as training data, and only adopting a Y channel of an image;
the classification number of the multi-classification network is matched with the number of the filter networks and is a power-of-2 classification network;
the multi-classification network adopts a VGG-16 network or a ResNet classification network; the filter network adopts a VRCNN network or a QECNN network;
in each iteration in the iterative training, a video frame of training data is firstly input into a multi-classification network, after a category i of the input video frame is obtained through prediction, the video frame is input into N filter networks, the serial number j of the jth network with the best filter effect is compared and recorded as a category label of the video frame to update parameters of the network, and then the parameters of the ith filter network are updated by using the video frame and an uncompressed original video frame corresponding to the video frame;
the filtering effect adopts the peak signal-to-noise ratio as an evaluation index of the image quality;
the parameter updating is to use a data pair to respectively calculate the cost function of the network, wherein the data pair comprises: classifying a prediction type i and a label type j of the prediction type i of the network, and a filtered video frame and an uncompressed original video frame output by the filter network; after the cost function is obtained through calculation, backward propagation of the gradient used by the general neural network is carried out, and then the parameters of the network are updated;
the cost function of the classification network adopts Softmax Loss, and specifically comprises the following steps:
Figure FDA0003286455750000011
wherein: if the prediction class i is equal to the annotation class j, i ═ j, yi1, otherwise yi=0;piRepresenting the prediction result of the network to the category i, wherein N is a classification category; the cost function adopted by the filter network is Mean Squared Error, which specifically comprises the following steps:
Figure FDA0003286455750000012
wherein: xiAnd YiRespectively representing ith pixel values of the output filtered video frame and the original frame, wherein M is the total pixel number of the video frame and depends on the size of the video frame;
the video compression process is realized by adopting any one of the following modes:
1) in a coding and decoding loop of video compression, firstly inputting a compressed video frame into a trained N classification network to obtain a predicted class i, then inputting the video frame into an i-th trained filter network, wherein the output of the filter network is the final filtered video frame;
2) in a coding and decoding loop of video compression, compressed video frames are respectively input into N trained filter networks, filtered video frames output by the N filter networks are compared according to evaluation indexes of image quality, a video frame output by the jth filter network with the best quality is selected as a final filtered video frame, j is expressed by binary system, and the j is written into a code stream obtained by coding.
CN201810341067.XA 2018-04-17 2018-04-17 Loop filtering implementation method based on multi-network combined construction and self-adaptive selection Active CN108520505B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810341067.XA CN108520505B (en) 2018-04-17 2018-04-17 Loop filtering implementation method based on multi-network combined construction and self-adaptive selection

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810341067.XA CN108520505B (en) 2018-04-17 2018-04-17 Loop filtering implementation method based on multi-network combined construction and self-adaptive selection

Publications (2)

Publication Number Publication Date
CN108520505A CN108520505A (en) 2018-09-11
CN108520505B true CN108520505B (en) 2021-12-03

Family

ID=63428705

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810341067.XA Active CN108520505B (en) 2018-04-17 2018-04-17 Loop filtering implementation method based on multi-network combined construction and self-adaptive selection

Country Status (1)

Country Link
CN (1) CN108520505B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110351568A (en) * 2019-06-13 2019-10-18 天津大学 A kind of filtering video loop device based on depth convolutional network
CN112422993B (en) * 2019-08-21 2021-12-03 四川大学 HEVC video quality enhancement method combined with convolutional neural network
WO2021051369A1 (en) * 2019-09-20 2021-03-25 Intel Corporation Convolutional neural network loop filter based on classifier
EP4049236A1 (en) * 2019-11-14 2022-08-31 Huawei Technologies Co., Ltd. Spatially adaptive image filtering
EP4107947A4 (en) * 2020-02-21 2024-03-06 Nokia Technologies Oy A method, an apparatus and a computer program product for video encoding and video decoding
WO2022257049A1 (en) * 2021-06-09 2022-12-15 Oppo广东移动通信有限公司 Encoding method, decoding method, code stream, encoder, decoder and storage medium
WO2022257130A1 (en) * 2021-06-11 2022-12-15 Oppo广东移动通信有限公司 Encoding method, decoding method, code stream, encoder, decoder, system and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102413323A (en) * 2010-01-13 2012-04-11 中国移动通信集团广东有限公司中山分公司 H.264-based video compression method
CN103096060A (en) * 2011-11-08 2013-05-08 乐金电子(中国)研究开发中心有限公司 Intra-frame image prediction coding and decoding self-adaption loop filtering method and device
CN103141094A (en) * 2010-10-05 2013-06-05 联发科技股份有限公司 Method and apparatus of adaptive loop filtering
US20130272624A1 (en) * 2012-04-11 2013-10-17 Texas Instruments Incorporated Virtual Boundary Processing Simplification for Adaptive Loop Filtering (ALF) in Video Coding
CN106101711A (en) * 2016-08-26 2016-11-09 成都杰华科技有限公司 A kind of quickly real-time video codec compression algorithm

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102413323A (en) * 2010-01-13 2012-04-11 中国移动通信集团广东有限公司中山分公司 H.264-based video compression method
CN103141094A (en) * 2010-10-05 2013-06-05 联发科技股份有限公司 Method and apparatus of adaptive loop filtering
CN103096060A (en) * 2011-11-08 2013-05-08 乐金电子(中国)研究开发中心有限公司 Intra-frame image prediction coding and decoding self-adaption loop filtering method and device
US20130272624A1 (en) * 2012-04-11 2013-10-17 Texas Instruments Incorporated Virtual Boundary Processing Simplification for Adaptive Loop Filtering (ALF) in Video Coding
CN106101711A (en) * 2016-08-26 2016-11-09 成都杰华科技有限公司 A kind of quickly real-time video codec compression algorithm

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
A Convolutional Neural Network Approach for Post-Processing in HEVC Intra Coding;Yuanying Dai等;《arXiv:1608.06690v2 [cs.MM]》;20161029;第1-12页 *
CNN-based in-loop filtering for coding efficiency improvement;Woon-Sung Park等;《2016 IEEE 12th Image, Video, and Multidimensional Signal Processing Workshop (IVMSP)》;20160804;第1-5页 *
Multi-modal/multi-scale convolutional neural network based in-loop filter design for next generation video codec;Jihong Kang等;《2017 IEEE International Conference on Image Processing (ICIP)》;20180222;第26-30页 *
ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression;Jian-Hao Luo等;《arXiv:1707.06342v1 [cs.CV]》;20170720;第1-9页 *
视频编码标准HEVC中的环路滤波技术分析;唐华敏 等;《电视技术》;20141231;第38卷(第11期);第1-4页 *

Also Published As

Publication number Publication date
CN108520505A (en) 2018-09-11

Similar Documents

Publication Publication Date Title
CN108520505B (en) Loop filtering implementation method based on multi-network combined construction and self-adaptive selection
CN108174225B (en) Video coding and decoding in-loop filtering implementation method and system based on countermeasure generation network
CN109120937B (en) Video encoding method, decoding method, device and electronic equipment
US20230062752A1 (en) A method, an apparatus and a computer program product for video encoding and video decoding
CN108134932B (en) Method and system for realizing video coding and decoding loop internal filtering based on convolutional neural network
US11062210B2 (en) Method and apparatus for training a neural network used for denoising
US20230291909A1 (en) Coding video frame key points to enable reconstruction of video frame
CN113822147B (en) Deep compression method for semantic tasks of collaborative machine
Lu et al. Learning a deep vector quantization network for image compression
WO2020183059A1 (en) An apparatus, a method and a computer program for training a neural network
WO2020008104A1 (en) A method, an apparatus and a computer program product for image compression
Klopp et al. How to exploit the transferability of learned image compression to conventional codecs
CN113379858A (en) Image compression method and device based on deep learning
Canh et al. Rate-distortion optimized quantization: A deep learning approach
Fujihashi et al. Wireless 3D point cloud delivery using deep graph neural networks
CN110351558B (en) Video image coding compression efficiency improving method based on reinforcement learning
WO2022266578A1 (en) Content-adaptive online training method and apparatus for deblocking in block- wise image compression
WO2022251828A1 (en) Content-adaptive online training method and apparatus for post-filtering
US20230110503A1 (en) Method, an apparatus and a computer program product for video encoding and video decoding
CN116935292B (en) Short video scene classification method and system based on self-attention model
KR102245682B1 (en) Apparatus for compressing image, learning apparatus and method thereof
Shen et al. Dec-adapter: Exploring efficient decoder-side adapter for bridging screen content and natural image compression
CN115278249B (en) Video block-level rate distortion optimization method and system based on visual self-attention network
CN113766250B (en) Compressed image quality improving method based on sampling reconstruction and feature enhancement
US11683515B2 (en) Video compression with adaptive iterative intra-prediction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant