CN111064958A - Low-complexity neural network filtering algorithm for B frame and P frame - Google Patents

Low-complexity neural network filtering algorithm for B frame and P frame Download PDF

Info

Publication number
CN111064958A
CN111064958A CN201911382700.0A CN201911382700A CN111064958A CN 111064958 A CN111064958 A CN 111064958A CN 201911382700 A CN201911382700 A CN 201911382700A CN 111064958 A CN111064958 A CN 111064958A
Authority
CN
China
Prior art keywords
filtering
neural network
frame
frames
syntax element
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911382700.0A
Other languages
Chinese (zh)
Other versions
CN111064958B (en
Inventor
范益波
刘超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fudan University
Original Assignee
Fudan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fudan University filed Critical Fudan University
Priority to CN201911382700.0A priority Critical patent/CN111064958B/en
Publication of CN111064958A publication Critical patent/CN111064958A/en
Application granted granted Critical
Publication of CN111064958B publication Critical patent/CN111064958B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/13Adaptive entropy coding, e.g. adaptive variable length coding [AVLC] or context adaptive binary arithmetic coding [CABAC]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/117Filters, e.g. for pre-processing or post-processing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/61Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/80Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation
    • H04N19/82Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation involving filtering within a prediction loop

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The invention belongs to the technical field of video coding, and particularly relates to a low-complexity neural network filtering algorithm for B frames and P frames. The method comprises the following steps: and selecting the optimal filtering strength to be coded into the code stream by testing the reconstructed pixel fusion results output by the neural network filter and the video coding standard in different proportions so as to achieve the optimal filtering effect. A new syntax element, called frame-level syntax element, is designed, i.e. a syntax element existing for each component of each frame, which consists of n bits, and is used to characterize the result of neural network output in the current frame and the degree of fusion of reconstructed pixels of the video encoder. The effect of self-adaptive filtering strength is realized through the syntax element, and the problem of over-blurring and over-smoothing caused by directly using the filter is effectively solved. Compared with the original syntax elements at the CTU level, the filtering directly at the frame level does not bring extra artificial boundaries, and the method is an algorithm with excellent performance and low complexity.

Description

Low-complexity neural network filtering algorithm for B frame and P frame
Technical Field
The invention belongs to the technical field of video coding, and particularly relates to a low-complexity neural network filtering algorithm for B frames and P frames.
Background
In the field of video coding, a filtering technique based on a convolutional neural network is used and widely applied, the neural network filtering achieves a better filtering effect than the traditional DB SAO ALF, but the complexity is high, which causes the practical application to be limited, and especially in the B frame and the P frame, the repeated use of the neural network filter causes an over-blurring problem, that is, a block region is repeatedly smoothed, and the filtering causes the details and high frequency information of the current block to be lost.
In the last few years, a variety of solutions have been proposed by many researchers, mainly based on CTU-level filtering concepts, and B-frames and P-frames in video coding are predictively transformed on a block basis, so some blocks will have large residuals or motion vectors, and some will be almost the same as the reference filtered frames. CTU-level filtering allows almost every block to be selected to the best filtering result by making a filtering decision for every CTU. Of course, an additional bit is added to indicate whether each CTU is selected to be used or not, so that the bit stream is consumed comparatively, and therefore, the addition of an additional classifier is proposed by scholars to help the current CU to judge and decide whether to use the neural network filtering method or not. Learners have also used methods of iteratively training neural networks to reduce this over-smoothing, so that a globally optimal filtering effect is ultimately achieved.
In fact, the drawback of CTU filtering is also obvious, and for convolutional neural networks, using CTU filtering requires additional zero padding or padding of reconstructed pixels, and if 0 is padded, the filtering performance is obviously reduced due to the error brought by 0. If padding reconstructed pixels are used, the computational complexity of the neural network is significantly increased. We therefore propose frame level based filtering.
Disclosure of Invention
The invention aims to provide a low-complexity neural network filtering algorithm for B frames and P frames.
The low-complexity neural network filtering algorithm for the B frame and the P frame is different from filtering for the CTU level, adopts filtering for the whole frame, has lower complexity and better filtering effect, and comprises the steps of fusing the output reconstruction pixels of a video encoder and the neural network filtering result, realizing optimal filtering and solving the problem of over-smoothness caused by repeatedly using a neural network filter.
The invention provides a low-complexity neural network filtering algorithm for B frames and P frames, which comprises the following specific steps:
(1) at the encoding end, closing the DB and SAO options in the HM configuration file, and encoding the target video to obtain a reconstructed pixel X; after one frame is coded, the traditional coder uses a filter such as DB SAO to filter the reconstructed pixel R of the whole frame, the DB and the SAO are closed in the step, and the traditional filter is not used to filter the reconstructed pixel R;
(2) and filtering the unprocessed reconstructed pixel R by using a neural network filter to obtain a filtered pixel Y. Wherein, the neural network filter's node references [ Chao Liu, Heming Sun, Junan Chen, Zhengxue Cheng, Masaru Takeuchi, Jiro Katto, Xiaoyang Zeng, and Yibo Fan, "Dual learning based video coding with admission pitch blocks," arXiv preprint arXiv:1911.09857,2019 ];
(3) unlike in the I frame, Y can be used directly instead of R. Since filtering B/P frames directly using Y can lead to over-blurring problems, the present invention uses a combination of filtered pixels Y and R
Figure BDA00023426915600000210
The original reconstructed pixel R is replaced. Thereby realizing the filtering of the B/P frame;
Figure BDA00023426915600000211
is calculated as equation (1) which depends on a new syntax element λ. And the lambda determines the strength and ratio of the filterSuch as when lambdai0 means no filtering at all, and λiWhen 1, the filtering is performed by completely using a neural network filter;
Figure BDA0002342691560000021
(4) traversing different super-parameter lambda in an encoderiThe value is calculated as equation (2), shown in FIG. 2, and λiIt is understood that 1 is equally divided into 2n1 interval, so for each increase of i by 1, λiIs increased
Figure BDA0002342691560000022
Thereby realizing the effect that the filtering strength in the interval is uniformly increased;
Figure BDA0002342691560000023
(5) different lambdaiCorresponding different values
Figure BDA0002342691560000024
For each one
Figure BDA0002342691560000025
All compare it with the original pixel
Figure BDA0002342691560000026
Mean square error value MSE betweeni
Figure BDA0002342691560000027
(6) Finding the minimum MSEiAnd recording the corresponding lambdaiCoding the binary form of i corresponding to the binary form into a code stream; and filtering the result
Figure BDA0002342691560000028
Sending the data to a frame buffer;
(7) for the decoder, it does not needTraverse λ under different iiBut directly decodes the filtering strength according to i in the code stream, thereby realizing the optimal filtering effect.
Specifically, at the decoding end, i can be decoded in the code stream by a corresponding entropy decoder, and λ is calculated according to i in the code streamiThen correspondingly adding the result Y output by the neural network and the original reconstruction pixel R to obtain
Figure BDA0002342691560000029
The desired filtering result can be calculated finally. And similarly, the frame buffer is sent into a frame buffer, so that the frame to be coded later can be referred to, and the complete matching of the coding and decoding ends is realized.
In the invention, a new syntax element is designed, called a frame-level syntax element, that is, a syntax element existing for each component of each frame, which is composed of n bits and is used for representing the fusion degree of the neural network output result and the video encoder reconstruction pixels in the current frame. The larger the value of the syntax element is, the more the result of selecting to use the neural network filtering is taken as the final output result; the smaller the value of this syntax element, the more the video encoder tends to use its own original reconstructed pixels as the final output result. The invention realizes the effect of self-adaptive filtering strength through the syntax element, thereby effectively solving the problem of over-blurring and over-smoothing caused by directly using the filter. Compared with the syntax elements at the CTU level designed by the previous method, the filtering directly at the frame level does not bring additional artificial boundaries, and the algorithm has excellent performance and low complexity.
Drawings
FIG. 1 is a flow chart of the algorithm of the present invention.
FIG. 2 shows i and λiA relationship diagram of (c).
Detailed Description
The present invention is further described below, taking the example in HEVC video encoder.
Firstly, closing options of DB and SAO in a configuration file of HM, encoding a target video, after encoding of each frame is finished, putting the frame into a frame buffer memory to refer to a next frame, obtaining a reconstructed pixel X, and filtering the X by using a neural network filter to obtain a filtered picture Y.
Setting a parameter value to λiThe magnitude of which depends on the value of i, as shown in FIG. 2, λ corresponding to different i is calculated according to equation (1)i. For different lambdaiAll calculate equation (2) to obtain a plurality of intermediate results
Figure BDA0002342691560000031
Which represents the temporary filtering effect. Then comparing the temporary stored filtering effect with the mean square error between the original pixels as shown in the formula (3), and finding out the minimum mean square error MSEiCorresponding sum of i
Figure BDA0002342691560000032
i ranges from 0 to 2n1, so that it can just be represented by n-bit 2 system, and thus can be coded into the code stream as a new syntax element, where the entropy coding model can use bypass coding mode, and the probability of MPS is set to 0.5. While
Figure BDA0002342691560000033
The value of (a) is sent to the frame buffer as the output filtering result for the subsequent frame as the reference frame.
In fact, this process is a rate-distortion optimization RDO process, i.e. a process of finding the smallest J, where J is calculated as follows:
J=D+kR (4)
wherein D represents distortion, R represents code rate, and k is a hyper-parameter for weighing the relationship between code rate and distortion; the loop filtering problem does not affect R, or different frames use the same extra bit number to represent the code rate, so that the optimization of the problem only needs to consider the minimum D to realize the RDO process. Thus, the minimum mean square error MSE among them can be found hereiCorresponding sum of i
Figure BDA0002342691560000034
As the value that is desired to be retained.
For this frame-level filtering syntax element, its position can be added after the original DB filtering syntax element, since both DB and the proposed algorithm use the same frame-level filtering concept. After reading the DB bit, the frame-level neural network filtering syntax element i can be read to calculate lambdaiTo control the final filtering strength.
At the decoding end, i can be decoded in the code stream through a corresponding entropy decoder, and lambda can be obtained through calculation according to i in the code streamiThen correspondingly adding the result Y output by the neural network and the original reconstruction pixel R to obtain
Figure BDA0002342691560000035
The desired filtering result can be calculated finally. And similarly, the frame buffer is sent into a frame buffer, so that the frame to be coded later can be referred to, and the complete matching of the coding and decoding ends is realized.

Claims (2)

1. A low-complexity neural network filtering algorithm for B frames and P frames is characterized by comprising the following specific steps:
(1) at the encoding end, closing DB and SAO options in the configuration file of the HM, and encoding the target video to obtain a reconstructed pixel X;
(2) filtering the unprocessed reconstructed pixel R by using a neural network filter to obtain a filtered pixel Y;
(3) using a combination of filtered pixels Y and R
Figure FDA0002342691550000011
Replacing the original reconstruction pixel R, thereby realizing the filtering of the B/P frame; wherein the content of the first and second substances,
Figure FDA0002342691550000012
is shown in formula (1), wherein a new syntax element λ is introduced, and λ determines the filtering strength of the filter when λiWhen 0, it means no filtering at allWhen λ isiWhen 1, the filtering is performed by completely using a neural network filter;
Figure FDA0002342691550000013
(4) traversing different super-parameter lambda in an encoderiThe value is represented by formula (2) < lambda >iIs to equally divide 1 into 2n1 interval, so for each increase of i by 1, λiIs increased
Figure FDA0002342691550000014
Thereby realizing the effect that the filtering strength in the interval is uniformly increased;
Figure FDA0002342691550000015
(5) different lambdaiCorresponding different values
Figure FDA0002342691550000016
For each one
Figure FDA0002342691550000017
All compare it with the original pixel
Figure FDA0002342691550000018
Mean square error value MSE betweeni
Figure FDA0002342691550000019
(6) Finding the minimum MSEiAnd recording the corresponding lambdaiCoding the binary form of i corresponding to the binary form into a code stream; and filtering the result
Figure FDA00023426915500000110
Sending the data to a frame buffer;
(7) at the decoding end, do not passλ under different iiBut directly decodes the filtering strength according to i in the code stream, thereby realizing the optimal filtering effect.
2. The low complexity neural network filtering algorithm for B-frames and P-frames according to claim 1, wherein at the decoding end, i is decoded in the code stream by a corresponding entropy decoder, and λ is calculated according to i in the code streamiThen correspondingly adding the result Y output by the neural network and the original reconstruction pixel R to obtain
Figure FDA00023426915500000111
A desired filtering result is calculated.
CN201911382700.0A 2019-12-28 2019-12-28 Low-complexity neural network filtering algorithm for B frame and P frame Active CN111064958B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911382700.0A CN111064958B (en) 2019-12-28 2019-12-28 Low-complexity neural network filtering algorithm for B frame and P frame

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911382700.0A CN111064958B (en) 2019-12-28 2019-12-28 Low-complexity neural network filtering algorithm for B frame and P frame

Publications (2)

Publication Number Publication Date
CN111064958A true CN111064958A (en) 2020-04-24
CN111064958B CN111064958B (en) 2021-03-30

Family

ID=70304317

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911382700.0A Active CN111064958B (en) 2019-12-28 2019-12-28 Low-complexity neural network filtering algorithm for B frame and P frame

Country Status (1)

Country Link
CN (1) CN111064958B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113422966A (en) * 2021-05-27 2021-09-21 绍兴市北大信息技术科创中心 Multi-model CNN loop filtering method
WO2022052533A1 (en) * 2020-09-10 2022-03-17 Oppo广东移动通信有限公司 Encoding method, decoding method, encoder, decoder, and encoding system
WO2022116165A1 (en) * 2020-12-04 2022-06-09 深圳市大疆创新科技有限公司 Video encoding method, video decoding method, encoder, decoder, and ai accelerator
WO2022218385A1 (en) * 2021-04-14 2022-10-20 Beijing Bytedance Network Technology Co., Ltd. Unified neural network filter model
WO2023130226A1 (en) * 2022-01-04 2023-07-13 Oppo广东移动通信有限公司 Filtering method, decoder, encoder and computer-readable storage medium
US11949918B2 (en) 2021-04-15 2024-04-02 Lemon Inc. Unified neural network in-loop filter signaling
US11979591B2 (en) 2021-04-06 2024-05-07 Lemon Inc. Unified neural network in-loop filter

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1558680A (en) * 2004-01-16 2004-12-29 北京工业大学 A simplified loop filtering method for video coding
CN108174225A (en) * 2018-01-11 2018-06-15 上海交通大学 Filter achieving method and system in coding and decoding video loop based on confrontation generation network
CN109151475A (en) * 2017-06-27 2019-01-04 杭州海康威视数字技术股份有限公司 A kind of method for video coding, coding/decoding method, device and electronic equipment
CN110062246A (en) * 2018-01-19 2019-07-26 杭州海康威视数字技术股份有限公司 The method and apparatus that video requency frame data is handled
CN110199521A (en) * 2016-12-23 2019-09-03 华为技术有限公司 Low complex degree hybrid domain for damaging Video coding cooperates with in-loop filter
CN110300301A (en) * 2018-03-22 2019-10-01 华为技术有限公司 Image coding/decoding method and device
JP2019201256A (en) * 2018-05-14 2019-11-21 シャープ株式会社 Image filter device
CN110519606A (en) * 2019-08-22 2019-11-29 天津大学 Intelligent coding method in a kind of deep video frame
CN110619607A (en) * 2018-06-20 2019-12-27 浙江大学 Image denoising method and device based on neural network and image coding and decoding method and device based on neural network image denoising

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1558680A (en) * 2004-01-16 2004-12-29 北京工业大学 A simplified loop filtering method for video coding
CN110199521A (en) * 2016-12-23 2019-09-03 华为技术有限公司 Low complex degree hybrid domain for damaging Video coding cooperates with in-loop filter
CN109151475A (en) * 2017-06-27 2019-01-04 杭州海康威视数字技术股份有限公司 A kind of method for video coding, coding/decoding method, device and electronic equipment
CN108174225A (en) * 2018-01-11 2018-06-15 上海交通大学 Filter achieving method and system in coding and decoding video loop based on confrontation generation network
CN110062246A (en) * 2018-01-19 2019-07-26 杭州海康威视数字技术股份有限公司 The method and apparatus that video requency frame data is handled
CN110300301A (en) * 2018-03-22 2019-10-01 华为技术有限公司 Image coding/decoding method and device
JP2019201256A (en) * 2018-05-14 2019-11-21 シャープ株式会社 Image filter device
CN110619607A (en) * 2018-06-20 2019-12-27 浙江大学 Image denoising method and device based on neural network and image coding and decoding method and device based on neural network image denoising
CN110519606A (en) * 2019-08-22 2019-11-29 天津大学 Intelligent coding method in a kind of deep video frame

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022052533A1 (en) * 2020-09-10 2022-03-17 Oppo广东移动通信有限公司 Encoding method, decoding method, encoder, decoder, and encoding system
WO2022116165A1 (en) * 2020-12-04 2022-06-09 深圳市大疆创新科技有限公司 Video encoding method, video decoding method, encoder, decoder, and ai accelerator
US11979591B2 (en) 2021-04-06 2024-05-07 Lemon Inc. Unified neural network in-loop filter
WO2022218385A1 (en) * 2021-04-14 2022-10-20 Beijing Bytedance Network Technology Co., Ltd. Unified neural network filter model
US11949918B2 (en) 2021-04-15 2024-04-02 Lemon Inc. Unified neural network in-loop filter signaling
CN113422966A (en) * 2021-05-27 2021-09-21 绍兴市北大信息技术科创中心 Multi-model CNN loop filtering method
CN113422966B (en) * 2021-05-27 2024-05-24 绍兴市北大信息技术科创中心 Multi-model CNN loop filtering method
WO2023130226A1 (en) * 2022-01-04 2023-07-13 Oppo广东移动通信有限公司 Filtering method, decoder, encoder and computer-readable storage medium

Also Published As

Publication number Publication date
CN111064958B (en) 2021-03-30

Similar Documents

Publication Publication Date Title
CN111064958B (en) Low-complexity neural network filtering algorithm for B frame and P frame
CN108184129B (en) Video coding and decoding method and device and neural network for image filtering
JP5854439B2 (en) Video coding system and method using adaptive segmentation
CN109889852B (en) HEVC intra-frame coding optimization method based on adjacent values
WO2021203394A1 (en) Loop filtering method and apparatus
CN112544081B (en) Loop filtering method and device
CN113766247B (en) Loop filtering method and device
WO2013067949A1 (en) Matrix encoding method and device thereof, and matrix decoding method and device thereof
CN105306957A (en) Adaptive loop filtering method and device
CN113068028A (en) Method and apparatus for predicting video image component, and computer storage medium
CN110944179B (en) Video data decoding method and device
CN113422966A (en) Multi-model CNN loop filtering method
CN113810715B (en) Video compression reference image generation method based on cavity convolutional neural network
CN115914654A (en) Neural network loop filtering method and device for video coding
US10764577B2 (en) Non-MPM mode coding for intra prediction in video coding
CN114793282A (en) Neural network based video compression with bit allocation
CN113709459B (en) Intra-frame prediction method, device and computer storage medium
WO2023197230A1 (en) Filtering method, encoder, decoder and storage medium
WO2024016156A1 (en) Filtering method, encoder, decoder, code stream and storage medium
WO2023245544A1 (en) Encoding and decoding method, bitstream, encoder, decoder, and storage medium
CN117459737B (en) Training method of image preprocessing network and image preprocessing method
WO2024077576A1 (en) Neural network based loop filter methods, video coding method and apparatus, video decoding method and apparatus, and system
WO2024007157A1 (en) Multi-reference line index list sorting method and device, video coding method and device, video decoding method and device, and system
CN116347108A (en) Loop filtering decision method of deep neural network based on coding information
Sheng et al. Prediction and Reference Quality Adaptation for Learned Video Compression

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant