CN111064958A - Low-complexity neural network filtering algorithm for B frame and P frame - Google Patents
Low-complexity neural network filtering algorithm for B frame and P frame Download PDFInfo
- Publication number
- CN111064958A CN111064958A CN201911382700.0A CN201911382700A CN111064958A CN 111064958 A CN111064958 A CN 111064958A CN 201911382700 A CN201911382700 A CN 201911382700A CN 111064958 A CN111064958 A CN 111064958A
- Authority
- CN
- China
- Prior art keywords
- filtering
- neural network
- frame
- frames
- syntax element
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/13—Adaptive entropy coding, e.g. adaptive variable length coding [AVLC] or context adaptive binary arithmetic coding [CABAC]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/117—Filters, e.g. for pre-processing or post-processing
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/60—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
- H04N19/61—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/70—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/80—Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation
- H04N19/82—Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation involving filtering within a prediction loop
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
The invention belongs to the technical field of video coding, and particularly relates to a low-complexity neural network filtering algorithm for B frames and P frames. The method comprises the following steps: and selecting the optimal filtering strength to be coded into the code stream by testing the reconstructed pixel fusion results output by the neural network filter and the video coding standard in different proportions so as to achieve the optimal filtering effect. A new syntax element, called frame-level syntax element, is designed, i.e. a syntax element existing for each component of each frame, which consists of n bits, and is used to characterize the result of neural network output in the current frame and the degree of fusion of reconstructed pixels of the video encoder. The effect of self-adaptive filtering strength is realized through the syntax element, and the problem of over-blurring and over-smoothing caused by directly using the filter is effectively solved. Compared with the original syntax elements at the CTU level, the filtering directly at the frame level does not bring extra artificial boundaries, and the method is an algorithm with excellent performance and low complexity.
Description
Technical Field
The invention belongs to the technical field of video coding, and particularly relates to a low-complexity neural network filtering algorithm for B frames and P frames.
Background
In the field of video coding, a filtering technique based on a convolutional neural network is used and widely applied, the neural network filtering achieves a better filtering effect than the traditional DB SAO ALF, but the complexity is high, which causes the practical application to be limited, and especially in the B frame and the P frame, the repeated use of the neural network filter causes an over-blurring problem, that is, a block region is repeatedly smoothed, and the filtering causes the details and high frequency information of the current block to be lost.
In the last few years, a variety of solutions have been proposed by many researchers, mainly based on CTU-level filtering concepts, and B-frames and P-frames in video coding are predictively transformed on a block basis, so some blocks will have large residuals or motion vectors, and some will be almost the same as the reference filtered frames. CTU-level filtering allows almost every block to be selected to the best filtering result by making a filtering decision for every CTU. Of course, an additional bit is added to indicate whether each CTU is selected to be used or not, so that the bit stream is consumed comparatively, and therefore, the addition of an additional classifier is proposed by scholars to help the current CU to judge and decide whether to use the neural network filtering method or not. Learners have also used methods of iteratively training neural networks to reduce this over-smoothing, so that a globally optimal filtering effect is ultimately achieved.
In fact, the drawback of CTU filtering is also obvious, and for convolutional neural networks, using CTU filtering requires additional zero padding or padding of reconstructed pixels, and if 0 is padded, the filtering performance is obviously reduced due to the error brought by 0. If padding reconstructed pixels are used, the computational complexity of the neural network is significantly increased. We therefore propose frame level based filtering.
Disclosure of Invention
The invention aims to provide a low-complexity neural network filtering algorithm for B frames and P frames.
The low-complexity neural network filtering algorithm for the B frame and the P frame is different from filtering for the CTU level, adopts filtering for the whole frame, has lower complexity and better filtering effect, and comprises the steps of fusing the output reconstruction pixels of a video encoder and the neural network filtering result, realizing optimal filtering and solving the problem of over-smoothness caused by repeatedly using a neural network filter.
The invention provides a low-complexity neural network filtering algorithm for B frames and P frames, which comprises the following specific steps:
(1) at the encoding end, closing the DB and SAO options in the HM configuration file, and encoding the target video to obtain a reconstructed pixel X; after one frame is coded, the traditional coder uses a filter such as DB SAO to filter the reconstructed pixel R of the whole frame, the DB and the SAO are closed in the step, and the traditional filter is not used to filter the reconstructed pixel R;
(2) and filtering the unprocessed reconstructed pixel R by using a neural network filter to obtain a filtered pixel Y. Wherein, the neural network filter's node references [ Chao Liu, Heming Sun, Junan Chen, Zhengxue Cheng, Masaru Takeuchi, Jiro Katto, Xiaoyang Zeng, and Yibo Fan, "Dual learning based video coding with admission pitch blocks," arXiv preprint arXiv:1911.09857,2019 ];
(3) unlike in the I frame, Y can be used directly instead of R. Since filtering B/P frames directly using Y can lead to over-blurring problems, the present invention uses a combination of filtered pixels Y and RThe original reconstructed pixel R is replaced. Thereby realizing the filtering of the B/P frame;
is calculated as equation (1) which depends on a new syntax element λ. And the lambda determines the strength and ratio of the filterSuch as when lambdai0 means no filtering at all, and λiWhen 1, the filtering is performed by completely using a neural network filter;
(4) traversing different super-parameter lambda in an encoderiThe value is calculated as equation (2), shown in FIG. 2, and λiIt is understood that 1 is equally divided into 2n1 interval, so for each increase of i by 1, λiIs increasedThereby realizing the effect that the filtering strength in the interval is uniformly increased;
(5) different lambdaiCorresponding different valuesFor each oneAll compare it with the original pixelMean square error value MSE betweeni;
(6) Finding the minimum MSEiAnd recording the corresponding lambdaiCoding the binary form of i corresponding to the binary form into a code stream; and filtering the resultSending the data to a frame buffer;
(7) for the decoder, it does not needTraverse λ under different iiBut directly decodes the filtering strength according to i in the code stream, thereby realizing the optimal filtering effect.
Specifically, at the decoding end, i can be decoded in the code stream by a corresponding entropy decoder, and λ is calculated according to i in the code streamiThen correspondingly adding the result Y output by the neural network and the original reconstruction pixel R to obtainThe desired filtering result can be calculated finally. And similarly, the frame buffer is sent into a frame buffer, so that the frame to be coded later can be referred to, and the complete matching of the coding and decoding ends is realized.
In the invention, a new syntax element is designed, called a frame-level syntax element, that is, a syntax element existing for each component of each frame, which is composed of n bits and is used for representing the fusion degree of the neural network output result and the video encoder reconstruction pixels in the current frame. The larger the value of the syntax element is, the more the result of selecting to use the neural network filtering is taken as the final output result; the smaller the value of this syntax element, the more the video encoder tends to use its own original reconstructed pixels as the final output result. The invention realizes the effect of self-adaptive filtering strength through the syntax element, thereby effectively solving the problem of over-blurring and over-smoothing caused by directly using the filter. Compared with the syntax elements at the CTU level designed by the previous method, the filtering directly at the frame level does not bring additional artificial boundaries, and the algorithm has excellent performance and low complexity.
Drawings
FIG. 1 is a flow chart of the algorithm of the present invention.
FIG. 2 shows i and λiA relationship diagram of (c).
Detailed Description
The present invention is further described below, taking the example in HEVC video encoder.
Firstly, closing options of DB and SAO in a configuration file of HM, encoding a target video, after encoding of each frame is finished, putting the frame into a frame buffer memory to refer to a next frame, obtaining a reconstructed pixel X, and filtering the X by using a neural network filter to obtain a filtered picture Y.
Setting a parameter value to λiThe magnitude of which depends on the value of i, as shown in FIG. 2, λ corresponding to different i is calculated according to equation (1)i. For different lambdaiAll calculate equation (2) to obtain a plurality of intermediate resultsWhich represents the temporary filtering effect. Then comparing the temporary stored filtering effect with the mean square error between the original pixels as shown in the formula (3), and finding out the minimum mean square error MSEiCorresponding sum of ii ranges from 0 to 2n1, so that it can just be represented by n-bit 2 system, and thus can be coded into the code stream as a new syntax element, where the entropy coding model can use bypass coding mode, and the probability of MPS is set to 0.5. WhileThe value of (a) is sent to the frame buffer as the output filtering result for the subsequent frame as the reference frame.
In fact, this process is a rate-distortion optimization RDO process, i.e. a process of finding the smallest J, where J is calculated as follows:
J=D+kR (4)
wherein D represents distortion, R represents code rate, and k is a hyper-parameter for weighing the relationship between code rate and distortion; the loop filtering problem does not affect R, or different frames use the same extra bit number to represent the code rate, so that the optimization of the problem only needs to consider the minimum D to realize the RDO process. Thus, the minimum mean square error MSE among them can be found hereiCorresponding sum of iAs the value that is desired to be retained.
For this frame-level filtering syntax element, its position can be added after the original DB filtering syntax element, since both DB and the proposed algorithm use the same frame-level filtering concept. After reading the DB bit, the frame-level neural network filtering syntax element i can be read to calculate lambdaiTo control the final filtering strength.
At the decoding end, i can be decoded in the code stream through a corresponding entropy decoder, and lambda can be obtained through calculation according to i in the code streamiThen correspondingly adding the result Y output by the neural network and the original reconstruction pixel R to obtainThe desired filtering result can be calculated finally. And similarly, the frame buffer is sent into a frame buffer, so that the frame to be coded later can be referred to, and the complete matching of the coding and decoding ends is realized.
Claims (2)
1. A low-complexity neural network filtering algorithm for B frames and P frames is characterized by comprising the following specific steps:
(1) at the encoding end, closing DB and SAO options in the configuration file of the HM, and encoding the target video to obtain a reconstructed pixel X;
(2) filtering the unprocessed reconstructed pixel R by using a neural network filter to obtain a filtered pixel Y;
(3) using a combination of filtered pixels Y and RReplacing the original reconstruction pixel R, thereby realizing the filtering of the B/P frame; wherein the content of the first and second substances,is shown in formula (1), wherein a new syntax element λ is introduced, and λ determines the filtering strength of the filter when λiWhen 0, it means no filtering at allWhen λ isiWhen 1, the filtering is performed by completely using a neural network filter;
(4) traversing different super-parameter lambda in an encoderiThe value is represented by formula (2) < lambda >iIs to equally divide 1 into 2n1 interval, so for each increase of i by 1, λiIs increasedThereby realizing the effect that the filtering strength in the interval is uniformly increased;
(5) different lambdaiCorresponding different valuesFor each oneAll compare it with the original pixelMean square error value MSE betweeni,
(6) Finding the minimum MSEiAnd recording the corresponding lambdaiCoding the binary form of i corresponding to the binary form into a code stream; and filtering the resultSending the data to a frame buffer;
(7) at the decoding end, do not passλ under different iiBut directly decodes the filtering strength according to i in the code stream, thereby realizing the optimal filtering effect.
2. The low complexity neural network filtering algorithm for B-frames and P-frames according to claim 1, wherein at the decoding end, i is decoded in the code stream by a corresponding entropy decoder, and λ is calculated according to i in the code streamiThen correspondingly adding the result Y output by the neural network and the original reconstruction pixel R to obtainA desired filtering result is calculated.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911382700.0A CN111064958B (en) | 2019-12-28 | 2019-12-28 | Low-complexity neural network filtering algorithm for B frame and P frame |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911382700.0A CN111064958B (en) | 2019-12-28 | 2019-12-28 | Low-complexity neural network filtering algorithm for B frame and P frame |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111064958A true CN111064958A (en) | 2020-04-24 |
CN111064958B CN111064958B (en) | 2021-03-30 |
Family
ID=70304317
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911382700.0A Active CN111064958B (en) | 2019-12-28 | 2019-12-28 | Low-complexity neural network filtering algorithm for B frame and P frame |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111064958B (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113422966A (en) * | 2021-05-27 | 2021-09-21 | 绍兴市北大信息技术科创中心 | Multi-model CNN loop filtering method |
WO2022052533A1 (en) * | 2020-09-10 | 2022-03-17 | Oppo广东移动通信有限公司 | Encoding method, decoding method, encoder, decoder, and encoding system |
WO2022116165A1 (en) * | 2020-12-04 | 2022-06-09 | 深圳市大疆创新科技有限公司 | Video encoding method, video decoding method, encoder, decoder, and ai accelerator |
WO2022218385A1 (en) * | 2021-04-14 | 2022-10-20 | Beijing Bytedance Network Technology Co., Ltd. | Unified neural network filter model |
WO2023130226A1 (en) * | 2022-01-04 | 2023-07-13 | Oppo广东移动通信有限公司 | Filtering method, decoder, encoder and computer-readable storage medium |
US11949918B2 (en) | 2021-04-15 | 2024-04-02 | Lemon Inc. | Unified neural network in-loop filter signaling |
US11979591B2 (en) | 2021-04-06 | 2024-05-07 | Lemon Inc. | Unified neural network in-loop filter |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1558680A (en) * | 2004-01-16 | 2004-12-29 | 北京工业大学 | A simplified loop filtering method for video coding |
CN108174225A (en) * | 2018-01-11 | 2018-06-15 | 上海交通大学 | Filter achieving method and system in coding and decoding video loop based on confrontation generation network |
CN109151475A (en) * | 2017-06-27 | 2019-01-04 | 杭州海康威视数字技术股份有限公司 | A kind of method for video coding, coding/decoding method, device and electronic equipment |
CN110062246A (en) * | 2018-01-19 | 2019-07-26 | 杭州海康威视数字技术股份有限公司 | The method and apparatus that video requency frame data is handled |
CN110199521A (en) * | 2016-12-23 | 2019-09-03 | 华为技术有限公司 | Low complex degree hybrid domain for damaging Video coding cooperates with in-loop filter |
CN110300301A (en) * | 2018-03-22 | 2019-10-01 | 华为技术有限公司 | Image coding/decoding method and device |
JP2019201256A (en) * | 2018-05-14 | 2019-11-21 | シャープ株式会社 | Image filter device |
CN110519606A (en) * | 2019-08-22 | 2019-11-29 | 天津大学 | Intelligent coding method in a kind of deep video frame |
CN110619607A (en) * | 2018-06-20 | 2019-12-27 | 浙江大学 | Image denoising method and device based on neural network and image coding and decoding method and device based on neural network image denoising |
-
2019
- 2019-12-28 CN CN201911382700.0A patent/CN111064958B/en active Active
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1558680A (en) * | 2004-01-16 | 2004-12-29 | 北京工业大学 | A simplified loop filtering method for video coding |
CN110199521A (en) * | 2016-12-23 | 2019-09-03 | 华为技术有限公司 | Low complex degree hybrid domain for damaging Video coding cooperates with in-loop filter |
CN109151475A (en) * | 2017-06-27 | 2019-01-04 | 杭州海康威视数字技术股份有限公司 | A kind of method for video coding, coding/decoding method, device and electronic equipment |
CN108174225A (en) * | 2018-01-11 | 2018-06-15 | 上海交通大学 | Filter achieving method and system in coding and decoding video loop based on confrontation generation network |
CN110062246A (en) * | 2018-01-19 | 2019-07-26 | 杭州海康威视数字技术股份有限公司 | The method and apparatus that video requency frame data is handled |
CN110300301A (en) * | 2018-03-22 | 2019-10-01 | 华为技术有限公司 | Image coding/decoding method and device |
JP2019201256A (en) * | 2018-05-14 | 2019-11-21 | シャープ株式会社 | Image filter device |
CN110619607A (en) * | 2018-06-20 | 2019-12-27 | 浙江大学 | Image denoising method and device based on neural network and image coding and decoding method and device based on neural network image denoising |
CN110519606A (en) * | 2019-08-22 | 2019-11-29 | 天津大学 | Intelligent coding method in a kind of deep video frame |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2022052533A1 (en) * | 2020-09-10 | 2022-03-17 | Oppo广东移动通信有限公司 | Encoding method, decoding method, encoder, decoder, and encoding system |
WO2022116165A1 (en) * | 2020-12-04 | 2022-06-09 | 深圳市大疆创新科技有限公司 | Video encoding method, video decoding method, encoder, decoder, and ai accelerator |
US11979591B2 (en) | 2021-04-06 | 2024-05-07 | Lemon Inc. | Unified neural network in-loop filter |
WO2022218385A1 (en) * | 2021-04-14 | 2022-10-20 | Beijing Bytedance Network Technology Co., Ltd. | Unified neural network filter model |
US11949918B2 (en) | 2021-04-15 | 2024-04-02 | Lemon Inc. | Unified neural network in-loop filter signaling |
CN113422966A (en) * | 2021-05-27 | 2021-09-21 | 绍兴市北大信息技术科创中心 | Multi-model CNN loop filtering method |
CN113422966B (en) * | 2021-05-27 | 2024-05-24 | 绍兴市北大信息技术科创中心 | Multi-model CNN loop filtering method |
WO2023130226A1 (en) * | 2022-01-04 | 2023-07-13 | Oppo广东移动通信有限公司 | Filtering method, decoder, encoder and computer-readable storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN111064958B (en) | 2021-03-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111064958B (en) | Low-complexity neural network filtering algorithm for B frame and P frame | |
CN108184129B (en) | Video coding and decoding method and device and neural network for image filtering | |
JP5854439B2 (en) | Video coding system and method using adaptive segmentation | |
CN109889852B (en) | HEVC intra-frame coding optimization method based on adjacent values | |
WO2021203394A1 (en) | Loop filtering method and apparatus | |
CN112544081B (en) | Loop filtering method and device | |
CN113766247B (en) | Loop filtering method and device | |
WO2013067949A1 (en) | Matrix encoding method and device thereof, and matrix decoding method and device thereof | |
CN105306957A (en) | Adaptive loop filtering method and device | |
CN113068028A (en) | Method and apparatus for predicting video image component, and computer storage medium | |
CN110944179B (en) | Video data decoding method and device | |
CN113422966A (en) | Multi-model CNN loop filtering method | |
CN113810715B (en) | Video compression reference image generation method based on cavity convolutional neural network | |
CN115914654A (en) | Neural network loop filtering method and device for video coding | |
US10764577B2 (en) | Non-MPM mode coding for intra prediction in video coding | |
CN114793282A (en) | Neural network based video compression with bit allocation | |
CN113709459B (en) | Intra-frame prediction method, device and computer storage medium | |
WO2023197230A1 (en) | Filtering method, encoder, decoder and storage medium | |
WO2024016156A1 (en) | Filtering method, encoder, decoder, code stream and storage medium | |
WO2023245544A1 (en) | Encoding and decoding method, bitstream, encoder, decoder, and storage medium | |
CN117459737B (en) | Training method of image preprocessing network and image preprocessing method | |
WO2024077576A1 (en) | Neural network based loop filter methods, video coding method and apparatus, video decoding method and apparatus, and system | |
WO2024007157A1 (en) | Multi-reference line index list sorting method and device, video coding method and device, video decoding method and device, and system | |
CN116347108A (en) | Loop filtering decision method of deep neural network based on coding information | |
Sheng et al. | Prediction and Reference Quality Adaptation for Learned Video Compression |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |