WO2022072245A1 - Multiple neural network models for filtering during video coding - Google Patents

Multiple neural network models for filtering during video coding Download PDF

Info

Publication number
WO2022072245A1
WO2022072245A1 PCT/US2021/052008 US2021052008W WO2022072245A1 WO 2022072245 A1 WO2022072245 A1 WO 2022072245A1 US 2021052008 W US2021052008 W US 2021052008W WO 2022072245 A1 WO2022072245 A1 WO 2022072245A1
Authority
WO
WIPO (PCT)
Prior art keywords
neural network
picture
level
syntax element
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/US2021/052008
Other languages
English (en)
French (fr)
Inventor
Hongtao Wang
Venkata Meher Satchit Anand Kotra
Jianle Chen
Marta Karczewicz
Dana KIANFAR
Auke Joris WIGGERS
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qualcomm Inc
Original Assignee
Qualcomm Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qualcomm Inc filed Critical Qualcomm Inc
Priority to KR1020237009249A priority Critical patent/KR20230079360A/ko
Priority to BR112023004814A priority patent/BR112023004814A2/pt
Priority to PH1/2023/550197A priority patent/PH12023550197A1/en
Priority to EP21795101.1A priority patent/EP4222960A1/en
Priority to JP2023515167A priority patent/JP7795528B2/ja
Priority to CN202180064854.4A priority patent/CN116349226A/zh
Publication of WO2022072245A1 publication Critical patent/WO2022072245A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding
    • G06T9/002Image coding using neural networks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/117Filters, e.g. for pre-processing or post-processing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/119Adaptive subdivision aspects, e.g. subdivision of a picture into rectangular or non-rectangular coding blocks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/129Scanning of coding units, e.g. zig-zag scan of transform coefficients or flexible macroblock ordering [FMO]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/13Adaptive entropy coding, e.g. adaptive variable length coding [AVLC] or context adaptive binary arithmetic coding [CABAC]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • H04N19/147Data rate or code amount at the encoder output according to rate distortion criteria
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/172Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/186Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/189Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding
    • H04N19/192Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding the adaptation method, adaptation tool or adaptation type being iterative or recursive
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/44Decoders specially adapted therefor, e.g. video decoders which are asymmetric with respect to the encoder
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/80Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/80Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation
    • H04N19/82Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation involving filtering within a prediction loop
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/85Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
    • H04N19/86Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression involving reduction of coding artifacts, e.g. of blockiness
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Definitions

  • Digital video capabilities can be incorporated into a wide range of devi ces, including digital televisions, digital direct broadcast systems, wireless broadcast systems, personal digital assistants (PDAs), laptop or desktop computers, tablet computers, e-book readers, digital cameras, digital recording devices, digital media players, video gaming devices, video game consoles, cellular or satellite radio telephones, so-called “smart phones,” video teleconferencing devices, video streaming devices, and the like.
  • Digital video devices implement video coding techniques, such as those described in the standards defined by MPEG-2, MPEG-4, ITU-T H.263, ITU-T H.264/MPEG-4, Part 10, Advanced Video Coding (AVC), ITU-T H.265/High Efficiency Video Coding (HEVC), and extensions of such standards.
  • the video devices may transmit, receive, encode, decode, and/or store digital video information more efficiently by implementing such video coding techniques.
  • a computer-readable storage medium has stored thereon instructions that, when executed, cause a processor to: decode a picture of video data; code a value for a syntax element representing a neural network model to be used to filter a portion of the decoded picture, the value representing an index into a set of predefined neural network models, the index corresponding to the neural network model in the set of pre -defined neural network models; and filter the portion of the decoded picture using the neural network model corresponding to the index.
  • a device for filtering decoded video data includes means for decoding a picture of video data; means for coding a value for a syntax element representing a neural network model to be used to filter a portion of the decoded picture, the value representing an index into a set of pre-defined neural network models, the index corresponding to the neural network model in the set of pre-defined neural network models; and means for filtering the portion of the decoded picture using the neural network model corresponding to the index.
  • VVC Versatile Video Coding
  • ITU-T H.266 which has been developed by the Joint Video Expert TEAM (JVET) of ITU-T Video Coding Experts Group (VCEG) and ISO/IEC Motion Picture Experts Group (MPEG).
  • JVET Joint Video Expert TEAM
  • MPEG ISO/IEC Motion Picture Experts Group
  • Version 1 of tire VVC specification referred to as “WC FDIS” hereinafter, is available from http://phenix.int- evry.fr/jvet/doc_end_user/documents/19_Teleconference/wgl l/JVET-S200 l-vl7.zip.
  • W FDIS Motion Picture Experts Group
  • Video source 104 of source device 102 may include a video capture device, such as a video camera, a video archive containing previously captured raw video, and/or a video feed interface to receive video from a video content provider.
  • video source 104 may generate computer graphics-based data as the source video, or a combination of live video, archived video, and computer-generated video.
  • video encoder 200 encodes the captured, pre-captured, or computer-generated video data.
  • Video encoder 200 may rearrange the pictures from the received order (sometimes referred to as “display order”) into a coding order for coding.
  • Video encoder 200 may generate a bitstream including encoded video data.
  • Source device 102 may then output the encoded video data via output interface 108 onto computer-readable medium 110 for reception and/or retrieval by, e.g., input interface 122 of destination device 116.
  • Computer-readable medium 110 may represent any type of medium or device capable of transporting the encoded video data from source device 102 to destination device 116.
  • computer-readable medium 110 represents a communication medium to enable source device 102 to transmit encoded video data directly to destination device 116 in real-time, e.g., via a radio frequency network or computer-based network.
  • Output interface 108 may modulate a transmission signal including the encoded video data, and input interface 122 may demodulate the received transmission signal, according to a communication standard, such as a wireless communication protocol.
  • the communication medium may comprise any wireless or wired communication medium, such as a radio frequency (RF) spectrum or one or more physical transmission lines.
  • RF radio frequency
  • source device 102 may output encoded data from output interface 108 to storage device 112.
  • destination device 116 may access encoded data from storage device 112 via input interface 122.
  • Storage device 112 may include any of a variety of distributed or locally accessed data storage media such as a hard drive, Blu-ray discs, DVDs, CD-ROMs, flash memory, volatile or non-volatile memory-, or any other suitable digital storage media for storing encoded video data.
  • source device 102 may output encoded video data to file server 114 or another intermediate storage device that may store the encoded video data generated by source device 102. Destination device 116 may access stored video data from file server 114 via streaming or download.
  • video encoder 200 may generate the prediction block using one or more motion vectors.
  • Video encoder 200 may generally perform a motion search to identify a reference block that closely matches the CU, e.g., in terms of differences between the CU and the reference block.
  • Video encoder 200 may calculate a difference metric using a sum of absolute difference (SAD), sum of squared differences (SSD), mean absolute difference (MAD), mean squared differences (MSD), or other such difference calculations to determine whether a reference block closely matches the current CU.
  • SAD sum of absolute difference
  • SSD sum of squared differences
  • MAD mean absolute difference
  • MSD mean squared differences
  • video encoder 200 may predict the current CU using uni -directional prediction or bi-directional prediction.
  • Quantization generally refers to a process in which transform coefficients are quantized to possibly reduce the amount of data used to represent the transform coefficients, providing further compression.
  • video encoder 200 may reduce the bit depth associated with some or all of the transform coefficients. For example, video encoder 200 may round an «-bit value down to an m ⁇ bit value during quantization, where n is greater than m. in some examples, to perform quantization, video encoder 200 may perform a bitwise right-shift of the value to be quantized.
  • Block partitioning is used to divide the image into smaller blocks for operation of the prediction and transform processes.
  • the early video coding standards used a fixed block size, typically 16x 16 samples.
  • Recent standards, such as HE VC and VVC employ tree-based partitioning structures to provide flexible partitioning.
  • Quantization aims to reduce the precision of an input value or a set of input values in order to decrease the amount of data needed to represent the values.
  • quantization is typically applied to individual transformed residual samples, i.e., to transform coefficients, resulting in integer coefficient levels.
  • the step size is derived from a so-called quantization parameter (QP) that controls the fidelity and bit rate.
  • QP quantization parameter
  • a larger step size lowers the bit rate but also deteriorates the quality, which e.g., results m video pictures exhibiting blocking artifacts and blurred details.
  • Context-adaptive binary arithmetic coding is a form of entropy coding used in recent video codecs, e.g., AVC, HEVC, and VVC, due to its high efficiency.
  • the NN-based filtering process may take the reconstructed samples as inputs, and the intermediate outputs are residual samples, which are added back to the input to refine the input samples.
  • the NN filter may use all color components (e.g., Y, U, and V, or Y, Cb, and Cr, i.e., luminance, blue-hue chrominance, and red-hue chrominance) as input to exploit cross-component correlations.
  • Different color components may share the same filters (including network structure and model parameters) or each color component may have its own specific filters.
  • M can be any value that is pre-defined or signaled in bitstreams at a lower level (e.g,, slice header, picture header, CTU level, grid level, etc.) If M > 1, one of the M candidates is selected and signaled in the bitstream as a syntax element.
  • each of the filter models may be associated with a QP value, and for each picture, video encoder 200 and video decoder 300 may derive a ‘model selection QP’ and select the models with the QP value that is closest to the ‘model selection QP’ for the current picture. In this case, no additional information needs to be signaled.
  • video encoder 200 includes video data memory' 230, mode selection unit 202, residual generation unit 204, transform processing unit 206, quantization unit 208, inverse quantization unit 210, inverse transform processing unit 212, reconstruction unit 214, filter unit 216, decoded picture buffer (DPB) 218, and entropy encoding unit 22.0.
  • Any or all of video data memorj' 230, mode selection unit 202, residual generation unit 204, transform processing unit 206, quantization unit 208, inverse quantization unit 210, inverse transform processing unit 212, reconstruction unit 214, filter unit 216, DPB 218, and entropy encoding unit 220 may be implemented in one or more processors or in processing circuitry.
  • video decoder 300 may be implemented as one or more circuits or logic elements as part of hardware circuitry, or as part of a processor, ASIC, or FPGA.
  • video decoder 300 may include additional or alternative processors or processing circuitry to perform these and other functions.
  • video decoder 300 reconstructs a picture on a block-by-block basis.
  • Video decoder 300 may perform a reconstruction operation on each block individually ( w here the block currently being reconstructed, i.e., decoded, may be referred to as a “current block”).
  • Entropy decoding unit 302 may entropy decode syntax elements defining quantized transform coefficients of a quantized transform coefficient block, as well as transform information, such as a quantization parameter (QP) and/or transform mode indication(s).
  • Inverse quantization unit 306 may use the QP associated with the quantized transform coefficient block to determine a degree of quantization and, likewise, a degree of inverse quantization for inverse quantization unit 306 to apply.
  • Inverse quantization unit 306 may, tor example, perform a bitwise left-shift operation to inverse quantize the quantized transform coefficients. Inverse quantization unit 306 may thereby form a transform coefficient block including transform coefficients.
  • FIG. 7 represents an example of a device for filtering decoded video data including a memory' configured to store video data; and one or more processors implemented in circuitry and configured to: decode a picture of video data; code a value for a syntax element representing a neural network model to be used to filter a portion of the decoded picture, the value representing an index into a set of pre-defined neural network models, the index corresponding to the neural network model in the set of pre-defined neural network models; and filter the portion of the decoded picture using the neural network model corresponding to the index.
  • FIG. 8 is a flowchart illustrating an example method for encoding a current block in accordance with the techniques of this disclosure.
  • Tire current block may comprise a current CU.
  • Video decoder 300 may receive entropy encoded data for the current block, such as entropy encoded prediction information and entropy encoded data for coefficients of a residual block corresponding to the current block (370). Video decoder 300 may entropy decode the entropy encoded data to determine prediction information for the current block, a neural network (NN) model for a portion of the picture including the current block, and to reproduce coefficients of the residual block (372). Video decoder 300 may predict the current block (374), e.g., using an intra- or inter-prediction mode as indicated by the prediction information for the current block, to calculate a prediction block for the current block.
  • entropy encoded data for the current block such as entropy encoded prediction information and entropy encoded data for coefficients of a residual block corresponding to the current block (370).
  • Video decoder 300 may entropy decode the entropy encoded data to determine prediction information for the current block, a neural network (NN) model for
  • Clause 1 A method of filtering decoded video data, the method comprising: determining one or more neural network models to be used to filter a portion of a decoded picture of video data; and filtering the portion of the decoded picture using the one or more neural network models.
  • Clause 37 Tire method of clause 23, wherein the value for the syntax element comprises a quantization parameter (QP) for the portion of the picture.
  • QP quantization parameter
  • Clause 38 The method of clause 23, further comprising encoding the picture prior to decoding the picture, wherein coding the value for the syntax element comprises encoding the value for the syntax element.
  • Clause 78 The method of clause 77, further comprising coding a syntax element that jointly represents filtering using the neural network model for each of the color components of the decoded picture.
  • Clause 81 The method of clause 80, further comprising determining the neural network model according to a rate -distortion computation.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
PCT/US2021/052008 2020-09-29 2021-09-24 Multiple neural network models for filtering during video coding Ceased WO2022072245A1 (en)

Priority Applications (6)

Application Number Priority Date Filing Date Title
KR1020237009249A KR20230079360A (ko) 2020-09-29 2021-09-24 비디오 코딩 동안 필터링을 위한 다중 뉴럴 네트워크 모델들
BR112023004814A BR112023004814A2 (pt) 2020-09-29 2021-09-24 Múltiplos modelos de rede neural para filtragem durante a codificação de vídeo
PH1/2023/550197A PH12023550197A1 (en) 2020-09-29 2021-09-24 Multiple neural network models for filtering during video coding
EP21795101.1A EP4222960A1 (en) 2020-09-29 2021-09-24 Multiple neural network models for filtering during video coding
JP2023515167A JP7795528B2 (ja) 2020-09-29 2021-09-24 ビデオコーディング中にフィルタ処理するための複数のニューラルネットワークモデル
CN202180064854.4A CN116349226A (zh) 2020-09-29 2021-09-24 用于视频编解码期间进行滤波的多神经网络模型

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US202063085092P 2020-09-29 2020-09-29
US63/085,092 2020-09-29
US17/448,658 US11930215B2 (en) 2020-09-29 2021-09-23 Multiple neural network models for filtering during video coding
US17/448,658 2021-09-23

Publications (1)

Publication Number Publication Date
WO2022072245A1 true WO2022072245A1 (en) 2022-04-07

Family

ID=80821639

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2021/052008 Ceased WO2022072245A1 (en) 2020-09-29 2021-09-24 Multiple neural network models for filtering during video coding

Country Status (8)

Country Link
US (3) US11930215B2 (https=)
EP (1) EP4222960A1 (https=)
JP (1) JP7795528B2 (https=)
KR (1) KR20230079360A (https=)
CN (1) CN116349226A (https=)
BR (1) BR112023004814A2 (https=)
PH (1) PH12023550197A1 (https=)
WO (1) WO2022072245A1 (https=)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20240305829A1 (en) * 2021-03-23 2024-09-12 Sharp Kabushiki Kaisha Systems and methods for signaling neural network-based in-loop filter parameter information in video coding
WO2025009295A1 (ja) * 2023-07-03 2025-01-09 パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ 復号装置、符号化装置、復号方法及び符号化方法
EP4554227A4 (en) * 2022-07-05 2025-11-12 Panasonic Ip Corp America DECODING DEVICE, CODING DEVICE, DECODING METHOD, AND CODING METHOD

Families Citing this family (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11930215B2 (en) 2020-09-29 2024-03-12 Qualcomm Incorporated Multiple neural network models for filtering during video coding
US11716469B2 (en) * 2020-12-10 2023-08-01 Lemon Inc. Model selection in neural network-based in-loop filter for video coding
US12113995B2 (en) * 2021-04-06 2024-10-08 Lemon Inc. Neural network-based post filter for video coding
US11979591B2 (en) * 2021-04-06 2024-05-07 Lemon Inc. Unified neural network in-loop filter
US12323608B2 (en) * 2021-04-07 2025-06-03 Lemon Inc On neural network-based filtering for imaging/video coding
US12095988B2 (en) 2021-06-30 2024-09-17 Lemon, Inc. External attention in neural network-based video coding
US12603998B2 (en) 2021-07-07 2026-04-14 Lemon Inc. Configurable neural network model depth in neural network-based video coding
US20230051066A1 (en) * 2021-07-27 2023-02-16 Lemon Inc. Partitioning Information In Neural Network-Based Video Coding
WO2023158127A1 (ko) * 2022-02-21 2023-08-24 현대자동차주식회사 트랜스포머 기반 인루프 필터를 이용하는 비디오 코딩을 위한 방법 및 장치
CN119422374A (zh) * 2022-06-16 2025-02-11 抖音视界有限公司 基于可变速率神经网络的压缩
WO2023245194A1 (en) * 2022-06-17 2023-12-21 Bytedance Inc. Partitioning information in neural network-based video coding
WO2023245544A1 (zh) * 2022-06-23 2023-12-28 Oppo广东移动通信有限公司 编解码方法、码流、编码器、解码器以及存储介质
US12598314B2 (en) * 2022-07-05 2026-04-07 Qualcomm Incorporated Neural network based filtering process for multiple color components in video coding
TW202404371A (zh) * 2022-07-05 2024-01-16 美商高通公司 視訊譯碼中用於多種顏色分量的基於神經網路的濾波程序
WO2024011386A1 (zh) * 2022-07-11 2024-01-18 浙江大学 一种编解码方法、装置、编码器、解码器及存储介质
WO2024078599A1 (en) * 2022-10-13 2024-04-18 Douyin Vision Co., Ltd. Method, apparatus, and medium for video processing
WO2024080904A1 (en) * 2022-10-13 2024-04-18 Telefonaktiebolaget Lm Ericsson (Publ) Selective application of neural network based filtering to picture regions
WO2024178432A1 (en) * 2023-02-24 2024-08-29 Google Llc Switchable machine-learning models on common inference hardware for video coding
WO2024245299A1 (en) * 2023-05-29 2024-12-05 Douyin Vision Co., Ltd. Method, apparatus, and medium for video processing
US12581126B2 (en) * 2023-06-12 2026-03-17 Qualcomm Incorporated Low complexity NN-based in loop filter architectures with separable convolution
CN120345242A (zh) * 2023-07-05 2025-07-18 Lg电子株式会社 图像编码/解码方法、存储比特流的记录介质和发送比特流的方法
WO2025048289A1 (ko) * 2023-08-25 2025-03-06 현대자동차주식회사 디코더 역량에 따라 신경망 기반 비디오 코딩 툴들을 제어하기 위한 방법
WO2025071276A1 (ko) * 2023-09-27 2025-04-03 엘지전자 주식회사 영상 부호화/복호화 방법, 비트스트림을 전송하는 방법 및 비트스트림을 저장한 기록 매체
WO2025071272A1 (ko) * 2023-09-27 2025-04-03 엘지전자 주식회사 영상 부호화/복호화 방법, 비트스트림을 전송하는 방법 및 비트스트림을 저장한 기록 매체
WO2025071278A1 (ko) * 2023-09-27 2025-04-03 엘지전자 주식회사 영상 부호화/복호화 방법, 비트스트림을 전송하는 방법 및 비트스트림을 저장한 기록 매체
WO2025129910A1 (en) * 2023-12-21 2025-06-26 Guangdong Oppo Mobile Telecommunications Corp., Ltd. Dynamic mesh base geometry coding position adjustment method
WO2025131105A1 (en) * 2023-12-22 2025-06-26 Douyin Vision Co., Ltd. Method, apparatus, and medium for video processing
CN120238656A (zh) * 2023-12-28 2025-07-01 维沃移动通信有限公司 视频压缩处理方法、装置及电子设备
CN119854487A (zh) * 2025-01-06 2025-04-18 深圳传音控股股份有限公司 图像处理方法、处理设备及存储介质

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190273948A1 (en) * 2019-01-08 2019-09-05 Intel Corporation Method and system of neural network loop filtering for video coding

Family Cites Families (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8325797B2 (en) * 2005-04-11 2012-12-04 Maxim Integrated Products, Inc. System and method of reduced-temporal-resolution update for video coding and quality control
WO2019009448A1 (ko) 2017-07-06 2019-01-10 삼성전자 주식회사 영상을 부호화 또는 복호화하는 방법 및 장치
US11631199B2 (en) 2017-08-10 2023-04-18 Sharp Kabushiki Kaisha Image filtering apparatus, image decoding apparatus, and image coding apparatus
EP3451293A1 (en) 2017-08-28 2019-03-06 Thomson Licensing Method and apparatus for filtering with multi-branch deep learning
CN109819253B (zh) * 2017-11-21 2022-04-22 腾讯科技(深圳)有限公司 视频编码方法、装置、计算机设备和存储介质
CN108184129B (zh) * 2017-12-11 2020-01-10 北京大学 一种视频编解码方法、装置及用于图像滤波的神经网络
WO2019244116A1 (en) * 2018-06-21 2019-12-26 Beijing Bytedance Network Technology Co., Ltd. Border partition in video coding
US11720997B2 (en) 2018-10-19 2023-08-08 Samsung Electronics Co.. Ltd. Artificial intelligence (AI) encoding device and operating method thereof and AI decoding device and operating method thereof
US11282172B2 (en) * 2018-12-11 2022-03-22 Google Llc Guided restoration of video data using neural networks
EP3706046A1 (en) 2019-03-04 2020-09-09 InterDigital VC Holdings, Inc. Method and device for picture encoding and decoding
TWI747339B (zh) * 2019-06-27 2021-11-21 聯發科技股份有限公司 視訊編解碼之方法和裝置
US11363307B2 (en) * 2019-08-08 2022-06-14 Hfi Innovation Inc. Video coding with subpictures
US20220295116A1 (en) * 2019-09-20 2022-09-15 Intel Corporation Convolutional neural network loop filter based on classifier
US11930215B2 (en) 2020-09-29 2024-03-12 Qualcomm Incorporated Multiple neural network models for filtering during video coding

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190273948A1 (en) * 2019-01-08 2019-09-05 Intel Corporation Method and system of neural network loop filtering for video coding

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
BROSS ET AL.: "Versatile Video Coding (Draft 9", JOINT VIDEO EXPERTS TEAM (JVET) OF ITU-T SG 16 WP 3 AND ISO/IEC JTC 1/SC 29/WG 11, 18TH MEETING
PARK WOON-SUNG ET AL: "CNN-based in-loop filtering for coding efficiency improvement", 2016 IEEE 12TH IMAGE, VIDEO, AND MULTIDIMENSIONAL SIGNAL PROCESSING WORKSHOP (IVMSP), IEEE, 11 July 2016 (2016-07-11), pages 1 - 5, XP032934608, DOI: 10.1109/IVMSPW.2016.7528223 *
ZHOU (HIKVISION) L ET AL: "Convolutional Neural Network Filter (CNNF) for intra frame", no. JVET-I0022, 24 January 2018 (2018-01-24), XP030248070, Retrieved from the Internet <URL:http://phenix.int-evry.fr/jvet/doc_end_user/documents/9_Gwangju/wg11/JVET-I0022-v4.zip JVET-I0022-v4.doc> [retrieved on 20180124] *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20240305829A1 (en) * 2021-03-23 2024-09-12 Sharp Kabushiki Kaisha Systems and methods for signaling neural network-based in-loop filter parameter information in video coding
EP4554227A4 (en) * 2022-07-05 2025-11-12 Panasonic Ip Corp America DECODING DEVICE, CODING DEVICE, DECODING METHOD, AND CODING METHOD
WO2025009295A1 (ja) * 2023-07-03 2025-01-09 パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ 復号装置、符号化装置、復号方法及び符号化方法

Also Published As

Publication number Publication date
JP2023542841A (ja) 2023-10-12
JP7795528B2 (ja) 2026-01-07
PH12023550197A1 (en) 2024-06-24
KR20230079360A (ko) 2023-06-07
US20250317604A1 (en) 2025-10-09
EP4222960A1 (en) 2023-08-09
US20220103864A1 (en) 2022-03-31
CN116349226A (zh) 2023-06-27
BR112023004814A2 (pt) 2023-04-18
US20240244265A1 (en) 2024-07-18
TW202218422A (zh) 2022-05-01
US11930215B2 (en) 2024-03-12
US12356014B2 (en) 2025-07-08

Similar Documents

Publication Publication Date Title
US12356014B2 (en) Multiple neural network models for filtering during video coding
US12341959B2 (en) Filtering process for video coding
EP3959891B1 (en) Adaptive loop filter set index signaling
SG11202112627SA (en) Transform and last significant coefficient position signaling for low-frequency non-separable transform in video coding
SG11202111553RA (en) Reference picture resampling and inter-coding tools for video coding
EP4035390A1 (en) Low-frequency non-separable transform (lfnst) simplifications
KR20230038709A (ko) 다중 적응형 루프 필터 세트들
WO2022072684A1 (en) Activation function design in neural network-based filtering process for video coding
EP4082211A1 (en) Lfnst signaling for chroma based on chroma transform skip
EP4186235B1 (en) Deblocking filter parameter signaling
KR20230129015A (ko) 비디오 코딩 동안의 필터링을 위한 다수의 신경망 모델들
WO2021055746A1 (en) Transform unit design for video coding
WO2021072215A1 (en) Signaling coding scheme for residual values in transform skip for video coding
WO2022221829A1 (en) Intra-mode dependent multiple transform selection for video coding
EP4085623A1 (en) Chroma transform skip and joint chroma coding enabled block in video coding
WO2020072781A1 (en) Wide-angle intra prediction for video coding
EP4226613A1 (en) Adaptively deriving rice parameter values for high bit-depth video coding

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21795101

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 202347006344

Country of ref document: IN

ENP Entry into the national phase

Ref document number: 2023515167

Country of ref document: JP

Kind code of ref document: A

REG Reference to national code

Ref country code: BR

Ref legal event code: B01A

Ref document number: 112023004814

Country of ref document: BR

ENP Entry into the national phase

Ref document number: 112023004814

Country of ref document: BR

Kind code of ref document: A2

Effective date: 20230315

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2021795101

Country of ref document: EP

Effective date: 20230502