WO2022072245A1 - Multiple neural network models for filtering during video coding - Google Patents
Multiple neural network models for filtering during video coding Download PDFInfo
- Publication number
- WO2022072245A1 WO2022072245A1 PCT/US2021/052008 US2021052008W WO2022072245A1 WO 2022072245 A1 WO2022072245 A1 WO 2022072245A1 US 2021052008 W US2021052008 W US 2021052008W WO 2022072245 A1 WO2022072245 A1 WO 2022072245A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- neural network
- picture
- level
- syntax element
- value
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/70—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T9/00—Image coding
- G06T9/002—Image coding using neural networks
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/117—Filters, e.g. for pre-processing or post-processing
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/119—Adaptive subdivision aspects, e.g. subdivision of a picture into rectangular or non-rectangular coding blocks
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/129—Scanning of coding units, e.g. zig-zag scan of transform coefficients or flexible macroblock ordering [FMO]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/13—Adaptive entropy coding, e.g. adaptive variable length coding [AVLC] or context adaptive binary arithmetic coding [CABAC]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/146—Data rate or code amount at the encoder output
- H04N19/147—Data rate or code amount at the encoder output according to rate distortion criteria
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/172—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/176—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/186—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/189—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding
- H04N19/192—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding the adaptation method, adaptation tool or adaptation type being iterative or recursive
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/44—Decoders specially adapted therefor, e.g. video decoders which are asymmetric with respect to the encoder
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/60—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/80—Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/80—Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation
- H04N19/82—Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation involving filtering within a prediction loop
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/85—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
- H04N19/86—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression involving reduction of coding artifacts, e.g. of blockiness
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
Definitions
- Digital video capabilities can be incorporated into a wide range of devi ces, including digital televisions, digital direct broadcast systems, wireless broadcast systems, personal digital assistants (PDAs), laptop or desktop computers, tablet computers, e-book readers, digital cameras, digital recording devices, digital media players, video gaming devices, video game consoles, cellular or satellite radio telephones, so-called “smart phones,” video teleconferencing devices, video streaming devices, and the like.
- Digital video devices implement video coding techniques, such as those described in the standards defined by MPEG-2, MPEG-4, ITU-T H.263, ITU-T H.264/MPEG-4, Part 10, Advanced Video Coding (AVC), ITU-T H.265/High Efficiency Video Coding (HEVC), and extensions of such standards.
- the video devices may transmit, receive, encode, decode, and/or store digital video information more efficiently by implementing such video coding techniques.
- a computer-readable storage medium has stored thereon instructions that, when executed, cause a processor to: decode a picture of video data; code a value for a syntax element representing a neural network model to be used to filter a portion of the decoded picture, the value representing an index into a set of predefined neural network models, the index corresponding to the neural network model in the set of pre -defined neural network models; and filter the portion of the decoded picture using the neural network model corresponding to the index.
- a device for filtering decoded video data includes means for decoding a picture of video data; means for coding a value for a syntax element representing a neural network model to be used to filter a portion of the decoded picture, the value representing an index into a set of pre-defined neural network models, the index corresponding to the neural network model in the set of pre-defined neural network models; and means for filtering the portion of the decoded picture using the neural network model corresponding to the index.
- VVC Versatile Video Coding
- ITU-T H.266 which has been developed by the Joint Video Expert TEAM (JVET) of ITU-T Video Coding Experts Group (VCEG) and ISO/IEC Motion Picture Experts Group (MPEG).
- JVET Joint Video Expert TEAM
- MPEG ISO/IEC Motion Picture Experts Group
- Version 1 of tire VVC specification referred to as “WC FDIS” hereinafter, is available from http://phenix.int- evry.fr/jvet/doc_end_user/documents/19_Teleconference/wgl l/JVET-S200 l-vl7.zip.
- W FDIS Motion Picture Experts Group
- Video source 104 of source device 102 may include a video capture device, such as a video camera, a video archive containing previously captured raw video, and/or a video feed interface to receive video from a video content provider.
- video source 104 may generate computer graphics-based data as the source video, or a combination of live video, archived video, and computer-generated video.
- video encoder 200 encodes the captured, pre-captured, or computer-generated video data.
- Video encoder 200 may rearrange the pictures from the received order (sometimes referred to as “display order”) into a coding order for coding.
- Video encoder 200 may generate a bitstream including encoded video data.
- Source device 102 may then output the encoded video data via output interface 108 onto computer-readable medium 110 for reception and/or retrieval by, e.g., input interface 122 of destination device 116.
- Computer-readable medium 110 may represent any type of medium or device capable of transporting the encoded video data from source device 102 to destination device 116.
- computer-readable medium 110 represents a communication medium to enable source device 102 to transmit encoded video data directly to destination device 116 in real-time, e.g., via a radio frequency network or computer-based network.
- Output interface 108 may modulate a transmission signal including the encoded video data, and input interface 122 may demodulate the received transmission signal, according to a communication standard, such as a wireless communication protocol.
- the communication medium may comprise any wireless or wired communication medium, such as a radio frequency (RF) spectrum or one or more physical transmission lines.
- RF radio frequency
- source device 102 may output encoded data from output interface 108 to storage device 112.
- destination device 116 may access encoded data from storage device 112 via input interface 122.
- Storage device 112 may include any of a variety of distributed or locally accessed data storage media such as a hard drive, Blu-ray discs, DVDs, CD-ROMs, flash memory, volatile or non-volatile memory-, or any other suitable digital storage media for storing encoded video data.
- source device 102 may output encoded video data to file server 114 or another intermediate storage device that may store the encoded video data generated by source device 102. Destination device 116 may access stored video data from file server 114 via streaming or download.
- video encoder 200 may generate the prediction block using one or more motion vectors.
- Video encoder 200 may generally perform a motion search to identify a reference block that closely matches the CU, e.g., in terms of differences between the CU and the reference block.
- Video encoder 200 may calculate a difference metric using a sum of absolute difference (SAD), sum of squared differences (SSD), mean absolute difference (MAD), mean squared differences (MSD), or other such difference calculations to determine whether a reference block closely matches the current CU.
- SAD sum of absolute difference
- SSD sum of squared differences
- MAD mean absolute difference
- MSD mean squared differences
- video encoder 200 may predict the current CU using uni -directional prediction or bi-directional prediction.
- Quantization generally refers to a process in which transform coefficients are quantized to possibly reduce the amount of data used to represent the transform coefficients, providing further compression.
- video encoder 200 may reduce the bit depth associated with some or all of the transform coefficients. For example, video encoder 200 may round an «-bit value down to an m ⁇ bit value during quantization, where n is greater than m. in some examples, to perform quantization, video encoder 200 may perform a bitwise right-shift of the value to be quantized.
- Block partitioning is used to divide the image into smaller blocks for operation of the prediction and transform processes.
- the early video coding standards used a fixed block size, typically 16x 16 samples.
- Recent standards, such as HE VC and VVC employ tree-based partitioning structures to provide flexible partitioning.
- Quantization aims to reduce the precision of an input value or a set of input values in order to decrease the amount of data needed to represent the values.
- quantization is typically applied to individual transformed residual samples, i.e., to transform coefficients, resulting in integer coefficient levels.
- the step size is derived from a so-called quantization parameter (QP) that controls the fidelity and bit rate.
- QP quantization parameter
- a larger step size lowers the bit rate but also deteriorates the quality, which e.g., results m video pictures exhibiting blocking artifacts and blurred details.
- Context-adaptive binary arithmetic coding is a form of entropy coding used in recent video codecs, e.g., AVC, HEVC, and VVC, due to its high efficiency.
- the NN-based filtering process may take the reconstructed samples as inputs, and the intermediate outputs are residual samples, which are added back to the input to refine the input samples.
- the NN filter may use all color components (e.g., Y, U, and V, or Y, Cb, and Cr, i.e., luminance, blue-hue chrominance, and red-hue chrominance) as input to exploit cross-component correlations.
- Different color components may share the same filters (including network structure and model parameters) or each color component may have its own specific filters.
- M can be any value that is pre-defined or signaled in bitstreams at a lower level (e.g,, slice header, picture header, CTU level, grid level, etc.) If M > 1, one of the M candidates is selected and signaled in the bitstream as a syntax element.
- each of the filter models may be associated with a QP value, and for each picture, video encoder 200 and video decoder 300 may derive a ‘model selection QP’ and select the models with the QP value that is closest to the ‘model selection QP’ for the current picture. In this case, no additional information needs to be signaled.
- video encoder 200 includes video data memory' 230, mode selection unit 202, residual generation unit 204, transform processing unit 206, quantization unit 208, inverse quantization unit 210, inverse transform processing unit 212, reconstruction unit 214, filter unit 216, decoded picture buffer (DPB) 218, and entropy encoding unit 22.0.
- Any or all of video data memorj' 230, mode selection unit 202, residual generation unit 204, transform processing unit 206, quantization unit 208, inverse quantization unit 210, inverse transform processing unit 212, reconstruction unit 214, filter unit 216, DPB 218, and entropy encoding unit 220 may be implemented in one or more processors or in processing circuitry.
- video decoder 300 may be implemented as one or more circuits or logic elements as part of hardware circuitry, or as part of a processor, ASIC, or FPGA.
- video decoder 300 may include additional or alternative processors or processing circuitry to perform these and other functions.
- video decoder 300 reconstructs a picture on a block-by-block basis.
- Video decoder 300 may perform a reconstruction operation on each block individually ( w here the block currently being reconstructed, i.e., decoded, may be referred to as a “current block”).
- Entropy decoding unit 302 may entropy decode syntax elements defining quantized transform coefficients of a quantized transform coefficient block, as well as transform information, such as a quantization parameter (QP) and/or transform mode indication(s).
- Inverse quantization unit 306 may use the QP associated with the quantized transform coefficient block to determine a degree of quantization and, likewise, a degree of inverse quantization for inverse quantization unit 306 to apply.
- Inverse quantization unit 306 may, tor example, perform a bitwise left-shift operation to inverse quantize the quantized transform coefficients. Inverse quantization unit 306 may thereby form a transform coefficient block including transform coefficients.
- FIG. 7 represents an example of a device for filtering decoded video data including a memory' configured to store video data; and one or more processors implemented in circuitry and configured to: decode a picture of video data; code a value for a syntax element representing a neural network model to be used to filter a portion of the decoded picture, the value representing an index into a set of pre-defined neural network models, the index corresponding to the neural network model in the set of pre-defined neural network models; and filter the portion of the decoded picture using the neural network model corresponding to the index.
- FIG. 8 is a flowchart illustrating an example method for encoding a current block in accordance with the techniques of this disclosure.
- Tire current block may comprise a current CU.
- Video decoder 300 may receive entropy encoded data for the current block, such as entropy encoded prediction information and entropy encoded data for coefficients of a residual block corresponding to the current block (370). Video decoder 300 may entropy decode the entropy encoded data to determine prediction information for the current block, a neural network (NN) model for a portion of the picture including the current block, and to reproduce coefficients of the residual block (372). Video decoder 300 may predict the current block (374), e.g., using an intra- or inter-prediction mode as indicated by the prediction information for the current block, to calculate a prediction block for the current block.
- entropy encoded data for the current block such as entropy encoded prediction information and entropy encoded data for coefficients of a residual block corresponding to the current block (370).
- Video decoder 300 may entropy decode the entropy encoded data to determine prediction information for the current block, a neural network (NN) model for
- Clause 1 A method of filtering decoded video data, the method comprising: determining one or more neural network models to be used to filter a portion of a decoded picture of video data; and filtering the portion of the decoded picture using the one or more neural network models.
- Clause 37 Tire method of clause 23, wherein the value for the syntax element comprises a quantization parameter (QP) for the portion of the picture.
- QP quantization parameter
- Clause 38 The method of clause 23, further comprising encoding the picture prior to decoding the picture, wherein coding the value for the syntax element comprises encoding the value for the syntax element.
- Clause 78 The method of clause 77, further comprising coding a syntax element that jointly represents filtering using the neural network model for each of the color components of the decoded picture.
- Clause 81 The method of clause 80, further comprising determining the neural network model according to a rate -distortion computation.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Data Mining & Analysis (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Priority Applications (6)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| KR1020237009249A KR20230079360A (ko) | 2020-09-29 | 2021-09-24 | 비디오 코딩 동안 필터링을 위한 다중 뉴럴 네트워크 모델들 |
| BR112023004814A BR112023004814A2 (pt) | 2020-09-29 | 2021-09-24 | Múltiplos modelos de rede neural para filtragem durante a codificação de vídeo |
| PH1/2023/550197A PH12023550197A1 (en) | 2020-09-29 | 2021-09-24 | Multiple neural network models for filtering during video coding |
| EP21795101.1A EP4222960A1 (en) | 2020-09-29 | 2021-09-24 | Multiple neural network models for filtering during video coding |
| JP2023515167A JP7795528B2 (ja) | 2020-09-29 | 2021-09-24 | ビデオコーディング中にフィルタ処理するための複数のニューラルネットワークモデル |
| CN202180064854.4A CN116349226A (zh) | 2020-09-29 | 2021-09-24 | 用于视频编解码期间进行滤波的多神经网络模型 |
Applications Claiming Priority (4)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US202063085092P | 2020-09-29 | 2020-09-29 | |
| US63/085,092 | 2020-09-29 | ||
| US17/448,658 US11930215B2 (en) | 2020-09-29 | 2021-09-23 | Multiple neural network models for filtering during video coding |
| US17/448,658 | 2021-09-23 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2022072245A1 true WO2022072245A1 (en) | 2022-04-07 |
Family
ID=80821639
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/US2021/052008 Ceased WO2022072245A1 (en) | 2020-09-29 | 2021-09-24 | Multiple neural network models for filtering during video coding |
Country Status (8)
| Country | Link |
|---|---|
| US (3) | US11930215B2 (https=) |
| EP (1) | EP4222960A1 (https=) |
| JP (1) | JP7795528B2 (https=) |
| KR (1) | KR20230079360A (https=) |
| CN (1) | CN116349226A (https=) |
| BR (1) | BR112023004814A2 (https=) |
| PH (1) | PH12023550197A1 (https=) |
| WO (1) | WO2022072245A1 (https=) |
Cited By (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20240305829A1 (en) * | 2021-03-23 | 2024-09-12 | Sharp Kabushiki Kaisha | Systems and methods for signaling neural network-based in-loop filter parameter information in video coding |
| WO2025009295A1 (ja) * | 2023-07-03 | 2025-01-09 | パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ | 復号装置、符号化装置、復号方法及び符号化方法 |
| EP4554227A4 (en) * | 2022-07-05 | 2025-11-12 | Panasonic Ip Corp America | DECODING DEVICE, CODING DEVICE, DECODING METHOD, AND CODING METHOD |
Families Citing this family (29)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US11930215B2 (en) | 2020-09-29 | 2024-03-12 | Qualcomm Incorporated | Multiple neural network models for filtering during video coding |
| US11716469B2 (en) * | 2020-12-10 | 2023-08-01 | Lemon Inc. | Model selection in neural network-based in-loop filter for video coding |
| US12113995B2 (en) * | 2021-04-06 | 2024-10-08 | Lemon Inc. | Neural network-based post filter for video coding |
| US11979591B2 (en) * | 2021-04-06 | 2024-05-07 | Lemon Inc. | Unified neural network in-loop filter |
| US12323608B2 (en) * | 2021-04-07 | 2025-06-03 | Lemon Inc | On neural network-based filtering for imaging/video coding |
| US12095988B2 (en) | 2021-06-30 | 2024-09-17 | Lemon, Inc. | External attention in neural network-based video coding |
| US12603998B2 (en) | 2021-07-07 | 2026-04-14 | Lemon Inc. | Configurable neural network model depth in neural network-based video coding |
| US20230051066A1 (en) * | 2021-07-27 | 2023-02-16 | Lemon Inc. | Partitioning Information In Neural Network-Based Video Coding |
| WO2023158127A1 (ko) * | 2022-02-21 | 2023-08-24 | 현대자동차주식회사 | 트랜스포머 기반 인루프 필터를 이용하는 비디오 코딩을 위한 방법 및 장치 |
| CN119422374A (zh) * | 2022-06-16 | 2025-02-11 | 抖音视界有限公司 | 基于可变速率神经网络的压缩 |
| WO2023245194A1 (en) * | 2022-06-17 | 2023-12-21 | Bytedance Inc. | Partitioning information in neural network-based video coding |
| WO2023245544A1 (zh) * | 2022-06-23 | 2023-12-28 | Oppo广东移动通信有限公司 | 编解码方法、码流、编码器、解码器以及存储介质 |
| US12598314B2 (en) * | 2022-07-05 | 2026-04-07 | Qualcomm Incorporated | Neural network based filtering process for multiple color components in video coding |
| TW202404371A (zh) * | 2022-07-05 | 2024-01-16 | 美商高通公司 | 視訊譯碼中用於多種顏色分量的基於神經網路的濾波程序 |
| WO2024011386A1 (zh) * | 2022-07-11 | 2024-01-18 | 浙江大学 | 一种编解码方法、装置、编码器、解码器及存储介质 |
| WO2024078599A1 (en) * | 2022-10-13 | 2024-04-18 | Douyin Vision Co., Ltd. | Method, apparatus, and medium for video processing |
| WO2024080904A1 (en) * | 2022-10-13 | 2024-04-18 | Telefonaktiebolaget Lm Ericsson (Publ) | Selective application of neural network based filtering to picture regions |
| WO2024178432A1 (en) * | 2023-02-24 | 2024-08-29 | Google Llc | Switchable machine-learning models on common inference hardware for video coding |
| WO2024245299A1 (en) * | 2023-05-29 | 2024-12-05 | Douyin Vision Co., Ltd. | Method, apparatus, and medium for video processing |
| US12581126B2 (en) * | 2023-06-12 | 2026-03-17 | Qualcomm Incorporated | Low complexity NN-based in loop filter architectures with separable convolution |
| CN120345242A (zh) * | 2023-07-05 | 2025-07-18 | Lg电子株式会社 | 图像编码/解码方法、存储比特流的记录介质和发送比特流的方法 |
| WO2025048289A1 (ko) * | 2023-08-25 | 2025-03-06 | 현대자동차주식회사 | 디코더 역량에 따라 신경망 기반 비디오 코딩 툴들을 제어하기 위한 방법 |
| WO2025071276A1 (ko) * | 2023-09-27 | 2025-04-03 | 엘지전자 주식회사 | 영상 부호화/복호화 방법, 비트스트림을 전송하는 방법 및 비트스트림을 저장한 기록 매체 |
| WO2025071272A1 (ko) * | 2023-09-27 | 2025-04-03 | 엘지전자 주식회사 | 영상 부호화/복호화 방법, 비트스트림을 전송하는 방법 및 비트스트림을 저장한 기록 매체 |
| WO2025071278A1 (ko) * | 2023-09-27 | 2025-04-03 | 엘지전자 주식회사 | 영상 부호화/복호화 방법, 비트스트림을 전송하는 방법 및 비트스트림을 저장한 기록 매체 |
| WO2025129910A1 (en) * | 2023-12-21 | 2025-06-26 | Guangdong Oppo Mobile Telecommunications Corp., Ltd. | Dynamic mesh base geometry coding position adjustment method |
| WO2025131105A1 (en) * | 2023-12-22 | 2025-06-26 | Douyin Vision Co., Ltd. | Method, apparatus, and medium for video processing |
| CN120238656A (zh) * | 2023-12-28 | 2025-07-01 | 维沃移动通信有限公司 | 视频压缩处理方法、装置及电子设备 |
| CN119854487A (zh) * | 2025-01-06 | 2025-04-18 | 深圳传音控股股份有限公司 | 图像处理方法、处理设备及存储介质 |
Citations (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20190273948A1 (en) * | 2019-01-08 | 2019-09-05 | Intel Corporation | Method and system of neural network loop filtering for video coding |
Family Cites Families (14)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US8325797B2 (en) * | 2005-04-11 | 2012-12-04 | Maxim Integrated Products, Inc. | System and method of reduced-temporal-resolution update for video coding and quality control |
| WO2019009448A1 (ko) | 2017-07-06 | 2019-01-10 | 삼성전자 주식회사 | 영상을 부호화 또는 복호화하는 방법 및 장치 |
| US11631199B2 (en) | 2017-08-10 | 2023-04-18 | Sharp Kabushiki Kaisha | Image filtering apparatus, image decoding apparatus, and image coding apparatus |
| EP3451293A1 (en) | 2017-08-28 | 2019-03-06 | Thomson Licensing | Method and apparatus for filtering with multi-branch deep learning |
| CN109819253B (zh) * | 2017-11-21 | 2022-04-22 | 腾讯科技(深圳)有限公司 | 视频编码方法、装置、计算机设备和存储介质 |
| CN108184129B (zh) * | 2017-12-11 | 2020-01-10 | 北京大学 | 一种视频编解码方法、装置及用于图像滤波的神经网络 |
| WO2019244116A1 (en) * | 2018-06-21 | 2019-12-26 | Beijing Bytedance Network Technology Co., Ltd. | Border partition in video coding |
| US11720997B2 (en) | 2018-10-19 | 2023-08-08 | Samsung Electronics Co.. Ltd. | Artificial intelligence (AI) encoding device and operating method thereof and AI decoding device and operating method thereof |
| US11282172B2 (en) * | 2018-12-11 | 2022-03-22 | Google Llc | Guided restoration of video data using neural networks |
| EP3706046A1 (en) | 2019-03-04 | 2020-09-09 | InterDigital VC Holdings, Inc. | Method and device for picture encoding and decoding |
| TWI747339B (zh) * | 2019-06-27 | 2021-11-21 | 聯發科技股份有限公司 | 視訊編解碼之方法和裝置 |
| US11363307B2 (en) * | 2019-08-08 | 2022-06-14 | Hfi Innovation Inc. | Video coding with subpictures |
| US20220295116A1 (en) * | 2019-09-20 | 2022-09-15 | Intel Corporation | Convolutional neural network loop filter based on classifier |
| US11930215B2 (en) | 2020-09-29 | 2024-03-12 | Qualcomm Incorporated | Multiple neural network models for filtering during video coding |
-
2021
- 2021-09-23 US US17/448,658 patent/US11930215B2/en active Active
- 2021-09-24 CN CN202180064854.4A patent/CN116349226A/zh active Pending
- 2021-09-24 KR KR1020237009249A patent/KR20230079360A/ko active Pending
- 2021-09-24 JP JP2023515167A patent/JP7795528B2/ja active Active
- 2021-09-24 PH PH1/2023/550197A patent/PH12023550197A1/en unknown
- 2021-09-24 EP EP21795101.1A patent/EP4222960A1/en active Pending
- 2021-09-24 WO PCT/US2021/052008 patent/WO2022072245A1/en not_active Ceased
- 2021-09-24 BR BR112023004814A patent/BR112023004814A2/pt unknown
-
2024
- 2024-02-06 US US18/433,946 patent/US12356014B2/en active Active
-
2025
- 2025-06-19 US US19/243,453 patent/US20250317604A1/en active Pending
Patent Citations (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20190273948A1 (en) * | 2019-01-08 | 2019-09-05 | Intel Corporation | Method and system of neural network loop filtering for video coding |
Non-Patent Citations (3)
| Title |
|---|
| BROSS ET AL.: "Versatile Video Coding (Draft 9", JOINT VIDEO EXPERTS TEAM (JVET) OF ITU-T SG 16 WP 3 AND ISO/IEC JTC 1/SC 29/WG 11, 18TH MEETING |
| PARK WOON-SUNG ET AL: "CNN-based in-loop filtering for coding efficiency improvement", 2016 IEEE 12TH IMAGE, VIDEO, AND MULTIDIMENSIONAL SIGNAL PROCESSING WORKSHOP (IVMSP), IEEE, 11 July 2016 (2016-07-11), pages 1 - 5, XP032934608, DOI: 10.1109/IVMSPW.2016.7528223 * |
| ZHOU (HIKVISION) L ET AL: "Convolutional Neural Network Filter (CNNF) for intra frame", no. JVET-I0022, 24 January 2018 (2018-01-24), XP030248070, Retrieved from the Internet <URL:http://phenix.int-evry.fr/jvet/doc_end_user/documents/9_Gwangju/wg11/JVET-I0022-v4.zip JVET-I0022-v4.doc> [retrieved on 20180124] * |
Cited By (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20240305829A1 (en) * | 2021-03-23 | 2024-09-12 | Sharp Kabushiki Kaisha | Systems and methods for signaling neural network-based in-loop filter parameter information in video coding |
| EP4554227A4 (en) * | 2022-07-05 | 2025-11-12 | Panasonic Ip Corp America | DECODING DEVICE, CODING DEVICE, DECODING METHOD, AND CODING METHOD |
| WO2025009295A1 (ja) * | 2023-07-03 | 2025-01-09 | パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ | 復号装置、符号化装置、復号方法及び符号化方法 |
Also Published As
| Publication number | Publication date |
|---|---|
| JP2023542841A (ja) | 2023-10-12 |
| JP7795528B2 (ja) | 2026-01-07 |
| PH12023550197A1 (en) | 2024-06-24 |
| KR20230079360A (ko) | 2023-06-07 |
| US20250317604A1 (en) | 2025-10-09 |
| EP4222960A1 (en) | 2023-08-09 |
| US20220103864A1 (en) | 2022-03-31 |
| CN116349226A (zh) | 2023-06-27 |
| BR112023004814A2 (pt) | 2023-04-18 |
| US20240244265A1 (en) | 2024-07-18 |
| TW202218422A (zh) | 2022-05-01 |
| US11930215B2 (en) | 2024-03-12 |
| US12356014B2 (en) | 2025-07-08 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US12356014B2 (en) | Multiple neural network models for filtering during video coding | |
| US12341959B2 (en) | Filtering process for video coding | |
| EP3959891B1 (en) | Adaptive loop filter set index signaling | |
| SG11202112627SA (en) | Transform and last significant coefficient position signaling for low-frequency non-separable transform in video coding | |
| SG11202111553RA (en) | Reference picture resampling and inter-coding tools for video coding | |
| EP4035390A1 (en) | Low-frequency non-separable transform (lfnst) simplifications | |
| KR20230038709A (ko) | 다중 적응형 루프 필터 세트들 | |
| WO2022072684A1 (en) | Activation function design in neural network-based filtering process for video coding | |
| EP4082211A1 (en) | Lfnst signaling for chroma based on chroma transform skip | |
| EP4186235B1 (en) | Deblocking filter parameter signaling | |
| KR20230129015A (ko) | 비디오 코딩 동안의 필터링을 위한 다수의 신경망 모델들 | |
| WO2021055746A1 (en) | Transform unit design for video coding | |
| WO2021072215A1 (en) | Signaling coding scheme for residual values in transform skip for video coding | |
| WO2022221829A1 (en) | Intra-mode dependent multiple transform selection for video coding | |
| EP4085623A1 (en) | Chroma transform skip and joint chroma coding enabled block in video coding | |
| WO2020072781A1 (en) | Wide-angle intra prediction for video coding | |
| EP4226613A1 (en) | Adaptively deriving rice parameter values for high bit-depth video coding |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21795101 Country of ref document: EP Kind code of ref document: A1 |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 202347006344 Country of ref document: IN |
|
| ENP | Entry into the national phase |
Ref document number: 2023515167 Country of ref document: JP Kind code of ref document: A |
|
| REG | Reference to national code |
Ref country code: BR Ref legal event code: B01A Ref document number: 112023004814 Country of ref document: BR |
|
| ENP | Entry into the national phase |
Ref document number: 112023004814 Country of ref document: BR Kind code of ref document: A2 Effective date: 20230315 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| ENP | Entry into the national phase |
Ref document number: 2021795101 Country of ref document: EP Effective date: 20230502 |