WO2016193949A1 - Procédé de codage vidéo avancé, système, appareil et support d'enregistrement - Google Patents

Procédé de codage vidéo avancé, système, appareil et support d'enregistrement Download PDF

Info

Publication number
WO2016193949A1
WO2016193949A1 PCT/IB2016/053284 IB2016053284W WO2016193949A1 WO 2016193949 A1 WO2016193949 A1 WO 2016193949A1 IB 2016053284 W IB2016053284 W IB 2016053284W WO 2016193949 A1 WO2016193949 A1 WO 2016193949A1
Authority
WO
WIPO (PCT)
Prior art keywords
region
interest
video
bitrate
data
Prior art date
Application number
PCT/IB2016/053284
Other languages
English (en)
Inventor
Tod BRYANT
Original Assignee
New Cinema, LLC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US14/731,135 external-priority patent/US20150312575A1/en
Application filed by New Cinema, LLC filed Critical New Cinema, LLC
Publication of WO2016193949A1 publication Critical patent/WO2016193949A1/fr

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/115Selection of the code volume for a coding unit prior to coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/167Position within a video image, e.g. region of interest [ROI]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/186Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component

Definitions

  • This disclosure relates to video compression.
  • H.264 is an industry standard for video compression, the process of converting digital video into a format that takes up less capacity when it is stored or bandwidth when transmitted.
  • Video compression (or video coding) is a technology incorporated in applications such as digital television, DVD-Video, mobile TV, videoconferencing and Internet video streaming, among others.
  • An encoder converts video into a compressed format and a decoder converts compressed video back into an uncompressed format. Standardizing video compression makes it possible for products from different manufacturers (e.g. encoders, decoders and storage media) to inter-operate.
  • Video coding may include, without limitation, the following: video compression, coding, encoding, decoding, processing, preprocessing, and the performing of functions and actions relating, accompanying, preceding or following video compression, such as may occur for transmission or storage of compressed video.
  • Video compression may be video compression, coding, encoding, decoding, processing, preprocessing and functions relating, accompanying, preceding, following, aiding, preparing, overlapping, or simultaneous with video compression in accordance with the H.264 standard or H.264/AVC standard.
  • video coding may include any function or action occurring in close relationship to video compression, coding, encoding, decoding or processing, and may include any functions or actions preceding, following, aiding, preparing, overlapping or simultaneous with video compression.
  • video coding may include any function or action performed in accordance with, or by functioning of, systems, methods, apparatus, and computer readable storage mediums for video processing, and may be embodied in, encompass, or include hardware, software, executable code or executable instructions for video processing.
  • Figure 1 depicts an embodiment wherein an encoder performs H.264 video compression encoder carrying out prediction, transform and encoding processes.
  • Figure 2 depicts an embodiment including intra prediction using 16x16 and 4x4 block sizes to predict a macroblock from surrounding, previously-coded pixels within the same frame.
  • Figure 3 depicts an embodiment including inter prediction using a range of block sizes (from 16x16 down to 4x4) to predict pixels in the current frame from similar regions in previously-coded frames.
  • Figure 4 depicts an embodiment including a DCT transform providing an image block wherein each basis pattern is weighted according to a coefficient value, and the weighted basis patterns are combined.
  • Figure 5 depicts an exemplary embodiment having an aspect relating to a blind spot.
  • Figure 6 depicts an exemplary embodiment in an aspect thereof providing compressed video, wherein objective assessment of video quality of compressed video is provided by comparing a Microsoft Media Room file produced by encoding a reference video sample (designated by Microsoft Corporation) with the Microsoft Media Room encoder product, and a file produced by encoding the same reference video source according to the embodiment, and illustrating objective video quality metrics for comparison.
  • objective assessment of video quality of compressed video is provided by comparing a Microsoft Media Room file produced by encoding a reference video sample (designated by Microsoft Corporation) with the Microsoft Media Room encoder product, and a file produced by encoding the same reference video source according to the embodiment, and illustrating objective video quality metrics for comparison.
  • Figure 7 depicts for an embodiment shown generally in Figure 6 a photograph of a reference video sample.
  • Figure 10 depicts for an embodiment a graph of test results representing quality metrics for compressed video produced by encoding a reference video sample of Figure 6 according to the embodiment at a bitrate of 4.0 Mbps and for comparison thereto quality metrics for compressed video produced by encoding a reference video sample of Figure 6 with a Microsoft Media Room encoder at a bitrate of 9.0 Mbps.
  • Figure 11 depicts for an embodiment a graph of test results representing quality metrics for compressed video produced by encoding a reference video sample of Figure 6 according to the embodiment at a bitrate of 3.5 Mbps and for comparison thereto quality metrics for compressed video produced by encoding a reference video sample of Figure 6 with a Microsoft Media Room encoder at a bitrate of 9.0 Mbps.
  • Figure 12 depicts for an embodiment a graph of test results representing quality metrics for compressed video produced by encoding a reference video sample of Figure 6 according to the embodiment at a bitrate of 3.0 Mbps and for comparison thereto quality metrics for compressed video produced by encoding a reference video sample of Figure 6 with a Microsoft Media Room encoder at a bitrate of 9.0 Mbps.
  • Figure 13 depicts for an embodiment a graph of test results representing quality metrics for compressed video produced by encoding a reference video sample of Figure 6 according to the embodiment at a bitrate of 2.5 Mbps and for comparison thereto quality metrics for compressed video produced by encoding a reference video sample of Figure 6 with a Microsoft Media Room encoder at a bitrate of 9.0 Mbps.
  • Figure 14 depicts for an embodiment a graph of test results representing quality metrics for compressed video produced by encoding a reference video sample of Figure 6 according to the embodiment at a bitrate of 2.0 Mbps and for comparison thereto quality metrics for compressed video produced by encoding a reference video sample of Figure 6 with a Microsoft Media Room encoder at a bitrate of 9.0 Mbps.
  • an encoder may include and perform prediction methods supported by H.264 that may be more flexible and that may enable accurate prediction and more efficient video compression.
  • intra prediction may, for example, use 16x16 and 4x4 block sizes to predict a macroblock from surrounding, previously-coded pixels within the same frame (see particularly, for example, Figure 2).
  • Inter prediction may use a range of block sizes (from 16x16 down to 4x4) to predict pixels in the current frame from similar regions in previously-coded frames (see particularly, for example, Figure 3).
  • an encoder may perform transform and quantization. It will be understood that identifying a suitable inter-coding prediction may be described as motion estimation, and subtracting an inter-coding prediction from a current macroblock to produce a difference block, or block of residuals, may be described as motion compensation.
  • both a motion estimation algorithm and bitrate control for processing video data of an object of interest, or region of interest can be improved and performed to provide greater image quality, reduced processing load, or both, and to further compress video data of a background region other than the object of interest or region of interest.
  • the preceding may be performed to provide a target bitrate.
  • Such a target bitrate may be constant or may change.
  • Figure 9 depicts for an embodiment a graph of test results representing quality metrics for compressed video produced by encoding a reference video sample of Figure 6 according to the embodiment at a bitrate of 4.5 Mbps and for comparison thereto quality metrics for compressed video produced by encoding a reference video sample of Figure 6 with a Microsoft Media Room encoder at a bitrate of 9.0 Mbps.
  • a decoder is deemed to be conformant to a given profile at a given level of a standard if such a decoder is capable of all allowed values of all syntactic elements specified by that profile at that level.
  • embodiments of disclosed subject matter may provide methods, systems, apparatus, and storage medium for coding compressed video that are compliant with one or more hybrid coding standards including, without limitation, H.264, H.264/AVC, MPEG-x, and HEVC standards. It will be understood that according to embodiments an encoder may create a compressed bitstream on the fly or stored in memory.
  • H.264, H.264/AVC, HEVC and similar standards describe data processing and manipulation techniques that are suited to compression of video, audio and other information using fixed or variable length source coding techniques.
  • these and other hybrid coding standards and techniques compress video information using intra-frame coding techniques (such as, for example, run-length coding, Huffman coding and the like) and inter-frame coding techniques (such as, for example, forward and backward predictive coding, motion compensation, and the like).
  • intra-frame coding techniques such as, for example, run-length coding, Huffman coding and the like
  • inter-frame coding techniques such as, for example, forward and backward predictive coding, motion compensation, and the like.
  • hybrid video processing systems are characterized by prediction-based compression encoding of video frames with intra-frame and/or inter-frame motion compensation encoding.
  • inter-frame coding refers to encoding a picture (a field or frame) with reference to another picture. Compared to the intra-coded frame, the inter-coded or predicted frame (or P-frame) may be coded with greater efficiency.
  • P-frame refers to encoding a picture (a field or frame) with reference to another picture.
  • inter-coded or predicted frame or P-frame
  • P-frame refers to encoding a picture (a field or frame) with reference to another picture.
  • P-frame predicted frame
  • B- frames Bi-directional predicted frames
  • Other terms that may be used by those of skill for video objects formed with inter- coding may include high-pass coding, residual coding, motion compensated interpolation, and other names known to those of ordinary skill in the art.
  • decoder 1555 may provide a reconstructed bitstream to a suitable display device 1515.
  • decoder 1555 may provide a reconstructed bitstream to a memory or storage, a network interface, or to a computing device for other processing such as, for example, object detection.
  • a hybrid coding scheme may comply with the H.264 standard.
  • an H.264 video encoder may perform prediction, transform and encoding processes (see Figure 3) to produce an H.264 compressed bitstream.
  • An H.264 video decoder may perform complementary processes of decoding, inverse transform and reconstruction to produce a decoded video sequence.
  • FIG. 16 illustrates a block diagram of an exemplary embodiment of a video processing system 1600 for perceptual filtering of video image data.
  • Video processing system 1600 may be, for example, a spatial filter for processing each frame in a video sequence independently as an image.
  • the filter receives an input video image 1605.
  • the filter may also receive additional inputs such as viewing distance between a display and a viewer/user, an effective contrast ratio of the display, and pixel density of the display.
  • the input video image 1605 is first converted to linear RGB space 1610. Next, the luminance channel is computed 1615. The black level is then adjusted 1620.
  • the contrast ratio 1625 may be the effective contrast ratio of a display.
  • Local DC values are estimated 1630 by, for example, applying a Gaussian low pass filter.
  • Amplitude estimation 1635 is then performed on the difference image.
  • the difference image is obtained by taking the absolute difference between the DC estimation 1630 and the luminance and image (the output of black level adjustment 1620).
  • the contrast sensitivity is subsequently estimated 1650.
  • the cutoff frequency is estimated 1655.
  • the cutoff frequency may be estimated 1655, for example, by employing the Movshon and Kiorpes CSF model, which yields an algorithm for computing the highest visible frequency.
  • the cutoff frequency is estimated 1655, the estimated frequency is passed as a parameter to an adaptive low pass filter 1660.
  • a conversion to the desired output color format 1665 is performed, and the video or image signal is output 1670.
  • method 1700 may further include: the compensating 1750 further comprising referring 1755 to a salience indicator for the motion of video data for the region of interest, the salience indicator relating to inter-frame video data.
  • Method 1700 may further comprise outputting the video data 1760.
  • the method for encoding video frames 1800 may determine a region of attention or an object of interest in accordance with a visual perception model 1810 by, for example, analyzing motion in the objects rendered or displayed in the video.
  • the visual perception model 1810 may include or function by considering several video metrics such as luminance and motion vectors.
  • Visual perception model 1810 may also consider salience and the results of edge detection within individual video frames.
  • visual perception model 1810 may include or function by considering the following video metrics: luminance, motion vectors and salience.
  • the region of attention or object of interest may be identified in the frame and that portion of the frame may be processed 1815 at a first bitrate.
  • the processing 1815 may be accomplished without physically isolating the region of attention or object of interest from the remainder of the frame.
  • the processing 1815 may be accomplished or performed without isolating a representation, such as a data model, of the region of interest or object of interest, from the remainder of the frame.
  • the region of attention or object of interest may be processed 1815 with limits based on a first bitrate such as, for example, the bitrate of the video source.
  • the background may be processed 1820 with the second or remaining bitrate, where the second or remaining bitrate is provided by deducting from a target bitrate the first bitrate selected for processing the region of attention or object of interest.
  • the region of attention or object of interest may be processed by, for example, determining the maximum quality at a given bitrate based on the ratio or percentage of the size of the region of attention or object of interest compared to the original video source frame.
  • the remainder of the original video source frame i.e., the original video source frame excluding the region of attention or object of interest
  • both the visual attention region and the remainder of the original video source may be blended together 1825 to ensure or provide a resulting full frame of video.
  • the blending 1825 may include performing a dithering function on the background or the region of attention or object of interest portions of the video.
  • the visual quality of the resulting full frame provided by blending 1825 may be improved by applying a "clean up" or "stitching" process.
  • a video filter such as a deblocking filter may be applied to the resulting full frame to improve the visual quality and prediction performance by smoothing the sharp edges which may form between macroblocks when block coding techniques are used.
  • the resulting full frame may be recombined 1830 with corresponding audio to produce a final video wrapper.
  • the video coding method may accomplish the recombination 1830 by, for example, a multiplexer.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

Certains modes de réalisation de l'invention concernent des procédés et des systèmes de codage vidéo qui consistent à identifier une région d'intérêt et à compresser des données de la région d'intérêt à un premier débit binaire.
PCT/IB2016/053284 2015-06-04 2016-06-03 Procédé de codage vidéo avancé, système, appareil et support d'enregistrement WO2016193949A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US14/731,135 US20150312575A1 (en) 2012-04-16 2015-06-04 Advanced video coding method, system, apparatus, and storage medium
US14/731,135 2015-06-04

Publications (1)

Publication Number Publication Date
WO2016193949A1 true WO2016193949A1 (fr) 2016-12-08

Family

ID=56117912

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IB2016/053284 WO2016193949A1 (fr) 2015-06-04 2016-06-03 Procédé de codage vidéo avancé, système, appareil et support d'enregistrement

Country Status (1)

Country Link
WO (1) WO2016193949A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113824996A (zh) * 2021-09-26 2021-12-21 深圳市商汤科技有限公司 信息处理方法及装置、电子设备和存储介质

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120177121A1 (en) * 2009-09-04 2012-07-12 Stmicroelectronics Pvt. Ltd. Advance video coding with perceptual quality scalability for regions of interest
US20140133554A1 (en) * 2012-04-16 2014-05-15 New Cinema Advanced video coding method, apparatus, and storage medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120177121A1 (en) * 2009-09-04 2012-07-12 Stmicroelectronics Pvt. Ltd. Advance video coding with perceptual quality scalability for regions of interest
US20140133554A1 (en) * 2012-04-16 2014-05-15 New Cinema Advanced video coding method, apparatus, and storage medium

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
RICHARDSON, LAIN E.: "H.264 and MPEG-4 Video Compression: Video Coding for Next Generation Multimedia", 2004, JOHN WILEY & SONS, pages: 306
RICHARDSON, LAIN E.: "The H.264 Advanced Video Compression Standard", 2011, JOHN WILEY & SONS, pages: 346
WIEN, MATHIAS: "High Efficiency Video Coding: Coding Tools and Specification", 2014, SPRINGER-VERLAG, pages: 314

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113824996A (zh) * 2021-09-26 2021-12-21 深圳市商汤科技有限公司 信息处理方法及装置、电子设备和存储介质

Similar Documents

Publication Publication Date Title
US20150312575A1 (en) Advanced video coding method, system, apparatus, and storage medium
US11743475B2 (en) Advanced video coding method, system, apparatus, and storage medium
JP7269257B2 (ja) フレームレベル超解像ベースビデオ符号化
KR102270095B1 (ko) 움직임 벡터 정밀도의 선택
US10291934B2 (en) Modified HEVC transform tree syntax
US20150139303A1 (en) Encoding device, encoding method, decoding device, and decoding method
KR102606414B1 (ko) 디블로킹 필터의 경계 강도를 도출하는 인코더, 디코더 및 대응 방법
CN105359531A (zh) 针对屏幕内容编码的编码器侧判定
CN113508592A (zh) 编码器、解码器及相应的帧间预测方法
KR102558495B1 (ko) Hls를 시그널링하는 영상 부호화/복호화 방법, 장치 및 비트스트림을 저장한 컴퓨터 판독 가능한 기록 매체
CN113785573A (zh) 编码器、解码器和使用自适应环路滤波器的对应方法
CN114902661A (zh) 用于跨分量线性模型预测的滤波方法和装置
KR20130103140A (ko) 영상압축 전에 행해지는 전처리 방법, 영상 압축률 개선을 위한 적응성 움직임 추정방법 및 영상 타입별 영상 데이터 제공방법
JP7247349B2 (ja) イントラ予測のための成分間線形モデリングの方法、装置、デコーダ、エンコーダ、およびプログラム
US20140133554A1 (en) Advanced video coding method, apparatus, and storage medium
EP3278558B1 (fr) Sélection et prédiction de vecteur de mouvement dans des systèmes et des procédés de codage vidéo
US10742979B2 (en) Nonlinear local activity for adaptive quantization
WO2016193949A1 (fr) Procédé de codage vidéo avancé, système, appareil et support d'enregistrement
Jung Comparison of video quality assessment methods
Milicevic et al. HEVC vs. H. 264/AVC standard approach to coder’s performance evaluation
RU2786086C1 (ru) Способ и устройство кросс-компонентного линейного моделирования для внутреннего предсказания
KR20180113868A (ko) 카메라 영상의 복호화 정보 기반 영상 재 부호화 방법 및 이를 이용한 영상 재부호화 시스템
KR20230169986A (ko) 복수의 dimd 모드 기반 인트라 예측 방법 및 장치
KR20230017819A (ko) 영상 코딩 방법 및 그 장치
Makris et al. Digital Video Coding Principles from H. 261 to H. 265/HEVC

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16728399

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16728399

Country of ref document: EP

Kind code of ref document: A1