KR102600727B1

KR102600727B1 - Binary arithmetic coding using parameterized probability estimation finite state machines

Info

Publication number: KR102600727B1
Application number: KR1020197030906A
Authority: KR
Inventors: 아미르 사이드; 마르타 카르체비츠; 리 장
Original assignee: 퀄컴 인코포레이티드
Priority date: 2017-03-22
Filing date: 2018-03-22
Publication date: 2023-11-09
Also published as: CN110419216B; SG11201907259YA; KR20190128224A; BR112019019170A2; WO2018175716A1; US10554988B2; ES2878325T3; US20180278946A1; AU2018237342A1; CN110419216A; EP3603062B1; EP3603062A1; AU2018237342B2

Abstract

빈 스트림의 적어도 하나의 각각의 빈에 대해, 디코더는 그 각각의 빈에 대한 상태, 각각의 빈에 대한 간격, 및 오프셋 값에 기초하여, 그 각각의 빈의 값을 결정할 수도 있다. 추가적으로, 디코더는 빈 스트림의 다음 빈에 대해 하나 이상의 유한 상태 머신 (FSM) 파라미터들을 결정한다. 다음 빈에 대한 상기 하나 이상의 FSM 파라미터들은 다음 빈에 대한 확률 추정치들이 어떻게 각각의 빈에 대한 상태로부터 계산되는지를 제어한다. 디코더는, 각각의 빈에 대한 상태, 빈 스트림의 상기 다음 빈에 대한 하나 이상의 FSM 파라미터들, 및 각각의 빈의 값을 입력으로서 취하는 파라미터화된 상태 업데이팅 함수를 이용하여 빈 스트림의 다음 빈에 대한 상태를 결정한다. 디코더는 디코딩된 신택스 엘리먼트를 형성하기 위해 빈 스트림을 이진화해제할 수도 있다.For at least one each bin of the bin stream, the decoder may determine the value of each bin based on the state for that respective bin, the interval for each bin, and the offset value. Additionally, the decoder determines one or more finite state machine (FSM) parameters for the next bin of the bin stream. The one or more FSM parameters for the next bin control how probability estimates for the next bin are calculated from the state for each bin. The decoder updates the next bin in the bin stream using a parameterized state updating function that takes as input the state for each bin, one or more FSM parameters for the next bin in the bin stream, and the value of each bin. determine the status of The decoder may debinarize the empty stream to form decoded syntax elements.

Description

Binary arithmetic coding using parameterized probability estimation finite state machines

이 출원은 2017년 3월 22일자로 출원된 미국 가 특허 출원 제 62/474,919 호, 및 2017 년 3월 23일자로 출원된 미국 가 특허 출원 제 62/475,609 호의 이익을 주장하고, 그것들 각각의 전체 내용은 참조에 의해 본원에 통합된다.This application claims the benefit of U.S. Provisional Patent Application No. 62/474,919, filed March 22, 2017, and U.S. Provisional Patent Application No. 62/475,609, filed March 23, 2017, each of which is filed in its entirety. The content is incorporated herein by reference.

기술 분야technology field

본 개시는 비디오 코딩, 예컨대, 비디오 인코딩 및 비디오 디코딩에 관한 것이다.This disclosure relates to video coding, such as video encoding and video decoding.

배경background

디지털 비디오 능력들은 디지털 텔레비전들, 디지털 직접 브로드캐스트 시스템들, 무선 브로드캐스트 시스템들, 개인용 디지털 보조기들 (PDA들), 랩탑 또는 데스크탑 컴퓨터들, 태블릿 컴퓨터들, e-북 리더들, 디지털 카메라들, 디지털 레코딩 디바이스들, 디지털 미디어 플레이어들, 비디오 게이밍 디바이스들, 비디오 게임 콘솔들, 셀룰러 또는 위성 무선 전화기들, 소위 "스마트 폰들", 비디오 텔레컨퍼런싱 디바이스들, 비디오 스트리밍 디바이스들 등을 포함한, 광범위한 디바이스들에 통합될 수 있다. 디지털 비디오 디바이스들은 MPEG-2, MPEG-4, ITU-T H.263, ITU-T H.264/MPEG-4, 파트 10, 어드밴스드 비디오 코딩 (AVC), ITU-T H.265, 고효율 비디오 코딩 (HEVC) 표준에 의해 정의된 표준들, 및 그러한 표준들의 확장들에서 설명된 기법들과 같은 비디오 압축 기법들을 구현한다. 비디오 디바이스들은 그러한 비디오 압축 기법들을 구현함으로써 디지털 비디오 정보를 더 효율적으로 송신, 수신, 인코딩, 디코딩, 및/또는 저장할 수도 있다.Digital video capabilities include digital televisions, digital direct broadcast systems, wireless broadcast systems, personal digital assistants (PDAs), laptop or desktop computers, tablet computers, e-book readers, digital cameras, A wide range of devices, including digital recording devices, digital media players, video gaming devices, video game consoles, cellular or satellite wireless phones, so-called “smart phones,” video teleconferencing devices, video streaming devices, etc. can be integrated into Digital video devices support MPEG-2, MPEG-4, ITU-T H.263, ITU-T H.264/MPEG-4, Part 10, Advanced Video Coding (AVC), ITU-T H.265, High Efficiency Video Coding Implements video compression techniques, such as those described in the standards defined by the (HEVC) standard, and extensions of those standards. Video devices may transmit, receive, encode, decode, and/or store digital video information more efficiently by implementing such video compression techniques.

비디오 압축 기법들은 비디오 시퀀스들에 내재한 리던던시를 감소 또는 제거하기 위해 공간 (인트라-픽처) 예측 및/또는 시간 (인터-픽처) 예측을 수행할 수도 있다. 블록 기반 비디오 코딩에 대해, 비디오 슬라이스 (예를 들어, 비디오 프레임 또는 비디오 프레임의 부분) 는 코딩 트리 블록들 및 코딩 블록들과 같은 비디오 블록들로 파티셔닝될 수도 있다. 공간 또는 시간 예측은 코딩될 블록에 대한 예측성 블록을 발생시킨다. 잔차 데이터는 코딩될 원래의 블록과 예측성 블록 간의 픽셀 차이들을 나타낸다. 추가의 압축을 위하여, 잔차 데이터는 픽셀 도메인으로부터 변환 도메인으로 변환되어 잔차 변환 계수들을 낳을 수도 있고, 그 후 이들은 양자화될 수도 있다.Video compression techniques may perform spatial (intra-picture) prediction and/or temporal (inter-picture) prediction to reduce or remove redundancy inherent in video sequences. For block-based video coding, a video slice (e.g., a video frame or portion of a video frame) may be partitioned into video blocks, such as coding tree blocks and coding blocks. Spatial or temporal prediction generates a predictive block for the block to be coded. Residual data represents pixel differences between the original block to be coded and the predictive block. For further compression, the residual data may be transformed from the pixel domain to the transform domain resulting in residual transform coefficients, which may then be quantized.

일반적으로, 본 개시는 산술 코딩 (arithmetic coding) 에 관련된 기법들을 기술한다. 본원에 기술된 바와 같이, 본 개시의 기법들은 적응적 확률 추정을 잠재적으로 향상시킴으로써 비디오 데이터의 압축을 향상시킬 수도 있다.In general, this disclosure describes techniques related to arithmetic coding. As described herein, techniques of this disclosure may improve compression of video data, potentially by improving adaptive probability estimation.

하나의 예에서, 본 개시는, 비디오 데이터를 디코딩하는 방법을 기술하고, 상기 방법은, 비트스트림에 포함된 오프셋 값에 이진 산술 디코딩을 적용하는 것에 의해, 디코딩된 신택스 엘리먼트를 결정하는 단계로서, 상기 이진 산술 디코딩을 적용하는 것은, 빈 스트림을 생성하는 것으로서, 상기 빈 스트림을 생성하는 것은, 빈 스트림의 적어도 하나의 각각의 빈에 대해, 각각의 빈에 대한 상태, 각각의 빈에 대한 간격, 및 오프셋 값에 기초하여 각각의 빈의 값을 결정하는 것; 빈 스트림의 다음 빈에 대한 하나 이상의 유한 상태 머신 (FSM) 파라미터들을 결정하는 것으로서, 다음 빈에 대한 하나 이상의 FSM 파라미터들은 다음 빈에 대한 확률 추정치들이 어떻게 각각의 빈에 대한 상태로부터 계산되는지를 제어하고, 빈 스트림의 다음 빈은 빈 스트림에서 각각의 빈을 뒤따르는, 상기 하나 이상의 FSM 파라미터들을 결정하는 것; 및, 각각의 빈에 대한 상태, 빈 스트림의 다음 빈에 대한 하나 이상의 FSM 파라미터들, 및 각각의 빈의 값을 입력으로서 취하는 파라미터화된 상태 업데이팅 함수를 이용하여 빈 스트림의 다음 빈에 대한 상태를 결정하는 것을 포함하는, 상기 빈 스트림을 생성하는 것; 및, 디코딩된 신택스 엘리먼트를 형성하기 위해 빈 스트림을 이진화해제하는 것을 포함하는, 상기 디코딩된 신택스 엘리먼트를 결정하는 단계; 및, 디코딩된 신택스 엘리먼트에 부분적으로 기초하여 비디오 데이터의 픽처를 재구성하는 단계를 포함한다.In one example, the present disclosure describes a method of decoding video data, the method comprising determining a decoded syntax element by applying binary arithmetic decoding to an offset value included in a bitstream, comprising: Applying the binary arithmetic decoding generates an empty stream, wherein generating the empty stream includes: for each bin of at least one of the empty stream, a state for each bin, an interval for each bin, and determining the value of each bin based on the offset value; determining one or more finite state machine (FSM) parameters for the next bin of the bin stream, wherein the one or more FSM parameters for the next bin control how probability estimates for the next bin are calculated from the state for each bin; , determining the one or more FSM parameters, wherein the next bin in the bin stream follows each bin in the bin stream; and, the state for the next bin in the bin stream using a parameterized state updating function that takes as input the state for each bin, one or more FSM parameters for the next bin in the bin stream, and the value of each bin. generating the empty stream, including determining: and, determining the decoded syntax element, including debinarizing the empty stream to form the decoded syntax element; and reconstructing a picture of the video data based in part on the decoded syntax elements.

다른 예에서, 본 개시는 비디오 데이터를 인코딩하는 방법을 기술하고, 상기 방법은, 비디오 데이터에 기초하여 신택스 엘리먼트를 생성하는 단계; 적어도 부분적으로, 신택스 엘리먼트에 이진 산술 인코딩을 적용하는 것에 의해, 오프셋 값을 결정하는 단계로서, 상기 이진 산술 인코딩을 적용하는 것은, 적어도 부분적으로, 신택스 엘리먼트를 이진화하는 것; 빈 스트림의 적어도 하나의 각각의 빈에 대해, 각각의 빈에 대한 상태, 각각의 빈에 대한 간격, 및 각각의 빈의 값에 기초하여 빈 스트림의 다음 빈에 대한 간격을 결정하는 것; 빈 스트림의 다음 빈에 대한 하나 이상의 유한 상태 머신 (FSM) 파라미터들을 결정하는 것으로서, 다음 빈에 대한 하나 이상의 FSM 파라미터들은 다음 빈에 대한 확률 추정치들이 어떻게 각각의 빈에 대한 상태로부터 계산되는지를 제어하는, 상기 하나 이상의 FSM 파라미터들을 결정하는 것; 및, 각각의 빈에 대한 상태, 빈 스트림의 다음 빈에 대한 하나 이상의 FSM 파라미터들, 및 각각의 빈의 값을 입력으로서 취하는 파라미터화된 상태 업데이팅 함수를 이용하여 빈 스트림의 다음 빈에 대한 상태를 결정하는 것에 의해, 빈 스트림을 생성하는 것을 포함하고, 오프셋 값은 빈 스트림의 마지막 빈에 대한 간격에서의 값과 동일한, 상기 오프셋 값을 결정하는 단계; 및, 오프셋 값을 포함하는 비트스트림을 출력하는 단계를 포함한다.In another example, this disclosure describes a method of encoding video data, the method comprising: generating syntax elements based on the video data; determining an offset value, at least in part, by applying a binary arithmetic encoding to a syntax element, wherein applying the binary arithmetic encoding at least in part binarizes the syntax element; for at least one each bin of the bin stream, determining an interval for the next bin of the bin stream based on a state for each bin, an interval for each bin, and a value of each bin; Determining one or more finite state machine (FSM) parameters for the next bin of a bin stream, wherein the one or more FSM parameters for the next bin control how probability estimates for the next bin are calculated from the state for each bin. , determining the one or more FSM parameters; and, the state for the next bin in the bin stream using a parameterized state updating function that takes as input the state for each bin, one or more FSM parameters for the next bin in the bin stream, and the value of each bin. generating an empty stream by determining an offset value, wherein the offset value is equal to a value in the interval for the last bin of the empty stream; And, it includes outputting a bitstream including an offset value.

또 다른 예에서, 본 개시는, 비디오 데이터를 디코딩하기 위한 장치를 기술하고, 상기 장치는, 비디오 데이터를 저장하도록 구성된 하나 이상의 저장 매체; 및, 하나 이상의 프로세서들을 포함하고, 하나 이상의 프로세서들은, 비트스트림에 포함된 오프셋 값에 이진 산술 디코딩을 적용하는 것에 의해, 디코딩된 신택스 엘리먼트를 결정하는 것으로서, 하나 이상의 프로세서들은, 상기 이진 산술 디코딩을 적용하는 것의 일부로서, 하나 이상의 프로세서들이, 빈 스트림을 생성하는 것으로서, 빈 스트림을 생성하는 것의 일부로서, 하나 이상의 프로세서들은, 빈 스트림의 적어도 하나의 각각의 빈에 대해, 각각의 빈에 대한 상태, 각각의 빈에 대한 간격, 및 오프셋 값에 기초하여 각각의 빈의 값을 결정하고; 빈 스트림의 다음 빈에 대한 하나 이상의 유한 상태 머신 (FSM) 파라미터들을 결정하는 것으로서, 다음 빈에 대한 하나 이상의 FSM 파라미터들은 다음 빈에 대한 확률 추정치들이 어떻게 각각의 빈에 대한 상태로부터 계산되는지를 제어하고, 빈 스트림의 다음 빈은 빈 스트림에서 각각의 빈을 뒤따르는, 상기 하나 이상의 FSM 파라미터들을 결정하는 것을 행하며; 그리고, 각각의 빈에 대한 상태, 빈 스트림의 다음 빈에 대한 하나 이상의 FSM 파라미터들, 및 각각의 빈의 값을 입력으로서 취하는 파라미터화된 상태 업데이팅 함수를 이용하여 빈 스트림의 다음 빈에 대한 상태를 결정하는, 상기 빈 스트림을 생성하는 것을 행하고, 디코딩된 신택스 엘리먼트를 형성하기 위해 상기 빈 스트림을 이진화해제하도록 구성되는, 상기 디코딩된 신택스 엘리먼트를 결정하는 것을 행하며; 그리고, 디코딩된 신택스 엘리먼트에 부분적으로 기초하여 비디오 데이터의 픽처를 재구성하도록 구성된다.In another example, the present disclosure describes an apparatus for decoding video data, the apparatus comprising: one or more storage media configured to store video data; and, one or more processors, wherein the one or more processors determine a decoded syntax element by applying binary arithmetic decoding to an offset value included in the bitstream, wherein the one or more processors perform the binary arithmetic decoding. As part of applying, one or more processors: generating an empty stream, wherein, as part of generating an empty stream, one or more processors: for at least one each bin of the empty stream, state for each bin. , determine the value of each bin based on the interval and offset value for each bin; determining one or more finite state machine (FSM) parameters for the next bin of the bin stream, wherein the one or more FSM parameters for the next bin control how probability estimates for the next bin are calculated from the state for each bin; , the next bin in the bin stream determines the one or more FSM parameters that follow each bin in the bin stream; Then, the state for the next bin in the bin stream using a parameterized state updating function that takes as input the state for each bin, one or more FSM parameters for the next bin in the bin stream, and the value of each bin. generate the empty stream, and determine the decoded syntax element, configured to debinarize the empty stream to form a decoded syntax element; And, it is configured to reconstruct a picture of the video data based in part on the decoded syntax elements.

또 다른 예에서, 본 개시는, 비디오 데이터를 인코딩하기 위한 장치를 기술하고, 상기 장치는, 비디오 데이터를 저장하도록 구성된 하나 이상의 저장 매체; 및, 하나 이상의 저장 매체에 커플링된 하나 이상의 프로세싱 회로들을 포함하고, 상기 하나 이상의 프로세싱 회로들은, 비디오 데이터에 기초하여 신택스 엘리먼트를 생성하고; 신택스 엘리먼트에 이진 산술 인코딩을 적용하는 것에 의해, 오프셋 값을 결정하는 것으로서, 하나 이상의 프로세서들은, 이진 산술 인코딩을 적용하는 것의 일부로서, 하나 이상의 프로세서들이, 적어도 부분적으로, 신택스 엘리먼트를 이진화하는 것; 및, 빈 스트림의 적어도 하나의 각각의 빈에 대해, 각각의 빈에 대한 상태, 각각의 빈에 대한 간격, 및 각각의 빈의 값에 기초하여 빈 스트림의 다음 빈에 대한 간격을 결정하는 것; 빈 스트림의 다음 빈에 대한 하나 이상의 유한 상태 머신 (FSM) 파라미터들을 결정하는 것으로서, 다음 빈에 대한 하나 이상의 FSM 파라미터들은 다음 빈에 대한 확률 추정치들이 어떻게 각각의 빈에 대한 상태로부터 계산되는지를 제어하는, 상기 하나 이상의 FSM 파라미터들을 결정하는 것; 및, 각각의 빈에 대한 상태, 빈 스트림의 다음 빈에 대한 하나 이상의 FSM 파라미터들, 및 각각의 빈의 값을 입력으로서 취하는 파라미터화된 상태 업데이팅 함수를 이용하여 빈 스트림의 다음 빈에 대한 상태를 결정하는 것에 의해, 상기 빈 스트림을 생성하도록 구성되고, 오프셋 값은 빈 스트림의 마지막 빈에 대한 간격에서의 값과 동일한, 상기 오프셋 값을 결정하는 것을 행하며; 그리고, 오프셋 값을 포함하는 비트스트림을 출력하도록 구성된다.In another example, the present disclosure describes an apparatus for encoding video data, the apparatus comprising: one or more storage media configured to store video data; and one or more processing circuits coupled to one or more storage media, wherein the one or more processing circuits generate a syntax element based on video data; By applying the binary arithmetic encoding to the syntax element, determining the offset value, the one or more processors: As part of applying the binary arithmetic encoding, the one or more processors: binarize, at least in part, the syntax element; and, for at least one each bin of the bin stream, determining an interval for the next bin of the bin stream based on the state for each bin, the interval for each bin, and the value of each bin; Determining one or more finite state machine (FSM) parameters for the next bin of a bin stream, wherein the one or more FSM parameters for the next bin control how probability estimates for the next bin are calculated from the state for each bin. , determining the one or more FSM parameters; and, the state for the next bin in the bin stream using a parameterized state updating function that takes as input the state for each bin, one or more FSM parameters for the next bin in the bin stream, and the value of each bin. configured to generate the empty stream, wherein the offset value is equal to a value in the interval for the last bin of the empty stream; And, it is configured to output a bitstream including an offset value.

또 다른 예에서, 본 개시는, 비디오 데이터를 디코딩하기 위한 장치를 기술하고, 상기 장치는, 비트스트림에 포함된 오프셋 값에 이진 산술 디코딩을 적용하는 것에 의해, 디코딩된 신택스 엘리먼트를 결정하는 수단으로서, 상기 이진 산술 디코딩을 적용하는 것은, 빈 스트림을 생성하는 것으로서, 상기 빈 스트림을 생성하는 것은, 빈 스트림의 적어도 하나의 각각의 빈에 대해, 각각의 빈에 대한 상태, 각각의 빈에 대한 간격, 및 오프셋 값에 기초하여 각각의 빈의 값을 결정하는 것; 빈 스트림의 다음 빈에 대한 하나 이상의 유한 상태 머신 (FSM) 파라미터들을 결정하는 것으로서, 다음 빈에 대한 하나 이상의 FSM 파라미터들은 다음 빈에 대한 확률 추정치들이 어떻게 각각의 빈에 대한 상태로부터 계산되는지를 제어하고, 빈 스트림의 다음 빈은 빈 스트림에서 각각의 빈을 뒤따르는, 상기 하나 이상의 FSM 파라미터들을 결정하는 것; 및, 각각의 빈에 대한 상태, 빈 스트림의 다음 빈에 대한 하나 이상의 FSM 파라미터들, 및 각각의 빈의 값을 입력으로서 취하는 파라미터화된 상태 업데이팅 함수를 이용하여 빈 스트림의 다음 빈에 대한 상태를 결정하는 것을 포함하는, 상기 빈 스트림을 생성하는 것; 및, 디코딩된 신택스 엘리먼트를 형성하기 위해 빈 스트림을 이진화해제하는 것을 포함하는, 상기 디코딩된 신택스 엘리먼트를 결정하는 수단; 및, 디코딩된 신택스 엘리먼트에 부분적으로 기초하여 비디오 데이터의 픽처를 재구성하는 수단을 포함한다.In another example, the present disclosure describes an apparatus for decoding video data, the apparatus comprising means for determining a decoded syntax element by applying binary arithmetic decoding to an offset value included in a bitstream. , applying the binary arithmetic decoding generates an empty stream, wherein generating the empty stream includes: for at least one each bin of the empty stream, a state for each bin, and an interval for each bin. , and determining the value of each bin based on the offset value; determining one or more finite state machine (FSM) parameters for the next bin of the bin stream, wherein the one or more FSM parameters for the next bin control how probability estimates for the next bin are calculated from the state for each bin; , determining the one or more FSM parameters, wherein the next bin in the bin stream follows each bin in the bin stream; and, the state for the next bin in the bin stream using a parameterized state updating function that takes as input the state for each bin, one or more FSM parameters for the next bin in the bin stream, and the value of each bin. generating the empty stream, including determining: and means for determining a decoded syntax element, comprising debinarizing an empty stream to form a decoded syntax element; and means for reconstructing a picture of the video data based in part on the decoded syntax elements.

또 다른 예에서, 본 개시는, 비디오 데이터를 인코딩하기 위한 장치를 기술하고, 상기 장치는, 비디오 데이터에 기초하여 신택스 엘리먼트를 생성하는 수단; 적어도 부분적으로, 신택스 엘리먼트에 이진 산술 인코딩을 적용하는 것에 의해, 오프셋 값을 결정하는 수단으로서, 상기 이진 산술 인코딩을 적용하는 것은, 적어도 부분적으로, 신택스 엘리먼트를 이진화하는 것; 및, 빈 스트림의 적어도 하나의 각각의 빈에 대해, 각각의 빈에 대한 상태, 각각의 빈에 대한 간격, 및 각각의 빈의 값에 기초하여 빈 스트림의 다음 빈에 대한 간격을 결정하는 것; 빈 스트림의 다음 빈에 대한 하나 이상의 유한 상태 머신 (FSM) 파라미터들을 결정하는 것으로서, 다음 빈에 대한 하나 이상의 FSM 파라미터들은 다음 빈에 대한 확률 추정치들이 어떻게 각각의 빈에 대한 상태로부터 계산되는지를 제어하는, 상기 하나 이상의 FSM 파라미터들을 결정하는 것; 및, 각각의 빈에 대한 상태, 빈 스트림의 다음 빈에 대한 하나 이상의 FSM 파라미터들, 및 각각의 빈의 값을 입력으로서 취하는 파라미터화된 상태 업데이팅 함수를 이용하여 빈 스트림의 다음 빈에 대한 상태를 결정하는 것에 의해, 상기 빈 스트림을 생성하는 것을 포함하고, 오프셋 값은 빈 스트림의 마지막 빈에 대한 간격에서의 값과 동일한, 상기 오프셋 값을 결정하는 수단; 및, 오프셋 값을 포함하는 비트스트림을 출력하는 수단을 포함한다.In another example, this disclosure describes an apparatus for encoding video data, the apparatus comprising: means for generating syntax elements based on the video data; means for determining an offset value by, at least in part, applying a binary arithmetic encoding to a syntax element, wherein applying the binary arithmetic encoding comprises, at least in part, binarizing a syntax element; and, for at least one each bin of the bin stream, determining an interval for the next bin of the bin stream based on the state for each bin, the interval for each bin, and the value of each bin; Determining one or more finite state machine (FSM) parameters for the next bin of a bin stream, wherein the one or more FSM parameters for the next bin control how probability estimates for the next bin are calculated from the state for each bin. , determining the one or more FSM parameters; and, the state for the next bin in the bin stream using a parameterized state updating function that takes as input the state for each bin, one or more FSM parameters for the next bin in the bin stream, and the value of each bin. means for determining an offset value, comprising generating the empty stream by determining , wherein the offset value is equal to a value in the interval for the last bin of the empty stream; and means for outputting a bitstream including the offset value.

또 다른 예에서, 본 개시는, 명령들을 저장하는 컴퓨터 판독가능 저장 매체를 기술하고, 상기 명령들은, 실행될 때, 하나 이상의 프로세서들로 하여금, 비트스트림에 포함된 오프셋 값에 이진 산술 디코딩을 적용하는 것에 의해, 디코딩된 신택스 엘리먼트를 결정하는 것으로서, 하나 이상의 프로세서들로 하여금 상기 이진 산술 디코딩을 적용하게 하는 것의 일부로서, 상기 명령들의 실행은 상기 하나 이상의 프로세서들로 하여금, 빈 스트림을 생성하는 것으로서, 하나 이상의 프로세서들로 하여금 빈 스트림을 생성하는 것의 일부로서, 상기 명령들의 실행은 상기 하나 이상의 프로세서들로 하여금, 빈 스트림의 적어도 하나의 각각의 빈에 대해, 각각의 빈에 대한 상태, 각각의 빈에 대한 간격, 및 오프셋 값에 기초하여 각각의 빈의 값을 결정하게 하고; 빈 스트림의 다음 빈에 대한 하나 이상의 유한 상태 머신 (FSM) 파라미터들을 결정하는 것으로서, 다음 빈에 대한 하나 이상의 FSM 파라미터들은 다음 빈에 대한 확률 추정치들이 어떻게 각각의 빈에 대한 상태로부터 계산되는지를 제어하고, 빈 스트림의 다음 빈은 빈 스트림에서 각각의 빈을 뒤따르는, 상기 하나 이상의 FSM 파라미터들을 결정하는 것을 행하게 하며; 그리고, 각각의 빈에 대한 상태, 빈 스트림의 다음 빈에 대한 하나 이상의 FSM 파라미터들, 및 각각의 빈의 값을 입력으로서 취하는 파라미터화된 상태 업데이팅 함수를 이용하여 빈 스트림의 다음 빈에 대한 상태를 결정하게 하는, 상기 빈 스트림을 생성하는 것을 행하게 하고; 그리고, 디코딩된 신택스 엘리먼트를 형성하기 위해 빈 스트림을 이진화해제하게 하는, 상기 디코딩된 신택스 엘리먼트를 결정하는 것을 행하게 하며; 그리고, 디코딩된 신택스 엘리먼트에 부분적으로 기초하여 비디오 데이터의 픽처를 재구성하게 한다.In another example, the present disclosure describes a computer-readable storage medium storing instructions that, when executed, cause one or more processors to apply binary arithmetic decoding to an offset value included in a bitstream. whereby, as part of determining a decoded syntax element, causing one or more processors to apply the binary arithmetic decoding, execution of the instructions causes the one or more processors to: generate an empty stream, As part of causing one or more processors to generate an empty stream, execution of the instructions causes the one or more processors to: for at least one each bin of the empty stream, a state for each bin, determine the value of each bin based on the interval for and the offset value; determining one or more finite state machine (FSM) parameters for the next bin of the bin stream, wherein the one or more FSM parameters for the next bin control how probability estimates for the next bin are calculated from the state for each bin; , the next bin in the bin stream determines the one or more FSM parameters that follow each bin in the bin stream; Then, the state for the next bin in the bin stream using a parameterized state updating function that takes as input the state for each bin, one or more FSM parameters for the next bin in the bin stream, and the value of each bin. determine: to generate the empty stream; and determine decoded syntax elements, thereby debinarizing the empty stream to form decoded syntax elements; Then, a picture of the video data is reconstructed based partially on the decoded syntax elements.

또 다른 예에서, 본 개시는, 명령들을 저장하는 컴퓨터 판독가능 저장 매체를 기술하고, 상기 명령들은, 실행될 때, 하나 이상의 프로세서들로 하여금, 비디오 데이터에 기초하여 신택스 엘리먼트를 생성하게 하고; 신택스 엘리먼트에 이진 산술 인코딩을 적용하는 것에 의해, 오프셋 값을 결정하는 것으로서, 하나 이상의 프로세서들로 하여금 이진 산술 인코딩을 적용하게 하는 것의 일부로서, 상기 명령들의 실행은 상기 하나 이상의 프로세서들로 하여금, 적어도 부분적으로, 상기 하나 이상의 프로세서들로 하여금, 신택스 엘리먼트를 이진화하게 하고; 그리고, 빈 스트림의 적어도 하나의 각각의 빈에 대해, 각각의 빈에 대한 상태, 각각의 빈에 대한 간격, 및 각각의 빈의 값에 기초하여 빈 스트림의 다음 빈에 대한 간격을 결정하게 하고; 빈 스트림의 다음 빈에 대한 하나 이상의 유한 상태 머신 (FSM) 파라미터들을 결정하는 것으로서, 다음 빈에 대한 하나 이상의 FSM 파라미터들은 다음 빈에 대한 확률 추정치들이 어떻게 각각의 빈에 대한 상태로부터 계산되는지를 제어하는, 상기 하나 이상의 FSM 파라미터들을 결정하는 것을 행하게 하며; 그리고, 각각의 빈에 대한 상태, 빈 스트림의 다음 빈에 대한 하나 이상의 FSM 파라미터들, 및 각각의 빈의 값을 입력으로서 취하는 파라미터화된 상태 업데이팅 함수를 이용하여 빈 스트림의 다음 빈에 대한 상태를 결정하는 것을 행하게 함으로써, 빈 스트림을 생성하게 하고, 오프셋 값은 빈 스트림의 마지막 빈에 대한 간격에서의 값과 동일한, 오프셋 값을 결정하는 것을 행하게 하며; 그리고, 오프셋 값을 포함하는 비트스트림을 출력하게 한다.In another example, the present disclosure describes a computer-readable storage medium storing instructions that, when executed, cause one or more processors to generate a syntax element based on video data; As part of causing one or more processors to apply binary arithmetic encoding, by applying a binary arithmetic encoding to a syntax element, determining an offset value, execution of the instructions causes the one or more processors to: In part, cause the one or more processors to binarize syntax elements; and, for at least one each bin of the bin stream, determine an interval for the next bin of the bin stream based on the state for each bin, the interval for each bin, and the value of each bin; Determining one or more finite state machine (FSM) parameters for the next bin of a bin stream, wherein the one or more FSM parameters for the next bin control how probability estimates for the next bin are calculated from the state for each bin. , determine the one or more FSM parameters; and the state for the next bin in the bin stream using a parameterized state updating function that takes as input the state for each bin, one or more FSM parameters for the next bin in the bin stream, and the value of each bin. generate an empty stream, and determine an offset value, where the offset value is equal to the value in the interval for the last bin of the empty stream; Then, a bitstream including the offset value is output.

본 개시의 하나 이상의 양태들의 상세들은 첨부 도면들 및 이하의 상세한 설명에서 전개된다. 본 개시물에서 기술된 기법들의 다른 특징들, 목적들, 및 이점들은 그 설명, 도면들, 및 청구항들로부터 명백할 것이다.The details of one or more aspects of the disclosure are set forth in the accompanying drawings and the detailed description below. Other features, objectives, and advantages of the techniques described in this disclosure will be apparent from the description, drawings, and claims.

도면의 간단한 설명
도 1 은 본 개시에서 설명된 하나 이상의 기법들을 이용할 수도 있는 예시적인 비디오 인코딩 및 디코딩 시스템을 나타내는 블록도이다.
도 2 는 본 개시에서 설명된 하나 이상의 기법들을 구현할 수도 있는 일 예시적인 비디오 인코더를 나타내는 블록도이다.
도 3 은 본 개시에서 설명된 하나 이상의 기법들을 구현할 수도 있는 일 예시적인 비디오 디코더를 나타내는 블록도이다.
도 4 는 일 예시적인 일반적인 유한 상태 머신의 블록도이다.
도 5 는 빈 확률 추정을 위해 많은 유한 상태 머신 (FSM) 들을 이용하여 컨텍스트-기반 이진 산술 인코딩하기 위한 예시적인 블록도이다.
도 6 은 빈 확률 추정을 위해 많은 FSM들을 이용하여 컨텍스트-기반 이진 산술 디코딩하기 위한 예시적인 블록도이다.
도 7 은 단일의 선택된 컨텍스트를 고려하여 컨텍스트-기반 이진 산술 인코딩하기 위한 예시적인 블록도이다.
도 8 은 단일의 선택된 컨텍스트를 고려하여 컨텍스트-기반 이진 산술 디코딩하기 위한 예시적인 블록도이다.
도 9 는 본 개시의 하나 이상의 양태들에 따른, 컨텍스트-기반 이진 산술 인코딩하기 위한 예시적인 블록도이다.
도 10 은 본 개시의 하나 이상의 양태들에 따른, 컨텍스트-기반 이진 산술 디코딩하기 위한 예시적인 블록도이다.
도 11a 는 FSM 파라미터들이 스캔 순서를 따라 동일 픽처에서 이웃하는 블록들 (예컨대, CTU들, CU들) 로부터 도출될 수 있는 것을 나타내는 블록도이다.
도 11b 는 현재 픽처의 블록들에서 사용되는 FSM 파라미터들이 이전에 코딩된 픽처에서의 블록들과 연관된 정보에 기초하여 결정될 수 있는 것을 나타내는 블록도이다.
도 12 는 일 예시적인 확률 추정 필터의 블록도이다.
도 13 은 다른 예시적인 확률 추정 필터의 블록도이다.
도 14 는 캐스케이드 필터들을 이용하는 확률 추정 필터의 일례를 나타낸다.
도 15 는 예시적인 엔트로피 인코딩 유닛을 나타내는 블록도이다.
도 16 은 예시적인 엔트로피 디코딩 유닛을 나타내는 블록도이다.
도 17 은 본 개시의 하나 이상의 기법들에 따른, 비디오 인코더의 예시적인 동작을 나타내는 플로우차트이다.
도 18 은 본 개시의 하나 이상의 기법들에 따른, 비디오 디코더의 예시적인 동작을 나타내는 플로우차트이다. Brief description of the drawing
1 is a block diagram illustrating an example video encoding and decoding system that may utilize one or more techniques described in this disclosure.
2 is a block diagram illustrating an example video encoder that may implement one or more techniques described in this disclosure.
3 is a block diagram illustrating an example video decoder that may implement one or more techniques described in this disclosure.
4 is a block diagram of an exemplary general finite state machine.
Figure 5 is an example block diagram for context-based binary arithmetic encoding using many finite state machines (FSMs) for bin probability estimation.
Figure 6 is an example block diagram for context-based binary arithmetic decoding using multiple FSMs for bin probability estimation.
Figure 7 is an example block diagram for context-based binary arithmetic encoding considering a single selected context.
Figure 8 is an example block diagram for context-based binary arithmetic decoding considering a single selected context.
9 is an example block diagram for context-based binary arithmetic encoding, in accordance with one or more aspects of the present disclosure.
10 is an example block diagram for context-based binary arithmetic decoding, in accordance with one or more aspects of the present disclosure.
FIG. 11A is a block diagram showing that FSM parameters can be derived from neighboring blocks (eg, CTUs, CUs) in the same picture along scan order.
FIG. 11B is a block diagram showing that FSM parameters used in blocks of a current picture can be determined based on information associated with blocks in a previously coded picture.
Figure 12 is a block diagram of an example probability estimation filter.
Figure 13 is a block diagram of another example probability estimation filter.
Figure 14 shows an example of a probability estimation filter using cascade filters.
Figure 15 is a block diagram illustrating an example entropy encoding unit.
Figure 16 is a block diagram illustrating an example entropy decoding unit.
17 is a flow chart illustrating example operation of a video encoder, in accordance with one or more techniques of this disclosure.
18 is a flow chart illustrating example operation of a video decoder, in accordance with one or more techniques of this disclosure.

상세한 설명details

산술 코딩은 데이터 압축을 제공하기 위해 비디오 코딩에서 자주 사용된다. 통상적인 산술 코딩 프로세스에서, 비디오 인코더는 제 1 이진 심볼 및 제 2 이진 심볼에 대해 초기 확률 추정치들과 연관되는 코딩 컨텍스트를 선택한다. 비디오 인코더는 오프셋 값을 결정하기 위해 빈 스트림의 빈 (bin) 의 값 및 확률 추정치들을 이용한다. 추가적으로, 비디오 인코더는 확률 추정치들을 업데이트하기 위해 빈 값에 기초하여 상태 업데이트 함수를 이용할 수도 있다. 비디오 인코더는 그 다음, 오프셋 값을 업데이트하기 위해 빈 스트림의 다음 빈 값 및 업데이트된 확률 추정치들을 이용할 수도 있다. 확률 추정치들 및 오프셋 값을 업데이트하는 이 프로세스는 비디오 인코더가 빈 스트림의 종단에 도달할 때까지 계속될 수도 있다.Arithmetic coding is often used in video coding to provide data compression. In a typical arithmetic coding process, a video encoder selects a coding context that is associated with initial probability estimates for a first binary symbol and a second binary symbol. The video encoder uses the value of the bin of the empty stream and probability estimates to determine the offset value. Additionally, the video encoder may use a state update function based on the bin value to update probability estimates. The video encoder may then use the next bin value of the bin stream and the updated probability estimates to update the offset value. This process of updating the probability estimates and offset value may continue until the video encoder reaches the end of the empty stream.

역으로, 비디오 디코더는 오프셋 값을 포함하는 바이트 스트림을 수신할 수도 있다. 비디오 디코더는 비디오 인코더에 의해 선택된 것과 동일한 코딩 컨텍스트를 선택하고, 서브-간격들의 쌍을 결정하기 위해 코딩 컨텍스트에 의해 명시된 확률 추정치들을 이용하며, 그 서브-간격들의 쌍의 각각은 상이한 빈 값에 대응한다. 오프셋 값이 제 1 서브-간격인 경우에, 비디오 디코더는 제 1 빈 값을 디코딩한다. 오프셋 값이 제 2 서브-간격인 경우에, 비디오 디코더는 제 2 빈 값을 디코딩한다. 비디오 디코더는 그 다음, 디코딩된 빈 값에 기초하여 확률 추정치들을 업데이트하기 위해 상태 업데이트 함수를 이용할 수도 있다. 비디오 디코더는 서브-간격들을 다시 결정하기 위해 업데이트된 확률 추정치들을 이용하고, 업데이트된 하위 서브-간격의 하위 종단은, 오프셋 값이 이전 하위 서브-간격 내에 있는 경우에 이전 하위 서브-간격의 낮은 종단과 동일하고, 오프셋 값이 이전 상위 서브-간격 내에 있는 경우에 이전 상위 서브-간격의 낮은 종단과 동일하다. 비디오 디코더는, 비디오 디코더가 바이트 스트림의 종단에 도달할 때까지 이 프로세스를 계속할 수도 있다. 비디오 디코더는 하나 이상의 신택스 엘리먼트들의 값을 결정하기 위해 결과적인 빈 스트림을 이진화해제 (de-binarize) 할 수도 있다.Conversely, a video decoder may receive a byte stream containing offset values. The video decoder selects the same coding context as selected by the video encoder and uses the probability estimates specified by the coding context to determine pairs of sub-intervals, each of which corresponds to a different bin value. do. If the offset value is the first sub-interval, the video decoder decodes the first bin value. If the offset value is the second sub-interval, the video decoder decodes the second bin value. The video decoder may then use a state update function to update probability estimates based on the decoded bin value. The video decoder uses the updated probability estimates to re-determine the sub-intervals, where the lower end of the updated lower sub-interval is the lower end of the previous sub-sub-interval if the offset value is within the previous sub-sub-interval. is equal to and is equal to the low end of the previous upper sub-interval if the offset value is within the previous upper sub-interval. The video decoder may continue this process until the video decoder reaches the end of the byte stream. The video decoder may de-binarize the resulting empty stream to determine the values of one or more syntax elements.

상술된 산술 인코딩 및 산술 디코딩 프로세스들에서, 비디오 인코더 및 비디오 디코더는 상태 업데이트 함수를 이용하여 확률 추정치들을 업데이트한다. 비디오 인코더 및 비디오 디코더는 모든 타입들의 신택스 엘리먼트들을 코딩할 때 모든 코딩 컨텍스트들에 대해 동일한 상태 업데이트 함수를 이용하다. 하지만, 이 개시물에서 기술된 바와 같이, 상이한 상황들에서 상이한 상태 업데이트 함수들을 이용하는 것은 향상된 코딩 효율 및 비트 레이트 감소의 결과를 가져올 수도 있다. 이것은, 다시, 향상된 픽처 품질 및/또는 감소된 대역폭 소모의 결과를 가져올 수도 있다.In the arithmetic encoding and arithmetic decoding processes described above, the video encoder and video decoder update probability estimates using a state update function. Video encoders and video decoders use the same state update function for all coding contexts when coding all types of syntax elements. However, as described in this disclosure, using different state update functions in different situations may result in improved coding efficiency and bit rate reduction. This, in turn, may result in improved picture quality and/or reduced bandwidth consumption.

본 개시의 하나의 예에서, 비디오 인코더는 비디오 데이터에 기초하여 신택스 엘리먼트를 생성하고, 그 신택스 엘리먼트에 이진 산술 인코딩을 적용함으로써 오프셋 값을 결정한다. 이진 산술 인코딩을 적용하는 것의 일부로서, 비디오 인코더는 신택스 엘리먼트를 이진화함으로써 빈 스트림을 생성한다. 추가적으로, 빈 스트림의 적어도 하나의 각각의 빈에 대해, 비디오 인코더는, 그 각각의 빈에 대한 상태, 각각의 빈에 대한 간격, 및 각각의 빈의 값에 기초하여, 빈 스트림의 다음 빈에 대한 간격을 결정한다. 비디오 인코더는 또한, 빈 스트림의 다음 빈에 대해 하나 이상의 유한 상태 머신 (FSM) 파라미터들을 결정한다. 다음 빈에 대한 상기 하나 이상의 FSM 파라미터들은 다음 빈에 대한 확률 추정치들이 어떻게 각각의 빈에 대한 상태로부터 계산되는지를 제어한다. 추가적으로, 비디오 인코더는 파라미터화된 상태 업데이팅 함수를 이용하여 빈 스트림의 다음 빈에 대한 상태를 결정할 수도 있다. 파라미터화된 상태 업데이팅 함수는 각각의 빈에 대한 상태, 빈 스트림의 다음 빈에 대한 하나 이상의 FSM 파라미터들, 및 각각의 빈의 값을 입력으로서 취한다. 이 예에서, 오프셋 값은 빈 스트림의 마지막 빈에 대한 간격에서의 값과 동일하다. 비디오 인코더는 오프셋 값을 포함하는 비트스트림을 출력할 수도 있다.In one example of this disclosure, a video encoder generates a syntax element based on video data and determines an offset value by applying binary arithmetic encoding to the syntax element. As part of applying binary arithmetic encoding, a video encoder creates an empty stream by binarizing syntax elements. Additionally, for each bin of at least one bin stream, the video encoder determines, based on the state for that respective bin, the interval for each bin, and the value of each bin, the video encoder for the next bin in the bin stream. Determine the spacing. The video encoder also determines one or more finite state machine (FSM) parameters for the next bin of the bin stream. The one or more FSM parameters for the next bin control how probability estimates for the next bin are calculated from the state for each bin. Additionally, the video encoder may use a parameterized state updating function to determine the state for the next bin of the bin stream. The parameterized state updating function takes as input the state for each bin, one or more FSM parameters for the next bin in the bin stream, and the value of each bin. In this example, the offset value is equal to the value in the interval for the last bin of the bin stream. A video encoder may output a bitstream including an offset value.

본 개시의 다른 예에서, 비디오 디코더는 비트스트림에 포함된 오프셋 값과 동일한 이진 산술 디코딩을 적용함으로써 디코딩된 신택스 엘리먼트를 결정할 수도 있다. 이진 산술 디코딩을 적용하는 것의 일부로서, 비디오 디코더는 빈 스트림을 생성할 수도 있다. 빈 스트림을 생성하는 것의 일부로서, 빈 스트림의 적어도 하나의 각각의 빈에 대해, 비디오 디코더는 각각의 빈의 값을 결정할 수도 있다. 비디오 디코더는 각각의 빈에 대한 상태, 각각의 빈에 대한 간격, 및 오프셋 값에 기초하여 이 결정을 실시할 수도 있다. 또한, 비디오 디코더는 빈 스트림의 다음 빈에 대해 하나 이상의 유한 FSM 파라미터들을 결정할 수도 있다. 다음 빈에 대한 상기 하나 이상의 FSM 파라미터들은 다음 빈에 대한 확률 추정치들이 어떻게 각각의 빈에 대한 상태로부터 계산되는지를 제어한다. 빈 스트림의 다음 빈은 빈 스트림에서의 각각의 빈을 뒤따른다. 또한, 비디오 디코더는 파라미터화된 상태 업데이팅 함수를 이용하여 빈 스트림의 다음 빈에 대한 상태를 결정한다. 파라미터화된 상태 업데이팅 함수는 각각의 빈에 대한 상태, 빈 스트림의 다음 빈에 대한 하나 이상의 FSM 파라미터들, 및 각각의 빈의 값을 입력으로서 취한다. 비디오 디코더는 디코딩된 신택스 엘리먼트를 형성하기 위해 빈 스트림을 이진화해제할 수도 있다. 또한, 비디오 디코더는 디코딩된 신택스 엘리먼트에 부분적으로 기초하여 비디오 데이터의 픽처를 재구성할 수도 있다.In another example of this disclosure, a video decoder may determine the decoded syntax element by applying binary arithmetic decoding equal to the offset value included in the bitstream. As part of applying binary arithmetic decoding, a video decoder may generate an empty stream. As part of generating an empty stream, for at least one each bin of the empty stream, the video decoder may determine the value of each bin. The video decoder may make this decision based on the state for each bin, the interval for each bin, and the offset value. Additionally, the video decoder may determine one or more finite FSM parameters for the next bin of the bin stream. The one or more FSM parameters for the next bin control how probability estimates for the next bin are calculated from the state for each bin. The next bean in the empty stream follows each bean in the empty stream. Additionally, the video decoder determines the state for the next bin of the bin stream using a parameterized state updating function. The parameterized state updating function takes as input the state for each bin, one or more FSM parameters for the next bin in the bin stream, and the value of each bin. A video decoder may debinarize the empty stream to form decoded syntax elements. Additionally, a video decoder may reconstruct a picture of video data based in part on decoded syntax elements.

도 1 은 본 개시의 기법들을 활용할 수도 있는 예시적인 비디오 인코딩 및 디코딩 시스템 (10) 을 나타내는 블록도이다. 도 1에 나타낸 바와 같이, 시스템 (10) 은, 목적지 디바이스 (14) 에 의해 나중에 디코딩될 인코딩된 비디오 데이터를 제공하는 소스 디바이스 (12) 를 포함한다. 소스 디바이스 (12) 는 인코딩된 비디오 데이터를, 컴퓨터 판독가능 매체 (16) 를 통해 목적지 디바이스 (14) 에 제공한다. 소스 디바이스 (12) 및 목적지 디바이스 (14) 는, 데스크탑 컴퓨터들, 노트북 (즉, 랩톱) 컴퓨터들, 태블릿 컴퓨터들, 셋톱 박스들, 전화기 핸드셋 이를테면 소위 "스마트" 폰들, 태블릿 컴퓨터, 텔레비전들, 카메라들, 디스플레이 디바이스들, 디지털 미디어 플레이어들, 비디오 게이밍 콘솔들, 비디오 스트리밍 디바이스들 등을 포함한, 광범위한 디바이스들 중 어느 것을 포함할 수도 있다. 일부 경우들에서, 소스 디바이스 (12) 및 목적지 디바이스 (14) 는 무선 통신을 위해 구비된다. 따라서, 소스 디바이스 (12) 및 목적지 디바이스 (14) 는 무선 통신 디바이스들일 수도 있다. 본 개시에 설명된 기법들은 무선 및/또는 유선 애플리케이션들에 적용될 수도 있다. 소스 디바이스 (12) 는 예시적인 비디오 인코딩 디바이스 (즉, 비디오 데이터를 인코딩하기 위한 디바이스) 이다. 목적지 디바이스 (14) 는 예시적인 비디오 디코딩 디바이스 (즉, 비디오 데이터를 디코딩하기 위한 디바이스) 이다.1 is a block diagram illustrating an example video encoding and decoding system 10 that may utilize the techniques of this disclosure. As shown in FIG. 1 , system 10 includes a source device 12 that provides encoded video data to be later decoded by a destination device 14 . Source device 12 provides encoded video data to destination device 14 via computer-readable medium 16. Source device 12 and destination device 14 may include desktop computers, notebook (i.e., laptop) computers, tablet computers, set-top boxes, telephone handsets such as so-called “smart” phones, tablet computers, televisions, cameras, etc. may include any of a wide range of devices, including display devices, digital media players, video gaming consoles, video streaming devices, and the like. In some cases, source device 12 and destination device 14 are equipped for wireless communication. Accordingly, source device 12 and destination device 14 may be wireless communication devices. The techniques described in this disclosure may be applied to wireless and/or wired applications. Source device 12 is an example video encoding device (i.e., a device for encoding video data). Destination device 14 is an example video decoding device (i.e., a device for decoding video data).

도 1 의 예시된 시스템 (10) 은 단지 하나의 예이다. 비디오 데이터를 프로세싱하기 위한 기법들은 임의의 디지털 비디오 인코딩 및/또는 디코딩 디바이스에 의해 수행될 수도 있다. 일부 예에서, 기법들은 통상적으로 "코덱 (CODEC)"으로 지칭되는 비디오 인코더/디코더에 의해 수행될 수도 있다. 소스 디바이스 (12) 및 목적지 디바이스 (12) 는 소스 디바이스 (14) 가 목적지 디바이스 (14) 로의 송신을 위한 코딩된 비디오 데이터를 생성하는 그러한 코딩 디바이스들의 예들이다. 일부 예들에서, 소스 디바이스 (12) 및 목적지 디바이스 (14) 는, 소스 디바이스 (12) 및 목적지 디바이스 (14) 의 각각이 비디오 인코딩 및 디코딩 컴포넌트들을 포함하도록 실질적으로 대칭 방식으로 동작한다. 그러므로, 시스템 (10) 은 예를 들면, 비디오 스트리밍, 비디오 플레이백, 비디오 브로드캐스팅 또는 화상 통화를 위해, 소스 디바이스 (12) 와 목적지 디바이스 (14) 간의 일방향 또는 양방향 비디오 송신을 지원할 수도 있다.The illustrated system 10 of Figure 1 is just one example. Techniques for processing video data may be performed by any digital video encoding and/or decoding device. In some examples, the techniques may be performed by a video encoder/decoder, commonly referred to as a “CODEC”. Source device 12 and destination device 12 are examples of such coding devices where source device 14 generates coded video data for transmission to destination device 14. In some examples, source device 12 and destination device 14 operate in a substantially symmetric manner such that each of source device 12 and destination device 14 includes video encoding and decoding components. Therefore, system 10 may support one-way or two-way video transmission between source device 12 and destination device 14, for example, for video streaming, video playback, video broadcasting, or video calling.

도 1 의 예에서, 소스 디바이스 (12) 는 비디오 소스 (18), 비디오 데이터를 저장하도록 구성된 저장 매체 (19), 비디오 인코더 (20), 및 출력 인터페이스 (22) 를 포함한다. 목적지 디바이스 (14) 는 입력 인터페이스 (26), 인코딩된 비디오 데이터를 저장하도록 구성된 저장 매체 (28), 비디오 디코더 (30) 및 디스플레이 디바이스 (32) 를 포함한다. 다른 예들에서, 소스 디바이스 (12) 및 목적지 디바이스 (14) 는 다른 컴포넌트들 또는 배열들을 포함한다. 예를 들어, 소스 디바이스 (12) 는 외부 카메라와 같은 외부 비디오 소스로부터 비디오 데이터를 수신할 수도 있다. 마찬가지로, 목적지 디바이스 (14) 는 통합된 디스플레이 디바이스를 포함하는 것보다는 외부 디스플레이 디바이스와 인터페이싱할 수도 있다.In the example of FIG. 1 , source device 12 includes a video source 18, a storage medium 19 configured to store video data, a video encoder 20, and an output interface 22. Destination device 14 includes an input interface 26, a storage medium 28 configured to store encoded video data, a video decoder 30, and a display device 32. In other examples, source device 12 and destination device 14 include other components or arrangements. For example, source device 12 may receive video data from an external video source, such as an external camera. Likewise, destination device 14 may interface with an external display device rather than including an integrated display device.

비디오 소스 (18) 는 비디오 데이터의 소스이다. 비디오 데이터는 일련의 픽처들을 포함할 수도 있다. 비디오 소스 (18) 는 비디오 카메라와 같은 비디오 캡처 디바이스, 이전에 캡처된 비디오를 포함하는 비디오 아카이브 (video archive), 및/또는 비디오 콘텐츠 제공자로부터 비디오 데이터를 수신하기 위한 비디오 피드 인터페이스 (video feed interface) 를 포함할 수도 있다. 일부 예들에서, 비디오 소스 (18) 는 컴퓨터 그래픽 기반 데이터, 또는 라이브 비디오, 아카이브된 비디오, 및 컴퓨터-생성된 비디오의 조합을 생성한다. 저장 매체들 (19) 은 비디오 데이터를 저장하도록 구성될 수도 있다. 각각의 경우에 있어서, 캡처된, 사전-캡처된 또는 컴퓨터 생성된 비디오는 비디오 인코더 (20) 에 의해 인코딩될 수도 있다.Video source 18 is a source of video data. Video data may include a series of pictures. Video source 18 may include a video capture device, such as a video camera, a video archive containing previously captured video, and/or a video feed interface for receiving video data from a video content provider. It may also include . In some examples, video source 18 generates computer graphics-based data, or a combination of live video, archived video, and computer-generated video. Storage media 19 may be configured to store video data. In each case, captured, pre-captured, or computer-generated video may be encoded by video encoder 20.

출력 인터페이스 (22) 는 인코딩된 비디오 정보를 컴퓨터 판독가능 매체 (16) 에 출력할 수도 있다. 출력 인터페이스 (22) 는 다양한 타입들의 컴포넌트들 또는 디바이스들을 포함할 수도 있다. 예를 들어, 출력 인터페이스 (22) 는 무선 송신기, 모뎀, 유선 네트워킹 컴포넌트 (예를 들어, 이더넷 카드), 또는 다른 물리 컴포넌트를 포함할 수도 있다. 출력 인터페이스 (22) 가 무선 송신기를 포함하는 예들에 있어서, 출력 인터페이스 (22) 는 4G, 4G-LTE, LTE 어드밴스드, 5G 등과 같은 셀룰러 통신 표준에 따라 변조되는 인코딩된 비디오 데이터와 같은 데이터를 송신하도록 구성될 수도 있다. 출력 인터페이스 (22) 가 무선 송신기를 포함하는 일부 예들에 있어서, 출력 인터페이스 (22) 는 IEEE 802.11 사양, IEEE 802.15 사양 (예를 들어, ZigBee™), Bluetooth™ 표준 등과 같은 다른 무선 표준들에 따라 변조되는 인코딩된 비디오 데이터와 같은 데이터를 송신하도록 구성될 수도 있다. 일부 예들에 있어서, 출력 인터페이스 (22) 의 회로부는 비디오 인코더 (20) 및/또는 소스 디바이스 (12) 의 다른 컴포넌트들의 회로부에 통합된다. 예를 들어, 비디오 인코더 (20) 및 출력 인터페이스 (22) 는 시스템 온 칩 (SoC) 의 부분들일 수도 있다. SoC 는 또한, 범용 마이크로프로세서, 그래픽스 프로세싱 유닛 등과 같은 다른 컴포넌트들을 포함할 수도 있다.Output interface 22 may output encoded video information to computer-readable medium 16. Output interface 22 may include various types of components or devices. For example, output interface 22 may include a wireless transmitter, a modem, a wired networking component (e.g., an Ethernet card), or other physical component. In examples where output interface 22 includes a wireless transmitter, output interface 22 is configured to transmit data, such as encoded video data, that is modulated according to a cellular communication standard such as 4G, 4G-LTE, LTE Advanced, 5G, etc. It may be configured. In some examples where output interface 22 includes a wireless transmitter, output interface 22 may be configured to modulate in accordance with other wireless standards, such as the IEEE 802.11 specification, IEEE 802.15 specification (e.g., ZigBee™), Bluetooth™ standard, etc. and may be configured to transmit data, such as encoded video data. In some examples, the circuitry of output interface 22 is integrated into the circuitry of other components of video encoder 20 and/or source device 12. For example, video encoder 20 and output interface 22 may be parts of a system on a chip (SoC). The SoC may also include other components such as a general-purpose microprocessor, graphics processing unit, etc.

목적지 디바이스 (14) 는 인코딩된 비디오 데이터를 컴퓨터 판독가능 매체 (16) 를 통해 수신할 수도 있다. 컴퓨터 판독 가능 매체 (16) 는, 인코딩된 비디오 데이터를 소스 디바이스 (12) 로부터 목적지 디바이스 (14) 로 이동시킬 수 있는 임의의 타입의 매체 또는 디바이스를 포함할 수도 있다. 일부 예들에서, 컴퓨터 판독가능 매체 (16) 는, 소스 디바이스 (12) 로 하여금 실시간으로 직접 목적지 디바이스 (14) 로, 인코딩된 비디오 데이터를 송신할 수 있게 하기 위한 통신 매체를 포함한다. 통신 매체는 임의의 무선 또는 유선 통신 매체, 이를테면 무선 주파수 (RF) 스펙트럼 또는 하나 이상의 물리적 송신 라인들을 포함할 수도 있다. 통신 매체는 패킷 기반 네트워크, 이를테면 로컬 영역 네트워크, 광역 네트워크, 또는 인터넷과 같은 글로벌 네트워크의 부분을 형성할 수도 있다. 통신 매체는 라우터들, 스위치들, 기지국들, 또는 소스 디바이스 (12) 로부터 목적지 디바이스 (14) 로의 통신을 용이하게 하는데 유용할 수도 있는 임의의 다른 장비를 포함할 수도 있다. 목적지 디바이스 (14) 는 인코딩된 비디오 데이터 및 디코딩된 비디오 데이터를 저장하도록 구성된 하나 이상의 데이터 저장 매체를 포함할 수도 있다.Destination device 14 may receive encoded video data via computer-readable medium 16. Computer-readable medium 16 may include any type of medium or device capable of moving encoded video data from source device 12 to destination device 14. In some examples, computer-readable medium 16 includes a communication medium to enable source device 12 to transmit encoded video data directly to destination device 14 in real time. Communication media may include any wireless or wired communication medium, such as the radio frequency (RF) spectrum or one or more physical transmission lines. The communication medium may form part of a packet-based network, such as a local area network, a wide area network, or a global network such as the Internet. Communication media may include routers, switches, base stations, or any other equipment that may be useful to facilitate communication from source device 12 to destination device 14. Destination device 14 may include one or more data storage media configured to store encoded video data and decoded video data.

일부 예들에서, 출력 인터페이스 (22) 는 인코딩된 비디오 데이터와 같은 데이터를 저장 디바이스와 같은 중간 디바이스에 출력할 수도 있다. 유사하게, 목적지 디바이스 (14) 의 입력 인터페이스 (26) 는 중간 디바이스로부터 인코딩된 데이터를 수신할 수도 있다. 중간 디바이스는 하드 드라이브, 블루-레이 디스크들, DVD들, CD-ROM들, 플래시 메모리, 휘발성 또는 비휘발성 메모리, 또는 인코딩된 비디오 데이터를 저장하기 위한 임의의 다른 적합한 디지털 저장 매체들과 같은 다양한 분산된 또는 국부적으로 액세스된 데이터 저장 매체들 중 임의의 데이터 저장 매체를 포함할 수도 있다. 일부 예에서, 중간 디바이스는 파일 서버에 대응한다. 예시적인 파일 서버들은 웹 서버들, FTP 서버들, 네트워크 접속형 저장 (NAS) 디바이스들, 또는 로컬 디스크 드라이브들을 포함한다.In some examples, output interface 22 may output data, such as encoded video data, to an intermediate device, such as a storage device. Similarly, input interface 26 of destination device 14 may receive encoded data from an intermediate device. The intermediate device may be a variety of distributed devices such as hard drives, Blu-ray discs, DVDs, CD-ROMs, flash memory, volatile or non-volatile memory, or any other suitable digital storage media for storing encoded video data. It may include any of a centralized or locally accessed data storage medium. In some examples, the intermediate device corresponds to a file server. Exemplary file servers include web servers, FTP servers, network attached storage (NAS) devices, or local disk drives.

목적지 디바이스 (14) 는, 인터넷 접속을 포함한, 임의의 표준 데이터 접속을 통해 인코딩된 비디오 데이터에 액세스할 수도 있다. 이것은, 파일 서버 상에 저장된 인코딩된 비디오 데이터를 액세스하는데 적합한 무선 채널 (예컨대, Wi-Fi 접속), 유선 접속 (예컨대, DSL, 케이블 모뎀 등), 또는 양자의 조합을 포함할 수도 있다. 저장 디바이스로부터의 인코딩된 비디오 데이터의 송신은 스트리밍 송신, 다운로드 송신, 또는 이들의 조합일 수도 있다.Destination device 14 may access the encoded video data via any standard data connection, including an Internet connection. This may include a wireless channel (eg, Wi-Fi connection), a wired connection (eg, DSL, cable modem, etc.), or a combination of both suitable for accessing encoded video data stored on a file server. Transmission of encoded video data from the storage device may be a streaming transmission, a download transmission, or a combination thereof.

컴퓨터 판독 가능 매체 (16) 는, 무선 브로드캐스트 또는 유선 네트워크 송신 등의 일시적 매체, 또는 하드 디스크, 플래시 드라이브, 컴팩트 디스크, 디지털 비디오 디스크, 블루레이 디스크 또는 다른 컴퓨터 판독 가능 매체 등의 저장 매체 (즉, 비일시적 저장 매체) 를 포함할 수도 있다. 일부 예들에서, 네트워크 서버 (미도시) 는 인코딩된 비디오 데이터를 소스 디바이스 (12) 로부터 수신하고, 인코딩된 비디오 데이터를, 예를 들어, 네트워크 송신을 통해 목적지 디바이스 (14) 에 제공할 수도 있다. 유사하게, 디스크 스탬핑 설비와 같은 매체 생성 설비의 컴퓨팅 디바이스는 인코딩된 비디오 데이터를 소스 디바이스 (12) 로부터 수신하고, 인코딩된 비디오 데이터를 포함하는 디스크를 생성할 수도 있다. 따라서, 컴퓨터 판독가능 매체 (16) 는, 다양한 예들에 있어서, 다양한 형태들의 하나 이상의 컴퓨터 판독가능 매체들을 포함하는 것으로 이해될 수도 있다.Computer-readable media 16 refers to temporary media such as wireless broadcasts or wired network transmissions, or storage media such as hard disks, flash drives, compact disks, digital video disks, Blu-ray disks, or other computer-readable media (i.e. , non-transitory storage media). In some examples, a network server (not shown) may receive encoded video data from source device 12 and provide the encoded video data to destination device 14, e.g., via network transmission. Similarly, a computing device in a media production facility, such as a disk stamping facility, may receive encoded video data from source device 12 and produce a disk containing the encoded video data. Accordingly, computer-readable medium 16 may, in various examples, be understood to include one or more computer-readable media in various forms.

목적지 디바이스 (14) 의 입력 인터페이스 (26) 는 컴퓨터 판독가능 매체 (16) 로부터 데이터를 수신한다. 입력 인터페이스 (26) 는 다양한 타입들의 컴포넌트들 또는 디바이스들을 포함할 수도 있다. 예를 들어, 입력 인터페이스 (26) 는 무선 수신기, 모뎀, 유선 네트워킹 컴포넌트 (예를 들어, 이더넷 카드), 또는 다른 물리 컴포넌트를 포함할 수도 있다. 입력 인터페이스 (26) 가 무선 수신기를 포함하는 예들에 있어서, 입력 인터페이스 (26) 는 4G, 4G-LTE, LTE 어드밴스드, 5G 등과 같은 셀룰러 통신 표준에 따라 변조되는 비트스트림과 같은 데이터를 수신하도록 구성될 수도 있다. 입력 인터페이스 (26) 가 무선 수신기를 포함하는 일부 예들에 있어서, 입력 인터페이스 (26) 는 IEEE 802.11 사양, IEEE 802.15 사양 (예를 들어, ZigBee™), Bluetooth™ 표준 등과 같은 다른 무선 표준들에 따라 변조되는 비트스트림과 같은 데이터를 수신하도록 구성될 수도 있다. 일부 예들에 있어서, 입력 인터페이스 (26) 의 회로부는 비디오 디코더 (30) 및/또는 목적지 디바이스 (14) 의 다른 컴포넌트들의 회로부에 통합될 수도 있다. 예를 들어, 비디오 디코더 (30) 및 입력 인터페이스 (26) 는 SoC 의 부분들일 수도 있다. SoC 는 또한, 범용 마이크로프로세서, 그래픽스 프로세싱 유닛 등과 같은 다른 컴포넌트들을 포함할 수도 있다.Input interface 26 of destination device 14 receives data from computer-readable medium 16. Input interface 26 may include various types of components or devices. For example, input interface 26 may include a wireless receiver, a modem, a wired networking component (e.g., an Ethernet card), or other physical component. In examples where input interface 26 includes a wireless receiver, input interface 26 may be configured to receive data, such as a bitstream, modulated according to a cellular communication standard such as 4G, 4G-LTE, LTE Advanced, 5G, etc. It may be possible. In some examples where input interface 26 includes a wireless receiver, input interface 26 may be configured to modulate in accordance with other wireless standards, such as the IEEE 802.11 specification, IEEE 802.15 specification (e.g., ZigBee™), Bluetooth™ standard, etc. It may also be configured to receive data such as a bitstream. In some examples, the circuitry of input interface 26 may be integrated into the circuitry of other components of video decoder 30 and/or destination device 14. For example, video decoder 30 and input interface 26 may be parts of a SoC. The SoC may also include other components such as a general-purpose microprocessor, graphics processing unit, etc.

저장 매체 (28) 는 입력 인터페이스 (26) 에 의해 수신된 인코딩된 비디오 데이터 (예를 들어, 비트스트림) 와 같은 인코딩된 비디오 데이터를 저장하도록 구성될 수도 있다. 디스플레이 디바이스 (32) 는 디코딩된 비디오 데이터를 디스플레이한다. 디스플레이 디바이스 (32) 는 음극선관 (CRT), 액정 디스플레이 (LCD), 플라즈마 디스플레이, 유기 발광 다이오드 (OLED) 디스플레이, 또는 다른 타입의 디스플레이 디바이스와 같은 다양한 디스플레이 디바이스들 중 임의의 디스플레이 디바이스를 포함할 수도 있다.Storage medium 28 may be configured to store encoded video data, such as encoded video data (e.g., a bitstream) received by input interface 26. Display device 32 displays the decoded video data. Display device 32 may include any of a variety of display devices, such as a cathode ray tube (CRT), liquid crystal display (LCD), plasma display, organic light emitting diode (OLED) display, or other type of display device. there is.

비디오 인코더 (20) 및 비디오 디코더 (30) 각각은 하나 이상의 마이크로프로세서들, 디지털 신호 프로세서 (DSP) 들, 주문형 반도체 (ASIC), 필드 프로그램가능 게이트 어레이 (FPGA), 이산 로직, 소프트웨어, 하드웨어, 펌웨어 또는 이들의 임의의 조합과 같은 다양한 적합한 회로 중 어느 것으로도 구현될 수 있다. 그 기법들이 부분적으로 소프트웨어로 구현될 때, 디바이스는 적합한 비일시적 컴퓨터 판독가능 매체에 그 소프트웨어를 위한 명령들을 저장하고 본 개시의 기법들을 수행하기 위하여 하나 이상의 프로세서들을 이용하여 하드웨어에서 그 명령들을 실행할 수도 있다. 비디오 인코더 (20) 및 비디오 디코더 (30) 의 각각은 하나 이상의 인코더들 또는 디코더들에 포함될 수도 있는데, 이들 중 어느 일방은 각각의 디바이스에서 결합된 인코더/디코더 (CODEC) 의 부분으로서 통합될 수도 있다.Video encoder 20 and video decoder 30 each include one or more microprocessors, digital signal processors (DSPs), application specific integrated circuits (ASICs), field programmable gate arrays (FPGAs), discrete logic, software, hardware, and firmware. or any combination thereof. When the techniques are implemented in part in software, the device may store instructions for the software in a suitable non-transitory computer-readable medium and execute the instructions in hardware using one or more processors to perform the techniques of the present disclosure. there is. Each of video encoder 20 and video decoder 30 may be included in one or more encoders or decoders, either of which may be integrated as part of a combined encoder/decoder (CODEC) in the respective device. .

일부 예들에 있어서, 비디오 인코더 (20) 및 비디오 디코더 (30) 는 비디오 코딩 표준에 따라 동작할 수도 있다. 예를 들어, 비디오 인코더 (20) 및 비디오 디코더 (30) 는, SVC (Scalable Video Coding) 및 MVC (Multi-View Video Coding) 확장들을 포함하는, ITU-T H.261, ISO/IEC MPEG-1 Visual, ITU-T H.262 또는 ISO/IEC MPEG-2 Visual, ITU-T H.263, ISO/IEC MPEG-4 Visual 및 ITU-T H.264 (ISO/IEC MPEG-4 AVC 으로도 알려짐), 또는 다른 비디오 코딩 표준 또는 명세에 따라 비디오 데이터를 인코딩 및 디코딩할 수도 있다. 일부 예들에서, 비디오 인코더 (20) 및 비디오 디코더 (30) 는, ITU-T H.265 로서 알려진 HEVC (High Efficiency Video Coding), 그것의 범위 및 스크린 콘텐츠 코딩 확장들, 그것의 3D 비디오 코딩 확장 (3D-HEVC), 그것의 멀티뷰 확장 (MV-HEVC), 또는 그것의 스케일러블 확장 (SHVC) 에 따라 비디오 데이터를 인코딩 및 디코딩한다.In some examples, video encoder 20 and video decoder 30 may operate in accordance with a video coding standard. For example, video encoder 20 and video decoder 30 may support ITU-T H.261, ISO/IEC MPEG-1, including Scalable Video Coding (SVC) and Multi-View Video Coding (MVC) extensions. Visual, ITU-T H.262 or ISO/IEC MPEG-2 Visual, ITU-T H.263, ISO/IEC MPEG-4 Visual and ITU-T H.264 (also known as ISO/IEC MPEG-4 AVC) , or may encode and decode video data according to other video coding standards or specifications. In some examples, video encoder 20 and video decoder 30 include High Efficiency Video Coding (HEVC), known as ITU-T H.265, its scope and screen content coding extensions, and its 3D video coding extension ( Encode and decode video data according to 3D-HEVC), its multiview extension (MV-HEVC), or its scalable extension (SHVC).

본 개시는 일반적으로 신택스 엘리먼트들과 같은 어떤 정보를 "시그널링 (signaling)" 하는 것을 언급할 수도 있다. 용어 "시그널링" 은 일반적으로, 인코딩된 비디오 데이터를 디코딩하는데 사용되는 다른 데이터 및/또는 신택스 엘리먼트들의 통신을 지칭할 수도 있다. 그러한 통신은 실시간 또는 준-실시간으로 발생할 수도 있다. 대안적으로, 그러한 통신은, 신택스 엘리먼트들을 인코딩 시 비트스트림으로 컴퓨터 판독가능 저장 매체에 저장할 경우 (그 후, 이 매체에 저장된 이후 임의의 시간에서 디코딩 디바이스에 의해 취출될 수도 있음) 에 발생할 수도 있는 것과 같이, 시간 범위에 걸쳐 발생할 수도 있다.This disclosure may generally refer to “signaling” some information, such as syntax elements. The term “signaling” may generally refer to the communication of other data and/or syntax elements used to decode encoded video data. Such communication may occur in real time or near-real time. Alternatively, such communication may occur when encoding the syntax elements and storing them as a bitstream on a computer-readable storage medium (which may then be retrieved by a decoding device at any time after storage on the medium). Likewise, it may occur over a time range.

HEVC 및 다른 비디오 코딩 사양들에 있어서, 비디오 데이터는 일련의 픽처들을 포함한다. 픽처들은 또한, "프레임" 들로 지칭될 수도 있다. 픽처는 하나 이상의 샘플 어레이들을 포함할 수도 있다. 픽처의 각각의 개별 샘플 어레이는 개별 컬러 컴포넌트에 대한 샘플들의 어레이를 포함할 수도 있다. 픽처는 SL, SCb, 및 SCr 로서 표기되는 3 개의 샘플 어레이들을 포함할 수도 있다. S_L 은 루마 샘플들의 2-차원 어레이 (즉, 블록) 이다. S_Cb 는 Cb 크로마 샘플들의 2-차원 어레이이다. S_Cr 은 Cr 크로마 샘플들의 2-차원 어레이이다. 다른 사례들에서, 픽처는 단색 (monochrome) 일 수도 있고, 루마 샘플들의 어레이만을 포함할 수도 있다.In HEVC and other video coding specifications, video data includes a series of pictures. Pictures may also be referred to as “frames.” A picture may include one or more sample arrays. Each individual sample array in a picture may include an array of samples for an individual color component. A picture may contain three sample arrays, denoted as SL, SCb, and SCr. S _L is a two-dimensional array (i.e. block) of luma samples. S _Cb is a two-dimensional array of Cb chroma samples. S _Cr is a two-dimensional array of Cr chroma samples. In other cases, the picture may be monochrome and may contain only an array of luma samples.

비디오 데이터를 인코딩하는 것의 일부로서, 비디오 인코더 (20) 는 비디오 데이터의 픽처들을 인코딩할 수도 있다. 달리 말하면, 비디오 인코더 (20) 는 비디오 데이터의 픽처들의 인코딩된 표현들을 생성할 수도 있다. 픽처의 인코딩된 표현은 본 명세서에서 "코딩된 픽처" 또는 "인코딩된 픽처" 로서 지칭될 수도 있다.As part of encoding the video data, video encoder 20 may encode pictures of the video data. In other words, video encoder 20 may generate encoded representations of pictures of video data. An encoded representation of a picture may be referred to herein as a “coded picture” or “encoded picture.”

픽처의 인코딩된 표현을 생성하기 위해, 비디오 인코더 (20) 는 픽처의 블록들을 인코딩할 수도 있다. 비디오 인코더 (20) 는 비디오 블록의 인코딩된 표현을 비트스트림에 포함시킬 수도 있다. 일부 예들에서, 픽처의 블록을 인코딩하기 위해, 비디오 인코더 (20) 는 하나 이상의 예측적 블록들을 생성하기 위해 인트라 예측 또는 인터 예측을 수행한다. 추가적으로, 비디오 인코더 (20) 는 블록에 대한 잔차 데이터를 생성할 수도 있다. 잔차 블록은 잔차 샘플들을 포함한다. 각각의 잔차 샘플은 생성된 예측적 블록들과 그 블록의 대응하는 샘플 간의 차이를 나타낼 수도 있다. 비디오 인코더 (20) 는 변환 계수들을 생성하기 위해 잔차 샘플들의 블록들에 변환을 적용할 수도 있다. 또한, 비디오 인코더 (20) 는 변환 계수들을 양자화할 수도 있다. 일부 예들에서, 비디오 인코더 (20) 는 변환 계수를 나타내기 위해 하나 이상의 신택스 엘리먼트들을 생성할 수도 있다. 비디오 인코더 (20) 는 변환 계수들을 나타내는 신택스 엘리먼트들 중 하나 이상을 엔트로피 인코딩할 수도 있다.To create an encoded representation of a picture, video encoder 20 may encode blocks of the picture. Video encoder 20 may include encoded representations of video blocks in a bitstream. In some examples, to encode a block of a picture, video encoder 20 performs intra-prediction or inter-prediction to generate one or more predictive blocks. Additionally, video encoder 20 may generate residual data for the block. The residual block contains residual samples. Each residual sample may represent the difference between the generated predictive blocks and the corresponding sample of that block. Video encoder 20 may apply a transform to blocks of residual samples to generate transform coefficients. Video encoder 20 may also quantize the transform coefficients. In some examples, video encoder 20 may generate one or more syntax elements to indicate a transform coefficient. Video encoder 20 may entropy encode one or more of the syntax elements representing transform coefficients.

보다 구체적으로, 픽처의 인코딩된 표현을 생성하기 위해서, HEVC 또는 다른 비디오 코딩 사양들에 따라 비디오 데이터를 인코딩할 때, 비디오 인코더 (20) 는 픽처의 각 샘플 어레이를 코딩 트리 블록 (CTB) 들로 파티셔닝하고 그 CTB 들을 인코딩할 수도 있다. CTB 는 픽처의 샘플 어레이에서의 샘플들의 NxN 블록일 수도 있다. HEVC 메인 프로파일에서, CTB 의 사이즈는, 비록 기술적으로 8x8 CTB 사이즈드이 지원될 수 있지만, 16x16 에서부터 64x64 까지의 범위일 수 있다.More specifically, when encoding video data according to HEVC or other video coding specifications to create an encoded representation of a picture, video encoder 20 divides each array of samples in the picture into coding tree blocks (CTBs). You can also partition and encode those CTBs. A CTB may be an NxN block of samples in a picture's sample array. In the HEVC main profile, the size of the CTB can range from 16x16 to 64x64, although technically an 8x8 CTB size may be supported.

픽처의 코딩 트리 유닛 (CTU) 은 하나 이상의 CTB 들을 포함할 수도 있고, 그 하나 이상의 CTB 들의 샘플들을 인코딩하기 위해 사용되는 신택스 구조들을 포함할 수도 있다. 실례로, 각 CTU 는 루마 샘플들의 CTB, 크로마 샘플들의 2개의 대응하는 CTB들, 및 CTB들의 샘플들을 인코딩하는데 사용되는 신택스 구조들을 포함할 수도 있다. 단색 픽처들 또는 3개의 별개의 컬러 평면들을 갖는 픽처들에 있어서, CTU 는 단일의 CTB, 및 그 CTB 의 샘플들을 인코딩하는데 사용되는 신택스 구조들을 포함할 수도 있다. CTU 는 또한 "트리 블록" 또는 "최대 코딩 유닛" (LCU) 으로 지칭될 수도 있다. 이 개시물에서, "신택스 구조 (syntax structure)" 는 특정된 순서로 비트스트림에서 함께 존재하는 제로 또는 그보다 많은 신택스 엘리먼트들로서 정의될 수도 있다. 일부 코덱들에서, 인코딩된 픽처는 픽처의 모든 CTU 들을 포함하는 인코딩된 표현이다.A coding tree unit (CTU) of a picture may include one or more CTBs, and may include syntax structures used to encode samples of the one or more CTBs. As an example, each CTU may include a CTB of luma samples, two corresponding CTBs of chroma samples, and syntax structures used to encode the samples of the CTBs. For monochromatic pictures or pictures with three separate color planes, a CTU may include a single CTB, and the syntax structures used to encode samples of that CTB. A CTU may also be referred to as a “tree block” or “largest coding unit” (LCU). In this disclosure, “syntax structure” may be defined as zero or more syntax elements that exist together in a bitstream in a specified order. In some codecs, an encoded picture is an encoded representation that includes all CTUs of the picture.

픽처의 CTU 를 인코딩하기 위해, 비디오 인코더 (20) 는 CTU 의 CTB 들을 하나 이상의 코딩 블록들로 파티셔닝할 수도 있다. 코딩 블록은 샘플들의 NxN 블록이다. 일부 코덱들에서, 픽처의 CTU 를 인코딩하기 위해, 비디오 인코더 (20) 는 CTU 의 코딩 트리 블록들에 대해 쿼드-트리 파티셔닝을 재귀적으로 수행하여, CTB 들을 코딩 블록들, 따라서, 일명 "코딩 트리 유닛들” 로 파티셔닝할 수도 있다. 코딩 유닛 (CU) 은 하나 이상의 코딩 트리 블록들 및그 하나 이상의 코딩 트리 블록들의 샘플들을 인코딩하기 위해 사용되는 신택스 구조들을 포함할 수도 있다. 예를 들어, CU 는 루마 샘플 어레이, Cb 샘플 어레이, 및 Cr 샘플 어레이를 갖는 픽처의 루마 샘플들의 코딩 블록, 및 크로마 샘플들의 2 개의 대응하는 코딩 블록들, 그리고 코딩 블록들의 샘플들을 인코딩하는데 사용된 신택스 구조들을 포함할 수도 있다. 단색 픽처들 또는 3개의 별개의 색 평면들을 갖는 픽처들에서, CU 는 단일 코딩 블록 및 그 코딩 블록의 샘플들을 코딩하는데 사용된 신택스 구조들을 포함할 수도 있다.To encode a CTU of a picture, video encoder 20 may partition the CTU's CTBs into one or more coding blocks. A coding block is an NxN block of samples. In some codecs, to encode a CTU of a picture, video encoder 20 recursively performs quad-tree partitioning on the coding tree blocks of the CTU, dividing the CTBs into coding blocks, thus, a so-called “coding tree.” units”. A coding unit (CU) may include one or more coding tree blocks and syntax structures used to encode samples of the one or more coding tree blocks. For example, a CU may be a luma It may include a coding block of luma samples of a picture with a sample array, a Cb sample array, and a Cr sample array, and two corresponding coding blocks of chroma samples, and syntax structures used to encode the samples of the coding blocks. In monochromatic pictures or pictures with three separate color planes, a CU may include a single coding block and the syntax structures used to code the samples of that coding block.

또한, 비디오 인코더 (20) 는 비디오 데이터의 픽처의 CU 들을 인코딩할 수도 있다. 일부 코덱들에서, CU 를 인코딩하는 것의 일부로서, 비디오 인코더 (20) 는 CU 의 코딩 블록을 하나 이상의 예측 블록들로 파티셔닝할 수도 있다. 예측 블록은, 동일한 예측이 적용되는 샘플들의 직사각형 (즉, 정사각형 또는 비정사각형) 블록이다. CU 의 예측 유닛 (PU) 은 CU 의 하나 이상의 예측 블록들 및 그 하나 이상의 예측 블록들을 예측하기 위해 사용되는 신택스 구조를 포함할 수도 있다. 예를 들어, PU 는 루마 샘플들의 예측 블록, 크로마 샘플들의 2 개의 대응하는 예측 블록들, 및 예측 블록들을 예측하는데 사용되는 신택스 구조들을 포함할 수도 있다. 단색 픽처들 또는 3 개의 별개의 컬러 평면들을 갖는 픽처들에 있어서, PU 는 단일의 예측 블록, 및 그 예측 블록을 예측하는데 사용되는 신택스 구조들을 포함할 수도 있다.Additionally, video encoder 20 may encode CUs of a picture of video data. In some codecs, as part of encoding a CU, video encoder 20 may partition a coding block of the CU into one or more prediction blocks. A prediction block is a rectangular (ie, square or non-square) block of samples to which the same prediction is applied. A prediction unit (PU) of a CU may include one or more prediction blocks of the CU and a syntax structure used to predict the one or more prediction blocks. For example, a PU may include a prediction block of luma samples, two corresponding prediction blocks of chroma samples, and syntax structures used to predict the prediction blocks. For monochromatic pictures or pictures with three separate color planes, a PU may include a single prediction block, and the syntax structures used to predict the prediction block.

비디오 인코더 (20) 는, CU 의 PU 의 예측 블록 (에를 들어, 루마, Cb, 및 Cr 예측 블록) 을 위해 예측적 블록 (예를 들어, 루마, Cb, 및 Cr 예측성 블록들) 을 생성할 수도 있다. 비디오 인코더 (20) 는 인트라 예측 또는 인터 예측을 이용하여 예측성 블록을 생성할 수도 있다. 비디오 인코더 (20) 가 예측성 블록을 생성하기 위해 인트라 예측을 사용하는 경우에, 비디오 인코더 (20) 는 CU 를 포함하는 픽처의 디코딩된 샘플들에 기초하여 예측성 블록을 생성할 수도 있다. 비디오 인코더 (20) 가 인터 예측을 이용하여 현재 픽처의 PU 의 예측성 블록을 생성하는 경우, 비디오 인코더 (20) 는 레퍼런스 픽처 (즉, 현재 픽처 이외의 픽처) 의 디코딩된 샘플들에 기초하여 PU 의 예측성 블록을 생성할 수도 있다. HEVC 에서, 비디오 인코더 (20) 는 인터 예측된 PU 들에 대한 "coding_unit” 신택스 구조 내의 "prediction_unit” 신택스 구조를 생성하지만, 인트라 예측된 PU 들에 대한 "coding_unit” 신택스 구조 내에서 "prediction_unit” 신택스 구조를 생성하지 않는다. 오히려, HEVC 에서, 인트라 예측된 PU 들과 관련된 신택스 엘리먼트는 "coding_unit” 구문 구조에 직접 포함된다.Video encoder 20 may generate predictive blocks (e.g., luma, Cb, and Cr predictive blocks) for predictive blocks (e.g., luma, Cb, and Cr predictive blocks) of a PU of a CU. It may be possible. Video encoder 20 may generate a predictive block using intra prediction or inter prediction. When video encoder 20 uses intra prediction to generate a predictive block, video encoder 20 may generate the predictive block based on decoded samples of a picture that includes a CU. When video encoder 20 uses inter prediction to generate a predictive block of a PU of a current picture, video encoder 20 generates a PU based on decoded samples of a reference picture (i.e., a picture other than the current picture). Predictive blocks can also be created. In HEVC, video encoder 20 generates a “prediction_unit” syntax structure within a “coding_unit” syntax structure for inter predicted PUs, but a “prediction_unit” syntax structure within a “coding_unit” syntax structure for intra predicted PUs. does not create Rather, in HEVC, syntax elements related to intra predicted PUs are included directly in the “coding_unit” syntax structure.

비디오 인코더 (20) 는 CU 에 대한 하나 이상의 잔차 블록들을 생성할 수도 있다. 실례로, 비디오 인코더 (20) 는 CU 에 대한 루마 잔차 블록을 생성할 수도 있다. CU 의 루마 잔차 블록에 있는 각각의 샘플은 CU 의 예측 루마 블록들 중 하나에 있는 루마 샘플과 CU 의 원래 루마 코딩 블록에 있는 대응하는 샘플 사이의 차이를 표시한다. 또한, 비디오 인코더 (20) 는 CU 에 대한 Cb 잔차 블록을 생성할 수도 있다. CU 의 Cb 잔차 블록에서의 각각의 샘플은 CU 의 예측성 Cb 블록들 중 하나에 있는 Cb 샘플과 CU 의 원래 Cb 코딩 블록에 있는 대응하는 샘플 사이의 차이를 표시할 수도 있다. 비디오 인코더 (20) 는 또한, CU 에 대한 Cr 잔차 블록을 생성할 수도 있다. CU 의 Cr 잔차 블록에서의 각각의 샘플은 CU 의 예측성 Cr 블록들 중 하나에서의 Cr 샘플과 CU 의 오리지널 Cr 코딩 블록에서의 대응하는 샘플 간의 차이를 나타낼 수도 있다.Video encoder 20 may generate one or more residual blocks for a CU. As an example, video encoder 20 may generate a luma residual block for a CU. Each sample in the CU's luma residual block represents the difference between a luma sample in one of the CU's predicted luma blocks and a corresponding sample in the CU's original luma coding block. Video encoder 20 may also generate a Cb residual block for the CU. Each sample in a Cb residual block of a CU may indicate a difference between a Cb sample in one of the CU's predictive Cb blocks and a corresponding sample in the CU's original Cb coding block. Video encoder 20 may also generate a Cr residual block for the CU. Each sample in a CU's Cr residual block may represent the difference between a Cr sample in one of the CU's predictive Cr blocks and a corresponding sample in the CU's original Cr coding block.

또한, 비디오 인코더 (20) 는 CU 의 잔차 블록들을 하나 이상의 변환 블록들로 분해할 수도 있다. 실례로, 비디오 인코더 (20) 는 쿼드-트리 파티셔닝을 이용하여 CU 의 잔차 블록들을 하나 이상의 변환 블록들로 분해할 수도 있다. 변환 블록은, 동일한 변환이 적용되는 샘플들의 직사각형 (예컨대, 정사각형 또는 비-정사각형) 블록이다. CU 의 변환 유닛 (TU) 은 하나 이상의 변환 블록들을 포함할 수도 있다. 예를 들어, TU 는 루마 샘플들의 변환 블록들, 크로마 샘플들의 2 개의 대응하는 변환 블록들, 및 변환 블록 샘플들을 변환하기 위해 사용되는 신택스 구조들을 포함할 수도 있다. 따라서, CU 의 각 TU 는 루마 변환 블록, Cb 변환 블록, 및 Cr 변환 블록을 가질 수도 있다. TU 의 루마 변환 블록은 CU 의 루마 잔차 블록의 서브-블록일 수도 있다. Cb 변환 블록은 CU 의 Cb 잔차 블록의 서브-블록일 수도 있다. Cr 변환 블록은 CU 의 Cr 잔차 블록의 서브-블록일 수도 있다. 단색 픽처들 또는 3 개의 별개의 컬러 평면들을 갖는 픽처들에 있어서, TU 는 단일의 변환 블록, 및 그 변환 블록의 샘플들을 변환하는데 사용되는 신택스 구조들을 포함할 수도 있다.Additionally, video encoder 20 may decompose the residual blocks of the CU into one or more transform blocks. As an example, video encoder 20 may decompose the residual blocks of a CU into one or more transform blocks using quad-tree partitioning. A transform block is a rectangular (eg, square or non-square) block of samples to which the same transform is applied. A transform unit (TU) of a CU may include one or more transform blocks. For example, a TU may include transform blocks of luma samples, two corresponding transform blocks of chroma samples, and syntax structures used to transform the transform block samples. Accordingly, each TU of a CU may have a luma transform block, a Cb transform block, and a Cr transform block. The luma transform block of a TU may be a sub-block of the luma residual block of a CU. The Cb transform block may be a sub-block of the Cb residual block of the CU. The Cr transform block may be a sub-block of the Cr residual block of the CU. For monochromatic pictures or pictures with three separate color planes, a TU may contain a single transform block, and the syntax structures used to transform the samples of that transform block.

비디오 인코더 (20) 는 TU 를 위한 계수 블록을 생성하기 위하여 TU 의 변환 블록에 하나 이상의 변환들을 적용할 수도 있다. 계수 블록은 변환 계수들의 2-차원 어레이일 수도 있다. 변환 계수는 스칼라 양일 수도 있다. 일부 예들에서, 하나 이상의 변환들은 변환 블록을 픽셀 도메인으로부터 주파수 도메인으로 변환한다. 따라서, 이러한 예들에서, 변환 계수는 주파수 도메인에 있을 것으로 간주되는 스칼라 양일 수도 있다. 변환 계수 레벨은 변환 계수 값의 계산을 위한 스케일링 이전에 디코딩 프로세스에서 특정 2-차원 주파수 인덱스와 연관된 값을 나타내는 정수 양이다.Video encoder 20 may apply one or more transforms to the transform block of a TU to generate a coefficient block for the TU. A coefficient block may be a two-dimensional array of transform coefficients. The conversion coefficient may be a scalar quantity. In some examples, one or more transforms transform a transform block from the pixel domain to the frequency domain. Accordingly, in these examples, the transform coefficient may be a scalar quantity that is assumed to be in the frequency domain. A transform coefficient level is an integer quantity that represents the value associated with a particular two-dimensional frequency index in the decoding process prior to scaling for calculation of transform coefficient values.

일부 예들에서, 비디오 인코더 (20) 는 변환 블록에 대한 변환들의 적용을 생략한다. 이러한 예들에서, 비디오 인코더 (20) 는 잔차 샘플 값들을 변환 계수들과 동일한 방식으로 처리할 수도 있다. 따라서, 비디오 인코더 (20) 가 변환들의 적용을 생략하는 예들에서, 변환 계수들 및 계수 블록들의 다음과 같은 논의가 잔차 샘플들의 변환 블록들에 적용가능할 수도 있다.In some examples, video encoder 20 omits application of transforms to a transform block. In these examples, video encoder 20 may process residual sample values in the same way as transform coefficients. Accordingly, in examples where video encoder 20 omits application of transforms, the following discussion of transform coefficients and coefficient blocks may be applicable to transform blocks of residual samples.

계수 블록을 생성한 후에, 비디오 인코더 (20) 는 계수 블록을 표현하기 위해 사용되는 데이터의 양을 가능하게는 감소시키기 위해서 계수 블록을 양자화하여, 잠재적으로 추가적인 압축을 제공할 수도 있다. 양자화는 일반적으로 값들의 범위가 단일 값으로 압축되는 프로세스를 지칭한다. 예를 들어, 양자화는 값을 상수로 나누고, 그 다음에 가장 가까운 정수로 라운딩 (rounding) 함으로써 행해질 수도 있다. 계수 블록을 양자화한 후에, 비디오 인코더 (20) 는 계수 블록의 변환 계수들을 양자화할 수도 있다. 양자화는 일부 또는 모든 변환 계수들과 연관된 비트 깊이를 감소시킬 수도 있다. 예를 들어, n 비트 변환 계수는 양자화 동안 m 비트 변환 계수로 라운드-다운될 수도 있으며, 여기서, n 은 m 보다 크다. 일부 예들에서, 비디오 인코더 (20) 는 양자화를 스킵한다.After generating the coefficient block, video encoder 20 may quantize the coefficient block to possibly reduce the amount of data used to represent the coefficient block, potentially providing additional compression. Quantization generally refers to the process by which a range of values is compressed into a single value. For example, quantization may be done by dividing the value by a constant and then rounding to the nearest integer. After quantizing the coefficient block, video encoder 20 may quantize the transform coefficients of the coefficient block. Quantization may reduce the bit depth associated with some or all transform coefficients. For example, n-bit transform coefficients may be rounded down to m-bit transform coefficients during quantization, where n is greater than m. In some examples, video encoder 20 skips quantization.

비디오 인코더 (20) 는 잠재적으로 양자화된 변환 계수의 일부 또는 전부를 나타내는 신택스 엘리먼트를 생성할 수도 있다. 비디오 인코더 (20) 는 양자화된 변환 계수를 나타내는 신택스 엘리먼트들 중 하나 이상을 엔트로피 인코딩할 수도 있다. 예를 들어, 비디오 인코더 (20) 는 양자화된 변환 계수들을 표시하는 신택스 엘리먼트들에 대해 컨텍스트 적응적 이진 산술 코딩 (CABAC) 을 수행할 수도 있다. 따라서, 인코딩된 블록 (예컨대, 인코딩된 CU) 는 양자화된 변환 계수들을 나타내는 엔트로피 인코딩된 신택스 엘리먼트들을 포함할 수도 있다.Video encoder 20 may potentially generate syntax elements representing some or all of the quantized transform coefficients. Video encoder 20 may entropy encode one or more of the syntax elements representing the quantized transform coefficient. For example, video encoder 20 may perform context adaptive binary arithmetic coding (CABAC) on syntax elements indicating quantized transform coefficients. Accordingly, an encoded block (eg, an encoded CU) may include entropy encoded syntax elements representing quantized transform coefficients.

비디오 인코더 (20) 는 인코딩된 비디오 데이터를 포함하는 비트스트림을 출력할 수도 있다. 달리 말하면, 비디오 인코더 (20) 는, 비디오 데이터의 인코딩된 표현을 포함하는 비트스트림을 출력할 수도 있다. 비디오 데이터의 인코딩된 표현은 비디오 데이터의 픽처들의 인코딩된 표현을 포함할 수도 있다. 예를 들어, 비트스트림은 비디오 데이터의 인코딩된 픽처들 및 연관된 데이터의 표현을 형성하는 비트들의 시퀀스를 포함할 수도 있다. 일부 예들에서, 인코딩된 픽처의 표현은 픽처의 블록들의 인코딩된 표현들을 포함할 수도 있다.Video encoder 20 may output a bitstream containing encoded video data. In other words, video encoder 20 may output a bitstream containing an encoded representation of video data. An encoded representation of video data may include an encoded representation of pictures of the video data. For example, a bitstream may include a sequence of bits that form a representation of encoded pictures of video data and associated data. In some examples, a representation of an encoded picture may include encoded representations of blocks of the picture.

비디오 디코더 (30) 는 비디오 인코더 (20) 에 의해 생성된 비트스트림을 수신할 수도 있다. 상기 언급된 바와 같이, 비트스트림은 비디오 데이터의 인코딩된 표현을 포함할 수도 있다. 비디오 디코더 (30) 는 비디오 데이터의 픽처들을 재구성하기 위해 비트스트림을 디코딩할 수도 있다. 비트스트림을 디코딩하는 것의 일부로서, 비디오 디코더 (30) 는 비트스트림으로부터 신택스 엘리먼트들을 획득할 수도 있다. 비디오 디코더 (30) 는 비트스트림으로부터 획득된 신택스 엘리먼트들에 적어도 부분적으로 기초하여 비디오 데이터의 픽처들을 재구성할 수도 있다. 비디오 데이터의 픽처들을 재구성하기 위한 프로세스는 일반적으로, 비디오 인코더 (20) 에 의해 수행되는 프로세스에 대해 상호 역일 수도 있다.Video decoder 30 may receive a bitstream generated by video encoder 20. As mentioned above, a bitstream may include an encoded representation of video data. Video decoder 30 may decode the bitstream to reconstruct pictures of video data. As part of decoding the bitstream, video decoder 30 may obtain syntax elements from the bitstream. Video decoder 30 may reconstruct pictures of video data based at least in part on syntax elements obtained from the bitstream. The process for reconstructing pictures of video data may generally be reciprocal to the process performed by video encoder 20.

실례로, 비디오 데이터의 픽처를 디코딩하는 것의 일부로서, 비디오 디코더 (30) 는 예측성 블록들을 생성하기 위해서 인터 예측 또는 인트라 예측을 이용할 수도 있다. 추가적으로, 비디오 디코더 (30) 는 비트스트림으로부터 획득된 신택스 엘리먼트들에 기초하여 변환 계수들을 결정할 수도 있다. 일부 예들에서, 비디오 디코더 (30) 는 결정된 변환 계수들을 역 양자화한다. 역 양자화 맵들은 양자화된 값을 재구성되는 값에 맵핑한다. 실례로, 비디오 디코더 (30) 는 양자화 스텝 사이즈에 의해 곱해지는 값을 결정함으로써 값을 역 양자화할 수도 있다. 또한, 비디오 디코더 (30) 는 잔차 샘플들의 값들을 결정하기 위해 결정된 변환 계수들에 역 변환을 적용할 수도 있다. 비디오 디코더 (30) 는 잔차 샘플들 및 생성된 예측성 블록들의 대응하는 샘플들에 기초하여 픽처의 블록을 재구성할 수도 있다. 실례로, 비디오 디코더 (30) 는 블록의 재구성된 샘플들을 결정하기 위해 생성된 예측성 블록들의 대응하는 샘들들에 잔차 샘플들을 부가할 수도 있다.As an example, as part of decoding a picture of video data, video decoder 30 may use inter prediction or intra prediction to generate predictive blocks. Additionally, video decoder 30 may determine transform coefficients based on syntax elements obtained from the bitstream. In some examples, video decoder 30 inverse quantizes the determined transform coefficients. Inverse quantization maps map quantized values to reconstructed values. As an example, video decoder 30 may inverse quantize a value by determining the value to be multiplied by the quantization step size. Additionally, video decoder 30 may apply an inverse transform to the determined transform coefficients to determine values of residual samples. Video decoder 30 may reconstruct a block of a picture based on the residual samples and corresponding samples of the generated predictive blocks. As an example, video decoder 30 may add residual samples to corresponding samples of the generated predictive blocks to determine reconstructed samples of the block.

보다 구체적으로, HEVC 및 다른 비디오 코딩 사양들에서, 비디오 디코더 (30) 는 현재 CU 의 각각의 PU 에 대해 하나 이상의 예측성 블록들을 생성하기 위해 인터 예측 또는 인트라 예측을 사용할 수도 있다. 또한, 비디오 디코더 (30) 는 현재 CU 의 TU 들의 계수 블록들을 역 양자화할 수도 있다. 비디오 디코더 (30) 는 현재 CU 의 TU 들의 변환 블록들을 재구성하기 위하여 계수 블록들에 대해 역 변환들을 수행할 수도 있다. 비디오 디코더 (30) 는 현재 CU 의 PU들의 예측성 블록들의 샘플들 및 현재 CU 의 TU 들의 변환 블록들의 잔차 샘플들에 기초하여, 현재 CU 의 코딩 블록을 재구성할 수도 있다. 일부 예들에서, 비디오 디코더 (30) 는 현재 CU 의 PU들에 대한 예측성 블록들의 샘플들을, 현재 CU 의 TU 들의 변환 블록들의 대응하는 디코딩된 샘플들에 부가함으로써, 현재 CU 의 코딩 블록들을 재구성할 수도 있다. 픽처의 각각의 CU 에 대한 코딩 블록들을 재구성함으로써, 비디오 디코더 (30) 는 픽처를 재구성할 수도 있다.More specifically, in HEVC and other video coding specifications, video decoder 30 may use inter-prediction or intra-prediction to generate one or more predictive blocks for each PU of the current CU. Additionally, video decoder 30 may inverse quantize coefficient blocks of TUs of the current CU. Video decoder 30 may perform inverse transforms on coefficient blocks to reconstruct transform blocks of TUs of the current CU. Video decoder 30 may reconstruct a coding block of the current CU based on samples of predictive blocks of the PUs of the current CU and residual samples of transform blocks of the TUs of the current CU. In some examples, video decoder 30 may reconstruct coding blocks of a current CU by adding samples of predictive blocks for PUs of the current CU to corresponding decoded samples of transform blocks of TUs of the current CU. It may be possible. By reconstructing the coding blocks for each CU of the picture, video decoder 30 may reconstruct the picture.

픽처의 슬라이스는 픽처의 정수 개수의 블록들을 포함할 수도 있다. 예를 들어, HEVC 및 다른 비디오 코딩 사양들에서, 픽처의 슬라이스는 픽처의 정수 개수의 CTU 들을 포함할 수도 있다. 슬라이스의 CTU 들은 레스터 스캔 순서와 같은 스캔 순서로 연속적으로 순서화될 수도 있다. HEVC 에 있어서, 슬라이스는 하나의 독립적인 슬라이스 세그먼트에서 그리고 동일한 액세스 유닛 내의 (있다면) 다음의 독립적인 슬라이스 세그먼트에 선행하는 (있다면) 모든 후속의 종속적인 슬라이스 세그먼트들에 포함된 정수 개수의 CTU들로서 정의된다. 또한, HEVC 에서, 슬라이스 세그먼트는 단일 NAL 유닛에 포함되고 타일 스캔에서 연속적으로 순서화된 정수 개수의 CTU들로서 정의된다. 타일 스캔은 CTB 들이 타일에서 CTB 래스터 스캔으로 연속적으로 순서화되는 픽처를 포지셔닝하는 CTB 들의 특정 순차적 순서화인 반면에, 픽처에서의 타일들은 픽처의 타일들의 래스터 스캔으로 연속적으로 순서화된다. 타일은 픽처에서의 특정 타일 열 및 특정 타일 행 내의 CTB 들의 직사각형 영역이다.A slice of a picture may contain an integer number of blocks of the picture. For example, in HEVC and other video coding specifications, a slice of a picture may contain an integer number of CTUs in the picture. The CTUs of a slice may be sequentially ordered in the same scan order as the raster scan order. For HEVC, a slice is defined as an integer number of CTUs contained in one independent slice segment and in all subsequent dependent slice segments (if any) that precede the next independent slice segment (if any) within the same access unit. do. Additionally, in HEVC, a slice segment is defined as an integer number of CTUs contained in a single NAL unit and sequentially ordered in a tile scan. A tile scan is a specific sequential ordering of CTBs positioning a picture in which the CTBs are sequentially ordered from tile to CTB raster scan, while the tiles in a picture are sequentially ordered with a raster scan of the tiles of the picture. A tile is a rectangular area of CTBs within a specific tile row and a specific tile row in a picture.

상기 언급된 바와 같이, 비디오 인코더 (20) ?? 비디오 디코더 (30) 는 비디오 코딩 및 압축 스킴 (scheme) 의 일부로서 신택스 엘리먼트들에 CABAC 인코딩 및 디코딩을 적용할 수도 있다. 신택스 엘리먼트에 CABAC 인코딩을 적용하기 위해, 비디오 인코더 (30) 는 신택스 엘리먼트를 이진화하여 "빈들" 로서 지칭되는 일련의 하나 이상의 비트들을 형성할 수도 있다. 또한, 비디오 인코더 (20) 는 코딩 컨텍스트를 식별할 수도 있다. 코딩 컨텍스트는 특정 값들을 갖는 빈들의 초기 확률들을 식별할 수도 있다. 실례로, 코딩 컨텍스트는 0-값의 빈을 코딩하는 0.7 의 확률 및 1-값의 빈을 코딩하는 0.3 의 확률을 나타낼 수도 있다. 코딩 컨텍스트를 식별한 후에, 비디오 인코더 (20) 는 간격을 하위 서브-간격 (lower sub-interval) 및 상위 서브-간격 (upper sub-interval) 으로 분할할 수도 있다. 서브-간격들 중 하나는 값 0 과 연관될 수도 있고, 다른 서브-간격은 값 1 과 연관될 수도 있다. 서브-간격들의 폭들은 식별된 코딩 컨텍스트에 의해 연관된 값들에 대해 표시된 확률들에 비례할 수도 있다. 신택스 엘리먼트의 빈이 하위 서브-간격과 연관된 값을 가지는 경우에, 인코딩된 값은 하위 서브-간격의 하위 경계와 동일할 수도 있다. 신택스 엘리먼트의 동일 빈이 상위 서브-간격과 연관된 값을 가지는 경우에, 인코딩된 값은 상위 서브-간격의 하위 경계와 동일할 수도 있다. 신택스 엘리먼트의 다음 빈을 인코딩하기 위해, 비디오 인코더 (20) 는 인코딩되는 비트의 값과 연관된 서브-간격인 간격으로 이들 단계들을 반복할 수도 있다. 비디오 인코더 (20) 가 다음 빈에 대해 이들 단계들을 반복할 때, 비디오 인코더 (20) 는 인코딩되는 빈들의 실제 값들 및 식별된 코딩 컨텍스트에 의해 표시된 확률들에 기초하여 수정된 확률들을 이용할 수도 있다.As mentioned above, video encoder 20 ?? Video decoder 30 may apply CABAC encoding and decoding to syntax elements as part of a video coding and compression scheme. To apply CABAC encoding to a syntax element, video encoder 30 may binarize the syntax element to form a series of one or more bits, referred to as “bins.” Video encoder 20 may also identify a coding context. Coding context may identify initial probabilities of bins having specific values. As an example, a coding context may indicate a probability of 0.7 to code a 0-valued bin and a probability of 0.3 to code a 1-valued bin. After identifying the coding context, video encoder 20 may split the interval into a lower sub-interval and an upper sub-interval. One of the sub-intervals may be associated with the value 0 and the other sub-interval may be associated with the value 1. The widths of sub-intervals may be proportional to the probabilities indicated for the values associated with the identified coding context. If a bin of a syntax element has a value associated with a lower sub-interval, the encoded value may be equal to the lower boundary of the lower sub-interval. If the same bin of the syntax element has a value associated with the upper sub-interval, the encoded value may be equal to the lower boundary of the upper sub-interval. To encode the next bin of the syntax element, video encoder 20 may repeat these steps at intervals that are sub-intervals associated with the value of the bit being encoded. When video encoder 20 repeats these steps for the next bin, video encoder 20 may use modified probabilities based on the actual values of the bins being encoded and the probabilities indicated by the identified coding context.

비디오 디코더 (30) 가 신택스 엘리먼트에 대해 CABAC 디코딩을 수행할 때, 비디오 디코더 (30) 는 코딩 컨텍스트를 식별할 수도 있다. 비디오 디코더 (30) 는 그 다음, 간격을 하위 서브-간격과 상위 서브-간격으로 분할할 수도 있다. 서브-간격들 중 하나는 값 0 과 연관될 수도 있고, 다른 서브-간격은 값 1 과 연관될 수도 있다. 서브-간격들의 폭들은 식별된 코딩 컨텍스트에 의해 연관된 값들에 대해 표시된 확률들에 비례할 수도 있다. 인코딘된 값이 하위 서브-간격 내인 경우에, 비디오 디코더 (30) 는 하위 서브-간격과 연관된 값을 갖는 빈을 디코딩할 수도 있다. 인코딘된 값이 상위 서브-간격 내인 경우에, 비디오 디코더 (30) 는 상위 서브-간격과 연관된 값을 갖는 빈을 디코딩할 수도 있다. 신택스 엘리먼트의 다음 빈을 디코딩하기 위해, 비디오 디코더 (30) 는 인코딩된 값을 값을 포함하는 서브-간격인 간격으로 이들 단계들을 반복할 수도 있다. 비디오 디코더 (30) 가 다음 빈에 대해 이들 단계들을 반복할 때, 비디오 디코더 (30) 는 디코딩되는 빈들 및 식별된 코딩 컨텍스트에 의해 표시된 확률들에 기초하여 수정된 확률들을 이용할 수도 있다. 비디오 디코더 (30) 는 그 다음, 신택스 엘리먼트를 복원하기 위해 빈들을 이진화해제할 수도 있다.When video decoder 30 performs CABAC decoding on a syntax element, video decoder 30 may identify a coding context. Video decoder 30 may then split the interval into a lower sub-interval and an upper sub-interval. One of the sub-intervals may be associated with the value 0 and the other sub-interval may be associated with the value 1. The widths of sub-intervals may be proportional to the probabilities indicated for the values associated with the identified coding context. If the encoded value is within the lower sub-interval, video decoder 30 may decode the bin with the value associated with the lower sub-interval. If the encoded value is within the upper sub-interval, video decoder 30 may decode the bin with the value associated with the upper sub-interval. To decode the next bin of the syntax element, video decoder 30 may repeat these steps at intervals that are sub-intervals containing the encoded value. When video decoder 30 repeats these steps for the next bin, video decoder 30 may use modified probabilities based on the probabilities indicated by the bins being decoded and the identified coding context. Video decoder 30 may then debinarize the bins to recover the syntax elements.

비디오 인코더 (20) 는 바이패스 CABAC 코딩을 이용하여 일부 빈들을 인코딩할 수도 있다. 빈에 대해 정규 CABAC 코딩을 수행하기보다는 빈에 대해 바이패스 CABAC 코딩을 수행하는 것이 계산적으로 덜 비용이 들 수도 있다. 또한, 바이패스 CABAC 코딩을 수행하는 것은 더 높은 정도의 병렬화 (parallelization) 및 스루풋 (throughput) 을 허용할 수도 있다. 바이패스 CABAC 코딩을 이용하여 인코딩된 빈들은 "바이패스 빈들" 로서 지칭될 수도 있다. 바이패스 빈들을 함께 그룹핑하는 것은 비디오 인코더 (20) ?? 비디오 디코더 (30) 의 스루풋을 증가시킬 수도 있다. 바이패스 CABAC 코딩 엔진은 단일 사이클에서 수개의 빈들을 코딩 가능할 수도 있는 반면, 정규 CABAC 코딩 엔진은 한 사이클에서 오직 단일 빈만을 코딩 가능할 수도 있다. 이들 바이패스 CABAC 코딩 엔진은 그 바이패스 CABAC 코딩 엔진이 컨텍스트들을 선택하지 않고 양 심볼들 (0 및 1) 에 대해 1/2 의 확률을 가정할 수도 있기 때문에 더 단순할 수도 있다. 결과적으로, 바이패스 CABAC 코딩에서, 간격들은 직접 절반으로 분할된다.Video encoder 20 may encode some bins using bypass CABAC coding. It may be computationally less expensive to perform bypass CABAC coding on a bean rather than performing regular CABAC coding on the bean. Additionally, performing bypass CABAC coding may allow for a higher degree of parallelization and throughput. Bins encoded using bypass CABAC coding may be referred to as “bypass bins.” Grouping the bypass bins together allows the Video Encoder (20) ?? The throughput of video decoder 30 may be increased. A bypass CABAC coding engine may be able to code several bins in a single cycle, whereas a regular CABAC coding engine may be able to code only a single bin in a cycle. These bypass CABAC coding engines may be simpler because the bypass CABAC coding engine may not select contexts and assume a probability of 1/2 for both symbols (0 and 1). As a result, in bypass CABAC coding, the intervals are directly split in half.

도 2 는 본 개시의 기법들을 구현할 수도 있는 예시적인 비디오 인코더 (20) 를 나타내는 블록도이다. 도 2 는 설명의 목적으로 제공되며 본 개시에 폭넓게 예시되고 기재되는 바와 같이 기법들을 제한하는 것으로 고려되지 않아야 한다. 본 개시의 기법들은 다양한 코딩 표준들 또는 방법들에 적용가능할 수도 있다.2 is a block diagram illustrating an example video encoder 20 that may implement the techniques of this disclosure. 2 is provided for illustrative purposes and should not be considered limiting of the techniques as broadly illustrated and described in this disclosure. The techniques of this disclosure may be applicable to various coding standards or methods.

프로세싱 회로는 비디오 인코더 (20) 를 포함하고, 비디오 인코더 (20) 는 본 개시에 설명된 예시적인 기법들 중 하나 이상을 수행하도록 구성된다.　 실례로, 비디오 인코더 (20) 는 집적 회로를 포함하고, 도 2 에 도시된 다양한 유닛들은 회로 버스로 상호접속되는 하드웨어 회로 블록들로서 형성될 수도 있다. 이들 하드웨어 회로 블록들은 별개의 회로 블록들일 수도 있거나, 또는 그 유닛들 중 2 개이상이 공통 하드웨어 회로 블록으로 결합될 수도 있다.　 하드웨어 회로 블록은 AND, OR, NAND, NOR, XOR, XNOR 와 같은 논리 블록 및 다른 유사한 논리 블록뿐만 아니라 산술 논리 유닛 (ALU), 기본 함수 유닛 (EFU) 과 같은 연산 블록을 형성하는 전기 컴포넌트들의 조합으로서 형성될 수도 있다.The processing circuitry includes video encoder 20, where video encoder 20 is configured to perform one or more of the example techniques described in this disclosure. By way of example, video encoder 20 includes an integrated circuit, and the various units shown in FIG. 2 may be formed as hardware circuit blocks that are interconnected with a circuit bus. These hardware circuit blocks may be separate circuit blocks, or two or more of the units may be combined into a common hardware circuit block. A hardware circuit block is a combination of electrical components that form an arithmetic block such as an arithmetic logic unit (ALU), an elementary function unit (EFU), as well as logical blocks such as AND, OR, NAND, NOR, XOR, XNOR, and other similar logic blocks. It can also be formed as.

일부 예들에서, 도 2 에서 예시된 유닛들 중 하나 이상은 프로세싱 회로 상에서 실행되는 소프트웨어 유닛일 수도 있다.　 이러한 예에서, 이들 소프트웨어 유닛에 대한 오브젝트 코드는 메모리에 저장된다.　 오퍼레이팅 시스템은 비디오 인코더 (20) 로 하여금 목적 코드를 취출하고 목적 코드를 실행하게 할 수도 있으며, 이는 비디오 인코더 (20) 로 하여금 예시적인 기법들을 구현하기 위한 동작들을 수행하게 한다.　 일부 예에서, 소프트웨어 유닛은 비디오 인코더 (20) 가 시동시에 실행되는 펌웨어일 수도 있다.　 따라서, 비디오 인코더 (20) 는 예시적인 기술을 수행하는 하드웨어를 갖는 구조적 컴포넌트이거나 또는 하드웨어를 특화하여 예시적인 기술을 수행하기 위해 하드웨어 상에서 실행되는 소프트웨어/펌웨어를 갖는다.In some examples, one or more of the units illustrated in FIG. 2 may be a software unit executing on processing circuitry. In this example, the object code for these software units is stored in memory. The operating system may cause video encoder 20 to retrieve object code and execute the object code, which causes video encoder 20 to perform operations to implement example techniques. In some examples, the software unit may be firmware that runs when video encoder 20 starts up. Accordingly, video encoder 20 is an architectural component that has hardware that performs example techniques or that specializes in hardware and has software/firmware running on the hardware to perform example techniques.

도 2 의 예에서, 비디오 인코더 (20) 는 예측 프로세싱 유닛 (200), 비디오 데이터 메모리 (201), 잔차 생성 유닛 (102), 변환 프로세싱 유닛 (204), 양자화 유닛 (206), 역양자화 유닛 (208), 역변환 프로세싱 유닛 (210), 재구성 유닛 (212), 필터 유닛 (214), 디코딩된 픽처 버퍼 (216), 및 엔트로피 인코딩 유닛 (218) 을 포함한다. 예측 프로세싱 유닛 (200) 은, 인터 예측 프로세싱 유닛 (220) 및 인트라 예측 프로세싱 유닛 (226) 을 포함한다. 인터 예측 프로세싱 유닛 (220) 은, 모션 추정 유닛 및 모션 보상 유닛 (미도시) 을 포함할 수도 있다.In the example of FIG. 2 , video encoder 20 includes prediction processing unit 200, video data memory 201, residual generation unit 102, transform processing unit 204, quantization unit 206, and dequantization unit ( 208), an inverse transform processing unit 210, a reconstruction unit 212, a filter unit 214, a decoded picture buffer 216, and an entropy encoding unit 218. Prediction processing unit 200 includes an inter prediction processing unit 220 and an intra prediction processing unit 226. Inter prediction processing unit 220 may include a motion estimation unit and a motion compensation unit (not shown).

비디오 데이터 메모리 (201) 는 비디오 인코더 (20) 의 컴포넌트들에 의해 인코딩될 비디오 데이터를 저장하도록 구성될 수도 있다. 비디오 데이터 메모리 (201) 에 저장된 비디오 데이터는, 예를 들어, 비디오 소스 (18) 로부터 획득될 수도 있다. 디코딩된 픽처 버퍼 (216) 는, 예컨대 인트라 또는 인터 코딩 모드들에서, 비디오 인코더 (20) 에 의해 비디오 데이터를 인코딩함에 있어서 사용하기 위한 레퍼런스 비디오 데이터를 저장하는 레퍼런스 픽처 메모리일 수도 있다. 비디오 데이터 메모리 (201) 및 디코딩된 픽처 버퍼 (216) 는 동기식 DRAM (SDRAM) 을 포함한 동적 랜덤 액세스 메모리 (DRAM), 자기저항성 RAM (MRAM), 저항성 RAM (RRAM), 또는 다른 타입들의 메모리 디바이스들과 같은 다양한 메모리 디바이스들 중 임의의 메모리 디바이스에 의해 형성될 수도 있다. 비디오 데이터 메모리 (201) 및 디코딩된 픽처 버퍼 (216) 는 동일한 메모리 디바이스 또는 별도의 메모리 디바이스들에 의해 제공될 수도 있다. 다양한 예들에 있어서, 비디오 데이터 메모리 (201) 는 비디오 인코더 (20) 의 다른 컴포넌트들과 온-칩형이거나 또는 그 컴포넌트들에 대하여 오프-칩형일 수도 있다. 비디오 데이터 메모리 (201) 는 도 1 의 저장 매체 (19) 와 동일하거나 또는 그것의 일부일 수도 있다.Video data memory 201 may be configured to store video data to be encoded by components of video encoder 20. Video data stored in video data memory 201 may be obtained from video source 18, for example. Decoded picture buffer 216 may be a reference picture memory that stores reference video data for use in encoding video data by video encoder 20, such as in intra or inter coding modes. Video data memory 201 and decoded picture buffer 216 may be dynamic random access memory (DRAM), including synchronous DRAM (SDRAM), magnetoresistive RAM (MRAM), resistive RAM (RRAM), or other types of memory devices. It may be formed by any of various memory devices such as. Video data memory 201 and decoded picture buffer 216 may be provided by the same memory device or separate memory devices. In various examples, video data memory 201 may be on-chip with or off-chip relative to other components of video encoder 20. Video data memory 201 may be the same as or be a part of storage medium 19 of FIG. 1 .

비디오 인코더 (20) 는 비디오 데이터를 수신한다. 비디오 인코더 (20) 는 비디오 데이터의 픽처의 슬라이스에서 각각의 CTU 를 인코딩할 수도 있다. CTU 들의 각각은 동일한 크기의 루마 코딩 트리 블록들 (CTB) 및 픽처의 대응하는 CTB들과 연관될 수도 있다. CTU 를 인코딩하는 부분으로서, 예측 프로세싱 유닛 (200) 은 파티셔닝을 수행하여, CTU 의 CTB들을 점진적으로 더 작은 블록들로 분할할 수도 있다. 더 작은 픽셀 블록들은 CU 들의 코딩 블록들일 수도 있다. 예를 들어, 예측 프로세싱 유닛 (200) 은 CTU 와 연관된 CTB 를 트리 구조에 따라 파티셔닝할 수도 있다.Video encoder 20 receives video data. Video encoder 20 may encode each CTU in a slice of a picture of video data. Each of the CTUs may be associated with luma coding tree blocks (CTBs) of the same size and the corresponding CTBs of the picture. As part of encoding a CTU, predictive processing unit 200 may perform partitioning, dividing the CTU's CTBs into progressively smaller blocks. Smaller pixel blocks may be coding blocks of CUs. For example, prediction processing unit 200 may partition CTBs associated with CTUs according to a tree structure.

비디오 인코더 (20) 는 CTU 의 CU 들을 인코딩하여 CU 들의 인코딩된 표현들 (즉, 코딩된 CU들) 을 생성할 수도 있다. CU 를 인코딩하는 것의 일부로서, 예측 프로세싱 유닛 (200) 은 CU 의 하나 이상의 PU들 중에서 CU 와 연관된 코딩 블록들을 파티셔닝할 수도 있다. 따라서, 각각의 PU 는 루마 예측 블록 및 대응하는 크로마 예측 블록들과 연관될 수도 있다. 비디오 인코더 (20) 및 비디오 디코더 (30) 는 다양한 크기를 갖는 PU들을 지원할 수도 있다. 상기 나타낸 바와 같이, CU 의 크기는 CU 의 루마 코딩 블록의 크기를 나타낼 수도 있고 PU 의 크기는 PU 의 루마 예측 블록의 크기를 나타낼 수도 있다. 특정 CU 의 크기가 2Nx2N 이라고 가정하면, 비디오 인코더 (20) 및 비디오 디코더 (30) 는 인트라 예측에 대해 2Nx2N 또는 NxN 의 PU 크기들, 그리고 인터 예측에 대해 2Nx2N, 2NxN, Nx2N, NxN 또는 유사한 것의 대칭적 PU 크기들을 지원할 수도 있다. 비디오 인코더 (20) 및 비디오 디코더 (30) 는 또한, 인터 예측을 위해 2NxnU, 2NxnD, nLx2N, 및 nRx2N 의 PU 크기에 대한 비대칭적 파티셔닝을 지원할 수도 있다.Video encoder 20 may encode the CUs of a CTU to generate encoded representations of the CUs (i.e., coded CUs). As part of encoding a CU, prediction processing unit 200 may partition coding blocks associated with a CU among one or more PUs of the CU. Accordingly, each PU may be associated with a luma prediction block and corresponding chroma prediction blocks. Video encoder 20 and video decoder 30 may support PUs of various sizes. As indicated above, the size of a CU may indicate the size of a luma coding block of the CU and the size of a PU may indicate the size of a luma prediction block of the PU. Assuming that the size of a particular CU is 2Nx2N, video encoder 20 and video decoder 30 may use PU sizes of 2Nx2N or NxN for intra prediction, and symmetries of 2Nx2N, 2NxN, Nx2N, NxN, or similar for inter prediction. Multiple PU sizes may be supported. Video encoder 20 and video decoder 30 may also support asymmetric partitioning for PU sizes of 2NxnU, 2NxnD, nLx2N, and nRx2N for inter prediction.

인터 예측 프로세싱 유닛 (220) 은 PU에 대한 예측성 데이터를 생성할 수도 있다. PU에 대한 예측성 데이터를 생성하는 것의 일부로서, 인터 예측 프로세싱 유닛 (220) 은 PU 에 대해 인터 예측을 수행한다. PU 를 위한 예측성 데이터는 PU 의 예측성 블록들 및 PU 를 위한 모션 정보를 포함할 수도 있다. 인터 예측 프로세싱 유닛 (220) 은, PU 가 I 슬라이스인지, P 슬라이스인지 또는 B 슬라이스인지에 의존하여 CU 의 PU 에 대해 상이한 동작들을 수행할 수도 있다. I 슬라이스에 있어서, 모든 PU들은 인트라 예측된다. 따라서, PU 가 I 슬라이스에 있으면, 인터-예측 프로세싱 유닛 (220) 은 PU 에 대해 인터 예측을 수행하지 않는다. 따라서, I-모드로 인코딩된 블록들에 대해, 예측된 블록은 동일 프레임 내의 이전에 인코딩된 이웃 블록들로부터의 공간 예측을 이용하여 형성된다. PU 가 P 슬라이스에 있는 경우에, 인터-예측 프로세싱 유닛 (220) 은 PU 의 예측성 블록을 생성하기 위해 단방향 인터 예측을 이용할 수도 있다. PU 가 B 슬라이스에 있는 경우에, 인터-예측 프로세싱 유닛 (220) 은 PU 의 예측성 블록을 생성하기 위해 단방향 또는 양방향 인터 예측을 수행할 수도 있다.Inter prediction processing unit 220 may generate predictive data for a PU. As part of generating predictive data for a PU, inter prediction processing unit 220 performs inter prediction for the PU. Predictive data for a PU may include predictive blocks of the PU and motion information for the PU. Inter prediction processing unit 220 may perform different operations on a PU of a CU depending on whether the PU is an I slice, a P slice, or a B slice. In an I slice, all PUs are intra predicted. Accordingly, if the PU is in an I slice, inter-prediction processing unit 220 does not perform inter prediction on the PU. Accordingly, for blocks encoded in I-mode, a predicted block is formed using spatial prediction from previously encoded neighboring blocks within the same frame. If the PU is in a P slice, inter-prediction processing unit 220 may use one-way inter prediction to generate a predictive block of the PU. If the PU is in a B slice, inter-prediction processing unit 220 may perform unidirectional or bidirectional inter prediction to generate a predictive block of the PU.

인트라 예측 프로세싱 유닛 (226) 은 PU에 대한 인트라 예측을 수행함으로써 PU 를 위한 예측성 데이터를 생성할 수도 있다. PU 를 위한 예측성 데이터는 PU 의 예측성 블록들 및 다양한 신택스 엘리먼트들을 포함할 수도 있다. 인트라-예측 프로세싱 유닛 (226) 은 I 슬라이스들, P 슬라이스들, 및 B 슬라이스들에 있어서 PU들에 대해 인트라 예측을 수행할 수도 있다.Intra prediction processing unit 226 may generate predictive data for a PU by performing intra prediction for the PU. Predictive data for a PU may include the PU's predictive blocks and various syntax elements. Intra-prediction processing unit 226 may perform intra prediction on PUs in I slices, P slices, and B slices.

PU 에 대해 인트라 예측을 수행하기 위해, 인트라-예측 프로세싱 유닛 (226) 은 다중의 인트라 예측 모드들을 이용하여, PU 에 대한 예측성 데이터의 다중의 세트들을 생성할 수도 있다. 인트라 예측 프로세싱 유닛 (226) 은 이웃하는 PU들의 샘플 블록들로부터의 샘플들을 이용하여 PU에 대한 예측성 블록을 생성할 수도 있다. 이웃 PU들은, PU들, CU들, 및 CTU들에 대한 좌-우로, 상부-하부로의 인코딩 순서를 가정할 때, PU 의 상부, 상부 및 우측으로, 상부 및 좌측으로, 또는 좌측으로일 수도 있다. 인트라 예측 프로세싱 유닛 (226) 은, 다양한 수의 인트라 예측 모드들, 예를 들어, 33개 방향 인트라 예측 모드들을 사용할 수도 있다. 일부 예들에서, 인트라 예측 모드들의 수는 PU 와 연관된 영역의 크기에 의존할 수도 있다.To perform intra prediction for a PU, intra-prediction processing unit 226 may use multiple intra prediction modes to generate multiple sets of predictive data for the PU. Intra prediction processing unit 226 may generate a prediction block for a PU using samples from sample blocks of neighboring PUs. Neighboring PUs may be above, above and to the right, above and to the left, or to the left of a PU, assuming a left-to-right, top-to-bottom encoding order for the PUs, CUs, and CTUs. there is. Intra prediction processing unit 226 may use various numbers of intra prediction modes, for example, 33-way intra prediction modes. In some examples, the number of intra prediction modes may depend on the size of the region associated with the PU.

예측 프로세싱 유닛 (200) 은, PU 를 위한 인터 예측 프로세싱 유닛 (220) 에 의해 생성된 예측성 데이터 또는 PU 를 위한 인트라 예측 프로세싱 유닛 (226) 에 의해 생성된 예측성 데이터 중에서 CU 의 PU 를 위한 예측성 데이터를 선택할 수도 있다. 일부 예들에 있어서, 예측 프로세싱 유닛 (200) 은 예측성 데이터의 세트들의 레이트/왜곡 메트릭들에 기초하여 CU 의 PU들에 대한 예측성 데이터를 선택한다. 선택된 예측성 데이터의 예측성 블록들은 본 명세서에서 선택된 예측성 블록들로서 지칭될 수도 있다.Prediction processing unit 200 generates a prediction for a PU of a CU among the prediction data generated by inter prediction processing unit 220 for a PU or the prediction data generated by intra prediction processing unit 226 for a PU. You can also select last name data. In some examples, prediction processing unit 200 selects predictive data for PUs of a CU based on rate/distortion metrics of sets of predictive data. Predictive blocks of selected predictive data may be referred to herein as selected predictive blocks.

잔차 생성 유닛 (102) 은, CU 에 대한 코딩 블록들 (예를 들어, 루마, Cb 및 Cr 코딩 블록들) 및 CU 의 PU 들에 대해 선택된 예측성 블록들 (예를 들어, 예측 루마, Cb 및 Cr 블록들) 에 기초하여, CU 에 대한 잔차 블록들 (예를 들어, 루마, Cb 및 Cr 잔차 블록들) 을 생성할 수도 있다. 실례로, 잔차 생성 유닛 (102) 은, 잔차 블록들에 있는 각각의 샘플이 CU 의 코딩 블록에 있는 샘플과 CU 의 PU 의 대응하는 선택된 예측성 블록 사이의 차이와 동일한 값을 갖도록 CU 의 잔차 블록들을 생성한다.Residual generation unit 102 generates coding blocks for a CU (e.g., luma, Cb, and Cr coding blocks) and selected predictive blocks for PUs of the CU (e.g., prediction luma, Cb, and Based on the Cr blocks), residual blocks (eg, luma, Cb, and Cr residual blocks) for the CU may be generated. In an example, residual generation unit 102 may generate a residual block of a CU such that each sample in the residual blocks has a value equal to the difference between a sample in a coding block of the CU and the corresponding selected predictive block of a PU of the CU. create them.

변환 프로세싱 유닛 (204) 은 CU 의 잔차 블록들을 CU 의 TU들의 변환 블록들로 파티셔닝할 수도 있다. 실례로, 변환 프로세싱 유닛 (204) 은 쿼드 트리 파티셔닝을 수행하여, CU 의 잔차 블록들을 CU 의 TU들의 변환 블록들로 파티셔닝할 수도 있다. 따라서, TU 는 루마 변환 블록 및 2개의 크로마 변환 블록들과 연관될 수도 있다. CU 의 TU 들의 루마 및 크로마 변환 블록들의 사이즈들 및 위치는 CU 의 PU 들의 예측 블록들의 사이즈들 및 위치에 기초하거나 또는 기초하지 않을 수도 있다. "잔차 쿼드 트리 (residual quad-tree)" (RQT) 로 알려진 쿼드 트리 구조는 각각의 영역들과 연관된 노드들을 포함할 수도 있다. CU 의 TU 들은 RQT 의 리프 노드들에 대응할 수도 있다.Transform processing unit 204 may partition the residual blocks of a CU into transform blocks of the TUs of the CU. As an example, transform processing unit 204 may perform quad tree partitioning, partitioning the residual blocks of a CU into transform blocks of TUs of the CU. Accordingly, a TU may be associated with a luma transform block and two chroma transform blocks. The sizes and positions of the luma and chroma transform blocks of the TUs of a CU may or may not be based on the sizes and positions of the prediction blocks of the PUs of the CU. A quad-tree structure, known as a “residual quad-tree” (RQT), may contain nodes associated with each region. TUs of a CU may correspond to leaf nodes of an RQT.

변환 프로세싱 유닛 (204) 은, TU 의 변환 블록들에 하나 이상의 변환들을 적용함으로써 CU 의 각각의 TU 에 대해 변환 계수 블록들을 생성할 수도 있다. 변환 프로세싱 유닛 (204) 은 TU 와 연관된 변환 블록에 다양한 변환들을 적용할 수도 있다. 예를 들어, 변환 프로세싱 유닛 (204) 은 이산 코사인 변환 (DCT), 지향성 변환, 또는 개념적으로 유사한 변환을 변환 블록에 적용할 수도 있다. 일부 예들에서, 변환 프로세싱 유닛 (204) 은 변환 블록에 변환들을 적용하지 않는다. 그러한 예들에서, 변환 블록은 변환 계수 블록으로 다루어질 수도 있다.Transform processing unit 204 may generate transform coefficient blocks for each TU of a CU by applying one or more transforms to transform blocks of the TU. Transform processing unit 204 may apply various transforms to the transform block associated with a TU. For example, transform processing unit 204 may apply a discrete cosine transform (DCT), a directional transform, or a conceptually similar transform to the transform block. In some examples, transform processing unit 204 does not apply transforms to the transform block. In such examples, the transform block may be treated as a transform coefficient block.

양자화 유닛 (206) 은 계수 블록에 있어서의 변환 계수들을 양자화할 수도 있다. 양자화 유닛 (206) 은 CU 와 연관된 양자화 파라미터 (QP) 값에 기초하여 CU 의 TU 와 연관된 계수 블록을 양자화할 수도 있다. 비디오 인코더 (20) 는 CU 와 연관된 QP 값을 조정함으로써 CU 와 연관된 계수 블록들에 적용된 양자화의 정도를 조정할 수도 있다. 양자화는 정보의 손실을 가져올 수도 있다. 따라서, 양자화된 변환 계수들은 원래의 것보다 낮은 정확도를 가질 수도 있다.Quantization unit 206 may quantize the transform coefficients in a coefficient block. Quantization unit 206 may quantize a coefficient block associated with a TU of a CU based on a quantization parameter (QP) value associated with the CU. Video encoder 20 may adjust the degree of quantization applied to coefficient blocks associated with a CU by adjusting the QP value associated with the CU. Quantization may result in loss of information. Therefore, quantized transform coefficients may have lower accuracy than the original ones.

역 양자화 유닛 (208) 및 역 변환 프로세싱 유닛 (210) 은 각각 계수 블록에 역 양자화 및 역 변환들을 적용하여, 계수 블록으로부터 잔차 블록을 재구성할 수도 있다. 재구성 유닛 (212) 은 예측 프로세싱 유닛 (200) 에 의해 생성된 하나 이상의 예측 블록들로부터 대응하는 샘플들에 재구성된 잔차 블록을 가산함으로써, TU 와 연관된 재구성된 변환 블록을 생성할 수도 있다. 이러한 방식으로 CU 의 각각의 TU 에 대한 변환 블록들을 복원함으로써, 비디오 인코더 (20) 는 CU 의 코딩 블록들을 복원할 수도 있다.Inverse quantization unit 208 and inverse transform processing unit 210 may apply inverse quantization and inverse transforms, respectively, to the coefficient block to reconstruct a residual block from the coefficient block. Reconstruction unit 212 may generate a reconstructed transform block associated with a TU by adding the reconstructed residual block to corresponding samples from one or more prediction blocks generated by prediction processing unit 200. By reconstructing the transform blocks for each TU of a CU in this way, video encoder 20 may reconstruct the coding blocks of the CU.

필터 유닛 (214) 은 하나 이상의 디블록킹 (deblocking) 동작들을 수행하여, CU 와 연관된 코딩 블록들에서의 블록킹 아티팩트들을 감소시킬 수도 있다. 디코딩된 픽처 버퍼 (216) 는, 필터 유닛 (214) 이 복원된 코딩 블록들에 하나 이상의 디블록킹 동작들을 수행한 이후 복원된 코딩 블록들을 저장할 수도 있다. 인터 예측 프로세싱 유닛 (220) 은 다른 화상들의 PU 들에 대해 인터 예측을 수행하기 위하여 재구성된 코딩 블록들을 포함하는 참조 화상을 사용할 수도 있다. 또한, 인트라-예측 프로세싱 유닛 (226) 은 디코딩된 픽처 버퍼 (216) 에 있어서의 재구성된 코딩 블록들을 이용하여, CU 와 동일한 픽처에 있어서의 다른 PU들에 대해 인트라 예측을 수행할 수도 있다.Filter unit 214 may perform one or more deblocking operations to reduce blocking artifacts in coding blocks associated with a CU. Decoded picture buffer 216 may store reconstructed coding blocks after filter unit 214 performs one or more deblocking operations on the reconstructed coding blocks. Inter prediction processing unit 220 may use a reference picture containing reconstructed coding blocks to perform inter prediction on PUs of other pictures. Additionally, intra-prediction processing unit 226 may use the reconstructed coding blocks in decoded picture buffer 216 to perform intra prediction on other PUs in the same picture as the CU.

엔트로피 인코딩 유닛 (218) 은 비디오 인코더 (20) 의 다른 기능 컴포넌트들로부터 데이터를 수신할 수도 있다. 예를 들어, 엔트로피 인코딩 유닛 (218) 은 양자화 유닛 (206) 으로부터 계수 블록들을 수신할 수도 있고 예측 프로세싱 유닛 (200) 으로부터 신택스 엘리먼트들을 수신할 수도 있다. 엔트로피 인코딩 유닛 (218) 은 인트로피 인코딩된 데이터를 생성하기 위하여 데이터에 대해 하나 이상의 엔트로피 인코딩 동작들을 수행할 수도 있다. 예를 들어, 엔트로피 인코딩 유닛 (218) 은 데이터에 대해 CABAC 동작 또는 다른 타입의 엔트로피 인코딩 동작을 수행할 수도 있다. 비디오 인코더 (20) 는 엔트로피 인코딩 유닛 (218) 에 의해 생성된 엔트로피 인코딩된 데이터를 포함하는 비트스트림을 출력할 수도 있다. 실례로, 그 비트스트림은 CU 에 대한 변환 계수들의 값들을 나타내는 데이터를 포함할 수도 있다.Entropy encoding unit 218 may receive data from other functional components of video encoder 20. For example, entropy encoding unit 218 may receive coefficient blocks from quantization unit 206 and syntax elements from prediction processing unit 200. Entropy encoding unit 218 may perform one or more entropy encoding operations on data to generate entropy encoded data. For example, entropy encoding unit 218 may perform a CABAC operation or another type of entropy encoding operation on data. Video encoder 20 may output a bitstream containing entropy encoded data generated by entropy encoding unit 218. As an example, the bitstream may include data representing values of transform coefficients for a CU.

도 3 은 본 개시의 기법들을 구현하도록 구성된 예시적인 비디오 디코더 (30) 를 나타내는 블록도이다. 도 3 은 설명의 목적들을 위해 제공되며, 본 개시에서 넓게 예시화되고 설명된 바와 같은 기법들에 대해 한정하는 것은 아니다. 설명의 목적을 위하여, 본 개시물은 HEVC 코딩의 맥락에서 비디오 디코더 (30) 를 설명한다. 하지만, 본 개시의 기법들은 다른 코딩 표준들 또는 방법들에 적용가능할 수도 있다.3 is a block diagram illustrating an example video decoder 30 configured to implement the techniques of this disclosure. 3 is provided for illustrative purposes and is not limiting to the techniques broadly illustrated and described in this disclosure. For purposes of explanation, this disclosure describes video decoder 30 in the context of HEVC coding. However, the techniques of this disclosure may be applicable to other coding standards or methods.

프로세싱 회로는 비디오 디코더 (30) 를 포함하고, 비디오 디코더 (30) 는 본 개시에 설명된 예시적인 기법들 중 하나 이상을 수행하도록 구성된다.　 실례로, 비디오 디코더 (30) 는 집적 회로를 포함하고, 도 3 에 도시된 다양한 유닛들은 회로 버스로 상호접속되는 하드웨어 회로 블록들로서 형성될 수도 있다. 이들 하드웨어 회로 블록들은 별개의 회로 블록들일 수도 있거나, 또는 그 유닛들 중 2 개이상이 공통 하드웨어 회로 블록으로 결합될 수도 있다.　 하드웨어 회로 블록은 AND, OR, NAND, NOR, XOR, XNOR 와 같은 논리 블록 및 다른 유사한 논리 블록뿐만 아니라 산술 논리 유닛 (ALU), 기본 함수 유닛 (EFU) 과 같은 연산 블록을 형성하는 전기 컴포넌트들의 조합으로서 형성될 수도 있다.The processing circuitry includes video decoder 30, where video decoder 30 is configured to perform one or more of the example techniques described in this disclosure. By way of example, video decoder 30 includes an integrated circuit, and the various units shown in FIG. 3 may be formed as hardware circuit blocks that are interconnected with a circuit bus. These hardware circuit blocks may be separate circuit blocks, or two or more of the units may be combined into a common hardware circuit block. A hardware circuit block is a combination of electrical components that form an arithmetic block such as an arithmetic logic unit (ALU), an elementary function unit (EFU), as well as logical blocks such as AND, OR, NAND, NOR, XOR, XNOR, and other similar logic blocks. It can also be formed as.

일부 예들에서, 도 3 에서 예시된 유닛들 중 하나 이상은 프로세싱 회로 상에서 실행되는 소프트웨어 유닛일 수도 있다.　 이러한 예에서, 이들 소프트웨어 유닛에 대한 오브젝트 코드는 메모리에 저장된다.　 오퍼레이팅 시스템은 비디오 디코더 (30) 로 하여금 목적 코드를 취출하고 목적 코드를 실행하게 할 수도 있으며, 이는 비디오 디코더 (30) 로 하여금 예시적인 기술을 구현하기 위한 동작들을 수행하게 한다.　 일부 예들에서, 소프트웨어 유닛은 비디오 디코더 (30) 가 시동시에 실행되는 펌웨어일 수도 있다.　 따라서, 비디오 디코더 (30) 는 예시적인 기술을 수행하는 하드웨어를 갖는 구조적 컴포넌트이거나 또는 하드웨어를 특화하여 예시적인 기술을 수행하기 위해 하드웨어 상에서 실행되는 소프트웨어/펌웨어를 갖는다.In some examples, one or more of the units illustrated in FIG. 3 may be a software unit executing on processing circuitry. In this example, the object code for these software units is stored in memory. The operating system may cause video decoder 30 to retrieve object code and execute the object code, which causes video decoder 30 to perform operations to implement the example technique. In some examples, the software unit may be firmware that runs when video decoder 30 starts up. Accordingly, video decoder 30 is an architectural component that has hardware that performs example techniques or that specializes in hardware and has software/firmware running on the hardware to perform example techniques.

도 3 의 예에 있어서, 비디오 디코더 (30) 는 엔트로피 디코딩 유닛 (300), 비디오 데이터 메모리 (301), 예측 프로세싱 유닛 (302), 역양자화 유닛 (304), 역변환 프로세싱 유닛 (306), 재구성 유닛 (308), 필터 유닛 (310), 및 디코딩된 픽처 버퍼 (312) 를 포함한다. 예측 프로세싱 유닛 (302) 은, 모션 보상 유닛 (314) 및 인트라 예측 프로세싱 유닛 (316) 을 포함한다. 다른 예들에서, 비디오 디코더 (30) 는, 더 많거나, 더 적거나, 또는 상이한 기능 컴포넌트들을 포함할 수도 있다.In the example of FIG. 3 , video decoder 30 includes an entropy decoding unit 300, a video data memory 301, a prediction processing unit 302, an inverse quantization unit 304, an inverse transform processing unit 306, and a reconstruction unit. 308, a filter unit 310, and a decoded picture buffer 312. Prediction processing unit 302 includes motion compensation unit 314 and intra prediction processing unit 316. In other examples, video decoder 30 may include more, fewer, or different functional components.

비디오 데이터 메모리 (301) 는 비디오 디코더 (30) 의 컴포넌트들에 의해 디코딩될 인코딩된 비디오 데이터, 이를테면 인코딩된 비디오 비트스트림을 저장할 수도 있다. 비디오 데이터 메모리 (301) 에 저장된 비디오 데이터는, 예를 들어, 컴퓨터 판독가능 매체 (16) 로부터, 예를 들어, 카메라와 같은 로컬 비디오 소스로부터, 비디오 데이터의 유선 또는 무선 네트워크 통신을 통해, 또는 물리적 데이터 저장 매체에 액세스하는 것에 의해, 획득될 수도 있다. 비디오 데이터 메모리 (301) 는 인코딩된 비디오 비트스트림으로부터 인코딩된 비디오 데이터를 저장하는 코딩된 픽처 버퍼 (CPB) 를 형성할 수도 있다. 디코딩된 픽처 버퍼 (312) 는, 예를 들어, 인트라- 또는 인터-코딩 모드들에서 또는 출력에 대하여 비디오 디코더 (30) 에 의해 비디오 데이터를 디코딩함에 있어서 사용하기 위한 레퍼런스 비디오 데이터를 저장하는 레퍼런스 픽처 메모리일 수도 있다. 비디오 데이터 메모리 (301) 및 디코딩된 픽처 버퍼 (312) 는 동기식 DRAM (SDRAM) 을 포함한 동적 랜덤 액세스 메모리 (DRAM), 자기저항성 RAM (MRAM), 저항성 RAM (RRAM), 또는 다른 타입들의 메모리 디바이스들과 같은 다양한 메모리 디바이스들 중 임의의 메모리 디바이스에 의해 형성될 수도 있다. 비디오 데이터 메모리 (301) 및 디코딩된 픽처 버퍼 (312) 는 동일한 메모리 디바이스 또는 별도의 메모리 디바이스들에 의해 제공될 수도 있다. 다양한 예들에 있어서, 비디오 데이터 메모리 (301) 는 비디오 디코더 (30) 의 다른 컴포넌트들과 온-칩형이거나 또는 그 컴포넌트들에 대하여 오프-칩형일 수도 있다. 비디오 데이터 메모리 (301) 는 도 1 의 저장 매체 (28) 와 동일하거나 또는 그것의 일부일 수도 있다.Video data memory 301 may store encoded video data, such as an encoded video bitstream, to be decoded by components of video decoder 30. Video data stored in video data memory 301 may be transmitted, for example, from a computer-readable medium 16, for example, from a local video source such as a camera, via wired or wireless network communication of video data, or from a physical It may also be obtained by accessing a data storage medium. Video data memory 301 may form a coded picture buffer (CPB) that stores encoded video data from an encoded video bitstream. Decoded picture buffer 312 is a reference picture that stores reference video data for use in decoding video data by video decoder 30, e.g., in intra- or inter-coding modes or for output. It could be memory. Video data memory 301 and decoded picture buffer 312 may be dynamic random access memory (DRAM), including synchronous DRAM (SDRAM), magnetoresistive RAM (MRAM), resistive RAM (RRAM), or other types of memory devices. It may be formed by any of various memory devices such as. Video data memory 301 and decoded picture buffer 312 may be provided by the same memory device or separate memory devices. In various examples, video data memory 301 may be on-chip with or off-chip relative to other components of video decoder 30. Video data memory 301 may be the same as or be a part of storage medium 28 of FIG. 1 .

비디오 데이터 메모리 (301) 는 비트스트림의 인코딩된 비디오 데이터 (예를 들어, NAL 유닛들) 를 수신하여 저장한다. 엔트로피 디코딩 유닛 (300) 은 비디오 데이터 메모리 (301) 로부터 인코딩된 비디오 데이터 (예를 들어, NAL 유닛들) 를 수신할 수도 있고 신택스 엘리먼트들을 얻기 위해 NAL 유닛들을 파싱할 수도 있다. 엔트로피 디코딩 유닛 (300) 은 NAL 유닛들에서 엔트로피 인코딩된 신택스 엘리먼트들을 엔트로피 디코딩할 수도 있다. 예측 프로세싱 유닛 (302), 역 양자화 유닛 (304), 역 변환 프로세싱 유닛 (306), 재구성 유닛 (308), 및 필터 유닛 (310) 은 비트스트림으로부터 추출된 신택스 엘리먼트들에 기초하여 디코딩된 비디오 데이터를 생성할 수도 있다. 엔트로피 디코딩 유닛 (300) 은 엔트로피 인코딩 유닛들과 일반적으로 상반되는 프로세스를 수행할 수도 있다.Video data memory 301 receives and stores encoded video data (e.g., NAL units) of a bitstream. Entropy decoding unit 300 may receive encoded video data (e.g., NAL units) from video data memory 301 and parse the NAL units to obtain syntax elements. Entropy decoding unit 300 may entropy decode entropy encoded syntax elements in NAL units. Prediction processing unit 302, inverse quantization unit 304, inverse transform processing unit 306, reconstruction unit 308, and filter unit 310 decode video data based on syntax elements extracted from the bitstream. You can also create . Entropy decoding unit 300 may perform a generally opposite process to entropy encoding units.

비트스트림으로부터 신택스 엘리먼트들을 획득하는 것에 추가하여, 비디오 디코더 (30) 는 파티셔닝되지 않은 CU 에 대해 재구성 동작을 수행할 수도 있다. CU 에 대해 재구성 동작을 수행하기 위하여, 비디오 디코더 (30) 는 CU 의 각각의 TU 에 대해 재구성 동작을 수행할 수도 있다. CU 의 각각의 TU 에 대해 재구성 동작을 수행함으로써, 비디오 디코더 (30) 는 CU 의 잔차 블록들을 재구성할 수도 있다.In addition to obtaining syntax elements from the bitstream, video decoder 30 may perform a reconstruction operation on a non-partitioned CU. To perform a reconstruction operation on a CU, video decoder 30 may perform a reconstruction operation on each TU of the CU. By performing a reconstruction operation on each TU of a CU, video decoder 30 may reconstruct the residual blocks of the CU.

CU 의 TU 에 대해 재구성 동작을 수행하는 것의 일부로서, 역 양자화 유닛 (304) 은 TU 와 연관된 계수 블록들을, 역 양자화, 즉, 양자화해제할 수도 있다. 역양자화 유닛 (304) 이 계수 블록을 역양자화한 이후, 역변환 프로세싱 유닛 (306) 은 TU 와 연관된 잔차 블록을 생성하기 위하여 계수 블록에 하나 이상의 역변환들을 적용할 수도 있다. 예를 들어, 역변환 프로세싱 유닛 (306) 은 역 DCT, 역 정수 변환, 역 KLT (Karhunen-Loeve transform), 역 회전 변환, 역 지향성 변환, 또는 다른 역변환을 계수 블록에 적용할 수도 있다.As part of performing a reconstruction operation on a TU of a CU, inverse quantization unit 304 may inverse quantize, or dequantize, the coefficient blocks associated with the TU. After inverse quantization unit 304 inverse quantizes the coefficient block, inverse transform processing unit 306 may apply one or more inverse transforms to the coefficient block to generate a residual block associated with a TU. For example, inverse transform processing unit 306 may apply an inverse DCT, inverse integer transform, inverse Karhunen-Loeve transform (KLT), inverse rotation transform, inverse directional transform, or other inverse transform to the coefficient block.

역양자화 유닛 (304) 은 본 개시의 특정 기법들을 수행할 수도 있다. 예를 들어, 비디오 데이터의 픽처의 CTU 의 CTB 내의 복수의 양자화 그룹들의 적어도 하나의 각각의 양자화 그룹에 대해, 역양자화 유닛 (304) 은, 비트스트림에서 시그널링된 로컬 양자화 정보에 적어도 부분적으로 기초하여, 각각의 양자화 그룹에 대한 각각의 양자화 파라미터를 도출할 수도 있다. 추가적으로, 이 예에서, 역양자화 유닛 (304) 은, 각각의 양자화 그룹에 대한 각각의 양자화 파라미터들에 기초하여, CTU 의 CU 의 TU 의 변환 블록의 적어도 하나의 변환 계수를 역 양자화할 수도 있다. 이 예에서, 각각의 양자화 그룹은, 그 각각의 양자화 그룹의 경계들이 Cu들 또는 코딩 블록들의 경계들이어야만 하고 각각의 양자화 그룹의 사이즈가 임계치 이상이도록, 코딩 순서로 연속적인 Cu들 또는 코딩 블록들의 그룹으로서 정의된다. 비디오 디코더 (30) (예컨대, 역변환 프로세싱 유닛 (306), 재구성 유닛 (308), 및 필터 유닛 (310)) 는, 변환 블록의 역 양자화된 변환 계수들에 기초하여, CU 의 코딩 블록을 재구성할 수도 있다.Inverse quantization unit 304 may perform certain techniques of this disclosure. For example, for at least one each quantization group of a plurality of quantization groups within a CTB of a CTU of a picture of video data, inverse quantization unit 304 may perform at least in part based on local quantization information signaled in the bitstream. , each quantization parameter for each quantization group may be derived. Additionally, in this example, inverse quantization unit 304 may inverse quantize at least one transform coefficient of a transform block of a TU of a CU of a CTU based on the respective quantization parameters for each quantization group. In this example, each quantization group consists of consecutive Cus or coding blocks in coding order such that the boundaries of each quantization group must be boundaries of Cus or coding blocks and the size of each quantization group is greater than or equal to a threshold. Defined as a group. Video decoder 30 (e.g., inverse transform processing unit 306, reconstruction unit 308, and filter unit 310) may reconstruct a coding block of a CU based on the inverse quantized transform coefficients of the transform block. It may be possible.

PU 가 인트라 예측을 이용하여 인코딩되는 경우, 인트라 예측 프로세싱 유닛 (316) 은 PU 의 예측 블록들을 생성하기 위하여 인트라 예측을 수행할 수도 있다. 인트라 예측 프로세싱 유닛 (316) 은, 공간적으로 이웃하는 블록들 샘플들에 기초하여 PU 의 예측 블록들을 생성하기 위하여 인트라 예측 모드를 사용할 수도 있다. 인트라 예측 프로세싱 유닛 (316) 은 비트스트림으로부터 획득된 하나 이상의 신택스 엘리먼트들에 기초하여 PU 를 위한 인트라 예측 모드를 결정할 수도 있다.When a PU is encoded using intra prediction, intra prediction processing unit 316 may perform intra prediction to generate predictive blocks of the PU. Intra prediction processing unit 316 may use an intra prediction mode to generate prediction blocks of a PU based on spatially neighboring blocks samples. Intra prediction processing unit 316 may determine an intra prediction mode for a PU based on one or more syntax elements obtained from the bitstream.

PU 가 인터 예측을 이용하여 인코딩되는 경우, 엔트로피 디코딩 유닛 (300) 은 PU 를 위한 모션 정보를 결정할 수도 있다. 모션 보상 유닛 (314) 은 PU 의 모션 정보에 기초하여, 하나 이상의 레퍼런스 블록들을 결정할 수도 있다. 모션 보상 유닛 (314) 은, 하나 이상의 레퍼런스 블록들에 기초하여, PU 를 위한 예측 블록들 (예를 들어, 예측 루마, Cb 및 Cr 블록들) 을 생성할 수도 있다.If the PU is encoded using inter prediction, entropy decoding unit 300 may determine motion information for the PU. Motion compensation unit 314 may determine one or more reference blocks based on the motion information of the PU. Motion compensation unit 314 may generate prediction blocks (e.g., prediction luma, Cb, and Cr blocks) for a PU based on one or more reference blocks.

재구성 유닛 (308) 은, CU 에 대한 코딩 블록들 (예를 들어, 루마, Cb 및 Cr 코딩 블록들) 을 재구성하기 위하여, 적용가능한 바에 따라, CU 의 TU 들에 대한 변환 블록들 (예를 들어, 루마, Cb 및 Cr 변환 블록들) 및 CU 의 PU 들의 예측 블록들 (예를 들어, 루마, Cb 및 Cr 블록들), 즉, 인트라 예측 데이터 또는 인터 예측 데이터 중의 어느 일방을 이용할 수도 있다. 예를 들어, 재구성 유닛 (308) 은 CU 의 코딩 블록들 (예를 들어, 루마, Cb 및 Cr 코딩 블록들) 을 재구성하기 위하여 변환 블록들 (예를 들어, 루마, Cb 및 Cr 변환 블록들) 의 샘플들을 예측성 블록들 (예를 들어, 루마, Cb 및 Cr 예측성 블록들) 의 대응하는 샘플들에 추가할 수도 있다.Reconstruction unit 308 reconstructs the transform blocks (e.g., luma, Cb, and Cr coding blocks) for TUs of the CU, as applicable, to reconstruct the coding blocks (e.g., luma, Cb, and Cr coding blocks) for the CU. , luma, Cb, and Cr transform blocks) and prediction blocks of PUs of a CU (e.g., luma, Cb, and Cr blocks), that is, either intra prediction data or inter prediction data may be used. For example, reconstruction unit 308 may use transform blocks (e.g., luma, Cb, and Cr transform blocks) to reconstruct coding blocks (e.g., luma, Cb, and Cr coding blocks) of a CU. Samples of may be added to corresponding samples of predictive blocks (e.g., luma, Cb, and Cr predictive blocks).

필터 유닛 (310) 은 CU 의 코딩 블록들과 연관된 블록킹 아티팩트들을 감소시키기 위하여 디블로킹 동작을 수행할 수도 있다. 비디오 디코더 (30) 는 CU 의 코딩 블록을 디코딩된 픽처 버퍼 (312) 에 저장할 수도 있다. 디코딩된 픽처 버퍼 (312) 는 후속 모션 보상, 인트라 예측, 및 도 1 의 디스플레이 디바이스 (32) 와 같은 디스플레이 디바이스 상으로의 프리젠테이션을 위해 레퍼런스 픽처들을 제공할 수도 있다. 실례로, 비디오 디코더 (30) 는, 디코딩된 픽처 버퍼 (312) 에서의 블록들에 기초하여, 다른 CU들의 PU들에 대해 인트라 예측 또는 인터 예측 동작들을 수행할 수도 있다.Filter unit 310 may perform a deblocking operation to reduce blocking artifacts associated with coding blocks of a CU. Video decoder 30 may store a coding block of a CU in decoded picture buffer 312. Decoded picture buffer 312 may provide reference pictures for subsequent motion compensation, intra prediction, and presentation on a display device, such as display device 32 of FIG. 1. As an example, video decoder 30 may perform intra-prediction or inter-prediction operations on PUs of different CUs, based on blocks in decoded picture buffer 312.

산술 코딩은 데이터 압축에서 사용되는 기본적인 툴이다. 예컨대, 「I. H. Witten, R. M. Neal, 및 J. G. Cleary, “Arithmetic coding for data compression,” Commun. ACM, vol. 30, no. 6, pp. 520-540, June 1987 (이하, “참조문헌 1”); A. Said, “Arithmetic Coding,” in “Lossless Compression Handbook,” K. Sayood, Ed., Academic Press, chapter 5, pp. 101-152, 2003 (이하, “참조문헌 2”); 및 A. Said, “Introduction to arithmetic coding - theory and practice,” Hewlett Packard Laboratories, Palo Alto, CA, USA, Technical Report HPL-2004-76, Apr. 2004, (http://www.hpl.hp.com/techreports/2004/HPL-2004-76.pdf) (이하, “참조문헌 3”)」 참조.Arithmetic coding is a fundamental tool used in data compression. For example, “I. H. Witten, R. M. Neal, and J. G. Cleary, “Arithmetic coding for data compression,” Commun. ACM, vol. 30, no. 6, pp. 520-540, June 1987 (hereinafter “Reference 1”); A. Said, “Arithmetic Coding,” in “Lossless Compression Handbook,” K. Sayood, Ed., Academic Press, chapter 5, pp. 101-152, 2003 (hereinafter “Reference 2”); and A. Said, “Introduction to arithmetic coding - theory and practice,” Hewlett Packard Laboratories, Palo Alto, CA, USA, Technical Report HPL-2004-76, Apr. 2004, (http://www.hpl.hp.com/techreports/2004/HPL-2004-76.pdf) (hereinafter referred to as “Reference 3”).

산술 코딩은 AVC/H.264 비디오 압축 표준에서 선택적이었다. 「I. D. Marpe, H. Schwarz, 및 T. Wiegand, “Context-based adaptive binary arithmetic coding in the H.264/AVC video compression standard,” IEEE Trans. Circuits Syst. Video Technol., vol. 13, no. 7, pp. 620-636, Jul. 2003 (이하, “참조문헌 4”); I. E. Richardson, The H.264 Advanced Video Compression Standard, 2nd ed., John Wiley and Sons Ltd., 2010 (이하, “참조문헌 5”)」 참조.Arithmetic coding was optional in the AVC/H.264 video compression standard. 「I. D. Marpe, H. Schwarz, and T. Wiegand, “Context-based adaptive binary arithmetic coding in the H.264/AVC video compression standard,” IEEE Trans. Circuits Syst. Video Technol., vol. 13, no. 7, pp. 620-636, Jul. 2003 (hereinafter “Reference 4”); See I. E. Richardson, The H.264 Advanced Video Compression Standard, 2nd ed., John Wiley and Sons Ltd., 2010 (hereinafter “Reference 5”).

산술 코딩은 비디오 코딩 표준들 HEVC/H.265 및 VP9 의 유일한 엔트로피 코딩 기법이 되었다. 「V. Sze 및 M. Budagavi, “High throughput CABAC entropy coding in HEVC,” IEEE Trans. Circuits Syst. Video Technol., vol. 22, no. 12, pp. 1778-1791, Dec. 2012 (이하, "참조문헌 6”); V. Sze and D. Marpe, “Entropy coding in HEVC,” in High Efficiency Video Coding (HEVC): Algorithms and Architectures, V. Sze, M. Budagavi, and G. J. Sullivan, Eds., chapter 8, pp. 209-274. Springer, 2014 (이하, "참조문헌 7”); M. Wien, High Efficiency Video Coding: Coding Tools and Specification, Springer-Verlag, 2015 (이하, "참조문헌 8”); D. Mukherjee, J. Bankoski, R. S. Bultje, A. Grange, J. Han, J. Koleszar, P. Wilkins, and Y. Xu, “The latest open-source video codec VP9 - an overview and preliminary results,” in Proc. 30th Picture Coding Symp., San Jose, CA, Dec. 2013 (이하, "참조문헌 9”)」 참조.Arithmetic coding has become the only entropy coding technique in video coding standards HEVC/H.265 and VP9. 「V. Sze and M. Budagavi, “High throughput CABAC entropy coding in HEVC,” IEEE Trans. Circuits Syst. Video Technol., vol. 22, no. 12, pp. 1778-1791, Dec. 2012 (hereinafter “Reference 6”); V. Sze and D. Marpe, “Entropy coding in HEVC,” in High Efficiency Video Coding (HEVC): Algorithms and Architectures, V. Sze, M. Budagavi, and G. J. Sullivan , Eds., chapter 8, pp. 209-274. Springer, 2014 (hereinafter “Reference 7”); M. Wien, High Efficiency Video Coding: Coding Tools and Specification, Springer-Verlag, 2015 (hereinafter “Reference 8”); D. Mukherjee, J. Bankoski, R. S. Bultje, A. Grange, J. Han, J. Koleszar, P. Wilkins, and Y. Xu, “The latest open-source video codec VP9 - an overview and preliminary results,” in Proc. 30th Picture Coding Symp., San Jose, CA, Dec. 2013 (hereafter, “References Refer to Document 9”)”.

산술 코딩의 우수한 압축 효울로 인해, 산술 코딩은 미래의 비디오 코딩 표준들에서 사용되는 유일한 엔트로피 코딩 기법으로 남을 것으로 예상된다. 하지만, 실제 애플리케이션들에서 엔트로피 코딩을 이용함에 있어서의 주요한 문제점들 중 하나는, 정적 데이터 소스들에 대해 최적이도록 가장 효과적인 방법들이 설계되지만 (비디오와 같은) 복잡한 신호들로부터의 실제 데이터는 정적인 것과는 거리가 멀다. 이러한 문제점을 해결하기 위해 현재의 솔루션들은 데이터 분류 및 적응적 코딩 방법들을 이용하고, 본 개시의 기법들은 적응 기법들의 효율성을 증가시킬 수도 있다.Because of the superior compression efficiency of arithmetic coding, arithmetic coding is expected to remain the only entropy coding technique used in future video coding standards. However, one of the main problems with using entropy coding in real applications is that while the most effective methods are designed to be optimal for static data sources, real data from complex signals (such as video) are different from static ones. It is far. Current solutions use data classification and adaptive coding methods to solve this problem, and the techniques of this disclosure may increase the efficiency of adaptation techniques.

본 개시의 기법들은, 데이터가 많은 부류들 (코딩 컨텍스트들) 에서 유한하게 나누어지는 경우에도, 각 부류 (class) 에 대해 할당된 데이터의 통계에서 여전히 많은 다양성들이 존재한다는 사실을 이용함으로써 압축 효율을 향상시킬 수도 있다. 따라서, 모든 부류들에 대해 단일의 "보편적 (universal)" 적응 기법을 이용하는 대신에, 본 개시는 각 부류에 따라 적응 파라미터들 (adaptation parameters) 을 변경하고, 각 부류 내에서, 예상되는 또는 관찰되는 확률 값들, 또는 추정들에서의 측정되는 변동들에 따라 적응 파라미터들을 추가로 변경하는 것을 제안한다.The techniques of this disclosure improve compression efficiency by taking advantage of the fact that even when data is finitely divided into many classes (coding contexts), there is still a lot of diversity in the statistics of the data assigned to each class. It can also be improved. Therefore, instead of using a single “universal” adaptation technique for all classes, the present disclosure changes the adaptation parameters for each class and, within each class, adjusts the expected or observed It is proposed to further change the adaptation parameters depending on the measured variations in probability values, or estimates.

또한, 본 개시는 이진 산술 코딩에 대한 적응인 이진 확률 추정을 향상시키기 위한 예시적인 기법들을 기술한다. 그 예시적인 기법들은, 재귀 방정식들을 이용하여 확률을 추정하기 위한 방법들을 향상시키기 위해 이산-시간 신호 분석을 이용하여 보다 양호한 압축을 가능하게 할 수도 있고, 낮은 정확도의 구현들을 이용할 때 낮은 복잡성 및 안정성을 가능하게 하는 특별한 피처들이다.Additionally, this disclosure describes example techniques for improving binary probability estimation that is an adaptation to binary arithmetic coding. The example techniques may enable better compression, lower complexity and stability when using low-accuracy implementations using discrete-time signal analysis to improve methods for estimating probabilities using recursive equations. These are special features that make it possible.

참조문헌들 1, 2, 및 3 에서 기술된 바와 같이, 현대의 비디오 코딩 표준들은 엔트로피 코딩을 모델링과 실제 코딩으로 분해하는 전략을 채택한다. 따라서, 현대의 비디오 압축 표준들에서 사용되는 이진 산술 코딩 프로세스는 3 개의 주요 스테이지들로 분할된다. 비디오 인코더 (20) 는 이들 스테이지들의 동작들을 수행할 수도 있고, 비디오 디코더 (30) 는 이들 스테이지들의 역 동작들을 수행할 수도 있다.As described in references 1, 2, and 3, modern video coding standards adopt a strategy of decomposing entropy coding into modeling and actual coding. Accordingly, the binary arithmetic coding process used in modern video compression standards is split into three main stages. Video encoder 20 may perform the operations of these stages, and video decoder 30 may perform the inverse operations of these stages.

(a) 이진화 (binarization): 코딩될 각각의 데이터 엘리먼트 (또는 신택스 엘리먼트) 는 이진 데이터 심볼들 (빈들) 의 시퀀스로 먼저 분해된다. 이진 심볼 확률들은 그것의 분해에서 데이터 엘리먼트 및 이진 심볼 포지션에 의존하기 때문에, 빈 컨텍스트 (또는 단순히 컨텍스트) 는 각 타입의 이진 심볼에 대해 할당되어, 그것의 엔트로피 코딩을 위해 사용될 확률 추정치를 고유하게 식별한다.(a) Binarization: Each data element (or syntax element) to be coded is first decomposed into a sequence of binary data symbols (bins). Because binary symbol probabilities depend on the data element and binary symbol position in its decomposition, an empty context (or simply context) is assigned for each type of binary symbol, uniquely identifying the probability estimate to be used for its entropy coding. do.

(b) 적응적 확률 추정 (adaptive probability estimation): 주어진 컨텍스트에 할당된 모든 빈들은 유사한, 하지만 정확하게 동일하지는 않은 확률들을 가진다고 가정되기 때문에, 인코더 및 디코더는 이전에 인코딩도니 또는 디코딩된 빈 값들에 기초하여 그것들의 확률 추정치들을 업데이트한다.(b) Adaptive probability estimation: Since all bins assigned to a given context are assumed to have similar, but not exactly identical, probabilities, the encoder and decoder are based on previously encoded or decoded bin values. to update their probability estimates.

(c) 산술 코딩 (arithmetic coding): 각각의 이진 심볼 (0 또는 1) 의 값은 빈의 대응하는 컨텍스트에 의해 정의되는 그것의 값의 추정된 확률을 이용하여 엔트로피 코딩된다.(c) Arithmetic coding: The value of each binary symbol (0 or 1) is entropy coded using the estimated probability of its value defined by the bin's corresponding context.

본 개시의 기법들은 적응적 확률 추정을 향상시킴으로써 보다 양호한 압축을 획득할 수도 있다.Techniques of this disclosure may achieve better compression by improving adaptive probability estimation.

이진 확률 추정에서 실제로 사용되는 예시적인 기법들은 다음의 참조문헌들에서 제시된다:Exemplary techniques used in practice in binary probability estimation are presented in the following references:

F. T. Leighton 및 R. L. Rivest, “Estimating a probability using finite memory,” IEEE Trans. Inf. Theory, vol. 32, no. 6, pp. 733-742, Nov. 1986 (이하, "참조문헌 10”).

FT Leighton and RL Rivest, “Estimating a probability using finite memory,” IEEE Trans. Inf. Theory, vol. 32, no. 6, pp. 733-742, Nov. 1986 (hereinafter referred to as “Reference 10”).

W. B. Pennebaker and J. L. Mitchell, “Probability estimation for the Q-Coder,” IBM J. Res. Theory, vol. 32, no. 6, pp. 737-752, Nov. 1988 (이하, "참조문헌 11”). WB Pennebaker and JL Mitchell, “Probability estimation for the Q-Coder,” IBM J. Res. Theory, vol. 32, no. 6, pp. 737-752, Nov. 1988 (hereinafter referred to as “Reference 11”).

P. G. Howard 및 J. S. Vitter, “Practical implementations of arithmetic coding,” in Image and Text Compression, J. A. Storer, Ed., chapter 4, pp. 85-112. Kluwer Academic Publishers, Norwell, MA, 1992 (이하, "참조문헌 12”). PG Howard and JS Vitter, “Practical implementations of arithmetic coding,” in Image and Text Compression, JA Storer, Ed., chapter 4, pp. 85-112. Kluwer Academic Publishers, Norwell, MA, 1992 (hereinafter referred to as “Reference 12”).

E. Meron 및 M. Feder, “Finite-memory universal prediction of individual sequences,” IEEE Trans. Inf. Theory, vol. 50, no. 7, pp. 1506-1523, Nov. 2004 (이하, "참조문헌 13”). E. Meron and M. Feder, “Finite-memory universal prediction of individual sequences,” IEEE Trans. Inf. Theory, vol. 50, no. 7, pp. 1506-1523, Nov. 2004 (hereinafter referred to as “Reference 13”).

E. Belyaev, M. Gilmutdinov, 및 A. Turlikov, “Binary arithmetic coding system with adaptive probability estimation by `virtual sliding window'," in Proc. IEEE Int. Symp. Consumer Electronics, St. Petersburg, Russia, June 2006 (이하, "참조문헌 14”). E. Belyaev, M. Gilmutdinov, and A. Turlikov, “Binary arithmetic coding system with adaptive probability estimation by `virtual sliding window'," in Proc. IEEE Int. Symp. Consumer Electronics, St. Petersburg, Russia, June 2006 ( Hereinafter, “Reference 14”).

A. Alshin, E. Alshina, 및 J.-H. Park, “High precision probability estimation for CABAC,” in Proc . IEEE Visual Commun . Image Process. Conf ., Kuching, Malaysia, Nov. 2013 (이하, "참조문헌 15”). A. Alshin, E. Alshina, and J.-H. Park, “High precision probability estimation for CABAC,” in Proc . IEEE Visual Commun . Image Process. Conf . , Kuching, Malaysia, Nov. 2013 (hereinafter referred to as “Reference 15”).

A. V. Oppenheim 및 R. W. Schafer, Discrete-Time Signal Processing, Prentice-Hall, Inc., Upper Saddle River, NJ, 3rd ed., Aug. 2009 (이하, "참조문헌 16”). A. V. Oppenheim and R. W. Schafer, Discrete-Time Signal Processing, Prentice-Hall, Inc., Upper Saddle River, NJ, 3rd ed., Aug. 2009 (hereinafter referred to as “Reference 16”).

S. K. Mitra, Digital Signal Processing: A Computer-based Approach, McGraw-Hill Publishing Co., New York, NY, 4th ed., 2010 (이하, "참조문헌 17”). SK Mitra, Digital Signal Processing: A Computer-based Approach, McGraw-Hill Publishing Co., New York, NY, 4th ed., 2010 (hereinafter referred to as “Reference 17”).

매우 낮은 계산적 복잡도의 실제적 요건들을 달성하기 위해, 확률 추정은 통상적으로 몇몇 타입의 유한-상태-머신 (FSM) 을 이용하여 행해진다. 본 개시의 기법들의 설명의 용이성을 위해, 본 개시는 참조문헌들 10-14 의 FSM들의 구현에 관한 상세들을 커버하지 않지만, 적절한 전문용어를 정의하는 것이 유용하고, 일부 예들이 이 섹션의 끝에서 제시된다.To achieve practical requirements of very low computational complexity, probability estimation is typically done using some type of finite-state-machine (FSM). For ease of explanation of the techniques of this disclosure, this disclosure does not cover details regarding the implementation of the FSMs of References 10-14, but it is useful to define appropriate terminology, and some examples are provided at the end of this section. presented.

도 4 는 예시적인 일반적 FSM (400) 의 블록도이다. 보다 구체적으로, 도 4 는 일반적인 유한 상태 머신의 상태 천이들, 입력들, 및 출력들의 시퀀스의 그래픽적 표현이다. 도 4 에서, 시퀀스들 은, 정수 엘리먼트들을 갖는 벡터들 (벡터 차원들 및 허용되는 엘리먼트 값들의 그것들의 세트들은 잘 정의되어야 하지만, 본 논의를 위해 중요하지 않다) 인 FSM (400) 의 상태들, 입력들, 및 출력들을 각각 나타낸다.Figure 4 is a block diagram of an exemplary generic FSM 400. More specifically, Figure 4 is a graphical representation of the sequence of state transitions, inputs, and outputs of a typical finite state machine. In Figure 4, sequences represents the states, inputs, and outputs of FSM 400, which are vectors with integer elements (the vector dimensions and their sets of allowed element values should be well-defined, but are not important for the present discussion). Each is indicated.

상기 정의들에 이어서, 도 4 의 다이어그램에서의 화살표들은 상태 업데이팅 방정식 및 출력 방정식을 나타내고, 그것들은Following the above definitions, the arrows in the diagram of Figure 4 represent the state updating equation and the output equation, which are

(1) (One)

이고, 여기서, T 는 상태 업데이팅 함수이고, P 는 출력 함수이다., where T is the state updating function and P is the output function.

확률 추정 FSM들에서, 입력들은 빈 값들이고, 출력들은 빈 확률 추정치들이다. 엔트로피 인코딩 및 디코딩 동안의 FSM들의 사용은 도 5, 도 6, 도 7, 및 도 8 에서 도시되고, 여기서, 협약에 의해, 이진 확률 추정 FSM 은 코딩 컨텍스트라 불린다.In probability estimation FSMs, the inputs are empty values and the outputs are empty probability estimates. The use of FSMs during entropy encoding and decoding is shown in Figures 5, 6, 7, and 8, where, by convention, the binary probability estimation FSM is called the coding context.

도 5 및 도 6 의 예에서 나타낸 바와 같이, 실제적인 비디오 코딩에서, 엔트로피 코딩 스테이지는 많은 수의 코딩 컨텍스트들을 이용하여 구현될 수도 있다. 코딩 컨텍스트들은 인코딩되는 또는 디코딩되는 빈의 타입 (또는 부류) 에 의존하여 인코딩 및 디코딩 동안 선택된다.As shown in the examples of FIGS. 5 and 6, in practical video coding, an entropy coding stage may be implemented using a large number of coding contexts. Coding contexts are selected during encoding and decoding depending on the type (or class) of bin being encoded or decoded.

도 7 은 단일의 선택된 컨텍스트를 고려하여 컨텍스트-기반 이진 산술 인코딩하기 위한 예시적인 블록도이다. 도 7 에서, 산술 인코딩 유닛 (700) 은 산술 인코더 (702), 비트 버퍼 (704), 상태 결정 유닛 (706), 및 확률 추정 FSM 유닛 (708) 을 포함한다. 산술 인코딩 유닛 (700)은, 일부 예들에서, 이진화 유닛 (712) 으로부터 빈 스트림을 수신할 수도 있다. 산술 인코딩 유닛 (700) 및 은 도 2 의 엔트로피 인코딩 유닛 (218) 의 일부를 형성할 수도 있다. 이진화 유닛 (712) 은 각각의 데이터 엘리먼트 (예컨대, 신택스 엘리먼트) 를 이진 데이터 심볼들 (빈들) 의 시퀀스로 인코딩한다. 이진 데이터 심볼들의 시퀀스는 "빈 스트림 (bin stream)" 으로서 지칭될 수도 있다. 추가적으로, 산술 인코딩 유닛 (700) 은 컨텍스트 재초기화 (reinitialization) 신호를 수신할 수도 있다. 실례로, 산술 인코딩 유닛 (700) 은, 산술 인코딩 유닛 (700) 이 상이한 타입의 이진 심볼을 인코딩하기 시작할 때, 컨텍스트 재초기화 신호를 수신할 수도 있다.Figure 7 is an example block diagram for context-based binary arithmetic encoding considering a single selected context. In FIG. 7 , arithmetic encoding unit 700 includes arithmetic encoder 702, bit buffer 704, state determination unit 706, and probability estimation FSM unit 708. Arithmetic encoding unit 700 may, in some examples, receive an empty stream from binarization unit 712. Arithmetic encoding unit 700 and may form part of entropy encoding unit 218 of FIG. 2 . Binarization unit 712 encodes each data element (e.g., syntax element) into a sequence of binary data symbols (bins). A sequence of binary data symbols may be referred to as a “bin stream.” Additionally, arithmetic encoding unit 700 may receive a context reinitialization signal. As an example, arithmetic encoding unit 700 may receive a context reinitialization signal when arithmetic encoding unit 700 begins encoding a different type of binary symbol.

또한, 도 7 에서, 컨텍스트 재초기화 신호를 수신하는 것에 응답하여, 상태 결정 유닛 (706) 은 확률 추정 FSM 의 상태를 재초기화할 수도 있다. 일반적으로, 재초기화는 확률 추정치들을 코딩 컨텍스트와 연관된 초기 확률 추정치들로 재설정하는 것을 지칭한다. 예를 들어, 인코딩될 이진 심볼의 타입에 기초하여, 상태 결정 유닛 (706) 은 미리정의된 테이블에서 초기 확률 추정치들을 찾을 수도 있다. 미리정의된 테이블은 HEVC 와 같은 비디오 코딩 표준에 의해 정의될 수도 있다. 상태 결정 유닛 (706) 은 결정된 초기 확률 추정치들을 빈 확률 추정 FSM 유닛 (708) 에 제공할 수도 있다. 빈 스트림의 제 1 빈에 대해, 빈 확률 추정 FSM 유닛 (708) 은 초기 확률 추정치들을 산술 인코더 (702) 에 제공한다. 추가적으로, 빈 확률 추정 FSM 유닛 (708) 은 빈 스트림의 제 1 빈의 실제 값에 기초하여 확률 추정치들을 업데이트한다. 상태 결정 유닛 (706) 이 확률 추정치들을 재설정할 때까지의 각각의 후속하는 빈에 대해, 빈 확률 추정 FSM 유닛 (708) 은 예컨대 식 (1) 에서 나타낸 바와 같이 상태 업데이팅 함수에 따라 확률 추정치들을 업데이트한다.Additionally, in FIG. 7 , in response to receiving a context reinitialization signal, state determination unit 706 may reinitialize the state of the probability estimation FSM. In general, reinitialization refers to resetting the probability estimates to the initial probability estimates associated with the coding context. For example, based on the type of binary symbol to be encoded, state determination unit 706 may look up initial probability estimates in a predefined table. Predefined tables may be defined by video coding standards such as HEVC. State determination unit 706 may provide the determined initial probability estimates to empty probability estimation FSM unit 708. For the first bin of the bin stream, bin probability estimation FSM unit 708 provides initial probability estimates to arithmetic encoder 702. Additionally, bin probability estimation FSM unit 708 updates probability estimates based on the actual value of the first bin of the bin stream. For each subsequent bin until state determination unit 706 resets the probability estimates, bin probability estimation FSM unit 708 generates probability estimates according to a state updating function, e.g., as shown in equation (1). Update.

빈 스트림의 각각의 빈에 대해, 산술 인코더 (702) 는, CABAC 에 대해 이 개시물의 다른 곳에서 설명된 바와 같이 빈을 인코딩하기 위해 빈 확률 추정 FSM 유닛 (708) 에 의해 제공된 확률 추정치들을 사용할 수도 있다. 비트 버퍼 (704) 는 산술 인코더 (702) 에 의해 인코딩된 빈들을 저장할 수도 있다. 도 7 에서, 지연 박스 (710) 는, 빈 확률 추정 FSM 유닛 (708) 에 의해 생성된 빈 확률들이 산술 인코더 (702) 에 의해 현재 인코딩되고 있는 빈 이전의 빈에 기초함을 의미한다.For each bin in the bin stream, arithmetic encoder 702 may use probability estimates provided by bin probability estimation FSM unit 708 to encode the bin as described elsewhere in this disclosure for CABAC. there is. Bit buffer 704 may store bins encoded by arithmetic encoder 702. In FIG. 7 , delay box 710 means that the bin probabilities generated by bin probability estimation FSM unit 708 are based on the bin preceding the bin currently being encoded by arithmetic encoder 702.

도 8 은 단일의 선택된 컨텍스트를 고려하여 컨텍스트-기반 이진 산술 디코딩하기 위한 예시적인 블록도이다. 도 8 에서, 산술 디코딩 유닛 (800) 은 산술 디코더 (802), 비트 버퍼 (804), 상태 결정 유닛 (806), 및 확률 추정 FSM 유닛 (808) 을 포함한다. 산술 디코딩 유닛 (800) 은, 일부 예들에서, 역 이진화 유닛 (812) 에 의해 수신될 수도 있는 빈 스트림을 생성한다. 산술 디코딩 유닛 (800) 및 역 이진화 유닛 (812) 은 도 3 의 엔트로피 디코딩 (300) 의 일부를 형성할 수도 있다. 역 이진화 유닛 (812) 은 빈 스트림을 일련의 하나 이상의 신택스 엘리먼트들로 변환한다.Figure 8 is an example block diagram for context-based binary arithmetic decoding considering a single selected context. In FIG. 8 , arithmetic decoding unit 800 includes an arithmetic decoder 802, a bit buffer 804, a state determination unit 806, and a probability estimation FSM unit 808. Arithmetic decoding unit 800 generates an empty stream that, in some examples, may be received by inverse binarization unit 812. Arithmetic decoding unit 800 and inverse binarization unit 812 may form part of entropy decoding 300 of FIG. 3 . Debinarization unit 812 converts the empty stream into a series of one or more syntax elements.

도 8 에서, 산술 디코딩 유닛 (800) 은 비디오 디코더 (30) 에 의해 수신된 비트스트림으로부터 파싱될 수도 있는 바이트 스트림 (byte stream) 을 수신한다. 추가적으로, 산술 디코딩 유닛 (800) 은 컨텍스트 재초기화 신호를 수신할 수도 있다. 실례로, 산술 디코딩 유닛 (800) 은, 산술 디코딩 유닛 (800) 이 상이한 타입의 이진 심볼을 인코딩하기 시작할 때, 컨텍스트 재초기화 신호를 수신할 수도 있다. 또한, 도 8 에서, 컨텍스트 재초기화 신호를 수신하는 것에 응답하여, 상태 결정 유닛 (806) 은 확률 추정 FSM 의 상태를 재초기화할 수도 있다. 예를 들어, 인코딩될 이진 심볼의 타입에 기초하여, 상태 결정 유닛 (806) 은 미리정의된 테이블에서 초기 확률 추정치들을 찾을 수도 있다. 미리정의된 테이블은 HEVC 와 같은 비디오 코딩 표준에 의해 정의될 수도 있다. 미리정의된 테이블은 상태 결정 유닛 (706) (도 7) 에 의해 사용되는 테이블과 동일할 수도 있다. 상태 결정 유닛 (806) 은 결정된 초기 확률 추정치들을 빈 확률 추정 FSM 유닛 (808) 에 제공할 수도 있다. 빈 스트림의 제 1 빈에 대해, 빈 확률 추정 FSM 유닛 (808) 은 초기 확률 추정치들을 산술 디코더 (802) 에 제공한다. 추가적으로, 빈 확률 추정 FSM 유닛 (808) 은 빈 스트림의 제 1 빈의 실제 값에 기초하여 확률 추정치들을 업데이트한다. 상태 결정 유닛 (806) 이 확률 추정치들을 재설정할 때까지의 각각의 후속하는 빈에 대해, 빈 확률 추정 FSM 유닛 (808) 은 예컨대 식 (1) 에서 나타낸 바와 같이 상태 업데이팅 함수에 따라 확률 추정치들을 업데이트한다.8, arithmetic decoding unit 800 receives a byte stream that may be parsed from a bitstream received by video decoder 30. Additionally, arithmetic decoding unit 800 may receive a context reinitialization signal. As an example, arithmetic decoding unit 800 may receive a context reinitialization signal when arithmetic decoding unit 800 begins encoding a different type of binary symbol. Additionally, in FIG. 8, in response to receiving a context reinitialization signal, state determination unit 806 may reinitialize the state of the probability estimation FSM. For example, based on the type of binary symbol to be encoded, state determination unit 806 may look up initial probability estimates in a predefined table. Predefined tables may be defined by video coding standards such as HEVC. The predefined table may be the same table used by state determination unit 706 (FIG. 7). State determination unit 806 may provide the determined initial probability estimates to empty probability estimation FSM unit 808. For the first bin of the bin stream, bin probability estimation FSM unit 808 provides initial probability estimates to arithmetic decoder 802. Additionally, bin probability estimation FSM unit 808 updates probability estimates based on the actual value of the first bin of the bin stream. For each subsequent bin until state determination unit 806 resets the probability estimates, bin probability estimation FSM unit 808 generates probability estimates according to a state updating function, e.g., as shown in equation (1). Update.

빈 스트림의 각각의 빈에 대해, 산술 디코더 (802) 는, CABAC 에 대해 이 개시물의 다른 곳에서 설명된 바와 같이 빈을 디코딩하기 위해 빈 확률 추정 FSM 유닛 (808) 에 의해 제공된 확률 추정치들을 사용할 수도 있다. 비트 버퍼 (804) 는 산술 디코더 (802) 에 의해 디코딩되될 빈들을 저장할 수도 있다. 도 8 에서, 지연 박스 (810) 는, 빈 확률 추정 FSM 유닛 (808) 에 의해 생성된 빈 확률들이 산술 디코더 (802) 에 의해 현재 디코딩되고 있는 빈 이전의 빈에 기초함을 의미한다.For each bin of the bin stream, arithmetic decoder 802 may use probability estimates provided by bin probability estimation FSM unit 808 to decode the bin as described elsewhere in this disclosure for CABAC. there is. Bit buffer 804 may store bins to be decoded by arithmetic decoder 802. In FIG. 8, delay box 810 means that the bin probabilities generated by bin probability estimation FSM unit 808 are based on the bin preceding the bin currently being decoded by arithmetic decoder 802.

도 7 및 도 8 은 단일의 컨텍스트가 선택되는 예시적인 경우를 고려한 단수화된 다이어그램들이다. 도 7 및 도 8 은 또한, 실제 애플리케이션에서 항상 존재하는 하나의 특징을 나타내고, 이는 인코더 상태들로 변환되는 데이터를 갖는 공유된 테이블을 이용하여 인코더 및 디코더 상태들을 주기적으로 재동기화할 필요성이다. 예를 들어, HEVC 표준에서, 컨텍스트들은, 각각의 컨텍스트에 대해, (양자화 스텝, 또는 양자화 파라미터 (QP) 값으로서 알려진) 압축-품질 파라미터 (참조문헌들 7 및 8 참조) 를 어떻게 FSM 상태들로 변환하는지를 정의하는 테이블로 주기적으로 재초기화된다.Figures 7 and 8 are simplified diagrams considering an example case in which a single context is selected. Figures 7 and 8 also illustrate one feature that is always present in real applications, and that is the need to periodically resynchronize the encoder and decoder states using a shared table with data converted to encoder states. For example, in the HEVC standard, contexts determine how, for each context, compression-quality parameters (known as quantization steps, or quantization parameter (QP) values) (see references 7 and 8) are converted into FSM states. It is a table that defines conversion and is periodically reinitialized.

HEVC 표준에서, FSM 기능들은 오직 테이블 룩-업 (look-up) 방법들을 이용하여 구현된다. 참조문헌들 7 및 8 을 참조하라. ITU-T/MPEG JVET 에 의해 생성된 JEM (Joint Exploration Model) 의 최근의 드래프트에서, FSM 은 2 개의 형태들의 이산 시간 무한 임펄스 응답 (infinite impulse response; IIR) 필터들을 이용하여 구현되었다. 첫번째는:In the HEVC standard, FSM functions are implemented using only table look-up methods. See references 7 and 8. In a recent draft of the Joint Exploration Model (JEM) produced by ITU-T/MPEG JVET, the FSM was implemented using two types of discrete-time infinite impulse response (IIR) filters. at first:

형태의 것이고 (참조문헌들 13 및 14 참조), 여기서, 는 스케일링된 확률 추정치들 (FSM 상태들 및 출력들) 의 정수 시퀀스이고,b 는 빈 값들 (FSM 입력들) 의 이진 시퀀스이며, a 는 비트 시프트에 의해 구현될 곱 및 나누기를 가능하게 하는 양의 정수이다.of the form (see references 13 and 14), where: is an integer sequence of scaled probability estimates (FSM states and outputs), and b is a binary sequence of empty values (FSM inputs), and a is a positive integer allowing multiplication and division to be implemented by bit shifting.

두번째 형태는 「A. Alshin, E. Alshina, 및 J.-H. Park, “High precision probability estimation for CABAC,” in Proc. IEEE Visual Commun. Image Process. Conf., Kuching, Malaysia, Nov. 2013」 (이하, "참조문헌 15”) 에 의해 제안된 방법을 이용하고, 다음과 같은 식들에 의해 정의된다:The second form is “A. Alshin, E. Alshina, and J.-H. Park, “High precision probability estimation for CABAC,” in Proc. IEEE Visual Commun. Image Process. Conf., Kuching, Malaysia, Nov. 2013” (hereinafter referred to as “Reference 15”), and is defined by the following equations:

(2) (2)

이 경우에, 확률 추정 FSM 입력들 및 출력들은 여전히 각각 시퀀스들 및 이지만, 상태는 쌍들 에 의해 정의된다. 여기서의 추정은, 양자 모두가 업데이트되고 저장되는 확률 추정치들이기 때문에, "2-트랙 추정" 으로 또한 명명될 수 있다.In this case, the probability estimation FSM inputs and outputs are still respectively sequences and , but the states are pairs is defined by The estimate here is, Since both are probability estimates that are updated and stored, they can also be called “two-track estimates.”

엔트로피 코딩이 정적 데이터 소스들에 대해 최적이도록 설계되기 때문에, 그것의 실제적인 유효성은 데이터 엘리먼트들을 분류하는 것에 의존했고, 따라서, 각 부류에서의 통계는 대략적으로 정적이고, 거의 "보편적 (universal)" 인 즉 모든 데이터 엘리먼트들에 대해 동일하게 잘 적응되는 확률 추정 FSM들을 이용하는 것이 가능하다. 도 7 및 도 8 에서 도시된 바와 같이, 그 가정에 기초하여, 확률 추정 FSM들은 변경되지 않고, 오직 그것들의 상태들만이 주기적으로 재초기화된다.Because entropy coding was designed to be optimal for static data sources, its practical effectiveness depended on classifying data elements, and thus the statistics in each class were approximately static, almost “universal”. That is, it is possible to use probability estimation FSMs that are equally well adapted to all data elements. As shown in Figures 7 and 8, based on that assumption, the probability estimation FSMs are not changed, and only their states are periodically reinitialized.

본 개시는, 각각의 컨텍스트에 대한 확률 변경 뿐만 아니라 변경들의 속도 및 크기도 컨텍스트들 사이에 상이하기 때문에, CABAC 에서 사용하기 위한 "보편적으로 최적" 인 확률 추정 FSM들은 실제로 존재하지 않는다는 사실을 해결한다. 본 개시는 변동은 컨텍스트 또는 추정된 확률에 따라 발생하지만, 최선의 전략은 최적의 FSM 을 결정하기 위해 양자 모두를 이용하는 것이라는 사실을 이용하는, 이 문제에 대한 솔루션을 기술한다. 본 개시의 기법들은 또한, FSM 상태들의 변화를 평가함으로써, 예컨대, 하나보다 많은 추정치 사이의 차이를 측정함으로써, FSM 파라미터들을 고르는 옵션을 커버한다.This disclosure addresses the fact that “universally optimal” probability estimation FSMs for use in CABAC do not actually exist, because not only the probability change for each context, but also the rate and magnitude of the changes differ between contexts. . This disclosure describes a solution to this problem that takes advantage of the fact that variations occur depending on either context or estimated probability, but the best strategy is to use both to determine the optimal FSM. The techniques of this disclosure also cover the option of picking FSM parameters by evaluating changes in FSM states, such as measuring the difference between more than one estimate.

본 개시의 하나 이상의 기법들에 다르면, 확률 추정 FSM들의 종래의 정의는 변경되고, 본 개시의 기법들은 또한 FSM 파라미터 벡터 h 를 정의하고, 이 벡터는 FSM 응답들을 변화시키기 위해 사용될 수 있다. 이 정의로, 식 (1) 은 다음과 같이 쓰여질 수 있다:In accordance with one or more techniques of this disclosure, the conventional definition of probability estimation FSMs is modified, and the techniques of this disclosure also define an FSM parameter vector h, which can be used to vary the FSM responses. With this definition, equation (1) can be written as:

(3) (3)

여기서, T 는 파라미터화된 상태 업데이팅 함수이고, P 는 파라미터화된 출력 함수이다. 달리 말하면, 상태 업데이팅 및 출력 식들은 식 (3) 에서 나타낸 바와 같이 재정의될 수도 있다. 이러한 확률 추정 FSM 들은 본 명세서에서 "파라미터화된 컨텍스트 FSM 들 (parameterized-context FSMs)" 로서 지칭될 수도 있다.Here, T is a parameterized state updating function and P is a parameterized output function. In other words, the state updating and output equations may be redefined as shown in equation (3). These probability estimation FSMs may be referred to herein as “parameterized-context FSMs.”

이러한 정의로, 파라미터화된 컨텍스트 FSM 의 코딩 성능을 정의하는 2 개의 팩터들이 식별될 수 있다.:With this definition, two factors can be identified that define the coding performance of a parameterized context FSM:

FSM states (): 이진 심볼들의 확률들을 계산하기 위해 직접 사용되는 수치적 또는 논리적 정보를 포함하고, 오직 이전의 표준드에서 재초기화에 의해 변경된 데이터이다;

FSM states ( ): Contains numerical or logical information used directly to calculate the probabilities of binary symbols, and is data that has only been changed by reinitialization from a previous standard;

FSM parameters (): 상태 업데이팅, 및 확률 추정치들이 어떻게 그 상태들로부터 계산되는지를 정의하고; 본 발명은 코딩 동안 또는 재초기화 동안 이들 파라미터들을 수정함으로써 압축을 향상시킨다. FSM parameters ( ): defines state updating, and how probability estimates are computed from the states; The present invention improves compression by modifying these parameters during coding or reinitialization.

예를 들어, 식 (2) 에서의 확률 추정은 식들:For example, the probability estimate in equation (2) is equivalent to the equations:

(4) (4)

에서의 파라미터들로서 양의 정수들 (a, b) 을 사용하도록 변경될 수 있다. 파라미터들 a 및 b 는 다음 빈 (즉, p[k + 1]) 에 대한 추정된 확률을 결정하기 위해 사용되기 때문에, 식 (4) 에서의 파라미터들 a 및 b 는 다음 빈에 대한 것으로서 고려될 수도 있다. 식 (4) 에서, 파라미터들 (a, b) 은 상태 천이들을 변겨할 수도 있지만, 출력 방정식은 아니다. FSM 파라미터들을 FSM 상태의 일부로서 정의하는 것이 수학적으로 가능함에도 불구하고 이것이 실제적 차이를 나타내기 때문에 이러한 구분이 사용된다.It can be changed to use positive integers ( a, b ) as parameters in . Since parameters a and b are used to determine the estimated probability for the next bin (i.e., p[ k + 1]), parameters a and b in equation (4) are to be considered as for the next bin. It may be possible. In equation (4), the parameters (a, b) may vary state transitions, but are not the output equation. This distinction is used because although it is mathematically possible to define FSM parameters as part of the FSM state, it represents a practical difference.

도 9 및 도 10 은 본 개시의 기법들이 산술 코딩 프로세스에서 어떻게 통합되는지를 나타낸다. 도 9 의 예에서, 산술 인코딩 유닛 (900) 은 산술 인코더 (902), 비트 버퍼 (904), 상태 결정 유닛 (908), 확률 추정 FSM 유닛 (912), 및 FSM 파라미터 결정 유닛 (912) 을 포함한다. 산술 인코딩 유닛 (900) 은, 일부 예들에서, 이진화 유닛 (913) 으로부터 빈 스트림을 수신할 수도 있다. 산술 인코딩 유닛 (900) 및 이진화 유닛 (913) 은 도 2 의 엔트로피 인코딩 유닛 (218) 의 일부를 형성할 수도 있다. 이진화 유닛 (913), 산술 인코더 (902), 비트 버퍼 (904), 상태 결정 유닛 (908), 및 지연 박스 (910) 는 도 7 에서의 이진화 유닛 (713), 산술 인코더 (702), 비트 버퍼 (704), 상태 결정 유닛 (706), 및 지연 박스 (710) 와 유사한 방식으로 동작할 수도 있다.9 and 10 illustrate how the techniques of this disclosure are integrated in an arithmetic coding process. In the example of FIG. 9 , arithmetic encoding unit 900 includes arithmetic encoder 902, bit buffer 904, state determination unit 908, probability estimation FSM unit 912, and FSM parameter determination unit 912. do. Arithmetic encoding unit 900 may, in some examples, receive an empty stream from binarization unit 913. Arithmetic encoding unit 900 and binarization unit 913 may form part of entropy encoding unit 218 of FIG. 2 . Binarization unit 913, arithmetic encoder 902, bit buffer 904, state determination unit 908, and delay box 910 are similar to binarization unit 713, arithmetic encoder 702, and bit buffer in FIG. 704, state determination unit 706, and delay box 710 may operate in a similar manner.

도 10 의 예에서, 산술 디코딩 유닛 (1000) 은 산술 디코더 (1002), 비트 버퍼 (1004), 상태 결정 유닛 (1006), 확률 추정 FSM 유닛 (1008), 및 FSM 파라미터 결정 유닛 (1012) 을 포함한다. 산술 디코딩 유닛 (1000) 은, 일부 예들에서, 역 이진화 유닛 (1013) 에 의해 수신될 수도 있는 빈 스트림을 생성한다. 산술 디코딩 유닛 (1000) 및 역 이진화 유닛 (1013) 은 도 3 의 엔트로피 디코딩 (300) 의 일부를 형성할 수도 있다. 역 이진화 유닛 (1013) 은 빈 스트림을 일련의 하나 이상의 신택스 엘리먼트들로 변환한다. 이진화 유닛 (1013), 산술 디코더 (1002), 비트 버퍼 (1004), 상태 결정 유닛 (1006), 및 지연 박스 (1010) 는 도 8 에서의 역 이진화 유닛 (812), 산술 디코더 (802), 비트 버퍼 (804), 상태 결정 유닛 (806), 및 지연 박스 (810) 와 유사한 방식으로 동작할 수도 있다.In the example of FIG. 10 , arithmetic decoding unit 1000 includes arithmetic decoder 1002, bit buffer 1004, state determination unit 1006, probability estimation FSM unit 1008, and FSM parameter determination unit 1012. do. Arithmetic decoding unit 1000, in some examples, generates an empty stream that may be received by inverse binarization unit 1013. Arithmetic decoding unit 1000 and inverse binarization unit 1013 may form part of entropy decoding 300 of FIG. 3 . Debinarization unit 1013 converts the empty stream into a series of one or more syntax elements. Binarization unit 1013, arithmetic decoder 1002, bit buffer 1004, state determination unit 1006, and delay box 1010 are similar to inverse binarization unit 812, arithmetic decoder 802, and bit in FIG. Buffer 804, state determination unit 806, and delay box 810 may operate in a similar manner.

도 7 및 도 8 로부터의 (도 9 및 도 10 에서 점선들로 도시된) 주요 차이는 도 9 및 도 10 에 포함된 FSM 파라미터 결정 유닛들 (912, 1012) 이다. FSM 파라미터 결정 유닛들 (912, 1012) 은 FSM 파라미터들 (예컨대, 식 (4) 에서의 a 및 b) 을 결정한다. FSM 파라미터 결정 유닛들 (912, 1012) 은 컨텍스트 재초기화 이벤트들에 응답하여, 또는 다른 상황들에서, 코딩 동안 FSM 파라미터들을 결정할 수도 있다. 따라서, 도 9 및 도 10 에서, 파라미터화된 컨텍스트 FSM들은, 확률 값들, 품질 팩터들, 및 다른 데이터를 이용하여, FSM 파라미터 결정 유닛들 (912, 1012) 에 의해, 코딩 또는 재초기화 동안 수정될 수도 있다. FSM 파라미터 결정 유닛들 (912, 1012) 에 공급되는 데이터는 재초기화 파라미터들, 및 또한 현재 상태들 (예컨대, 빈 확률들) 을 포함하거나 그것들로 이루어질 수 있다.The main difference from FIGS. 7 and 8 (shown with dashed lines in FIGS. 9 and 10) is the FSM parameter determination units 912, 1012 included in FIGS. 9 and 10. FSM parameter determination units 912, 1012 determine FSM parameters (e.g., a and b in equation (4)). FSM parameter determination units 912, 1012 may determine FSM parameters during coding, in response to context reinitialization events, or in other situations. Accordingly, in FIGS. 9 and 10, parameterized context FSMs may be modified during coding or reinitialization by FSM parameter determination units 912, 1012 using probability values, quality factors, and other data. It may be possible. Data supplied to the FSM parameter determination units 912, 1012 may include or consist of reinitialization parameters, and also current states (eg, bin probabilities).

따라서, 본 개시의 하나 이상의 기법들에 따르면, 비디오 인코더 (20) 는 비디오 데이터를 수신할 수도 있다. 비디오 데이터는 하나 이상의 픽처들을 포함할 수도 있다. 또한, 비디오 인코더 (20) 의 예측 프로세싱 유닛 (200), 양자화 유닛 (206), 및 잠재적으로 다른 컴포넌트들은 그 비디오 데이터에 기초하여 신택스 엘리먼트들을 생성할 수도 있다. 이 예에서, 엔트로피 인코딩 유닛 (218) 은 생성된 신택스 엘리먼트들 중 하나에 이진 산술 인코딩을 적용함으로써 오프셋 값을 결정할 수도 있다. 이진 산술 인코딩을 적용하는 것의 일부로서, 엔트로피 인코딩 유닛 (218) 은 하나 이상의 신택스 엘리먼트들을 이진화함으로써 빈 스트림을 생성할 수도 있다. 또한, 빈 스트림의 적어도 하나의 각각의 빈 (예컨대, 빈 스트림의 특정 빈, 빈 스트림의 각각의 빈, 빈 스트림의 마지막 빈 이외의 빈 스트림의 각각의 빈 등) 에 대해, 엔트로피 인코딩 유닛 (218) 은, 각각의 빈에 대한 상태, 각각의 빈에 대한 간격, 및 각각의 빈의 값에 기초하여 빈 스트림의 다음 빈에 대한 간격을 결정할 수도 있다. 추가적으로, 엔트로피 인코딩 유닛 (218) 은 빈 스트림의 다음 빈에 대해 하나 이상의 FSM 파라미터들을 결정할 수도 있다. 엔트로피 인코딩 유닛 (218) 은, 각각의 빈에 대한 상태, 빈 스트림의 다음 빈에 대한 하나 이상의 FSM 파라미터들, 및 각각의 빈의 값에 기초하여, 빈 스트림의 다음 빈에 대한 상태를 또한 결정할 수도 있다. 이 예에서, 오프셋 값은 빈 스트림의 마지막 빈에 대한 간격에서의 값과 동일할 수도 있다. 비디오 인코더 (20) 는 오프셋 값을 포함하는 비트스트림을 출력할 수도 있다.Accordingly, in accordance with one or more techniques of this disclosure, video encoder 20 may receive video data. Video data may include one or more pictures. Additionally, prediction processing unit 200, quantization unit 206, and potentially other components of video encoder 20 may generate syntax elements based on the video data. In this example, entropy encoding unit 218 may determine the offset value by applying binary arithmetic encoding to one of the generated syntax elements. As part of applying binary arithmetic encoding, entropy encoding unit 218 may generate an empty stream by binarizing one or more syntax elements. Additionally, for at least one respective bin of the empty stream (e.g., a specific bin of the empty stream, each bin of the empty stream, each bin of the empty stream other than the last bin of the empty stream, etc.), an entropy encoding unit 218 ) may determine the interval for the next bin of the bin stream based on the state for each bin, the interval for each bin, and the value of each bin. Additionally, entropy encoding unit 218 may determine one or more FSM parameters for the next bin of the bin stream. Entropy encoding unit 218 may also determine the state for the next bin of the bin stream based on the state for each bin, one or more FSM parameters for the next bin of the bin stream, and the value of each bin. there is. In this example, the offset value may be equal to the value in the interval for the last bin of the bin stream. Video encoder 20 may output a bitstream that includes the offset value.

또한, 본 개시의 하나 이상의 기법들에 따르면, 비디오 디코더 (30) 의 엔트로피 디코딩 유닛 (300) 은 비트스트림에 포함된 오프셋 값에 이진 산술 디코딩을 적용함으로써 디코딩된 신택스 엘리먼트를 결정할 수도 있다. 이진 산술 디코딩을 적용하는 것의 일부로서, 엔트로피 디코딩 유닛 (300) 은 빈 스트림을 생성할 수도 있다. 빈 스트림을 생성하는 것의 일부로서, 엔트로피 디코딩 유닛 (300) 은, 빈 스트림의 적어도 하나의 각각의 빈 (예컨대, 빈 스트림의 특정 빈, 빈 스트림의 각각의 빈, 빈 스트림의 마지막 빈 이외의 빈 스트림의 각각의 빈 등) 에 대해, 각각의 빈에 대한 상태, 각각의 빈에 대한 간격, 및 오프셋 값에 기초하여, 각각의 빈의 값을 결정할 수도 있다. 추가적으로, 엔트로피 디코딩 유닛 (300) 은 빈 스트림의 다음 빈에 대해 하나 이상의 FSM 파라미터들을 결정할 수도 있다. 빈 스트림의 다음 빈은 빈 스트림에서의 각각의 빈을 뒤따른다. 또한, 엔트로피 디코딩 유닛 (300) 은, 각각의 빈에 대한 상태, 빈 스트림의 다음 빈에 대한 하나 이상의 FSM 파라미터들, 및 각각의 빈의 값에 기초하여, 빈 스트림의 다음 빈에 대한 상태를 결정할 수도 있다. 엔트로피 디코딩 유닛 (300) 은 디코딩된 신택스 엘리먼트를 형성하기 위해 빈 스트림을 이진화해제할 수도 있다. 비디오 디코더 (30) 의 다른 컴포넌트들은 디코딩된 신택스 엘리먼트에 부분적으로 기초하여 비디오 데이터의 픽처를 재구성할 수도 있다. 달리 말하면, 비디오 디코더 (30) 는 픽처를 재구성하기 위해 프로세스에서 디코딩된 신택스 엘리먼트를 사용할 수도 있다.Additionally, according to one or more techniques of this disclosure, entropy decoding unit 300 of video decoder 30 may determine a decoded syntax element by applying binary arithmetic decoding to an offset value included in the bitstream. As part of applying binary arithmetic decoding, entropy decoding unit 300 may generate an empty stream. As part of generating the empty stream, entropy decoding unit 300 may decode at least one respective bin of the empty stream (e.g., a specific bin of the empty stream, each bin of the empty stream, bins other than the last bin of the empty stream). For each bin of the stream, etc.), the value of each bin may be determined based on the state for each bin, the interval for each bin, and the offset value. Additionally, entropy decoding unit 300 may determine one or more FSM parameters for the next bin of the bin stream. The next bean in the empty stream follows each bean in the empty stream. Additionally, entropy decoding unit 300 determines the state for the next bin of the bin stream based on the state for each bin, one or more FSM parameters for the next bin of the bin stream, and the value of each bin. It may be possible. Entropy decoding unit 300 may debinarize the empty stream to form decoded syntax elements. Other components of video decoder 30 may reconstruct a picture of the video data based in part on the decoded syntax elements. In other words, video decoder 30 may use the decoded syntax elements in the process to reconstruct the picture.

일부 예들에서, FSM 파라미터 결정 유닛들 (912, 1012) 은 상이한 컨텍스트들에 대해 상이한 함수들을 사용한다. 실례로, 각각의 컨텍스트는 FSM 파라미터 결정 유닛들 (912, 1012) 에서 상이한 함수를 사용할 수도 있다. FSM 파라미터 결정 유닛들 (912, 1012) 은 빈 확률 추정 FSM 유닛들 (908, 1008) 에 의해 사용되는 FSM 파라미터들을 결정하기 위해 컨텍스트에 대한 함수를 사용한다. 실례로, FSM 파라미터 결정 유닛들 (912, 1012) 의 각각은 상이한 컨텍스트들을 상이한 FSM 파라미터들에 맵핑하는 미리정의된 테이블을 액세스할 수도 있다. 다른 예에서, 상이한 컨텍스트들은 상이한 미리정의된 테이블들과 연관될 수도 있다. 이 예에서, FSM 파라미터 결정 유닛들 (912, 1012) 의 각각은 현재 컨텍스트와 연관된 미리정의된 테이블을 액세스하고, 정보의 하나 이상의 추가적인 조각들에 기초하여 현재 컨텍스트와 연관된 미리정의된 테이블에서 엔트리를 찾을 수도 있다. 이 예에서, 정보의 추가적인 조각들은 이웃하는 블록들의 코딩 모드들에 관한 정보, 마지막 비-제로 계수의 포지션에 관한 정보 등을 포함할 수도 있다. 일부 예들에서, 컨텍스트에 대한 함수는 FSM 파라미터들의 미리결정된 값들에 대한 컨텍스트로부터의 맵핑 (mapping) 이다. 따라서, 일부 예들에서, FSM 파라미터 결정 유닛들 (912, 1012) 의 각각은 FSM 파라미터들의 미리결정된 값들에 컨텍스트들을 맵핑하는 테이블로 구성된다.In some examples, FSM parameter determination units 912, 1012 use different functions for different contexts. Illustratively, each context may use a different function in FSM parameter determination units 912, 1012. FSM parameter determination units 912, 1012 use a function on context to determine FSM parameters used by bin probability estimation FSM units 908, 1008. As an example, each of FSM parameter determination units 912, 1012 may access a predefined table that maps different contexts to different FSM parameters. In another example, different contexts may be associated with different predefined tables. In this example, each of FSM parameter determination units 912, 1012 accesses a predefined table associated with the current context and creates an entry in the predefined table associated with the current context based on one or more additional pieces of information. You can also find it. In this example, additional pieces of information may include information about the coding modes of neighboring blocks, information about the position of the last non-zero coefficient, etc. In some examples, a function to a context is a mapping from the context to predetermined values of FSM parameters. Accordingly, in some examples, each of FSM parameter determination units 912, 1012 is comprised of a table that maps contexts to predetermined values of FSM parameters.

일부 예들에서, 동일한 컨텍스트에 대해서도, FSM 파라미터들 (예컨대, 2-트랙 산술 코어에 대해 (a, b)) 은 또한, 슬라이스 타입들 및/또는 양자화 파라미터들, 및/또는 코딩된 모드 정보 (예컨대, 예를 들어 슬라이스 헤더에서 디코더에 대한 사이드 정보로서 코딩되는 임의의 타입의 정보) 에 의존할 수도 있다. 예를 들어, FSM 파라미터 결정 유닛들 (912, 1012) 의 각각은 FSM 파라미터들에 대한 (팩터들 (예컨대, 컨텍스트, 슬라이스 타입들, 양자화 파라미터들, 코딩된 모드 정보 등) 의 조합들을 맵핑하는 테이블로 구성될 수도 있다.In some examples, even for the same context, the FSM parameters (e.g., (a, b) for a 2-track arithmetic core) may also include slice types and/or quantization parameters, and/or coded mode information (e.g. , for example, any type of information that is coded as side information to the decoder in the slice header). For example, each of the FSM parameter determination units 912, 1012 may have a table mapping combinations of (factors (e.g., context, slice types, quantization parameters, coded mode information, etc.) to FSM parameters. It may be composed of:

FSM 파라미터 결정 유닛들 (912, 1012) 은 FSM 파라미터들을 다양한 방식들로 수정할 수도 있다. 달리 말하면, FSM 파라미터 결정 유닛들 (912, 1012) 은 다양한 이벤트들에 응답하여 그리고 다양한 타입들의 정보에 기초하여 빈 확률 추정 FSM 유닛 (908) 에 의해 어느 FSM 파라미터들이 사용되는지를 변경할 수도 있다. 몇몇 가능성들이 이하에서 리스트되고, 이들은 각각의 컨텍스트에 대해 적용가능할 수도 있다.FSM parameter determination units 912, 1012 may modify FSM parameters in various ways. In other words, FSM parameter determination units 912, 1012 may change which FSM parameters are used by bin probability estimation FSM unit 908 in response to various events and based on various types of information. Some possibilities are listed below, which may be applicable for each context.

하나의 예에서, FSM 파라미터들은 양자화 스텝 또는 QP 값과 같은 상태 재초기화 파라미터에 따라 재초기화 동안 수정된다. 예를 들어, 비디오 코더는 상태 재초기화 파라미터의 값들을 FSM 파라미터들에 맵핑하는 미리정의된 테이블로 구성될 수도 있다. 이 예에서, 비디오 코더는 상태 재초기화 파라미터들에 기초하여 FSM 파라미터들을 찾기 위해 그 테이블을 사용할 수도 있다.In one example, FSM parameters are modified during reinitialization depending on state reinitialization parameters such as quantization step or QP value. For example, a video coder may be configured with a predefined table that maps values of state reinitialization parameters to FSM parameters. In this example, a video coder may use the table to find FSM parameters based on state reinitialization parameters.

일부 예들에서, FSM 파라미터들은 재초기화 동안, 또는 재초기화보다 더 짧은 주기적 간격들에서 (예컨대, 각각의 CTU 에 대해, 각각의 빈 후에 등) 추정된 확률 값들에 따라 수정된다. 예를 들어, 비디오 코더는 추정된 확률 값들을 FSM 파라미터들에 맵핑하는 미리정의된 테이블로 구성될 수도 있다. 이 예에서, 비디오 코더는 추정된 확률 값들에 기초하여 FSM 파라미터들을 찾기 위해 그 테이블을 사용할 수도 있다. 이 예에서, 비디오 코더는 컨텍스트 재초기화 동안 FSM 파라미터들을 찾기 위해 그 테이블을 사용할 수도 있다. 실례로, 비디오 코더는 컨텍스트 재초기화 동안 FSM 파라미터들을 찾기 위해 코딩 컨텍스트에 의해 명시된 확률 추정치들을 사용할 수도 있다. 더욱이, 일부 예들에서, 빈 확률 추정 FSM 유닛들 (908 또는 1008) 은 컨텍스트 재초기화 후에 제 1 빈을 뒤따르는 빈들에 대한 확률 추정치들을 결정할 수도 있다. 그러한 예들에서, 비디오 코더는 미리정의된 테이블에서 FSM 파라미터들을 찾기 위해 빈 확률 추정 FSM 유닛들 (908 또는 1008) 에 의해 결정된 확률 추정치들을 사용할 수도 있다.In some examples, FSM parameters are modified according to estimated probability values during reinitialization, or at periodic intervals shorter than reinitialization (eg, for each CTU, after each bin, etc.). For example, a video coder may be configured with a predefined table that maps estimated probability values to FSM parameters. In this example, a video coder may use the table to find FSM parameters based on the estimated probability values. In this example, a video coder may use the table to find FSM parameters during context reinitialization. For example, a video coder may use probability estimates specified by the coding context to find FSM parameters during context reinitialization. Moreover, in some examples, bin probability estimation FSM units 908 or 1008 may determine probability estimates for bins that follow the first bin after context reinitialization. In such examples, a video coder may use the probability estimates determined by bin probability estimation FSM units 908 or 1008 to find FSM parameters in a predefined table.

일부 예들에서, FSM 파라미터들은 과거 확률 변동 (past probability variation) 의 측정에 기초하여 수정된다. 비디오 코더는 확률들 사이의 절대적 차이들을 합산함으로써 과거 확률 변동의 측정치를 계산할 수도 있다. 상이한 예들에서, 비디오 코더는 상이한 기법들을 이용하여 확률들을 추정할 수도 있다. 하나의 예에서, 식 (4) 에 의해 정의된 FSM 을 이용하여, 비디오 코더는 FSM 파라미터들을 결정하기 위해 "추정 변동 측정치 (estimation variation measure)" 를 사용할 수도 있다. 이 예에서, "추정 변동 측정치" 는 다음의 식:In some examples, FSM parameters are modified based on measurement of past probability variation. A video coder may calculate a measure of past probability variation by summing the absolute differences between probabilities. In different examples, a video coder may estimate probabilities using different techniques. In one example, using an FSM defined by equation (4), a video coder may use an “estimation variation measure” to determine FSM parameters. In this example, the “estimated change measure” is the equation:

(5) (5)

에 따라 정의될 수도 있다. 상기 식에서, σ[k + 1] 은 k+1 에 대한 추정 변동 측정치이고, σ[k] 는 빈 k 에 대한 추정 변동 측정치이며, q ₁[k] 및 q ₂[k] 는 식 (4) 에서 정의되며, c 는 파라미터이다 (예컨대, c 는 상수일 수도 있다). 이 예에서, 비디오 코더는 추정 변동 측정치 σ[k] 의 상이한 값들을 FSM 파라미터들의 상이한 세트드에 맵핑하는 미리정의된 테이블을 액세스할 수도 있다. 따라서, 이 예에서, 비디오 코더는 추정 변동 측정치에 기초하여 FSM 파라미터들을 찾기 위해 그 테이블을 사용할 수도 있다. 일부 예들에서, 비디오 코더는 상이한 코딩 컨텍스트들에 대해 상이한 미리정의된 테이블들을 액세스할 수도 있다. 그러한 예들에서, 코딩 컨텍스트에 대한 미리정의된 테이블에서의 상이한 엔트리들은 FSM 파라미터들의 값들에 σ[k] 의 상이한 값들을 맵핑할 수도 있다. 비디오 코더가 상이한 코딩 컨텍스트들에 대해 상이한 미리정의된 테이블들을 액세스하는 일부 예들에서, 코딩 컨텍스트에 대한 테이블에서의 상이한 엔트리들은 σ[k] 의 상이한 값들 및 추가적인 정보의 상이한 조합들을 FSM 파라미터들의 값들에 맵핑할 수도 있다. 그러한 예들에서, 추가적인 정보는 슬라이스 타입들, 양자화 파라미터들 등을 포함할 수도 있다.It may be defined according to . In the above equation, σ [ k +1] is the estimated variance measure for k+ 1, σ [ k ] is the estimated variance measure for bin k , and q1 _[ k ] and q2 _[ k ] are in equation (4) is defined, and c is a parameter (e.g., c may be a constant). In this example, a video coder may access a predefined table that maps different values of the estimated variance measure σ [ k ] to different sets of FSM parameters. Accordingly, in this example, a video coder may use the table to find FSM parameters based on the estimated variance measure. In some examples, a video coder may access different predefined tables for different coding contexts. In such examples, different entries in the predefined table for the coding context may map different values of σ [ k ] to values of FSM parameters. In some examples where the video coder accesses different predefined tables for different coding contexts, different entries in the table for the coding context may use different values of σ [ k ] and different combinations of additional information to the values of the FSM parameters. You can also map. In such examples, additional information may include slice types, quantization parameters, etc.

일부 예들에서, 비디오 코더는 FSM 파라미터들을 수정하기 위해 상기 정의된 바와 같은 기법들 중 임의의 것을 사용할 수도 있지만, 이전 재초기화부터 생성된 데이터를 이용하는 대신에, 비디오 코더는 비디오 시퀀스의 다른 부분들로부터의 데이터를 사용할 수도 있다. 예를 들어, 도 11a 는 FSM 파라미터들이 스캔 순서를 따라 동일 픽처에서 이웃하는 블록들 (예컨대, CTU들, CU들) 로부터 도출될 수 있는 것을 나타내는 블록도이다. 실례로, 도 11a 의 예에서, FSM 파라미터 결정 유닛들 (912, 1012) 은 하나 이상의 이웃하는 블록들로부터의 정보에 적어도 부분적으로 기초하여 FSM 파라미터들을 결정할 수도 있다. 그 정보는 예측 모드들, 양자화 파라미터들 등을 포함할 수도 있다. 일부 예들에서, 그 정보는 확률 기변성의 일부 측정치로부터 도출된 데이터를 포함할 수도 있다. 예를 들어, 단일 비트가 사용될 수 있고, 1 은 확률 추정치들에서의 심각한 변화들이 관찰된 것을 나타내기 위한 것이고, 0 은 그 외의 경우를 나타내기 위한 것이다.In some examples, a video coder may use any of the techniques as defined above to modify FSM parameters, but instead of using data generated from a previous reinitialization, the video coder may use data from other parts of the video sequence. You can also use data from For example, FIG. 11A is a block diagram showing that FSM parameters can be derived from neighboring blocks (e.g., CTUs, CUs) in the same picture along scan order. For example, in the example of FIG. 11A , FSM parameter determination units 912, 1012 may determine FSM parameters based at least in part on information from one or more neighboring blocks. The information may include prediction modes, quantization parameters, etc. In some examples, the information may include data derived from some measure of stochastic variability. For example, a single bit could be used, with 1 to indicate that significant changes in probability estimates were observed and 0 to indicate otherwise.

도 11b 는 현재 픽처 (1110) 의 블록들에서 사용되는 FSM 파라미터들이 이전에 코딩된 픽처 (1112) 에서의 블록들과 연관된 정보에 기초하여 결정될 수 있는 것을 나타내는 블록도이다. 이전에 코딩된 픽처 (1112) 에서의 블록들은 현재 픽처 (1110) 에서의 블록들과 동일한 공간적 로케이션을 가질 수도 있다. 실례로, 도 11b 의 예에서, FSM 파라미터 결정 유닛들 (912, 1012) 은 이전에 코딩된 픽처에서의 하나 이상의 블록들로부터의 정보에 적어도 부분적으로 기초하여 FSM 파라미터들을 결정할 수도 있다. 그 정보는 예측 모드들, 양자화 파라미터들 등을 포함할 수도 있다. 일부 예들에서, 그 정보는 확률 기변성의 일부 측정치로부터 도출된 데이터를 포함할 수도 있다. 예를 들어, 단일 비트가 사용될 수 있고, 1 은 확률 추정치들에서의 심각한 변화들이 관찰된 것을 나타내기 위한 것이고, 0 은 그 외의 경우를 나타내기 위한 것이다.FIG. 11B is a block diagram showing that FSM parameters used in blocks of a current picture 1110 can be determined based on information associated with blocks in a previously coded picture 1112. Blocks in previously coded picture 1112 may have the same spatial location as blocks in current picture 1110. For example, in the example of FIG. 11B, FSM parameter determination units 912, 1012 may determine FSM parameters based at least in part on information from one or more blocks in a previously coded picture. The information may include prediction modes, quantization parameters, etc. In some examples, the information may include data derived from some measure of stochastic variability. For example, a single bit could be used, with 1 to indicate that significant changes in probability estimates were observed and 0 to indicate otherwise.

이진 확률 추정, 즉, 랜덤 이진 데이터 소스의 확률을 추정하는 프로세스는 잘 알려진 베르누이 (Bernoulli) (또는 이항) 시도들에 관련될 수도 있고, 따라서, 수십년 동안 연구되었다. 하지만, 엔트로피 코딩을 위한 그것의 사용은, 실제 애플리케이션들에서, 추정 방법이 다음과 같은 2 가지 충돌하는 목적들을 고려하여야 하기 때문에, 여전히 개발 중이다: (1) 압축 효율은 확률 추정치들의 더 높은 정확도, 및 정확도를 유지하면서 추정치들을 빨리 변화시키는 능력과 함께 향상되고, 이는 더 높은 계산적 복잡도를 요구한다. 그리고, (2) 산술 코딩 속도는 압축 및 압축해제 시스템의 스루풋을 (Mbits/sec 로) 심각하게 제한할 수 있기 때문에, 산술 코딩이 스루풋을 증가시키기 위해서 작은 계산적 복잡도로 수행되는 것이 바람직하다.Binary probability estimation, i.e. the process of estimating the probability of a random binary data source, may involve the well-known Bernoulli (or binomial) attempts and, therefore, have been studied for decades. However, its use for entropy coding is still under development because, in practical applications, the estimation method must consider two conflicting objectives: (1) compression efficiency requires higher accuracy of probability estimates; and the ability to quickly change estimates while maintaining accuracy, which requires higher computational complexity. And, (2) because arithmetic coding speed can severely limit the throughput (in Mbits/sec) of compression and decompression systems, it is desirable for arithmetic coding to be performed with small computational complexity to increase throughput.

낮은 복잡도에 대한 실제적 비디오 코딩 시스템들에서의 선호 때문에, 이진 산술 코딩을 위한 확률 추정의 대부분의 방법들은, 참조문헌 11 에서 설명된 바와 같이 산술 코딩의 처음 실제적 구현들 이후로, 참조문헌 10 에서 설명된 바와 같이 FSM 에 기초하였다. 이들 기법들의 기본들의 표기법이 이하에서 정의되고, 이는 비디오 코딩 애플리케이션들에서 사용될 수도 있다.Because of the preference in practical video coding systems for low complexity, most methods of probability estimation for binary arithmetic coding have been used since the first practical implementations of arithmetic coding, as described in reference 11, as described in reference 10. As described above, it was based on FSM. The notation of the basics of these techniques is defined below, which may be used in video coding applications.

이진 랜덤 데이터 소스 (즉, ) 로부터 N 심볼들의 시퀀스, , 심볼 1 에 대한 참 확률들, 즉, 의 알려지지 않은 시퀀스 가 존재한다고 가정하면, 이진 확률 추정은, 인과율 조건 즉, p[n] 이 "과거" 빈들의 세트 에 오직 의존할 수 있다는 조건 하에서, 참 확률들에 가장 근사하는 추정된 확률들 의 시퀀스를 발견하는 문제이다.Binary random data source (i.e. ) a sequence of N symbols from , the true probabilities for symbol 1, that is, unknown sequence of Assuming that exists , the binary probability estimate is that the causality condition is The estimated probabilities that best approximate the true probabilities, provided that they can only depend on The problem is to find a sequence of .

코딩 애플리케이션들에서, 변수들 및 방정식들은 오직 정수 산술을 이용하기 위해 수정된다. 보통, 확률은 2 의 승수로 스케일링된다. 실례로, 정수 스케일링 팩터 (scaling factor) 가 사용되고, 대문자들은 대응하는 정수 값들을 나타내기 위해 사용되는 경우에, 확률들 및 빈들의 스케일링된 값들은 다음과 같을 수도 있다:In coding applications, variables and equations are modified to use only integer arithmetic. Usually, probabilities are scaled by a power of 2. For example, an integer scaling factor is used, and uppercase letters are used to indicate corresponding integer values, the scaled values of the probabilities and bins may be:

채택되었고, 과거 수십년에 걸쳐 몇번이나 "재발견" 되고 재명명된 확률 추정 FSM 의 하나의 특정 타입은 적응 파라미터 (adaptation parameter) , 및 재귀 형태 (recursive form):One particular type of probability estimation FSM that has been adopted, and has been "rediscovered" and renamed several times over the past few decades, is the adaptation parameter. , and recursive form:

(6) (6)

를 갖는다.has

실제의 인코딩 애플리케이션에서, 비디오 인코더 (20) ?? 비디오 디코더 (30) 양자는 (보통 공유된 고정된 테이블로부터의) 동일한 초기 확률 추정치 로 시작하고, 그 다음, 각각의 빈 은 확률 추정된 를 이용하여 순차적으로 최적으로 인코딩되고 디코딩되며, 각각의 확률 추정치는 각각의 빈이 인코딩되거나 디코딩된 후에 식 (6) 으로 업데이트된다. 이것은 재귀 방정식이기 때문에, 각각의 확률 값들은 모든 이전에 인코딩되었던 빈들 또는 디코딩되었던 빈들에 의존한다.In a real encoding application, video encoder 20 ?? Both video decoders 30 have the same initial probability estimate (usually from a shared fixed table). Start with , then each bean is the estimated probability are optimally encoded and decoded sequentially using , and each probability estimate is updated with equation (6) after each bin is encoded or decoded. Because this is a recursive equation, each probability value depends on all previously encoded or decoded bins.

참조문헌들 16 및 17 에서 설명된 바와 같은 신호 프로세싱은, 식 (6) 이 사용될 때, 추정된 확률 값들은, 지수적으로 감소하는 가중치들을 이용하여, 이전에 코딩된 빈들에 의존적임을 보여준다.　 그러한 이유로, 이 확률 추정 기법은, 실제의 코딩 애플리케이션들에서 의 값을 이용하여 제안된, 참조문헌 12 에서 지수적 에이징 (exponential aging) 으로 불렸다. AVC/H.264 및 HEVC/H.265 비디오 코딩 표준들에서 채택된 CABAC 산술 코딩 방법은 또한, 값 [4, 6, 7] 을 이용하여 이러한 접근법을 이용하고, 이전의 구현들로부터의 차이들 중 하나는 테이블 룩-업에 기초한 유한 상태 머신의 그것의 사용에 관련된다.Signal processing as described in references 16 and 17 shows that when equation (6) is used, the estimated probability values are dependent on previously coded bins, using exponentially decreasing weights. For that reason, this probability estimation technique is used in actual coding applications. It was called exponential aging in reference 12, proposed using the value of . The CABAC arithmetic coding method adopted in the AVC/H.264 and HEVC/H.265 video coding standards also provides [4, 6, 7] uses this approach, and one of the differences from previous implementations relates to its use of a finite state machine based on table look-up.

보다 최근에는, 동일한 접근법이 참조문헌 13 및 참조문헌 14 에서 지수적으로 감쇠하는 메모리로 불렸다. 참조문헌 14 는 그것을 "가상 슬라이딩 윈도우 (virtual sliding window)" 기법이라고 불렀고, 그것은, 대등 형태More recently, the same approach has been called exponentially decaying memory in references 13 and 14. Reference 14 called it a “virtual sliding window” technique, and it is

(7) (7)

가 W 빈들의 "슬라이딩 윈도우" 를 이용하여, 확률 추정에 대한 램던화된 알고리즘에 관련되기 때문이다. 더욱이, 참조문헌 13 및 14 는, 고 복잡도 분할이 효율적인 정수 비트 시프트들에 의해 대체될 수 있기 때문에This is because is associated with a randomized algorithm for probability estimation, using a “sliding window” of W bins. Moreover, references 13 and 14 show that since high-complexity divisions can be replaced by efficient integer bit shifts,

(8) (8)

의 형태로, W 가 2 의 승수일 때 정수 산술로 그것이 유효하게 구현될 수 있음을 보여줬다.We showed that it can be effectively implemented with integer arithmetic when W is a power of 2.

이전의 추정 공식들에 있어서의 한가지 실제적인 문제점은, 주어진 컨텍스트의 이진 데이터에 따라 상이한 윈도우 값들 W 가 필요할 수도 있다는 점이다. 빈 값들의 확률들이 느리게 변화하거나, 빈 값 확률들이 매우 상이한 (예를 들어, ) 경우에, 그것들은 많은 수의 이전의 빈 값들에 걸쳐 평균화하기 때문에, W 의 더 큰 값들에 의해 보다 많은 압축이 획득된다. 다른 한편, W 의 작은 값들은 확률이 신속하고 자주 변화할 때 유익할 수도 있다.One practical problem with previous estimation formulas is that different window values W may be needed depending on the binary data of a given context. Either the probabilities of empty values change slowly, or the probabilities of empty values are very different (e.g. ) case, more compression is obtained with larger values of W , since they average over a large number of previous bin values. On the other hand, small values of W may be beneficial when probabilities change quickly and frequently.

참조문헌 15 에서 설명된 이 문제에 대한 한 가지 솔루션은One solution to this problem described in reference 15 is

(9) (9)

이도록 넘버 M 의 적응 파라미터들 , 및 가중치들 을 정의하고, 식 (6) 의 동일한 재귀적 형태에서, 수개의 확률 추정자들을 이용하며,Adaptation parameters of number M so that , and weights define , and in the same recursive form of equation (6), use several probability estimators,

(10) (10)

그 다음, 사중된 평균으로서 최종 추정치를 계산하는 것이다:Next, calculate the final estimate as a four-weighted average:

이러한 접근법은 비디오 코딩 애플리케이션들에서 보다 효율적인 것으로 증명되었고, 그러한 이유로, 현재의 ITU/MPEG 실험적 비디오 압축 프로그램은 확률 추정을 위해 (정수 산술을 이용하는) 다음과 같은 3 개의 식들을 사용한다.This approach has proven to be more efficient in video coding applications, and for that reason the current ITU/MPEG experimental video compression program uses the following three equations (using integer arithmetic) for probability estimation:

(11) (11)

이 경우에, 확률 추정 FSM 은 상태 엘리먼트들로서 Q ₁ 및 Q ₂ 를 사용하고, 이는, Q ₁ > 1/2 및 Q ₂ < 1/2 가 동시에 발생되거나 그 역일 수 있기 때문에, 최소 우도 심볼 (least-probable-symbol; LPS) 의 확률만을 추정하는 보통의 기법을 사용하는 것을 어렵게 만든다.In this case, the probability estimation FSM uses Q ₁ and Q ₂ as state elements, since Q ₁ > 1/2 and Q ₂ < 1/2 can occur simultaneously or vice versa, the least likelihood symbol (least -probable-symbol; LPS) makes it difficult to use the usual technique of estimating only the probability.

참조문헌들 16 및 17 에서 설명된 이산-시간 신호 프로세싱, 및 z-변환의 정의를 이용하면, 결과는 다음과 같다:Using the discrete-time signal processing described in references 16 and 17, and the definition of the z-transform, the result is:

(12) (12)

규약에 따라, 본 개시물은 z-변환들에 대해 대문자들을 사용하지만, 그것들은 브라켓들 대신에 사용 괄호에 의해 인식될 수 있다. 예를 들어, P[k] 는 p[k] 의 스케일링된 정수 버전를 나타내기 위해 사용되는 한편, P(z) 는 p[k] 의 z-변환을 나타낸다.By convention, this disclosure uses capital letters for z-transforms, but they can be recognized by using parentheses instead of brackets. For example, P [ k ] is used to denote the scaled integer version of p [ k ], while P ( z ) denotes the z-transform of p [ k ].

이들 정의들을 이용하면 식 (6) 는 다음에 대응한다:Using these definitions, equation (6) corresponds to:

(13) (13)

또는or

(14) (14)

이는 확률 추정치가 도 12 에서 도시된 바와 같이 빈 값들의 시퀀스에 적용되는 응답This is a response where the probability estimate is applied to a sequence of empty values as shown in Figure 12.

(15) (15)

으로 무한 임펄스 응답 (IIR) 의 출력임을 의미한다. 실례로, 도 12 는 식 (6) 에 의해 정의된 확률 추정의 하나의 예를 나타낸다.This means that it is an output of infinite impulse response (IIR). By way of illustration, Figure 12 shows one example of probability estimation defined by equation (6).

식 (10) 의 확률 추정자 (probability estimator) 는, 2 개의 팩터들, 및 동일한 가중치들을 이용하여, 도 13 에서 도시된 병렬 필터 구현에 대응한다. 도 13 에서, 식 (10) 에 의해 정의된 확률 추정 필터는 2 개의 추정자들 및 동일한 가중치들을 포함한다.The probability estimator of equation (10) corresponds to the parallel filter implementation shown in Figure 13, using two factors and equal weights. In Figure 13, the probability estimation filter defined by equation (10) includes two estimators and equal weights.

따라서, 일부 예들에서, 비디오 인코더 (20) 는 산술 인코딩을 위해 사용되는 특정 빈 (예컨대, b[k]) 의 빈 확률 (예컨대, p[k]) 을 결정하기 위해 도 12 및 도 13 에서 예시된 확률 추정 필터들의 동작들을 구현할 수도 있다. 비디오 인코더 (20) 에 대한 b[k] 는 비트스트림으로 인코딩되는 이진화된 신택스 엘리먼트들 또는 보다 일반적으로 비디오 데이터에 기초할 수도 있다. 유사하게, 비디오 디코더 (30) 는 산술 인코딩을 위해 사용되는 특정 빈 (예컨대, b[k]) 의 빈 확률 (예컨대, p[k]) 을 결정하기 위해 도 12 및 도 13 에서 예시된 확률 추정 필터들의 동작들을 구현할 수도 있다. 비디오 디코더 (30) 에 대한 b[k] 는 비디오 데이터가 비트스트림으로부터 디코딩되는 그 비트스트림에 포함될 수도 있다.Accordingly, in some examples, video encoder 20 uses the example in FIGS. 12 and 13 to determine the bin probability (e.g., p[k]) of a particular bin (e.g., b[k]) used for arithmetic encoding. Operations of probability estimation filters may also be implemented. b[k] for video encoder 20 may be based on binarized syntax elements or more generally video data encoded into the bitstream. Similarly, video decoder 30 may use the probability estimation illustrated in FIGS. 12 and 13 to determine the bin probability (e.g., p[k]) of a particular bin (e.g., b[k]) used for arithmetic encoding. You can also implement the operations of filters. b[k] for video decoder 30 may be included in the bitstream from which video data is decoded.

엔트로피 코딩 (예컨대, 산술 코딩) 이 정적 데이터 소스 (예컨대, 비-이동 콘텐츠) 에 대해 최적이도록 설계되기 때문에, 그것의 실제적인 유효성은 데이터 엘리먼트들을 분류하는 것에 의존하므로, 각 부류에서의 통계는 대략적으로 정적이다. 실제로, 각 부류는 코딩 컨텍스트 (또는 바이너리 알파벳들에 대해 빈 컨텍스트) 에 의해 표현되고, 비록 각 컨텍스트가 심볼 확률들의 상이한 세트에 대응함에도 불구하고, 심볼 확률들이 변화하는 경우에, 그 변화들이 발생하는 방식은 충분히 유사하여서, 단일의 적응 방법이 모든 컨텍스트들에 대해 충분할 것이라고 가정된다.Since entropy coding (e.g., arithmetic coding) is designed to be optimal for static data sources (e.g., non-moving content), its practical effectiveness depends on classifying data elements, so the statistics in each class are approximate. It is static. In practice, each class is represented by a coding context (or an empty context for binary alphabets), and even though each context corresponds to a different set of symbol probabilities, if the symbol probabilities change, those changes occur. It is assumed that the methods are sufficiently similar that a single adaptation method will be sufficient for all contexts.

하지만, 현실에서, 실제적인 비디오 코딩에서, 분류에 의해 형성된 스트림들의 각각에서의 데이터는 또한 구분되는 고차 변동들을 갖는다. 예를 들어, 부류들은 다음의 경우들에 대응한다:However, in reality, in practical video coding, the data in each of the streams formed by classification also has distinct higher-order variations. For example, the classes correspond to the following cases:

데이터는 참으로 정적임;

Data is truly static;

심볼 확률들이 자주, 하지만 작은 양들로 변화함; Symbol probabilities vary frequently but by small amounts;

심볼 확률들은 자주 변화하지 않지만, 그것들이 변화할 때, 확률 값들은 매우 현저하게 변화함. Symbol probabilities do not change often, but when they do, the probability values change very significantly.

또한, 데이터 심볼 확률들이 얼마나 자주 변화하는지, 변화의 속도, 및 변화들의 평균 크기에 관한 가변성이 또한 존재할 수도 있다. 모든 상이한 경우들을 다루기 위해서, 추정 프로세ㅡ의 양호한 제어를 가지는 것이 바람직할 수도 있다. 본 개시는 추정의 순서를 증가시키는 것, 데이터 소스 통계에 최적으로 매칭하기 위해 새로운 추정자들의 파라미터들을 조정하는 것에 의해 이것이 어떻게 행해지는지를 기술한다. 또한, 본 개시에서 기술된 기법들은, 오직 최소 우도 심볼 (LPS) 의 확률만을 추정하는 통상적인 기법을 이용하기 어렵게 만드는, 그것의 상태의 일부로서 확률 추정을 가지지 않는 FSM 추정자들의 문제를 해결할 수도 있다.Additionally, there may also be variability regarding how often data symbol probabilities change, the rate of change, and the average magnitude of the changes. In order to handle all different cases, it may be desirable to have good control of the estimation process. This disclosure describes how this is done by increasing the order of estimation and adjusting the parameters of the new estimators to optimally match the data source statistics. Additionally, the techniques described in this disclosure may solve the problem of FSM estimators that do not have a probability estimate as part of their state, making it difficult to use conventional techniques that only estimate the probability of the least likelihood symbol (LPS). .

예측자들의 응답에서 보다 많은 정도의 자유를 가능하게 하기 위해서, 본 기법들은 더 높은 차수의 필터들을 이용할 수도 있다. IIR 필터 응답은To enable a greater degree of freedom in the predictors' responses, the techniques may use higher order filters. The IIR filter response is

. (16) . (16)

와 같이 다항식 표현을 이용하여 보통 정의된다.It is usually defined using a polynomial expression, such as:

하지만, 확률 추정 문제에서, 다항식 표현의 파라미터들은 보통 최소 계산적 복잡도를 허용하지 않는다. 또한, 최선의 응답은 숫자적으로 안정적이지 않고, 추가로 구현을 복잡하게 만드는 것으로 알려져 있다.However, in probability estimation problems, the parameters of the polynomial expression usually do not allow for minimal computational complexity. Additionally, best responses are known to be not numerically stable, which further complicates implementation.

본 개시는 확률 추정을 위해 이용되는 동작들이 결정될 수도 있는 시작 포인트로서 다음과 같은 프로덕트를 이용하는 것을 기술한다.This disclosure describes using the following product as a starting point from which operations used for probability estimation may be determined.

(17) (17)

여기서, 는 각각 H(z) 의 폴들 (poles) 및 제로들 (zeros) 이고, 상수 γ 는 조건 H(1) = 1 이 충족되도록, 즉, 확률 추정이 적절하게 스케일링되도록 정의된다.here, are the poles and zeros of H ( z ), respectively, and the constant γ is defined such that the condition H (1) = 1 is satisfied, i.e., the probability estimate is scaled appropriately.

다음과 같은 단계는, 폴들 및 제로들이 거의 단일성이도록 예상되는 것, 및, 몇몇 정수 곱들을 갖는 것이 수용가능할 수도 있지만 비트 시프트들에 의해 모든 나누기들을 대체하는 것이 유용할 수도 있음을 고려하기 위한 것이다. 그러한 조건들 하에서, 필터는 식 (18) 을 이용하여 정의될 수도 있다.The following steps are to take into account that the poles and zeros are expected to be approximately unity, and while it may be acceptable to have some integer products it may be useful to replace all divisions by bit shifts. Under such conditions, the filter may be defined using equation (18).

(18) (18)

여기서, 조건 H(1) = 1 을 충족시키기 위해서, 다음과 같아야 한다. Here, to satisfy the condition H (1) = 1, it must be as follows.

(19) (19)

정의Justice

(20) (20)

로 다음과 같이 쓰여질 수 있고:can be written as:

(21) (21)

이 캐스케이드 (프로덕트) 형태에서, 개별 필터들이 임의의 순서로 배열될 수 있다는 사실을 이용할 수 있다.In this cascade (product) form, it is possible to take advantage of the fact that individual filters can be arranged in any order.

무한한 정확도로, 필터들의 순서는 관계가 없을 수도 있지만, 필터들의 순서는 오직 유한한 정확도 및 정수 연산들을 이용할 때에만 중요할 수도 있다. 실례로, 파라미터 γ 가 2 와는 상이한 팩터들을 가질 대, 그것은 단순히 제 1 스테이지로 이동될 수도 있고, 정수 시퀀스 B[k] 는 곱들 또는 나누기들을 회피하기 위해 재스케일링될 수도 있다.With infinite accuracy, the order of the filters may be irrelevant, but the order of the filters may only matter when using finite accuracy and integer operations. For example, when parameter γ has factors different from 2, it may simply be moved to the first stage, and the integer sequence B [ k ] may be rescaled to avoid multiplications or divisions.

본원에 개시된 일부 예들에서, 파라미터들의 세트 는 (맞춤형 하드웨어에 대해 유용할 수도 있는,) 산술 연산들의 복잡성을 최소화하기 위해, 그리고 구현들을 단순화하기 위해, 모두 작은 정수들이다.In some examples disclosed herein, a set of parameters are all small integers to minimize the complexity of arithmetic operations (which may be useful for custom hardware) and to simplify implementations.

도 14 에서 구현의 다이어그램이 도시된다. 예를 들어 도 14 는 캐스케이드 필터들을 이용하는 확률 추정 필터의 일례를 나타낸다. 예를 들어, 도 14 는 빈 스트림의 빈들의 각각의 빈 확률들을 결정하기 위해 전달 함수 H_i(z) 의 연산들을 각각 수행하는 복수의 필터들을 나타낸다. 일례로서, 빈 스트림은 인코딩의 목적들을 위해 비디오 인코더 (20) 에서 생성된 이진화된 빈 스트림일 수도 있다. 빈 스트림은 또한, 비디오 디코더 (30) 가 실제 비디오 데이터 (예컨대, 신택스 엘리먼트) 를 생성하기 위해 이진화해제를 위해 생성하는 빈 스트림일 수도 있다.In Figure 14 a diagram of the implementation is shown. For example, Figure 14 shows an example of a probability estimation filter using cascade filters. For example, Figure 14 shows a plurality of filters that each perform operations of a transfer function H _i (z) to determine respective bin probabilities of bins of an empty stream. As an example, the empty stream may be a binarized empty stream generated in video encoder 20 for encoding purposes. An empty stream may also be an empty stream that video decoder 30 generates for debinarization to generate actual video data (e.g., syntax elements).

비디오 디코더 (30) 에 대해, 복수의 필터들의 제 1 필터는 비트스트림 (예컨대, 비디오 인코더 (20) 에 의해 생성된 비트스트림) 에 기초하여 값들을 수신한다. 복수의 필터들의 마지막 필터는 빈 확률을 출력한다. 다른 필터들의 각각은 그것의 바로 이전 필터로부터 값들을 수신하고, 그 수신된 값들 및 필터들의 각각에 대한 각각의 파라미터 값들에 기초하여 그것의 다음 필터에 대한 값들을 결정한다. 마지막 필터는 그것의 바로 이전 필터로부터 값들을 수신하고, 그 수신된 값들 및 그 마지막 필터에 대한 파라미터 값들에 기초하여 빈 확률을 결정한다. 일례로서, 필터들의 각각에 대한 각각의 파라미터 값들은 _i , _i , _i , 및 _i 일 수도 있지만, 더 적은 또는 더 많은 파라미터 값들이 가능하다.For video decoder 30, a first filter of the plurality of filters receives values based on a bitstream (e.g., a bitstream generated by video encoder 20). The last filter of the plurality of filters outputs an empty probability. Each of the other filters receives values from its immediately preceding filter and determines values for its next filter based on the received values and the respective parameter values for each of the filters. The last filter receives values from its immediately preceding filter and determines the bin probability based on the received values and the parameter values for that last filter. As an example, the respective parameter values for each of the filters are _i , _i , _i , and It may be _i , but fewer or more parameter values are possible.

비디오 인코더 (20) 에 대해, 복수의 필터들의 제 1 필터는 (신택스 엘리먼트의 이진화에 의해 생성되는) 빈 스트림에 기초하여 값들을 수신한다. 복수의 필터들의 마지막 필터는 빈 확률을 출력한다. 다른 필터들의 각각은 그것의 바로 이전 필터로부터 값들을 수신하고, 그 수신된 값들 및 필터들의 각각에 대한 각각의 파라미터 값들에 기초하여 그것의 다음 필터에 대한 값들을 결정한다. 마지막 필터는 그것의 바로 이전 필터로부터 값들을 수신하고, 그 수신된 값들 및 그 마지막 필터에 대한 파라미터 값들에 기초하여 빈 확률을 결정한다. 일례로서, 필터들의 각각에 대한 각각의 파라미터 값들은 _i , _i , _i , 및 _i 일 수도 있지만, 더 적은 또는 더 많은 파라미터 값들이 가능하다.For video encoder 20, a first filter of the plurality of filters receives values based on an empty stream (generated by binarization of syntax elements). The last filter of the plurality of filters outputs an empty probability. Each of the other filters receives values from its immediately preceding filter and determines values for its next filter based on the received values and the respective parameter values for each of the filters. The last filter receives values from its immediately preceding filter and determines the bin probability based on the received values and the parameter values for that last filter. As an example, the respective parameter values for each of the filters are _i , _i , _i , and It may be _i , but fewer or more parameter values are possible.

비디오 코더 (예컨대, 비디오 인코더 (20) 또는 비디오 디코더 (30)) 는 다양한 방식들로 _i , _i , _i , 및 _i 을 결정할 수도 있다. 예를 들어, 비디오 코더는 다양한 타입들의 정보를 _i , _i , _i , 및 _i 의 값들에 맵핑하는 하나 이상의 미리정의된 테이블들을 액세스할 수도 있다. 예를 들어, 비디오 코더는 _i , _i , _i , 및 _i 의 값들을 찾기 위해 특정 컨텍스트와 연관된 테이블을 사용할 수도 있다. 이러한 예들에서, 그 하나 이상의 테이블들은 비디오 데이터의 경험적 분석에 기초하여 결정될 수도 있다.A video coder (e.g., video encoder 20 or video decoder 30) may operate in a variety of ways. _i , _i , _i , and _i can also be determined. For example, a video coder can use many different types of information. _i , _i , _i , and You may also access one or more predefined tables that map to the values of _i . For example, a video coder _i , _i , _i , and You can also use a table associated with a specific context to find the values of _i . In these examples, the one or more tables may be determined based on empirical analysis of the video data.

일부 예들에서, 본 개시의 기법들은 엔트로피 인코딩 유닛 (218) (도 2) 을 이용하여 구현될 수도 있다. 예를 들어, 비디오 인코더 (20) 는 비디오 데이터를 수신할 수도 있다. 비디오 데이터는 하나 이상의 픽처들을 포함할 수도 있다. 또한, 비디오 인코더 (20) 의 예측 프로세싱 유닛 (200), 양자화 유닛 (206), 및 잠재적으로 다른 컴포넌트들은 그 비디오 데이터에 기초하여 신택스 엘리먼트들을 생성할 수도 있다. 이 예에서, 엔트로피 인코딩 유닛 (218) 은 비디오 데이터에 기초하여 신택스 엘리먼트를 수신하고, 그 신택스 엘리먼트에 이진 산술 인코딩을 적용할 수도 있다. 일부 예들에서, 이진 산술 인코딩을 적용하는 것은, 신택스 엘리먼트를 이진화함으로써 빈 스트림을 생성하는 것, 및 복수의 필터들로 빈 스트림의 마지막 하나의 빈에 대해 빈 확률을 결정하는 것을 포함할 수도 있다. 복수의 필터들의 제 1 필터는 빈 스트림에 기초하여 값들을 수신하고, 복수의 필터들의 마지막 필터는 빈 확률을 출력하며, 다른 필터들의 각각은 그것의 바로 이전 필터로부터 값들을 수신하고, 그 수신된 값들 및 필터들의 각각에 대한 각각의 파라미터 값들에 기초하여 그것의 다음 필터에 대한 값들을 결정하고, 마지막 필터는 그것의 바로 이전 필터로부터 값들을 수신하고, 그 수신된 값들 및 그 마지막 필터에 대한 파라미터 값들에 기초하여 빈 확률을 결정한다. 이진 산술 인코딩을 적용하는 것은 또한, 적어도 하나의 빈 및 빈 확률에 기초하여 비트스트림을 생성하는 것을 포함할 수도 있다. 엔트로피 인코딩 유닛 (218) 은 비트스트림을 출력할 수도 있다.In some examples, techniques of this disclosure may be implemented using entropy encoding unit 218 (FIG. 2). For example, video encoder 20 may receive video data. Video data may include one or more pictures. Additionally, prediction processing unit 200, quantization unit 206, and potentially other components of video encoder 20 may generate syntax elements based on the video data. In this example, entropy encoding unit 218 may receive a syntax element based on video data and apply binary arithmetic encoding to the syntax element. In some examples, applying binary arithmetic encoding may include generating an empty stream by binarizing a syntax element and determining an empty probability for the last one bin of the empty stream with a plurality of filters. The first filter of the plurality of filters receives values based on the empty stream, the last filter of the plurality of filters outputs the empty probability, each of the other filters receives values from its immediately preceding filter, and the received Determine the values for its next filter based on the respective parameter values for each of the values and filters, the last filter receiving values from its immediately preceding filter, the received values and the parameters for that last filter. Determine the bin probability based on the values. Applying binary arithmetic encoding may also include generating a bitstream based on at least one bin and bin probability. Entropy encoding unit 218 may output a bitstream.

더욱이, 일부 예들에서, 본 개시의 기법들은 엔트로피 디코딩 유닛 (300) (도 3) 을 이용하여 구현될 수도 있다. 예를 들어, 엔트로피 디코딩 유닛 (300) 은 비트스트림에 이진 산술 디코딩을 적용함으로써 디코딩된 신택스 엘리먼트를 결정할 수도 있다. 이진 산술 디코딩을 적용하는 것은 빈 스트림을 생성하는 것을 포함하고, 빈 스트림을 생성하는 것은 복수의 필터들로 빈 스트림의 마지막 하나의 빈에 대해 빈 확률을 결정하는 것을 포함한다. 복수의 필터들의 제 1 필터는 비트스트림에 기초하여 값들을 수신하고, 복수의 필터들의 마지막 필터는 빈 확률을 출력하며, 다른 필터들의 각각은 그것의 바로 이전 필터로부터 값들을 수신하고, 그 수신된 값들 및 필터들의 각각에 대한 각각의 파라미터 값들에 기초하여 그것의 다음 필터에 대한 값들을 결정하고, 마지막 필터는 그것의 바로 이전 필터로부터 값들을 수신하고, 그 수신된 값들 및 그 마지막 필터에 대한 파라미터 값들에 기초하여 빈 확률을 결정한다. 엔트로피 디코딩 유닛 (300) 은 디코딩된 신택스 엘리먼트를 형성하기 위해 빈 스트림을 이진화해제할 수도 있다. 예측 프로세싱 유닛 (302) 은 디코딩된 신택스 엘리먼트에 부분적으로 기초하여 비디오 데이터의 픽처를 재구성할 수도 있다.Moreover, in some examples, the techniques of this disclosure may be implemented using entropy decoding unit 300 (FIG. 3). For example, entropy decoding unit 300 may determine a decoded syntax element by applying binary arithmetic decoding to the bitstream. Applying binary arithmetic decoding includes generating an empty stream, and generating the empty stream includes determining an empty probability for the last one bin of the empty stream with a plurality of filters. The first of the plurality of filters receives values based on the bitstream, the last filter of the plurality of filters outputs an empty probability, and each of the other filters receives values from its immediately preceding filter, and the received Determine the values for its next filter based on the respective parameter values for each of the values and filters, the last filter receiving values from its immediately preceding filter, the received values and the parameters for that last filter. Determine the bin probability based on the values. Entropy decoding unit 300 may debinarize the empty stream to form decoded syntax elements. Predictive processing unit 302 may reconstruct a picture of video data based in part on decoded syntax elements.

연산들의 순서는 확률 추정치에 대한 직접적 액세스가 최소 우도 심볼을 빨리 결정하기 위해 유용할 때 또한 중요할 수도 있다. 예를 들어, 다음의 식들은 양의 정수 상수들 a 및 b 에 의해 파라미터화된 확률 추정 FSM 을 정의하고, 여기서, 확률 시퀀스 P[k] 는 최소 FSM 상태의 일부이다. The order of operations may also be important when direct access to the probability estimate is useful for quickly determining the minimum likelihood symbol. For example, the following equations define a probability estimate FSM parameterized by positive integer constants a and b, where the probability sequence P [ k ] is part of the minimum FSM state.

(22) (22)

비디오 코더는 이 개시물의 다른 곳에서 제공된 예들 중 임의의 것을 사용하여 FSM 파라미터들 a 및 b 의 값들을 결정할 수도 있다. 더욱이, 일부 예들에서, 비디오 코더는 식 (4) 대신에 식 (22) 를 사용할 수도 있다.A video coder may determine the values of FSM parameters a and b using any of the examples provided elsewhere in this disclosure. Moreover, in some examples, a video coder may use equation (22) instead of equation (4).

도 15 는 예시적인 엔트로피 인코딩 유닛을 나타내는 블록도이다. 예를 들어, 도 15 는 도 2 의 엔트로피 인코딩 유닛 (218) 의 하나의 예를 더 자세히 나타낸다. 엔트로피 인코딩 유닛 (218) 은 추가적인 또는 더 적은 컴포넌트들을 포함할 수도 있고, 도 15 에서 예시된 특정 상호접속들은 단지 이해의 용이성을 위한 것이고, 제한하는 것으로 간주되지 않아야 한다. 엔트로피 인코딩 유닛 (218) 은 비디오 데이터 (예컨대, 신택스 엘리먼트) 를 수신하고 이진화 프로세스 (예컨대, 빈 값들이라고도 불리는 이진 데이터 심볼들의 시퀀스) 를 수행하는 이진화 회로 (1500) 를 포함할 수도 있다. 확률 추정 회로 (1502) 는 이진화 회로 (1500) 에 의해 생성되 빈 값들을 수신할 수도 있고, 도 14 와 관련하여 예시되고 설명된 캐스케이드 필터링 기법을 이용하는 것과 같이 본 개시에서 설명된 예시적인 기법들을 이용하여 그 빈에 대한 확률을 결정할 수도 있다. 산술 인코더 회로 (1504) 는, 비디오 인코더 (20) 가 출력하는 비트스트림을 생성하기 위해 확률 추정 회로 (1502) 에 의해 생성된 코딩 컨텍스트들 (예컨대, 확률들) 에 기초하여 이진화 회로 (1500) 에 의해 생성된 이진화된 데이터를 엔트로피 인코딩할 수도 있다.Figure 15 is a block diagram illustrating an example entropy encoding unit. For example, Figure 15 shows one example of entropy encoding unit 218 of Figure 2 in more detail. Entropy encoding unit 218 may include additional or fewer components, and the specific interconnections illustrated in FIG. 15 are for ease of understanding only and should not be considered limiting. Entropy encoding unit 218 may include a binarization circuit 1500 that receives video data (e.g., a syntax element) and performs a binarization process (e.g., a sequence of binary data symbols, also called bin values). Probability estimation circuit 1502 may receive bin values generated by binarization circuit 1500 and use example techniques described in this disclosure, such as using the cascade filtering technique illustrated and described with respect to FIG. 14. Thus, the probability for that bin can be determined. Arithmetic encoder circuit 1504 encodes a binary code to binarization circuit 1500 based on coding contexts (e.g., probabilities) generated by probability estimation circuit 1502 to generate a bitstream that video encoder 20 outputs. Binarized data generated by can also be entropy-encoded.

일례로서, 확률 추정 회로 (1502) 는 복수의 필터들 (예컨대, H_i(z)) 로 빈 스트림의 적어도 하나의 빈에 대한 빈 확률을 결정할 수도 있다. 복수의 필터들의 제 1 필터 (예컨대, H₀(z)) 는 빈 스트림에 기초하여 값들을 수신한다. 복수의 필터들의 마지막 필터 (예컨대, H_F(z)) 는 빈 확률을 출력한다. 다른 필터들 (예컨대, H₁(z) 내지 H_F-1(z)) 의 각각은 그것의 바로 이전 필터로부터 값들을 수신하고, 그 수신된 값들 및 필터들의 각각에 대한 각각의 파라미터 값들 (예컨대, _i , _i , _i , 및 _i ) 에 기초하여 그것의 다음 필터에 대한 값들을 결정한다. 마지막 필터는 그것의 바로 이전 필터로부터 값들을 수신하고, 그 수신된 값들 및 그 마지막 필터에 대한 파라미터 값들 (예컨대, _F , _F , _F , 및 _F ) 에 기초하여 빈 확률을 결정한다.As an example, probability estimation circuitry 1502 may determine the bin probability for at least one bin of the bin stream with a plurality of filters (e.g., H _i (z)). A first filter (eg, H ₀ (z)) of the plurality of filters receives values based on the empty stream. The last filter of the plurality of filters (eg, H _F (z)) outputs an empty probability. Each of the other filters (e.g. H ₁ (z) to H _F-1 (z)) receives values from its immediately preceding filter, and sets the received values and the respective parameter values for each of the filters (e.g. , _i , _i , _i , and _i ) to determine the values for its next filter. The last filter receives values from its immediately preceding filter, and sets the received values and the parameter values for that last filter (e.g. _F , _F , _F , and _F ) determine the empty probability based on

그것의 값들을 생성하기 위해 제 1 필터에 의해 적용되는 전달 함수는 를 포함하고, 여기서, 는 제 1 필터에 대한 파라미터를 포함하고, z 는 z-변환에 대한 변수이다. 제 1 필터에 후속하는 필터들의 각각에 의해 적용되는 전달 함수는 를 포함하고, 여기서, i 는 필터의 상대적인 순서를 나타내고, 여기서, _i , _i , _i , 및 _i 는 필터들에 대한 각각의 파라미터 값들이다. _i , _i , _i , 및 _i 의 각각의 하나는 정수들일 수도 있고, 일부 예들에서, _i , _i , _i , 및 _i 의 각각의 것은 비교적 작은 정수들일 수도 있다.The transfer function applied by the first filter to generate its values is Includes, where: contains parameters for the first filter, and z is a variable for the z-transform. The transfer function applied by each of the filters subsequent to the first filter is , where i represents the relative order of the filters, where: _i , _i , _i , and _i is the respective parameter values for the filters. _i , _i , _i , and Each one of _i may be an integer, and in some examples, _i , _i , _i , and Each of _i may be a relatively small integer.

도 16 은 예시적인 엔트로피 디코딩 유닛을 나타내는 블록도이다. 예를 들어, 도 16 는 도 3 의 엔트로피 인코딩 유닛 (300) 의 하나의 예를 더 자세히 나타낸다. 엔트로피 디코딩 유닛 (300) 은 추가적인 또는 더 적은 컴포넌트들을 포함할 수도 있고, 도 16 에서 예시된 특정 상호접속들은 단지 이해의 용이성을 위한 것이고, 제한하는 것으로 간주되지 않아야 한다. 엔트로피 디코딩 유닛 (300) 은 비트스트림 (예컨대, 인코딩된 신택스 엘리먼트들) 을 수신하고 확률 추정 회로 (1606) 에 의해 결정된 코딩 컨텍스트들 (예컨대, 확률들) 에 기초하여 산술 디코딩을 수행하는 산술 디코더 회로 (1604) 를 포함할 수도 있다. 하나의 예로서, 확률 추정 회로 (1606) 는 도 14 에서 예시된 연산들 또는 그들 연산들의 역의 일부 형태를 수행할 수도 있다. 역 이진화 회로 (1600) 는 산술 디코더 회로 (1604) 로부터의 출력을 수신하고, 픽처들을 재구성하기 위해 사용되는 비디오 데이터를 생성하기 위해 역 이진화 프로세스를 수생한다.Figure 16 is a block diagram illustrating an example entropy decoding unit. For example, Figure 16 shows one example of the entropy encoding unit 300 of Figure 3 in more detail. Entropy decoding unit 300 may include additional or fewer components, and the specific interconnections illustrated in FIG. 16 are for ease of understanding only and should not be considered limiting. Entropy decoding unit 300 is an arithmetic decoder circuit that receives a bitstream (e.g., encoded syntax elements) and performs arithmetic decoding based on coding contexts (e.g., probabilities) determined by probability estimation circuitry 1606. (1604) may also be included. As one example, probability estimation circuitry 1606 may perform some form of the operations illustrated in FIG. 14 or the inverse of those operations. Inverse binarization circuit 1600 receives the output from arithmetic decoder circuit 1604 and performs an inverse binarization process to generate video data used to reconstruct the pictures.

일례로서, 확률 추정 회로 (1606) 는 복수의 필터들 (예컨대, H_i(z)) 로 빈 스트림의 적어도 하나의 빈에 대한 빈 확률을 결정한다. 복수의 필터들의 제 1 필터 (예컨대, H₀(z)) 는 비트스트림에 기초하여 값들을 수신한다. 복수의 필터들의 마지막 필터 (예컨대, H_F(z)) 는 빈 확률을 출력한다. 다른 필터들 (예컨대, H₁(z) 내지 H_F-1(z)) 의 각각은 그것의 바로 이전 필터로부터 값들을 수신하고, 그 수신된 값들 및 필터들의 각각에 대한 각각의 파라미터 값들 (예컨대, _i , _i , _i , 및 _i ) 에 기초하여 그것의 다음 필터에 대한 값들을 결정한다. 마지막 필터는 그것의 바로 이전 필터로부터 값들을 수신하고, 그 수신된 값들 및 그 마지막 필터에 대한 파라미터 값들 (예컨대, _F , _F , _F , 및 _F ) 에 기초하여 빈 확률을 결정한다.As an example, probability estimation circuit 1606 determines the bin probability for at least one bin of the bin stream with a plurality of filters (e.g., H _i (z)). A first filter (eg, H ₀ (z)) of the plurality of filters receives values based on the bitstream. The last filter of the plurality of filters (eg, H _F (z)) outputs an empty probability. Each of the other filters (e.g. H ₁ (z) to H _F-1 (z)) receives values from its immediately preceding filter, and sets the received values and the respective parameter values for each of the filters (e.g. , _i , _i , _i , and _i ) to determine the values for its next filter. The last filter receives values from its immediately preceding filter, and sets the received values and the parameter values for that last filter (e.g. _F , _F , _F , and _F ) determine the empty probability based on

다음의 넘버링된 패러그래프들은 본 개시의 기법들에 따른 특정 예들을 기술한다.The following numbered paragraphs describe specific examples according to the techniques of this disclosure.

예 1. 비디오 데이터를 디코딩하는 방법으로서, 그 방법은, 비트스트림에 이진 산술 디코딩을 적용하는 것에 의해, 디코딩된 신택스 엘리먼트를 결정하는 단계를 포함하고, 상기 이진 산술 디코딩을 적용하는 것은, 빈 스트림을 생성하는 것을 포함하며 상기 빈 스트림을 생성하는 것은: 복수의 필터들로 빈 스트림의 적어도 하나의 빈에 대한 빈 확률을 결정하는 것으로서, 여기서, 복수의 필터들의 제 1 필터는 빈 스트림에 기초하여 값들을 수신하고, 복수의 필터들의 마지막 필터는 빈 확률을 출력하며, 다른 필터들의 각각은 그것의 바로 이전 필터로부터 값들을 수신하고, 그 수신된 값들 및 필터들의 각각에 대한 각각의 파라미터 값들에 기초하여 그것의 다음 필터에 대한 값들을 결정하고, 마지막 필터는 그것의 바로 이전 필터로부터 값들을 수신하고, 그 수신된 값들 및 그 마지막 필터에 대한 파라미터 값들에 기초하여 빈 확률을 결정하는, 상기 빈 확률을 결정하는 것; 및, 디코딩된 신택스 엘리먼트를 형성하기 위해 빈 스트림을 이진화해제하는 것을 포함하고 상기 방법은, 또한 디코딩된 신택스 엘리먼트에 부분적으로 기초하여 비디오 데이터의 픽처를 재구성하는 단계를 포함한다.Example 1. A method of decoding video data, the method comprising determining a decoded syntax element by applying binary arithmetic decoding to a bitstream, wherein applying the binary arithmetic decoding comprises determining a decoded syntax element: Generating the empty stream includes: determining an empty probability for at least one bin of the empty stream with a plurality of filters, wherein a first filter of the plurality of filters is based on the empty stream. Receives values, the last filter of the plurality of filters outputs an empty probability, and each of the other filters receives values from its immediately preceding filter, based on the received values and the respective parameter values for each of the filters. determines the values for its next filter, and the last filter receives values from its immediately preceding filter and determines the bin probability based on the received values and the parameter values for the last filter. to decide; and debinarizing the empty stream to form decoded syntax elements, wherein the method also includes reconstructing a picture of the video data based in part on the decoded syntax elements.

예 2. 예 1 의 방법에 있어서, 복수의 필터들은 캐스케이드 구성으로 배열된다.Example 2. The method of Example 1, wherein the plurality of filters are arranged in a cascade configuration.

예 3. 예 1 및 예 2 의 방법에 있어서, 그것의 값들을 생성하기 위해 제 1 필터에 의해 적용되는 전달 함수는 를 포함하고, 여기서, 는 제 1 필터에 대한 파라미터를 포함하고, z 는 z-변환에 대한 변수이다.Example 3. The method of Examples 1 and 2, wherein the transfer function applied by the first filter to generate its values is Includes, where: contains parameters for the first filter, and z is a variable for the z-transform.

예 4. 예 1 내지 예 3 의 어느 것의 방법에 있어서, 제 1 필터에 후속하는 필터들의 각각에 의해 적용되는 전달 함수는 를 포함하고, 여기서, i 는 필터의 상대적인 순서를 나타내고, 여기서, _i , _i , _i , 및 _i 는 필터들에 대한 각각의 파라미터 값들이다.Example 4. The method of any of Examples 1-3, wherein the transfer function applied by each of the filters subsequent to the first filter is , where i represents the relative order of the filters, where: _i , _i , _i , and _i is the respective parameter values for the filters.

예 5. 예 4 의 방법에 있어서, _i , _i , _i , 및 _i 의 각각의 하나는 정수들이다.Example 5. In the method of Example 4, _i , _i , _i , and Each one of _i is an integer.

예 6. 예 4 의 방법에 있어서, _i , _i , _i , 및 _i 의 각각의 하나는 비교적 작은 정수들이다.Example 6. In the method of Example 4, _i , _i , _i , and Each one of _i is a relatively small integer.

예 7. 예 1 내지 예 6 의 어느 것의 방법은, Example 7. The method of any of Examples 1 to 6 is:

의 연산들을 수행하는 것에 적어도 부분적으로 기초하여 최소 우도 심볼을 결정하는 것을 더 포함한다.and determining a minimum likelihood symbol based at least in part on performing the operations of

예 8. 비디오 데이터를 인코딩하는 방법으로서, 그 방법은, 비디오 데이터에 기초하여 신택스 엘리먼트를 수신하는 단계를 포함하고; 신택스 엘리먼트에 이진 산술 인코딩을 적용하는 단계를 포함하며, 이진 산술 인코딩을 적용하는 것은, 신택스 엘리먼트를 이진화함으로써 빈 스트림을 생성하는 것; 복수의 필터들로 빈 스트림의 적어도 하나의 빈에 대한 빈 확률을 결정하는 것으로서, 여기서, 복수의 필터들의 제 1 필터는 빈 스트림에 기초하여 값들을 수신하고, 복수의 필터들의 마지막 필터는 빈 확률을 출력하며, 다른 필터들의 각각은 그것의 바로 이전 필터로부터 값들을 수신하고, 그 수신된 값들 및 필터들의 각각에 대한 각각의 파라미터 값들에 기초하여 그것의 다음 필터에 대한 값들을 결정하고, 마지막 필터는 그것의 바로 이전 필터로부터 값들을 수신하고, 그 수신된 값들 및 그 마지막 필터에 대한 파라미터 값들에 기초하여 빈 확률을 결정하는, 상기 빈 확률을 결정하는 것; 및, 그 적어도 하나의 빈 및 빈 확률에 기초하여 비트스트림을 생성하는 것을 포함하고, 상기 방법은, 그 비트스트림을 출력하는 단계를 또한 포함한다.Example 8. A method of encoding video data, the method comprising receiving a syntax element based on the video data; applying a binary arithmetic encoding to a syntax element, wherein applying the binary arithmetic encoding comprises: binarizing the syntax element, thereby producing an empty stream; Determining an empty probability for at least one bin of an empty stream with a plurality of filters, wherein a first filter of the plurality of filters receives values based on the empty stream and a last filter of the plurality of filters receives the empty probability. Each of the other filters receives values from its immediately preceding filter, determines values for its next filter based on the received values and the respective parameter values for each of the filters, and the last filter receives values from its immediately preceding filter and determines the bin probability based on the received values and parameter values for the last filter; and generating a bitstream based on the at least one bin and the bin probability, wherein the method also includes outputting the bitstream.

예 9. 예 8 의 방법에 있어서, 복수의 필터들은 캐스케이드 구성으로 배열된다.Example 9. The method of Example 8, wherein the plurality of filters are arranged in a cascade configuration.

예 10. 예 8 및 예 9 의 방법에 있어서, 그것의 값들을 생성하기 위해 제 1 필터에 의해 적용되는 전달 함수는 를 포함하고, 여기서, 는 제 1 필터에 대한 파라미터를 포함하고, z 는 z-변환에 대한 변수이다.Example 10. The method of Examples 8 and 9, wherein the transfer function applied by the first filter to generate its values is Includes, where: contains parameters for the first filter, and z is a variable for the z-transform.

예 11. 예 8 내지 예 10 의 어느 것의 방법에 있어서, 제 1 필터에 후속하는 필터들의 각각에 의해 적용되는 전달 함수는 를 포함하고, 여기서, i 는 필터의 상대적인 순서를 나타내고, 여기서, _i , _i , _i , 및 _i 는 필터들에 대한 각각의 파라미터 값들이다.Example 11. The method of any of Examples 8-10, wherein the transfer function applied by each of the filters subsequent to the first filter is , where i represents the relative order of the filters, where: _i , _i , _i , and _i is the respective parameter values for the filters.

예 12. 예 11 의 방법에 있어서, _i , _i , _i , 및 _i 의 각각의 하나는 정수들이다.Example 12. In the method of Example 11, _i , _i , _i , and Each one of _i is an integer.

예 13. 예 11 의 방법에 있어서, _i , _i , _i , 및 _i 의 각각의 하나는 비교적 작은 정수들이다.Example 13. In the method of Example 11, _i , _i , _i , and Each one of _i is a relatively small integer.

예 14. 예 8 내지 예13 의 어느 것의 방법은, Example 14. The method of any of Examples 8 to 13 is:

도 17 은 본 개시의 하나 이상의 기법들에 따른, 비디오 인코더 (20) 의 예시적인 동작을 나타내는 플로우차트이다. 본 개시의 플로우차트들은 예들로서 제공된다. 본 개시의 기법에 따른 다른 예들은 더 많거나 더 적거나, 또는 상이한 액션들을 수반할 수도 있다. 더욱이, 일부 예들에서, 특정 액션들은 상이한 순서들로 또는 병행하여 수행될 수도 있다.FIG. 17 is a flow chart illustrating example operation of video encoder 20, in accordance with one or more techniques of this disclosure. The flowcharts of this disclosure are provided as examples. Other examples according to the techniques of this disclosure may involve more, fewer, or different actions. Moreover, in some examples, certain actions may be performed in different orders or in parallel.

도 17 의 예에서, 비디오 인코더 (20) 는 비디오 데이터에 기초하여 신택스 엘리먼트를 생성할 수도 있다 (1700). 예를 들어, 비디오 인코더 (20) 는 잔차 값이 1 보다 더 큰지 여부를 나타내는 신택스 엘리먼트, 잔차 값이 2 보다 더 큰지 여부를 나타내는 신택스 엘리먼트, 또는 다른 타입의 신택스 엘리먼트를 생성할 수도 있다.In the example of FIG. 17 , video encoder 20 may generate a syntax element based on the video data (1700). For example, video encoder 20 may generate a syntax element indicating whether the residual value is greater than 1, a syntax element indicating whether the residual value is greater than 2, or another type of syntax element.

추가적으로, 비디오 인코더 (20) 는 적어도 부분적으로 ,신택스 엘리먼트에 이진 산술 인코딩 (예컨대, CABAC 인코딩) 을 적용하는 것에 의해, 오프셋 값을 결정할 수도 있다 (1702). 신택스 엘리먼트에 이진 산술 인코딩을 적용하는 것의 일부로서, 비디오 인코더 (20) 는 적어도 부분적으로 신택스 엘리먼트를 이진화하는 것에 의해 빈 스트림을 생성할 수도 있다 (1704). 비디오 인코더 (20) 는 신택스 엘리먼트를 다양한 방식드로 이진화할 수도 있다. 예를 들어, 비디오 인코더 (20) 는 Truncated Rice 이진화 프로세스, k-차 Exp-Golomb 이진화 프로세스, 고정 길이 이진화 프로세스, 또는 다른 타입의 이진화 프로세스를 이용하여 신택스 엘리먼트를 이진화할 수도 있다. 일부 예들에서, 비디오 인코더 (20) 는 상이한 타입들의 신택스 엘리먼트들에 대해 상이한 이진화 프로세스들을 사용한다. 일부 예들에서, 빈 스트림은 다수의 신택스 엘리먼트들을 이진화함으로써 생성된 빈들을 포함할 수도 있다.Additionally, video encoder 20 may determine the offset value, at least in part, by applying binary arithmetic encoding (e.g., CABAC encoding) to the syntax element (1702). As part of applying binary arithmetic encoding to a syntax element, video encoder 20 may generate an empty stream by at least partially binarizing the syntax element (1704). Video encoder 20 may binarize syntax elements in various ways. For example, video encoder 20 may binarize syntax elements using a Truncated Rice binarization process, a k-order Exp-Golomb binarization process, a fixed length binarization process, or another type of binarization process. In some examples, video encoder 20 uses different binarization processes for different types of syntax elements. In some examples, a bin stream may include bins created by binarizing multiple syntax elements.

또한, 빈 스트림의 적어도 하나의 각각의 빈에 대해, 비디오 인코더 (20) 는 빈 스트림의 다음 빈에 대한 간격을 결정할 수도 있다 (1706). 이 예에서, 비디오 인코더 (20) 는 각각의 빈에 대한 상태, 각각의 빈에 대한 간격, 및 각각의 빈의 값에 기초하여 다음 빈에 대한 간격을 결정할 수도 있다. 각각의 빈에 대한 상태는 제 1 값인 각각의 빈의 확률의 추정치 및 제 2 의 상이한 값인 각각의 빈의 확률의 추정치에 대응한다. 각각의 빈이 재초기화 후의 제 1 빈인 경우에, 각각의 빈에 대한 상태는 코딩 컨텍스트와 연관된 초기 확률 추정치들과 동일하다.Additionally, for each bin of at least one bin stream, video encoder 20 may determine an interval for the next bin of the bin stream (1706). In this example, video encoder 20 may determine the state for each bin, the interval for each bin, and the interval for the next bin based on the value of each bin. The state for each bin corresponds to an estimate of the probability of each bin being a first value and an estimate of the probability of each bin being a second, different value. If each bin is the first bin after reinitialization, the state for each bin is equal to the initial probability estimates associated with the coding context.

일부 예들에서, 비디오 인코더 (20) 는 다음 빈에 대한 간격을 결정하는 것의 일부로서 빈 스트림의 개개의 각각의 빈에 대해 다음과 같은 액션들을 수행할 수도 있다. 특히, 비디오 인코더 (20) 는, 각각의 빈에 대한 상태에 기초하여, 각각의 빈에 대한 간격을 제 1 심볼과 연관된 간격과 제 2 심볼과 연관된 간격으로 분할할 수도 있다. 추가적으로, 비디오 인코더 (20) 는, 각각의 빈의 값이 제 1 심볼과 동일한지 또는 제 2 심볼과 동일한지 여부에 기초하여, 다음 빈에 대한 간격의 상위 한계 (upper bound) 또는 하위 한계 (lower bound) 중 하나를 설정할 수도 있다. 각각의 빈의 값이 제 1 심볼과 동일하다고 결정하는 것에 응답하여, 다음 빈에 대한 간격의 상위 한계는 제 1 심볼과 연관된 간격의 상위 한계로 설정되고 다음 빈에 대한 간격의 하위 한계는 변경되지 않는다. 각각의 빈의 값이 제 2 심볼과 동일하다고 결정하는 것에 응답하여, 다음 빈에 대한 간격의 하위 한계는 제 2 심볼과 연관된 간격의 하위 한계로 설정되고 다음 빈에 대한 간격의 상위 한계는 변경되지 않는다.In some examples, video encoder 20 may perform the following actions on each individual bin of the bin stream as part of determining the interval for the next bin. In particular, video encoder 20 may split the interval for each bin into an interval associated with the first symbol and an interval associated with the second symbol, based on the state for each bin. Additionally, video encoder 20 determines the upper bound or lower bound of the interval for the next bin based on whether the value of each bin is equal to the first symbol or the second symbol. bound) can also be set. In response to determining that the value of each bin is equal to the first symbol, the upper limit of the interval for the next bin is set to the upper limit of the interval associated with the first symbol and the lower limit of the interval for the next bin is unchanged. No. In response to determining that the value of each bin is equal to the second symbol, the lower limit of the interval for the next bin is set to the lower limit of the interval associated with the second symbol and the upper limit of the interval for the next bin is unchanged. No.

예를 들어, 각각의 빈이 재초기화 후의 제 1 빈인 경우에, 각각의 빈에 대한 간격은 0 내지 1 이다. 이 예에서, 제 1 빈에 대한 상태가, 심볼 0 이 0.6 확률을 가지고 심볼 1 이 0.4 확률을 가진다고 나타내는 경우에, 비디오 인코더 (20) 는 제 1 빈에 대한 간격을 0 내지 0.6 의 간격과 0.6 내지 1 의 간격으로 분할할 수도 있다. 제 1 빈의 값이 0 인 경우에, 비디오 인코더 (20) 는 제 2 빈에 대한 간격의 상위 한계를 0.6 으로 설정할 수도 있고, 제 2 빈에 대한 간격의 하위 한계를 0 으로 설정할 수도 있다. 제 1 빈의 값이 1 인 경우에, 비디오 인코더 (20) 는 제 2 빈에 대한 간격의 하위 한계를 0.6 으로 설정할 수도 있고, 제 2 빈에 대한 간격의 상위 한계를 1 으로 설정할 수도 있다. 후속하여, 제 2 빈에 대한 상태가, 심볼 0 이 0.7 확률을 가지고 심볼 1 이 0.3 확률을 가지며, 제 2 빈에 대한 간격이 0 내지 0.6 이라고 나타내는 경우에, 비디오 인코더 (20) 는 제 2 빈에 대한 간격을 0 내지 0.42 의 간격과 0.42 내지 0.6 의 간격으로 분할할 수도 있다. 제 2 빈의 값이 0 인 경우에, 비디오 인코더 (20) 는 제 3 빈에 대한 간격의 상위 한계를 0.42 로 설정할 수도 있고, 제 3 빈에 대한 간격의 하위 한계를 0 으로 설정할 수도 있다. 제 2 빈의 값이 1 인 경우에, 비디오 인코더 (20) 는 제 3 빈에 대한 간격의 하위 한계를 0.42 로 설정할 수도 있고, 제 2 빈에 대한 간격의 상위 한계를 0.6 으로 설정할 수도 있다. 일부 예들에서, 비디오 인코더 (20) 는 0 과 1 사이의 값들 대신에 정수 값들을 사용할 수도 있다.For example, if each bin is the first bin after reinitialization, the interval for each bin is 0 to 1. In this example, if the state for the first bin indicates that symbol 0 has probability 0.6 and symbol 1 has probability 0.4, then video encoder 20 sets the interval for the first bin to be between 0 and 0.6 and 0.6. It can also be divided into intervals from 1 to 1. If the value of the first bin is 0, video encoder 20 may set the upper limit of the spacing for the second bin to 0.6 and may set the lower limit of the spacing for the second bin to 0. If the value of the first bin is 1, video encoder 20 may set the lower limit of the spacing for the second bin to 0.6 and may set the upper limit of the spacing for the second bin to 1. Subsequently, if the status for the second bin indicates that symbol 0 has probability 0.7 and symbol 1 has probability 0.3, and that the interval for the second bin is 0 to 0.6, video encoder 20 may The interval for may be divided into an interval of 0 to 0.42 and an interval of 0.42 to 0.6. If the value of the second bin is 0, video encoder 20 may set the upper limit of the spacing for the third bin to 0.42 and may set the lower limit of the spacing for the third bin to 0. If the value of the second bin is 1, video encoder 20 may set the lower limit of the spacing for the third bin to 0.42 and the upper limit of the spacing for the second bin to 0.6. In some examples, video encoder 20 may use integer values instead of values between 0 and 1.

또한, 비디오 인코더 (20) 는 빈 스트림의 다음 빈에 대해 하나 이상의 FSM 파라미터들을 결정할 수도 있다 (1708). 다음 빈에 대한 하나 이상의 FSM 파라미터들은 다음 빈에 대한 확률 추정치들이 어떻게 각각의 빈에 대한 상태로부터 계산되는지를 제어한다. 일부 경우들에서, 빈 스트림의 제 1 빈 (예컨대, 현재 빈) 에 대한 하나 이상의 FSM 파라미터들은 빈 스트림의 제 2 빈 (예컨대, 다음 빈) 에 대한 하나 이상의 FSM 파라미터들과는 상이하다.Video encoder 20 may also determine one or more FSM parameters for the next bin of the bin stream (1708). One or more FSM parameters for the next bin control how probability estimates for the next bin are calculated from the state for each bin. In some cases, one or more FSM parameters for a first bin (eg, current bin) of an empty stream are different from one or more FSM parameters for a second bin (eg, next bin) of an empty stream.

비디오 인코더 (20) 는 이 개시물의 다른 곳에서 제공된 예들의 임의의 것에 따라 하나 이상의 FSM 파라미터들을 결정할 수도 있다. 실례로, 하나의 예에서, 빈 스트림의 다음 빈에 대한 하나 이상의 FSM 파라미터들을 결정하는 것의 일부로서, 비디오 인코더 (20) 는, 상태 재초기화 파라미터에 따라 빈 스트림의 다음 빈에 대한 하나 이상의 FSM 파라미터들을 재초기화할 수도 있다.Video encoder 20 may determine one or more FSM parameters according to any of the examples provided elsewhere in this disclosure. Illustratively, in one example, as part of determining one or more FSM parameters for the next bin of an bin stream, video encoder 20 may determine one or more FSM parameters for the next bin of an bin stream according to a state reinitialization parameter. You can also reinitialize them.

일부 예들에서, 비디오 인코더 (20) 는 추정된 확률 값들에 따라 빈 스트림의 다음 빈에 대한 FSM 파라미터들을 수정한다. 일부 예들에서, 비디오 인코더 (20) 는 과거 확률 변동의 측정치에 기초하여 빈 스트림의 다음 빈에 대한 FSM 파라미터들을 수정한다. 일부 예들에서, 비디오 인코더 (20) 는 동일 프레임 또는 이전에 디코딩된 프레임에서의 하나 이상의 이웃하는 블록들에 기초하여 빈 스트림의 다음 빈에 대한 하나 이상의 FSM 파라미터들을 결정한다.In some examples, video encoder 20 modifies the FSM parameters for the next bin of the bin stream according to the estimated probability values. In some examples, video encoder 20 modifies the FSM parameters for the next bin of the bin stream based on a measure of past probability variation. In some examples, video encoder 20 determines one or more FSM parameters for the next bin of the bin stream based on one or more neighboring blocks in the same frame or a previously decoded frame.

일부 예들에서, 비디오 인코더 (20) 는 빈 스트림의 각각의 빈에 대해 하나 이상의 FSM 파라미터들을 결정하기 위한 프로세스를 수행한다. 다른 예들에서, 비디오 인코더 (20) 는 오직, 빈 스트림의 특정 빈들에 대해 하나 이상의 FSM 파라미터들을 결정하기 위한 프로세스를 수행한다. 실례로, 하나의 예에서, 비디오 인코더 (20) 는 오직, 재초기화 후에 제 2 빈에 대한 확률 추정치들을 결정함에 있어서 사용되는 하나 이상의 FSM 파라미터들을 결정하기 위한 프로세스를 수행할 수도 있고, 다음 재초기화 이벤트까지 동일한 하나 이상의 FSM 파라미터들을 이용하는 것을 계속할 수도 있다. 일부 예들에서, 비디오 인코더 (20) 는 블록 경계들, 슬라이스 경계들 등과 같이 다른 시간들에서 하나 이상의 FSM 파라미터들을 결정하기 위한 프로세스를 수행할 수도 있다.In some examples, video encoder 20 performs a process to determine one or more FSM parameters for each bin of the bin stream. In other examples, video encoder 20 performs a process to determine one or more FSM parameters only for specific bins of the bin stream. Illustratively, in one example, video encoder 20 may perform a process to determine one or more FSM parameters to be used in determining probability estimates for the second bin only after reinitialization and then reinitialization. You may continue to use the same one or more FSM parameters until the event. In some examples, video encoder 20 may perform a process to determine one or more FSM parameters at different times, such as block boundaries, slice boundaries, etc.

비디오 인코더 (20) 는 또한, 빈 스트림의 다음 빈에 대한 상태를 결정하기 위해 파라미터화된 상태 업데이팅 함수를 사용할 수도 있다 (1710). 파라미터화된 상태 업데이팅 함수는 각각의 빈에 대한 상태, 빈 스트림의 다음 빈에 대한 하나 이상의 FSM 파라미터들, 및 각각의 빈의 값을 입력으로서 취한다. 예를 들어, 비디오 인코더 (20) 는 식 (4) 또는 식 (22) 를 이용하여 다음 빈에 대한 상태를 결정할 수도 있다.Video encoder 20 may also use a parameterized state updating function to determine the state for the next bin of the bin stream (1710). The parameterized state updating function takes as input the state for each bin, one or more FSM parameters for the next bin in the bin stream, and the value of each bin. For example, video encoder 20 may use equation (4) or equation (22) to determine the state for the next bin.

도 17 의 예에서, 오프셋 값은 빈 스트림의 마지막 빈에 대한 간격에서의 값과 동일할 수도 있다. 비디오 인코더 (20) 는 오프셋 값을 포함하는 비트스트림을 출력할 수도 있다 (1712). 실례로, 비디오 인코더 (20) 는 비트스트림을 컴퓨터 판독가능 매체 (16) (도 1) 에 저장하거나 전송할 수도 있다.In the example of Figure 17, the offset value may be equal to the value in the interval for the last bin of the bin stream. Video encoder 20 may output a bitstream that includes the offset value (1712). By way of example, video encoder 20 may store or transmit a bitstream to computer-readable medium 16 (FIG. 1).

도 18 은 본 개시의 하나 이상의 기법들에 따른, 비디오 디코더 (30) 의 예시적인 동작을 나타내는 플로우차트이다. 도 18 의 예에서, 비디오 디코더 (30) 는 비트스트림을 수신할 수도 있고, 비트스트림에 포함된 오프셋 값에 이진 산술 디코딩을 적용함으로써 하나 이상의 디코딩된 신택스 엘리먼트들을 결정할 수도 있다 (1800). 하나 이상의 디코딩된 신택스 엘리먼트들을 결정하는 것의 일부로서, 비디오 디코더 (30) 는 빈 스트림을 생성할 수도 있다. 이 빈 스트림은 더 긴 빈 스트림 또는 전체 빈 스트림의 일부일 수도 있다. 비트스트림은 비디오 데이터의 인코딩된 표현을 포함할 수도 있다.FIG. 18 is a flow chart illustrating example operation of video decoder 30, in accordance with one or more techniques of this disclosure. In the example of FIG. 18, video decoder 30 may receive a bitstream and determine one or more decoded syntax elements by applying binary arithmetic decoding to an offset value included in the bitstream (1800). As part of determining one or more decoded syntax elements, video decoder 30 may generate an empty stream. This empty stream may be part of a longer empty stream or an entire empty stream. A bitstream may contain an encoded representation of video data.

또한, 빈 스트림을 생성하는 것의 일부로서, 비디오 디코더 (30) 는 빈 스트림의 적어도 하나의 각각의 빈에 대해, 다음과 같은 액션들을 수행할 수도 있다. 특히, 비디오 디코더 (30) 는 각각의 빈의 값을 결정할 수도 있다 (1802). 비디오 디코더 (30) 는 각각의 빈에 대한 상태, 각각의 빈에 대한 간격, 및 오프셋 값에 기초하여 각각의 빈의 값을 결정할 수도 있다. 실례로, 하나의 예에서, 빈 스트림의 개개의 각각의 빈에 대해, 비디오 디코더 (30) 는, 적어도 부분적으로, 각각의 빈에 대한 상태에 기초하여 그 각각의 빈에 대한 간격을 제 1 심볼과 연관된 간격과 제 2 심볼과 연관된 간격으로 나누는 것에 의해, 각각의 빈의 값을 결정할 수도 있다. 추가적으로, 비디오 디코더 (30) 는, 오프셋 값이 제 1 심볼과 연관된 간격 또는 제 2 심볼과 연관된 간격에 있는지 여부에 기초하여 각각의 빈의 값을 결정할 수도 있다. 이 예에서, 오프셋 값이 제 1 심볼과 연관된 간격 내에 있다고 결정하는 것에 응답하여, 각각의 빈의 값은 상기 제 1 심볼과 동일하다. 더욱이, 이 예에서, 오프셋 값이 제 2 심볼과 연관된 간격 내에 있다고 결정하는 것에 응답하여, 각각의 빈의 값은 제 2 심볼과 동일하다.Additionally, as part of generating an empty stream, video decoder 30 may perform the following actions for at least one each bin of the empty stream. In particular, video decoder 30 may determine the value of each bin (1802). Video decoder 30 may determine the value of each bin based on the state for each bin, the interval for each bin, and the offset value. Illustratively, in one example, for each respective bin of the bin stream, video decoder 30 determines the interval for each bin based, at least in part, on the state for each bin to the first symbol. The value of each bin may be determined by dividing into an interval associated with and an interval associated with the second symbol. Additionally, video decoder 30 may determine the value of each bin based on whether the offset value is in the interval associated with the first symbol or the interval associated with the second symbol. In this example, in response to determining that the offset value is within an interval associated with a first symbol, the value of each bin is equal to the first symbol. Moreover, in this example, in response to determining that the offset value is within an interval associated with the second symbol, the value of each bin is equal to the second symbol.

실례로, 하나의 특정 예에서, 제 1 빈과 연관된 간격은 0 에서부터 1 까지일 수도 있다. 이 예에서, 제 1 빈에 대한 상태는 심볼 0 이 0.6 확률을 갖는 것을 나타내고, 제 2 빈에 대한 상태는 심볼 1 이 0.4 확률을 갖는 것을 나타낸다. 이에 따라, 이 예에서, 비디오 디코더 (30) 는 제 1 빈과 연관된 간격을, 0 에서부터 0.6 까지의 범위인 심볼 0 과 연관된 간격과, 0.6 에서부터 1 까지의 범위인 심볼 1 과 연관된 간격으로 분할할 수도 있다. 이 예에서, 오프셋 값이 0 과 0.6 사이인 경우에, 비디오 디코더 (30) 는 제 1 빈의 값이 0 과 동일하다고 결정한다. 오프셋 값이 0.6 과 1 사이인 경우에, 비디오 디코더 (30) 는 제 1 빈의 값이 1 과 동일하다고 결정한다. 이 예에서, 제 1 빈의 값이 0 인 경우에, 비디오 디코더 (30) 는 제 2 빈에 대한 간격이 0 내지 0.6 이라고 결정할 수도 있다. 1 빈의 값이 1 인 경우에, 비디오 디코더 (30) 는 제 2 빈에 대한 간격이 0.6 내지 1 이라고 결정할 수도 있다. 또한, 제 1 빈의 값이 1 과 동일하고 제 2 빈에 대한 상태가 제 2 빈이 심볼 0 일 0.7 의 확률을 가지고 심볼 1 일 0.3 의 확률을 가진다고 가정하면, 비디오 디코더 (30) 는 간격 0.6 내지 1 을 심볼 0 에 대응하는 0.6 에서부터 0.88 까지의 간격 및 심볼 0 에 대응하는 0.88 에서부터 1 까지의 간격으로 분할할 수도 있다. 따라서, 오프셋 값이 0.6 과 0.88 사이인 경우에, 비디오 디코더 (30) 는 제 2 빈의 값이 0 이라고 결정한다. 오프셋 값이 0.88 과 1 사이인 경우에, 비디오 디코더 (30) 는 제 2 빈의 값이 1 이라고 결정한다. 비디오 디코더 (30) 는 빈 스트림의 각각의 빈에 대해 이 프로세스를 계속할 수도 있다. 일부 예들에서, 비디오 디코더 (30) 는 0 과 1 사이의 값들 대신에 정수 값들을 사용할 수도 있다.For example, in one particular example, the interval associated with the first bin may range from 0 to 1. In this example, the state for the first bin indicates that symbol 0 has probability 0.6, and the state for the second bin indicates that symbol 1 has probability 0.4. Accordingly, in this example, video decoder 30 may split the interval associated with the first bin into an interval associated with symbol 0, ranging from 0 to 0.6, and an interval associated with symbol 1, ranging from 0.6 to 1. It may be possible. In this example, if the offset value is between 0 and 0.6, video decoder 30 determines that the value of the first bin is equal to 0. If the offset value is between 0.6 and 1, video decoder 30 determines that the value of the first bin is equal to 1. In this example, if the value of the first bin is 0, video decoder 30 may determine that the interval for the second bin is 0 to 0.6. If the value of bin 1 is 1, video decoder 30 may determine that the interval for bin 2 is between 0.6 and 1. Additionally, assuming that the value of the first bin is equal to 1 and the state for the second bin is that the second bin has a probability of 0.7 for symbol 0 and a probability of 0.3 for symbol 1, video decoder 30 can 1 may be divided into an interval from 0.6 to 0.88 corresponding to symbol 0 and an interval from 0.88 to 1 corresponding to symbol 0. Accordingly, if the offset value is between 0.6 and 0.88, video decoder 30 determines that the value of the second bin is 0. If the offset value is between 0.88 and 1, video decoder 30 determines that the value of the second bin is 1. Video decoder 30 may continue this process for each bin of the bin stream. In some examples, video decoder 30 may use integer values instead of values between 0 and 1.

도 18 의 예에서, 비디오 디코더 (30) 는 빈 스트림의 다음 빈에 대해 하나 이상의 FSM 파라미터들을 결정할 수도 있다 (1804). 다음 빈에 대한 하나 이상의 FSM 파라미터들은 다음 빈에 대한 확률 추정치들이 어떻게 각각의 빈에 대한 상태로부터 계산되는지를 제어한다. 빈 스트림의 다음 빈은 빈 스트림에서의 상기 각각의 빈을 뒤따른다. 일부 경우들에서, 빈 스트림의 제 1 빈 (예컨대, 현재 빈) 에 대한 하나 이상의 FSM 파라미터들은 빈 스트림의 제 2 빈 (예컨대, 다음 빈) 에 대한 하나 이상의 FSM 파라미터들과는 상이하다.In the example of FIG. 18, video decoder 30 may determine one or more FSM parameters for the next bin of the bin stream (1804). One or more FSM parameters for the next bin control how probability estimates for the next bin are calculated from the state for each bin. The next bean in the empty stream follows each of the above bins in the empty stream. In some cases, one or more FSM parameters for a first bin (eg, current bin) of an empty stream are different from one or more FSM parameters for a second bin (eg, next bin) of an empty stream.

비디오 디코더 (30) 는 이 개시물의 다른 곳에서 설명된 예들의 임의의 것에 따라 하나 이상의 FSM 파라미터들을 결정할 수도 있다. 예를 들어, 다음 빈에 대해 하나 이상의 FSM 파라미터들을 결정하는 것의 일부로서, 비디오 디코더 (30) 는 상태 재초기화 파라미터에 따라 빈 스트림의 다음 빈에 대한 FSM 파라미터들을 재초기화할 수도 있다. 일부 예들에서, 다음 빈에 대해 하나 이상의 FSM 파라미터들을 결정하는 것의 일부로서, 비디오 디코더 (30) 는 추정된 확률 값들에 따라 빈 스트림의 다음 빈에 대한 FSM 파라미터들을 수정할 수도 있다. 일부 예들에서, 다음 빈에 대해 하나 이상의 FSM 파라미터들을 결정하는 것의 일부로서, 비디오 디코더 (30) 는 과거 확률 변동의 측정치에 기초하여 빈 스트림의 다음 빈에 대한 FSM 파라미터들을 수정한다. 일부 예들에서, 빈 스트림의 다음 빈에 대한 하나 이상의 FSM 파라미터들을 결정하는 것의 일부로서, 비디오 디코더 (30) 는 동일 프레임 또는 이전에 디코딩된 프레임에서의 하나 이상의 이웃하는 블록들에 기초하여 빈 스트림의 다음 빈에 대해 하나 이상의 FSM 파라미터들을 결정할 수도 있다.Video decoder 30 may determine one or more FSM parameters according to any of the examples described elsewhere in this disclosure. For example, as part of determining one or more FSM parameters for the next bin, video decoder 30 may reinitialize the FSM parameters for the next bin of the bin stream according to the state reinitialization parameter. In some examples, as part of determining one or more FSM parameters for the next bin, video decoder 30 may modify the FSM parameters for the next bin of the bin stream according to the estimated probability values. In some examples, as part of determining one or more FSM parameters for the next bin, video decoder 30 modifies the FSM parameters for the next bin of the bin stream based on a measure of past probability variation. In some examples, as part of determining one or more FSM parameters for the next bin of the bin stream, video decoder 30 may determine the FSM parameters of the bin stream based on one or more neighboring blocks in the same frame or a previously decoded frame. One or more FSM parameters may be determined for the next bin.

일부 예들에서, 비디오 디코더 (30) 는 빈 스트림의 각각의 빈에 대해 하나 이상의 FSM 파라미터들을 결정하기 위한 프로세스를 수행한다. 다른 예들에서, 비디오 디코더 (30) 는 오직, 빈 스트림의 특정 빈들에 대해 하나 이상의 FSM 파라미터들을 결정하기 위한 프로세스를 수행한다. 실례로, 하나의 예에서, 비디오 디코더 (30) 는 오직, 재초기화 후에 제 2 빈에 대한 확률 추정치들을 결정함에 있어서 사용되는 하나 이상의 FSM 파라미터들을 결정하기 위한 프로세스를 수행할 수도 있고, 다음 재초기화 이벤트까지 동일한 하나 이상의 FSM 파라미터들을 이용하는 것을 계속할 수도 있다. 일부 예들에서, 비디오 디코더 (30) 는 블록 경계들, 슬라이스 경계들 등과 같이 다른 시간들에서 하나 이상의 FSM 파라미터들을 결정하기 위한 프로세스를 수행할 수도 있다.In some examples, video decoder 30 performs a process to determine one or more FSM parameters for each bin of the bin stream. In other examples, video decoder 30 performs a process to determine one or more FSM parameters only for specific bins of the bin stream. Illustratively, in one example, video decoder 30 may perform a process to determine one or more FSM parameters to be used in determining probability estimates for the second bin only after reinitialization and then reinitialization. You may continue to use the same one or more FSM parameters until the event. In some examples, video decoder 30 may perform a process to determine one or more FSM parameters at different times, such as block boundaries, slice boundaries, etc.

또한, 비디오 디코더 (30) 는 빈 스트림의 다음 빈에 대한 상태를 결정할 수도 있다 (1806). 비디오 디코더 (30) 는 각각의 빈에 대한 상태, 빈 스트림의 다음 빈에 대한 하나 이상의 FSM 파라미터들, 및 각각의 빈의 값을 입력으로서 취하는 파라미터화된 상태 업데이팅 함수를 이용할 수도 있다. 예를 들어, 비디오 인코더 (20) 는 식 (4) 또는 식 (22) 를 이용하여 다음 빈에 대한 상태를 결정할 수도 있다.Video decoder 30 may also determine the status for the next bin of the bin stream (1806). Video decoder 30 may use a parameterized state updating function that takes as input the state for each bin, one or more FSM parameters for the next bin in the bin stream, and the value of each bin. For example, video encoder 20 may use equation (4) or equation (22) to determine the state for the next bin.

추가적으로, 비디오 디코더 (30) 는 하나 이상의 디코딩된 신택스 엘리먼트들을 형성하기 위해 빈 스트림을 이진화해제할 수도 있다 (1808). 본 개시물의 다른 곳에서 언급된 바와 같이, 신택스 엘리먼트들은 Truncated Rice 이진화 프로세스, k-차 Exp-Golomb 이진화 프로세스, 고정 길이 이진화 프로세스, 또는 다른 타입의 이진화 프로세스와 같은 다양한 프로세스들을 이용하여 이진화될 수도 있다. 이들 프로세스들은 신택스 엘리먼트들의 값들을 바이너리 코드들에 맵핑한다. 빈 스트림을 이진화해제하기 위해, 비디오 디코더 (30) 는 빈 스트림에서 바이너리 코드들에 대응하는 값들을 찾을 수도 있다.Additionally, video decoder 30 may debinarize the empty stream to form one or more decoded syntax elements (1808). As mentioned elsewhere in this disclosure, syntax elements may be binarized using various processes, such as a Truncated Rice binarization process, a k-order Exp-Golomb binarization process, a fixed-length binarization process, or other types of binarization processes. . These processes map the values of syntax elements to binary codes. To debinarize the empty stream, video decoder 30 may find values corresponding to binary codes in the empty stream.

도 18 의 예에서, 비디오 디코더 (30) 는, 하나 이상의 디코딩된 신택스 엘리먼트들에 부분적으로 기초하여 비디오 데이터의 픽처를 재구성할 수도 있다 (1810). 예를 들어, 디코딩된 신택스 엘리먼트들이 잔차 데이터에 대한 나머지 값들 여부를 나타내는 경우에, 비디오 디코더 (30) 는 잔차 샘플들의 값들에 대해 결정하기 위해 그 나머지 값들을 사용할 수도 있다. 이 예에서, 비디오 디코더 (30) 는, 이 개시물의 다른 곳에서 설명된 바와 같이, 픽처의 샘플 값들을 재구성하기 위해서 잔차 샘플들 및 대응하는 예측성 샘플들을 이용할 수도 있다. 일부 예들에서, 디코딩된 신택스 엘리먼트들은, 블록들이 인트라 예측 또는 인터 예측으로 인코딩되는지 여부를 나타내는 신택스 엘리먼트들을 포함할 수도 있다. 이러한 예들에서, 비디오 디코더 (30) 는 인트라 예측 또는 인터 예측을 이용하여 블록들을 재구성할지 여부를 결정하기 위해 그러한 신택스 엘리먼트들을 이용할 수도 있다.In the example of FIG. 18, video decoder 30 may reconstruct a picture of video data based in part on one or more decoded syntax elements (1810). For example, if the decoded syntax elements indicate whether there are remainder values for residual data, video decoder 30 may use the remainder values to make a decision about the values of the residual samples. In this example, video decoder 30 may use the residual samples and corresponding predictive samples to reconstruct sample values of the picture, as described elsewhere in this disclosure. In some examples, the decoded syntax elements may include syntax elements that indicate whether the blocks are encoded with intra-prediction or inter-prediction. In these examples, video decoder 30 may use such syntax elements to determine whether to reconstruct blocks using intra-prediction or inter-prediction.

본 개시의 특정 양태들이 예시의 목적을 위해 HEVC 표준의 확장들에 관하여 설명되었다. 하지만, 본 개시에 설명된 기법들은, 아직 개발되지 않은 다른 표준 또는 독점적 비디오 코딩 프로세스들을 포함한, 다른 비디오 코딩 프로세스들에 유용할 수도 있다.Certain aspects of the present disclosure have been described with respect to extensions of the HEVC standard for purposes of illustration. However, the techniques described in this disclosure may be useful for other video coding processes, including other standard or proprietary video coding processes that have not yet been developed.

본 개시에서 설명된 바와 같은 비디오 코더는 비디오 인코더 또는 비디오 디코더를 지칭할 수도 있다. 유사하게, 비디오 코딩 유닛은 비디오 인코더 또는 비디오 디코더를 지칭할 수도 있다. 마찬가지로, 비디오 코딩은, 적용가능한 바에 따라, 비디오 인코딩 또는 비디오 디코딩을 지칭할 수도 있다. 이 개시물에서, 문구 “~ 에 기초하여” 는 오직 기초하여, 적어도 부분적으로 기초하여, 또는 어떤 방식으로 기초하여를 나타낼 수도 있다. 본 개시는 하나 이상의 샘플 블록들 및 하나 이상의 샘플 블록들의 샘플들을 코딩하는데 이용된 신택스 구조들을 지칭하기 위하여 용어 "비디오 유닛", "비디오 블록" 또는 "블록" 을 이용할 수도 있다. 비디오 유닛들의 예시적인 타입들은 CTU, CU, PU, 변환 유닛 (TU), 매크로블록, 매크로블록 파티션 등을 포함할 수도 있다. 일부 맥락에서는, PU에 대한 논의가 매크로블록 또는 매크로블록 파티션에 대한 논의와 상호 교환될 수도 있다. 비디오 블록들의 예시적인 타입들은 코딩 트리 블록들, 코딩 블록들, 및 다른 타입들의 비디오 데이터의 블록들을 포함할 수도 있다.A video coder as described in this disclosure may refer to a video encoder or video decoder. Similarly, a video coding unit may refer to a video encoder or video decoder. Likewise, video coding may refer to video encoding or video decoding, as applicable. In this disclosure, the phrase “based on” may represent solely based, at least partially based, or based on in some way. This disclosure may use the terms “video unit,” “video block,” or “block” to refer to one or more sample blocks and syntax structures used to code samples of one or more sample blocks. Example types of video units may include CTU, CU, PU, transformation unit (TU), macroblock, macroblock partition, etc. In some contexts, discussion of PUs may be interchangeable with discussion of macroblocks or macroblock partitions. Example types of video blocks may include coding tree blocks, coding blocks, and other types of blocks of video data.

본 개시의 기법들은, 공중 경유 (over-the-air) 텔레비전 브로드캐스트들, 케이블 텔레비전 송신들, 위성 텔레비전 송신들, HTTP 상으로의 동적 적응적 스트리밍 (DASH) 과 같은 인터넷 스트리밍 비디오 송신들, 데이터 저장 매체 상으로 인코딩되는 디지털 비디오, 데이터 저장 매체 상에 저장된 디지털 비디오의 디코딩, 또는 다른 애플리케이션들과 같은 다양한 멀티미디어 애플리케이션들 중 임의의 것을 지원하여 비디오 코딩에 적용될 수도 있다.The techniques of this disclosure include over-the-air television broadcasts, cable television transmissions, satellite television transmissions, Internet streaming video transmissions, such as Dynamic Adaptive Streaming over HTTP (DASH), data It may be applied to video coding to support any of a variety of multimedia applications, such as digital video being encoded on a storage medium, decoding of digital video stored on a data storage medium, or other applications.

예에 의존하여, 본 명세서에서 설명된 기법들의 임의의 특정 행위들 또는 이벤트들은 상이한 시퀀스로 수행될 수 있고, 전체적으로 부가되거나 병합되거나 또는 제거될 수도 있음 (예를 들어, 설명된 모든 행위들 또는 이벤트들이 그 기법들의 실시를 위해 필수적인 것은 아님) 이 인식되어야 한다. 더욱이, 특정 예들에 있어서, 행위들 또는 이벤트들은 순차적인 것보다는, 예를 들어, 다중-스레딩된 프로세싱, 인터럽트 프로세싱, 또는 다중의 프로세서들을 통해 동시에 수행될 수도 있다.Depending on the example, any specific acts or events of the techniques described herein may be performed in a different sequence, and may be added, merged, or removed entirely (e.g., all acts or events described It should be recognized that these techniques are not essential for implementation. Moreover, in certain examples, acts or events may be performed concurrently, for example, through multi-threaded processing, interrupt processing, or multiple processors, rather than sequentially.

하나 이상의 예들에서, 설명된 기능들은 하드웨어, 소프트웨어, 펌웨어, 또는 그 임의의 조합으로 구현될 수도 있다.　 소프트웨어로 구현되는 경우, 그 기능들은 컴퓨터 판독가능 매체 상에 하나 이상의 명령들 또는 코드로서 저장되거나 또는 이를 통해 송신되고 하드웨어 기반 프로세싱 유닛에 의해 실행될 수도 있다. 컴퓨터 판독가능 매체들은 데이터 저장 매체들과 같은 유형의 매체에 대응하는 컴퓨터 판독가능 저장 매체들, 또는 예를 들어, 통신 프로토콜에 따라 일 장소로부터 다른 장소로의 컴퓨터 프로그램의 전송을 용이하게 하는 임의의 매체를 포함하는 통신 매체들을 포함할 수도 있다. 이러한 방식으로, 컴퓨터 판독가능 매체들은 일반적으로 (1) 비일시적인 유형의 컴퓨터 판독가능 저장 매체들 또는 (2) 신호 또는 캐리어파와 같은 통신 매체에 대응할 수도 있다. 　데이터 저장 매체들은 본 개시에서 설명된 기법들의 구현을 위한 명령들, 코드 및/또는 데이터 구조들을 취출하기 위해 하나 이상의 컴퓨터들 또는 하나 이상의 프로세싱 회로들에 의해 액세스될 수 있는 임의의 가용 매체들일 수도 있다.　 컴퓨터 프로그램 제품이 컴퓨터 판독가능 매체를 포함할 수도 있다.In one or more examples, the functions described may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium and executed by a hardware-based processing unit. Computer-readable media refers to a computer-readable storage medium, such as a tangible medium, such as data storage media, or any device that facilitates transfer of a computer program from one place to another, e.g., according to a communication protocol. It may also include communication media including media. In this manner, computer-readable media may generally correspond to (1) non-transitory computer-readable storage media or (2) communication media, such as a signal or carrier wave. Data storage media may be any available media that can be accessed by one or more computers or one or more processing circuits to retrieve instructions, code and/or data structures for implementation of the techniques described in this disclosure. . A computer program product may include computer-readable media.

제한이 아닌 일 예로, 이러한 컴퓨터 판독가능 저장 매체들은 RAM, ROM, EEPROM, CD-ROM 또는 다른 광학 디스크 스토리지, 자기 디스크 스토리지, 또는 다른 자기 저장 디바이스들, 플래시 메모리, 또는 명령들 또는 데이터 구조들의 형태로 원하는 프로그램 코드를 저장하는데 사용될 수 있고 컴퓨터에 의해 액세스될 수 있는 임의의 다른 매체를 포함할 수 있다.　 또한, 임의의 커넥션이 컴퓨터 판독가능 매체로 적절히 명명된다. 예를 들어, 동축 케이블, 광섬유 케이블, 꼬임쌍선, 디지털 가입자 라인 (DSL), 또는 적외선, 무선, 및 마이크로파와 같은 무선 기술들을 이용하여 웹사이트, 서버, 또는 다른 원격 소스로부터 소프트웨어가 송신된다면, 동축 케이블, 광섬유 케이블, 꼬임쌍선, DSL, 또는 적외선, 무선, 및 마이크로파와 같은 무선 기술들은 매체의 정의에 포함된다. 하지만, 컴퓨터 판독가능 저장 매체들 및 데이터 저장 매체들은 커넥션들, 캐리어파들, 신호들, 또는 다른 일시적 매체들을 포함하지 않지만 대신 비일시적인 유형의 저장 매체들로 지향됨을 이해해야 한다. 본원에서 이용된 디스크 (disk) 와 디스크 (disc) 는, 컴팩트 디스크(CD), 레이저 디스크, 광학 디스크, 디지털 다기능 디스크 (DVD), 플로피 디스크, 및 블루레이 디스크를 포함하며, 여기서 디스크 (disk) 들은 통상 자기적으로 데이터를 재생하는 반면, 디스크(disc) 들은 레이저들을 이용하여 광학적으로 데이터를 재생한다. 상기의 조합들이 또한, 컴퓨터 판독가능 매체들의 범위 내에 포함되어야 한다.By way of example, and not limitation, such computer-readable storage media may include RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage, or other magnetic storage devices, flash memory, or in the form of instructions or data structures. It may be used to store desired program code and may include any other medium that can be accessed by a computer. Additionally, any connection is properly termed a computer-readable medium. For example, if the Software is transmitted from a website, server, or other remote source using coaxial cable, fiber optic cable, twisted pair, digital subscriber line (DSL), or wireless technologies such as infrared, radio, and microwave. Cable, fiber optic cable, twisted pair, DSL, or wireless technologies such as infrared, wireless, and microwave are included in the definition of medium. However, it should be understood that computer-readable storage media and data storage media do not include connections, carrier waves, signals, or other transient media, but are instead directed to non-transitory tangible storage media. As used herein, disk and disk include compact disk (CD), laser disk, optical disk, digital versatile disk (DVD), floppy disk, and Blu-ray disk, where disk is While disks usually reproduce data magnetically, discs reproduce data optically using lasers. Combinations of the above should also be included within the scope of computer-readable media.

이 개시물에서 설명된 기능성은 고정된 기능 및/또는 프로그래밍가능한 프로세싱 회로에 의해 수행될 수도 있다. 실례로, 명령들은 고정된 기능 및/또는 프로그래밍가능한 프로세싱 회로에 의해 실행될 수도 있다. 이러한 프로세싱 회로는 하나 이상의 디지털 신호 프로세서들 (DSP들), 범용 마이크로프로세서들, 주문형 집적 회로들 (ASIC들), 필드 프로그래밍가능 로직 어레이들 (FPGA들), 또는 다른 등가의 통합된 또는 별개의 로직 회로부와 같은 하나 이상의 프로세서들을 포함할 수도 있다. 이에 따라, 본 명세서에서 사용된 바와 같은 용어 "프로세서" 는, 전술한 구조 또는 본 명세서에서 설명된 기법들의 구현에 적합한 임의의 다른 구조 중 임의의 것을 지칭할 수도 있다. 추가로, 일부 양태들에서, 본 명세서에서 설명된 기능성은 인코딩 및 디코딩을 위해 구성되거나, 또는 결합된 코덱에 통합된 전용 하드웨어 및/또는 소프트웨어 모듈들 내에 제공될 수도 있다. 또한, 그 기법들은 하나 이상의 회로들 또는 로직 엘리먼트들에서 완전히 구현될 수 있다. 프로세싱 회로는 다양한 방식들로 다른 컴포넌트들에 커플링될 수도 있다. 예를 들어, 프로세싱 회로는 내부 디바이스 상호연결부, 유선 또는 무선 네트워크 접속, 또는 다른 통신 매체를 통해 다른 컴포넌트들에 커플링될 수도 있다.The functionality described in this disclosure may be performed by fixed-function and/or programmable processing circuitry. By way of illustration, instructions may be executed by fixed function and/or programmable processing circuitry. Such processing circuitry may include one or more digital signal processors (DSPs), general purpose microprocessors, application specific integrated circuits (ASICs), field programmable logic arrays (FPGAs), or other equivalent integrated or discrete logic. It may also include one or more processors, such as circuitry. Accordingly, the term “processor,” as used herein, may refer to any of the foregoing structure or any other structure suitable for implementation of the techniques described herein. Additionally, in some aspects, the functionality described herein may be provided within dedicated hardware and/or software modules configured for encoding and decoding, or integrated into a combined codec. Additionally, the techniques may be fully implemented in one or more circuits or logic elements. Processing circuitry may be coupled to other components in a variety of ways. For example, the processing circuitry may be coupled to other components through internal device interconnects, wired or wireless network connections, or other communication media.

본 개시의 기법들은 무선 핸드셋, 집적 회로 (IC) 또는 IC 들의 세트 (예를 들면, 칩 세트) 를 포함하는, 매우 다양한 디바이스들 또는 장치들에서 구현될 수도 있다. 다양한 컴포넌트들, 모듈들, 또는 유닛들은 개시된 기법들을 수행하도록 구성된 디바이스들의 기능적 양태들을 강조하기 위해 본 개시에 설명되지만, 상이한 하드웨어 유닛들에 의한 실현을 반드시 요구하는 것은 아니다. 오히려, 상기 설명된 바와 같이, 다양한 유닛들은 코덱 하드웨어 유닛에서 결합되거나 또는 적합한 소프트웨어 및/또는 펌웨어와 함께, 상기 설명된 바와 같은 하나 이상의 프로세서들을 포함하는, 상호동작가능한 하드웨어 유닛들의 콜렉션에 의해 제공될 수도 있다.The techniques of this disclosure may be implemented in a wide variety of devices or apparatuses, including a wireless handset, an integrated circuit (IC), or a set of ICs (e.g., a chip set). Various components, modules, or units are described in this disclosure to highlight functional aspects of devices configured to perform the disclosed techniques, but do not necessarily require implementation by different hardware units. Rather, as described above, the various units may be combined in a codec hardware unit or provided by a collection of interoperable hardware units, including one or more processors as described above, together with suitable software and/or firmware. It may be possible.

다양한 예들이 설명되었다. 이들 및 다른 예들은 다음의 청구항들의 범위 내에 있다.Various examples have been explained. These and other examples are within the scope of the following claims.

Claims

A method for decoding video data, comprising:
The above method is,
determining a decoded syntax element by applying binary arithmetic decoding to an offset value included in the bitstream, wherein applying the binary arithmetic decoding comprises:
By creating an empty stream, creating the empty stream includes:
For at least one each bin of the bin stream,
determining a value for each bin based on the state for each bin, the interval for each bin, and the offset value;
Determining one or more finite state machine (FSM) parameters for the next bin of the bin stream, wherein the one or more FSM parameters for the next bin determine how the probability estimates for the next bin are derived from the state for each bin. determining the one or more FSM parameters that control whether the next bin of the bin stream follows the respective bin in the bin stream; and
update the bin stream using a parameterized state updating function that takes as input the state for each bin, the one or more FSM parameters for the next bin of the bin stream, and the value of each bin. Determining the state for the next bin of the bin stream includes:
The following expressions:

and determining a status for the next bin according to
Here, p[k+1] is the state for the next bin of the bin stream, p[k] is the state for each bin, b[k] is the value of each bin, and a is determining a state for the next bin of the bin stream, wherein b is a first parameter of the one or more FSM parameters for the next bin, and b is a second parameter of the one or more FSM parameters for the next bin.
generating the empty stream, including: and
debinarizing the empty stream to form the decoded syntax element
determining the decoded syntax element, comprising: and
A method of decoding video data, comprising reconstructing a picture of the video data based in part on the decoded syntax elements.

According to claim 1,
Determining one or more FSM parameters for the next bin of the bin stream includes:
A method of decoding video data, comprising reinitializing the FSM parameters for the next bin of the bin stream according to a state reinitialization parameter.

According to claim 1,
Determining one or more FSM parameters for the next bin of the bin stream includes:
A method for decoding video data, comprising modifying the FSM parameters for the next bin of the bin stream according to estimated probability values.

According to claim 1,
Determining one or more FSM parameters for a next bin of the bin stream includes modifying the FSM parameters for the next bin based on a measure of past probability variation.

According to claim 4,
The measure of the past probability fluctuation is,

It is calculated by summing the absolute differences between the estimated probabilities using the estimated variance measure for a particular bin defined as
where σ[k+1] is the estimated variance measure for the next bin of the bin stream, σ[k] is the estimated variance measure for the specific bin, and q ₁ [k] is the estimated variance measure for the specific bin. is a first probability estimate, q ₂ [k] is a second probability estimate for the specific bin, and c is a parameter.

According to claim 1,
Determining one or more FSM parameters for the next bin of the bin stream includes:
A method for decoding video data, comprising determining the one or more FSM parameters for the next bin of the bin stream based on one or more neighboring blocks in the same frame or a previously decoded frame.

delete

According to claim 1,
For each individual bin of the bin stream, determining the value of each bin includes:
based on the state for each bin, dividing the interval for each bin into an interval associated with a first symbol and an interval associated with a second symbol; and
determining a value of each bin based on whether the offset value is in an interval associated with the first symbol or an interval associated with the second symbol,
In response to determining that the offset value is within an interval associated with the first symbol, the value of each bin is equal to the first symbol, and
In response to determining that the offset value is within an interval associated with the second symbol, the value of each bin is equal to the second symbol.

According to claim 1,
A method for decoding video data, further comprising receiving the bitstream.

According to claim 1,
The one or more FSM parameters for a first bin of the empty stream are different from the one or more FSM parameters for a second bin of the empty stream.

A method of encoding video data, comprising:
The above method is,
generating a syntax element based on the video data;
determining an offset value, at least in part, by applying a binary arithmetic encoding to the syntax element, wherein applying the binary arithmetic encoding comprises:
At least in part,
Binarizing the syntax elements;
For at least one each bin of the bin stream,
determining an interval for the next bin of the bin stream based on the state for each bin, the interval for each bin, and the value of each bin;
Determining one or more finite state machine (FSM) parameters for the next bin of the bin stream, wherein the one or more FSM parameters for the next bin determine how the probability estimates for the next bin are the state for each bin. determining the one or more FSM parameters that control whether to be calculated from; and
update the bin stream using a parameterized state updating function that takes as input the state for each bin, the one or more FSM parameters for the next bin of the bin stream, and the value of each bin. Determining the state for the next bin of the bin stream includes:
The following expressions:

and determining a state for the next bin of the bin stream according to,
Here, p[k+1] is the state for the next bin of the bin stream, p[k] is the state for each bin, b[k] is the value of each bin, and a is is a first parameter of the one or more FSM parameters for the next bin of the bin stream, and b is a second parameter of the one or more FSM parameters for the next bin of the bin stream. determining the status of
generating the bin stream, wherein the offset value is equal to a value in the interval for the last bin of the bin stream; and
A method of encoding video data, comprising outputting a bitstream including the offset value.

According to claim 11,
Determining one or more FSM parameters for the next bin of the bin stream includes reinitializing the FSM parameters for the next bin of the bin stream according to a state reinitialization parameter.

According to claim 11,
Determining one or more FSM parameters for the next bin of the bin stream includes modifying the FSM parameters for the next bin of the bin stream according to estimated probability values.

According to claim 11,
wherein determining one or more FSM parameters for the next bin of the bin stream includes modifying the FSM parameters for the next bin of the bin stream based on a measure of past probability variation. .

According to claim 14,
The measure of the past probability fluctuation is,

It is calculated by summing the absolute differences between the estimated probabilities using the estimated variance measure for a particular bin defined as
where σ[k+1] is the estimated variance measure for the next bin of the bin stream, σ[k] is the estimated variance measure for the specific bin, and q ₁ [k] is the estimated variance measure for the specific bin. is a first probability estimate, q ₂ [k] is a second probability estimate for the specific bin, and c is a parameter.

According to claim 11,
Determining the one or more FSM parameters for the next bin of the bin stream includes the one or more FSM parameters for the next bin of the bin stream based on one or more neighboring blocks in the same frame or a previously decoded frame. A method of encoding video data, including determining:

delete

According to claim 11,
For each bin of the bin stream, determining the interval to the next bin of the bin stream includes:
based on the state for each bin, dividing the interval for each bin into an interval associated with a first symbol and an interval associated with a second symbol; and
setting either an upper limit or a lower limit of the interval for the next bin based on whether the value of each bin is the same as the first symbol or the second symbol,
In response to determining that the value of each bin is equal to the first symbol, the upper limit of the interval for the next bin is set to the upper limit of the interval associated with the first symbol and the upper limit of the interval for the next bin is set to the upper limit of the interval associated with the first symbol. the lower limit of the interval is not changed, and
In response to determining that the value of each bin is equal to the second symbol, the lower limit of the interval for the next bin is set to the lower limit of the interval associated with the second symbol and the lower limit of the interval for the next bin is set to the lower limit of the interval associated with the second symbol. The method of encoding video data, wherein the upper limit of the interval is not changed.

According to claim 11,
The method of encoding video data, wherein the one or more FSM parameters for a first bin of the empty stream are different from the one or more FSM parameters for a second bin of the empty stream.

A device for decoding video data, comprising:
The device is,
One or more storage media configured to store video data; and
Contains one or more processors,
The one or more processors:
determining a decoded syntax element by applying binary arithmetic decoding to an offset value included in a bitstream, wherein, as part of applying the binary arithmetic decoding, the one or more processors:
Generating an empty stream, as part of generating the empty stream, the one or more processors:
For at least one each bin of the bin stream,
determine a value for each bin based on the state for each bin, the interval for each bin, and the offset value;
Determining one or more finite state machine (FSM) parameters for the next bin of the bin stream, wherein the one or more FSM parameters for the next bin determine how the probability estimates for the next bin are derived from the state for each bin. determine the one or more FSM parameters that control whether the next bin of the bin stream follows each bin in the bin stream; and
update the bin stream using a parameterized state updating function that takes as input the state for each bin, the one or more FSM parameters for the next bin of the bin stream, and the value of each bin. Determining a state for the next bin of, wherein the one or more processors:
The following expressions:

It is configured to determine the status for the next bin according to,
Here, p[k+1] is the state for the next bin of the bin stream, p[k] is the state for each bin, b[k] is the value of each bin, and a is wherein b is a first parameter of the one or more FSM parameters for the next bin, and b is a second parameter of the one or more FSM parameters for the next bin. Creates a stream,
debinarize the empty stream to form the decoded syntax elements.
determine which decoded syntax element is configured; and
An apparatus for decoding video data, configured to reconstruct a picture of the video data based in part on the decoded syntax elements.

According to claim 20,
As part of determining one or more FSM parameters for the next bin of the bin stream, the one or more processors may reset the FSM parameters for the next bin of the bin stream according to a state reinitialization parameter. An apparatus for decoding video data, configured to initialize.

According to claim 20,
As part of determining one or more FSM parameters for the next bin of the bin stream, the one or more processors modify the FSM parameters for the next bin of the bin stream according to estimated probability values. An apparatus for decoding video data, configured to:

According to claim 20,
wherein, as part of determining the one or more FSM parameters for the next bin, the one or more processors are configured to modify the FSM parameters for the next bin based on a measure of historical probability variation. A device for decoding data.

According to claim 23,
The one or more processors may be configured to:

It is configured to be calculated by summing the absolute differences between the probabilities estimated using the estimated variance measure for a specific bin defined as,
where σ[k+1] is the estimated variance measure for the next bin of the bin stream, σ[k] is the estimated variance measure for the specific bin, and q ₁ [k] is the estimated variance measure for the specific bin. is a first probability estimate, q ₂ [k] is a second probability estimate for the specific bin, and c is a parameter.

According to claim 20,
As part of determining one or more FSM parameters for a next bin of the bin stream, the one or more processors may determine the FSM parameters based on one or more neighboring blocks in the same frame or a previously decoded frame. Apparatus for decoding video data, configured to determine the one or more FSM parameters for the next bin of a bin stream.

delete

According to claim 20,
For each respective bin of the bin stream, the one or more processors: As part of determining a value of the respective bin, the one or more processors:
Based on the state for each bin, divide the interval for each bin into an interval associated with a first symbol and an interval associated with a second symbol; and
determine the value of each bin based on whether the offset value is in an interval associated with the first symbol or an interval associated with the second symbol
It is composed,
In response to determining that the offset value is within an interval associated with the first symbol, the value of each bin is equal to the first symbol, and
In response to determining that the offset value is within an interval associated with the second symbol, the value of each bin is equal to the second symbol.

According to claim 20,
The apparatus for decoding video data, wherein the one or more processors are further configured to receive the bitstream.

According to claim 20,
The one or more FSM parameters for a first bin of the empty stream are different from the one or more FSM parameters for a second bin of the empty stream.

According to claim 20,
The device is,
integrated circuit,
microprocessor, or
wireless communication device
A device for decoding video data, including.

A device for encoding video data, comprising:
The device is,
One or more storage media configured to store video data; and
comprising one or more processing circuits coupled to the one or more storage media,
The one or more processing circuits,
generate syntax elements based on the video data;
By applying a binary arithmetic encoding to the syntax element, determining an offset value causes one or more processors to: As part of applying the binary arithmetic encoding, the one or more processors, at least in part,:
Binarizing the syntax elements; and
For at least one each bin of the bin stream,
determining an interval for the next bin of the bin stream based on the state for each bin, the interval for each bin, and the value of each bin;
Determining one or more finite state machine (FSM) parameters for the next bin of the bin stream, wherein the one or more FSM parameters for the next bin determine how the probability estimates for the next bin are the state for each bin. determining the one or more FSM parameters that control whether to be calculated from; and
update the bin stream using a parameterized state updating function that takes as input the state for each bin, the one or more FSM parameters for the next bin of the bin stream, and the value of each bin. As part of determining the state for the next bin of the bin stream, the one or more processors:
The following expressions:

configured to determine a state for the next bin of the bin stream according to
Here, p[k+1] is the state for the next bin of the bin stream, p[k] is the state for each bin, b[k] is the value of each bin, and a is is a first parameter of the one or more FSM parameters for the next bin of the bin stream, and b is a second parameter of the one or more FSM parameters for the next bin of the bin stream. determining the status of
configured to generate the empty stream by,
determine the offset value, wherein the offset value is equal to the value in the interval for the last bin of the bin stream; and
Apparatus for encoding video data, configured to output a bitstream including the offset value.

According to claim 31,
As part of determining one or more FSM parameters for the next bin of the bin stream, the one or more processors may reset the FSM parameters for the next bin of the bin stream according to a state reinitialization parameter. An apparatus for encoding video data, configured to initialize.

According to claim 31,
As part of determining one or more FSM parameters for the next bin of the bin stream, the one or more processors modify the FSM parameters for the next bin of the bin stream according to estimated probability values. A device for encoding video data, configured to:

According to claim 31,
As part of determining one or more FSM parameters for a next bin of the bin stream, the one or more processors determine the FSM parameter for the next bin of the bin stream based on a measure of past probability variation. A device for encoding video data, configured to modify them.

According to claim 34,
The one or more processors may be configured to:

It is configured to be calculated by summing the absolute differences between the probabilities estimated using the estimated variance measure for a specific bin defined as,
where σ[k+1] is the estimated variance measure for the next bin of the bin stream, σ[k] is the estimated variance measure for the specific bin, and q ₁ [k] is the estimated variance measure for the specific bin. is a first probability estimate, q ₂ [k] is a second probability estimate for the specific bin, and c is a parameter.

According to claim 31,
As part of determining one or more FSM parameters for a next bin of the bin stream, the one or more processors may determine the FSM parameters based on one or more neighboring blocks in the same frame or a previously decoded frame. and determine the one or more FSM parameters for the next bin of a bin stream.

delete

According to claim 31,
For each bin of the bin stream, the one or more processors: As part of determining an interval for the next bin of the bin stream, the one or more processors:
Based on the state for each bin, divide the interval for each bin into an interval associated with a first symbol and an interval associated with a second symbol; and
configured to set either an upper limit or a lower limit of the interval for the next bin based on whether the value of each bin is the same as the first symbol or the second symbol,
In response to determining that the value of each bin is equal to the first symbol, the upper limit of the interval for the next bin is set to the upper limit of the interval associated with the first symbol and the upper limit of the interval for the next bin is set to the upper limit of the interval associated with the first symbol. The said lower limit of said interval is not changed,
In response to determining that the value of each bin is equal to the second symbol, the lower limit of the interval for the next bin is set to the lower limit of the interval associated with the second symbol and the lower limit of the interval for the next bin is set to the lower limit of the interval associated with the second symbol. The apparatus for encoding video data, wherein the upper limit of the interval is not changed.

According to claim 31,
The one or more FSM parameters for a first bin of the empty stream are different from the one or more FSM parameters for a second bin of the empty stream.

According to claim 31,
The device is,
integrated circuit,
microprocessor, or
wireless communication device
A device for encoding video data, including a.

A device for decoding video data, comprising:
The device is,
Means for determining a decoded syntax element by applying binary arithmetic decoding to an offset value included in a bitstream, said applying binary arithmetic decoding comprising:
By creating an empty stream, creating the empty stream includes:
For at least one each bin of the bin stream,
determining a value for each bin based on the state for each bin, the interval for each bin, and the offset value;
Determining one or more finite state machine (FSM) parameters for the next bin of the bin stream, wherein the one or more FSM parameters for the next bin determine how the probability estimates for the next bin are derived from the state for each bin. determining the one or more FSM parameters that control whether the next bin of the bin stream follows the respective bin in the bin stream; and
update the bin stream using a parameterized state updating function that takes as input the state for each bin, the one or more FSM parameters for the next bin of the bin stream, and the value of each bin. Determining the state for the next bin of the bin stream includes:
The following expressions:

and determining a status for the next bin according to
Here, p[k+1] is the state for the next bin of the bin stream, p[k] is the state for each bin, b[k] is the value of each bin, and a is determining a state for the next bin of the bin stream, wherein b is a first parameter of the one or more FSM parameters for the next bin, and b is a second parameter of the one or more FSM parameters for the next bin.
generating the empty stream, including: and
debinarizing the empty stream to form the decoded syntax element
means for determining the decoded syntax element, comprising: and
Apparatus for decoding video data, comprising means for reconstructing a picture of the video data based in part on the decoded syntax elements.

A device for encoding video data, comprising:
The device is,
means for generating syntax elements based on the video data;
means for determining an offset value, at least in part by applying a binary arithmetic encoding to the syntax element, wherein applying the binary arithmetic encoding comprises:
Binarizing the syntax elements; and
For at least one each bin of the bin stream,
determining an interval for the next bin of the bin stream based on the state for each bin, the interval for each bin, and the value of each bin;
Determining one or more finite state machine (FSM) parameters for the next bin of the bin stream, wherein the one or more FSM parameters for the next bin determine how the probability estimates for the next bin are the state for each bin. determining the one or more FSM parameters that control whether to be calculated from; and
update the bin stream using a parameterized state updating function that takes as input the state for each bin, the one or more FSM parameters for the next bin of the bin stream, and the value of each bin. Determining the state for the next bin of the bin stream includes:
The following expressions:

and determining a state for the next bin of the bin stream according to,
Here, p[k+1] is the state for the next bin of the bin stream, p[k] is the state for each bin, b[k] is the value of each bin, and a is is a first parameter of the one or more FSM parameters for the next bin of the bin stream, and b is a second parameter of the one or more FSM parameters for the next bin of the bin stream. determining the status of
and generating the empty stream by,
means for determining the offset value, wherein the offset value is equal to a value in the interval for the last bin of the bin stream; and
Apparatus for encoding video data, comprising means for outputting a bitstream containing the offset value.

A computer-readable storage medium storing instructions that, when executed, cause one or more processors to:
determining a decoded syntax element by applying binary arithmetic decoding to an offset value included in a bitstream, wherein execution of the instructions as part of causing the one or more processors to apply the binary arithmetic decoding causing the one or more processors to:
Generating an empty stream causes the one or more processors to: As part of generating the empty stream, execution of the instructions causes the one or more processors to:
For at least one each bin of the bin stream,
determine a value for each bin based on the state for each bin, the interval for each bin, and the offset value;
Determining one or more finite state machine (FSM) parameters for the next bin of the bin stream, wherein the one or more FSM parameters for the next bin determine how the probability estimates for the next bin are derived from the state for each bin. determine whether the one or more FSM parameters are calculated and the next bin of the bin stream follows each bin in the bin stream; and
update the bin stream using a parameterized state updating function that takes as input the state for each bin, the one or more FSM parameters for the next bin of the bin stream, and the value of each bin. determining the state for the next bin of the bin stream, wherein determining the state for the next bin of the bin stream includes:
The following expressions:

and determining a status for the next bin according to
Here, p[k+1] is the state for the next bin of the bin stream, p[k] is the state for each bin, b[k] is the value of each bin, and a is determine a state for the next bin of the bin stream, wherein b is a first parameter of the one or more FSM parameters for the next bin, and b is a second parameter of the one or more FSM parameters for the next bin. cause to create an empty stream; and
determine the decoded syntax element, cause to debinarize the empty stream to form the decoded syntax element; and
A computer-readable storage medium that allows reconstructing a picture of video data based in part on the decoded syntax elements.

A computer-readable storage medium storing instructions that, when executed, cause one or more processors to:
generate syntax elements based on video data;
By applying a binary arithmetic encoding to the syntax element, determining an offset value, as part of causing one or more processors to apply the binary arithmetic encoding, execution of the instructions causes the one or more processors to: , at least in part, causing the one or more processors to:
binarize the syntax elements; and
For at least one each bin of the bin stream,
determine an interval for the next bin of the bin stream based on the state for each bin, the interval for each bin, and the value of each bin;
Determining one or more finite state machine (FSM) parameters for the next bin of the bin stream, wherein the one or more FSM parameters for the next bin determine how the probability estimates for the next bin are the state for each bin. determine the one or more FSM parameters that control whether to calculate from; and
update the bin stream using a parameterized state updating function that takes as input the state for each bin, the one or more FSM parameters for the next bin of the bin stream, and the value of each bin. Determining the state for the next bin of the bin stream includes:
The following expressions:

and determining a state for the next bin of the bin stream according to,
Here, p[k+1] is the state for the next bin of the bin stream, p[k] is the state for each bin, b[k] is the value of each bin, and a is is a first parameter of the one or more FSM parameters for the next bin of the bin stream, and b is a second parameter of the one or more FSM parameters for the next bin of the bin stream. By making it possible to determine the state of affairs,
Create the empty stream,
the offset value determines that the offset value is equal to the value in the interval for the last bin of the bin stream; and
A computer-readable storage medium that outputs a bitstream including the offset value.