KR20130140190A

KR20130140190A - Methods and devices for coding and decoding the position of the last significant coefficient

Info

Publication number: KR20130140190A
Application number: KR1020137030056A
Authority: KR
Inventors: 다케 히; 징 왕
Original assignee: 블랙베리 리미티드
Priority date: 2011-04-15
Filing date: 2011-04-15
Publication date: 2013-12-23
Also published as: CA2832086C; EP2697974B1; KR101571618B1; CA2832086A1; EP2697974A4; EP2697974A2; WO2012139192A2; WO2012139192A3; EP3229473B1; EP3229473A1; CN103597838B; CN103597838A

Abstract

양자화된 변환 영역 계수 데이터를 인코딩하기 위해 엔트로피 코더를 사용하여 데이터를 엔트로피 코딩하는 방법 및 장치가 기술되어 있다. 마지막 유효 계수에 대한 2차원 좌표를 사용하여 마지막 유효 계수 정보가 비트스트림에서 신호된다. 좌표들 중 하나의 좌표의 빈들에 대한 컨텍스트가 좌표들 중 다른 하나의 좌표의 값에 부분적으로 기초한다. 한 경우에, 마지막 유효 계수 정보를 신호하는 대신에, 영이 아닌 계수들의 수가 이진화되고 엔트로피 인코딩된다.A method and apparatus are described for entropy coding data using an entropy coder to encode quantized transform domain coefficient data. The last significant coefficient information is signaled in the bitstream using two-dimensional coordinates for the last significant coefficient. The context for the bins of one of the coordinates is based in part on the value of the coordinate of the other of the coordinates. In one case, instead of signaling the last significant coefficient information, the number of nonzero coefficients is binarized and entropy encoded.

Description

Method and apparatus for coding and decoding the position of the last significant coefficient {METHODS AND DEVICES FOR CODING AND DECODING THE POSITION OF THE LAST SIGNIFICANT COEFFICIENT}

본 출원은 일반적으로 데이터 압축에 관한 것으로서, 상세하게는, 마지막 유효 변환 계수(last significant transform coefficient)를 코딩 및 디코딩하는 인코더, 디코더 및 방법에 관한 것이다.TECHNICAL FIELD This application generally relates to data compression, and more particularly, to encoders, decoders, and methods for coding and decoding last significant transform coefficients.

데이터 압축은, 손실 압축이든 무손실 압축이든, 역상관된 신호를 비트 시퀀스로서(즉, 비트스트림) 인코딩하기 위해 엔트로피 코딩(entropy coding)을 종종 사용한다. 효율적인 데이터 압축은 영상, 오디오 및 비디오 인코딩 등의 매우 다양한 응용들을 가진다. 종래의 비디오 인코딩 기술은 ITU-T H.264/MPEG AVC 비디오 코딩 표준이다. 이 표준은 메인 프로파일(Main profile), 베이스라인 프로파일(Baseline profile) 및 기타를 비롯한, 상이한 응용들에 대한 다수의 상이한 프로파일들을 정의한다. 차세대 비디오 인코딩 표준은 MPEG-ITU의 공동 발의(joint initiative)를 통해 현재 개발 중에 있다: HEVC(High Efficiency Video Coding, 고효율 비디오 코딩).Data compression often uses entropy coding to encode decorrelated signals as bit sequences (ie, bitstreams), either lossy or lossless compression. Efficient data compression has a wide variety of applications, such as video, audio and video encoding. Conventional video encoding technology is the ITU-T H.264 / MPEG AVC video coding standard. This standard defines a number of different profiles for different applications, including the Main profile, Baseline profile and others. The next generation of video encoding standards is currently under development through the joint initiative of MPEG-ITU: High Efficiency Video Coding (HEVC).

이진 데이터를 생성하기 위해 손실 압축 프로세스를 이용하는, 영상 및 비디오를 인코딩/디코딩하는 다수의 표준들(H.264를 포함함)이 있다. 예를 들어, H.264는 잔차 데이터(residual data)를 획득하는 예측 동작, 및 그에 뒤이은 DCT 변환 및 DCT 계수들의 양자화를 포함하고 있다. 양자화된 계수, 움직임 벡터, 코딩 모드, 및 기타 관련 데이터를 포함하는 얻어진 데이터는 이어서 엔트로피 코딩되어, 전송하기 위한 또는 컴퓨터 판독가능 매체 상에 저장하기 위한 데이터 비트스트림을 발생한다. HEVC도 역시 이 특징들을 가질 것으로 예상된다.There are a number of standards (including H.264) for encoding / decoding video and video, which use a lossy compression process to generate binary data. For example, H.264 includes a prediction operation to obtain residual data, followed by a DCT transform and quantization of DCT coefficients. The resulting data, including quantized coefficients, motion vectors, coding modes, and other related data, is then entropy coded to generate a data bitstream for transmission or storage on a computer readable medium. HEVC is also expected to have these features.

이진 데이터를 인코딩하기 위해 다수의 코딩 방식들이 개발되었다. 예를 들어, JPEG 영상이 허프만 코드(Huffman code)를 사용하여 인코딩될 수 있다. H.264 표준은 다음과 같은 2가지 가능한 엔트로피 코딩 프로세스를 가능하게 해준다: CAVLC(Context Adaptive Variable Length Coding, 컨텍스트 적응적 가변 길이 코딩) 또는 CABAC(Context Adaptive Binary Arithmetic Coding, 컨텍스트 적응적 이진 산술 코딩). CABAC는 CAVLC보다 더 많은 압축을 가져오지만, CABAC는 더 많은 계산을 요구한다. 이 경우들 중 임의의 경우에서, 코딩 방식은 이진 데이터를 처리하여 인코딩된 데이터의 직렬 비트스트림을 생성한다. 디코더에서, 디코딩 방식은 비트스트림을 검색하고, 직렬 비트스트림을 엔트로피 디코딩하여 이진 데이터를 재구성한다.A number of coding schemes have been developed for encoding binary data. For example, JPEG images may be encoded using Huffman code. The H.264 standard enables two possible entropy coding processes: Context Adaptive Variable Length Coding (CAVLC) or Context Adaptive Binary Arithmetic Coding (CABAC). . CABAC brings more compression than CAVLC, but CABAC requires more computation. In any of these cases, the coding scheme processes the binary data to produce a serial bitstream of encoded data. At the decoder, the decoding scheme retrieves the bitstream and entropy decodes the serial bitstream to reconstruct binary data.

개선된 인코더, 디코더, 및 엔트로피 코딩 및 디코딩 방법을 제공하는 것이 유익할 것이다.It would be beneficial to provide an improved encoder, decoder, and entropy coding and decoding method.

이제부터, 예로서, 본 출원의 예시적인 실시예들을 도시하고 있는 첨부 도면들을 참조할 것이다.
도 1은 비디오를 인코딩하는 인코더를 블록도 형태로 나타낸 도면.
도 2는 비디오를 디코딩하는 디코더를 블록도 형태로 나타낸 도면.
도 3은 인코딩 프로세스의 블록도.
도 4는 인코더의 예시적인 실시예의 간략화된 블록도.
도 5는 디코더의 예시적인 실시예의 간략화된 블록도.
도 6은 4x4 계수 블록에 대한 지그재그 코딩 순서를 나타낸 도면.
도 7은 비트스트림의 일부분을 개략적으로 나타낸 도면.
도 8은 마지막 유효 계수 정보를 엔트로피 인코딩하는 예시적인 방법을 플로우차트 형태로 나타낸 도면.
도 9는 양자화된 변환 영역 계수 데이터를 재구성하기 위해 인코딩된 데이터의 비트스트림을 엔트로피 디코딩하는 예시적인 방법을 플로우차트 형태로 나타낸 도면.
도 10은 유의성 맵(significance map)을 인코딩하는 예시적인 방법을 플로우차트 형태로 나타낸 도면.
도 11은 4x4 블록에서의 계수들의 반대각 그룹화(anti-diagonal grouping)를 나타낸 도면.
유사한 구성요소들을 나타내기 위해 상이한 도면들에서 유사한 참조 번호들이 사용될 수 있다.Reference will now be made to the accompanying drawings, which show, by way of example, exemplary embodiments of the present application.
1 is a block diagram of an encoder for encoding video.
2 is a block diagram of a decoder for decoding video.
3 is a block diagram of an encoding process.
4 is a simplified block diagram of an exemplary embodiment of an encoder.
5 is a simplified block diagram of an exemplary embodiment of a decoder.
6 illustrates a zigzag coding order for a 4x4 coefficient block.
7 schematically illustrates a portion of a bitstream.
FIG. 8 illustrates in flowchart form an exemplary method of entropy encoding last significant coefficient information. FIG.
FIG. 9 illustrates in flowchart form an exemplary method of entropy decoding a bitstream of encoded data to reconstruct quantized transform region coefficient data.
FIG. 10 illustrates in flowchart form an exemplary method of encoding a significance map. FIG.
FIG. 11 shows anti-diagonal grouping of coefficients in a 4 × 4 block. FIG.
Like reference numerals may be used in different drawings to indicate like elements.

본 출원은 이진 데이터를 인코딩 및 디코딩하는 장치, 방법 및 프로세스를 기술하고 있다. 상세하게는, 본 출원은 블록 기반 코딩 방식(block-based coding scheme)에서 마지막 유효 계수 위치를 코딩 및 디코딩하는 방법 및 장치를 기술하고 있다.The present application describes an apparatus, method and process for encoding and decoding binary data. Specifically, the present application describes a method and apparatus for coding and decoding the last significant coefficient position in a block-based coding scheme.

한 측면에서, 본 출원은 마지막 유효 계수 정보를 포함하는 양자화된 변환 영역 계수 데이터를 인코딩하는 방법을 기술하고 있다. 이 방법은 마지막 유효 계수의 2차원 좌표들의 2개의 위치들 각각을 이진화하는 단계; 위치들 중 하나의 위치의 각각의 빈(bin)에 대한 컨텍스트(context)를 결정하는 단계; 위치들 중 다른 하나의 위치의 각각의 빈에 대한 컨텍스트를 결정하는 단계 - 위치들 중 다른 하나의 위치의 각각의 빈의 컨텍스트는 위치들 중 상기 하나의 위치에 부분적으로 기초함 -; 및 인코딩된 데이터를 생성하기 위해, 이진화된 위치들의 각각의 빈에 대해 결정된 컨텍스트에 기초하여, 이진화된 위치들을 엔트로피 인코딩하는 단계를 포함한다.In one aspect, the present application describes a method for encoding quantized transform region coefficient data including last significant coefficient information. The method comprises binarizing each of the two positions of the two-dimensional coordinates of the last significant coefficient; Determining a context for each bin of one of the locations; Determining a context for each bin of the other of the locations, wherein the context of each bin of the other of the locations is based in part on the one of the locations; And entropy encoding the binarized positions based on the context determined for each bin of the binarized positions to produce encoded data.

다른 측면에서, 본 출원은, 양자화된 변환 영역 계수 데이터를 재구성하기 위해, 인코딩된 데이터의 비트스트림을 디코딩하는 방법을 기술하고 있다. 이 방법은, 마지막 유효 계수의 2차원 좌표들을 정의하는 2개의 이진화된 위치들을 생성하기 위해, 인코딩된 데이터의 일부분을 엔트로피 디코딩하는 단계 - 데이터의 일부분을 엔트로피 디코딩하는 단계는 위치들 중 하나의 위치의 각각의 빈에 대한 컨텍스트를 결정하는 단계, 및 위치들 중 다른 하나의 위치의 각각의 빈에 대한 컨텍스트를 결정하는 단계를 포함하고, 위치들 중 다른 하나의 위치의 각각의 빈의 컨텍스트는 위치들 중 상기 하나의 위치에 부분적으로 기초함 -; 마지막 유효 계수의 2차원 좌표들에 기초하여, 유효 계수 시퀀스(significant coefficient sequence)를 엔트로피 디코딩하는 단계; 유효 계수 시퀀스에 기초하여, 레벨 정보(level information)를 엔트로피 디코딩하는 단계; 및 레벨 정보 및 유효 계수 시퀀스를 사용하여 양자화된 변환 영역 계수 데이터를 재구성하는 단계를 포함한다.In another aspect, the present application describes a method of decoding a bitstream of encoded data to reconstruct quantized transform region coefficient data. The method entropy decodes a portion of encoded data to generate two binary positions that define two-dimensional coordinates of the last significant coefficient, wherein entropy decode a portion of data is a position of one of the positions. Determining a context for each bin of, and determining a context for each bin of a location of the other one of the locations, wherein the context of each bin of the location of the other one of the locations is a location Based in part on the location of one of these; Entropy decoding a significant coefficient sequence based on the two-dimensional coordinates of the last significant coefficient; Entropy decoding level information based on the effective coefficient sequence; And reconstructing the quantized transform region coefficient data using the level information and the effective coefficient sequence.

또 다른 측면에서, 본 출원은, 실행될 때, 프로세서로 하여금 기술된 인코딩 및/또는 디코딩 방법을 수행하도록 구성되어 있는 컴퓨터 실행가능 프로그램 명령어를 저장하는 컴퓨터 판독가능 매체를 기술하고 있다.In another aspect, the present application describes a computer readable medium storing computer executable program instructions that, when executed, are configured to cause a processor to perform the described encoding and / or decoding methods.

기술 분야의 당업자라면 첨부 도면들과 관련한 예들에 대한 이하의 설명을 검토함으로써 본 출원의 다른 측면들 및 특징들을 잘 알 것이다. Those skilled in the art will appreciate other aspects and features of the present application by reviewing the following description of examples in conjunction with the accompanying drawings.

이하의 설명은 일반적으로 데이터 압축에 관한 것으로서, 상세하게는, 이진 소스(binary source) 등의 유한 알파벳 소스(finite alphabet source)의 효율적인 인코딩 및 디코딩에 관한 것이다. 이하에 주어지는 다수의 예들에서, 이러한 인코딩 및 디코딩 방식의 특정의 응용들이 주어져 있다. 예를 들어, 이하의 다수의 예시들은 비디오 코딩을 언급하고 있다. 본 출원이 비디오 코딩 또는 영상 코딩으로 꼭 제한될 필요는 없다는 것을 잘 알 것이다. 이는 블록 기반이고 블록에서의 마지막 유효 비트 또는 심볼의 위치를 신호하는 것을 포함하는 컨텍스트 기반 데이터 인코딩 방식의 적용을 받는 임의의 유형의 데이터에 적용가능할 수 있다.The following description relates generally to data compression, and in particular, to efficient encoding and decoding of finite alphabet sources, such as binary sources. In many of the examples given below, certain applications of this encoding and decoding scheme are given. For example, many of the examples below refer to video coding. It will be appreciated that the present application is not necessarily limited to video coding or image coding. This may be applicable to any type of data that is block based and subject to context based data encoding schemes including signaling the location of the last valid bit or symbol in the block.

본 명세서에 기술되어 있는 예시적인 실시예들은 유한 알파벳 소스의 데이터 압축에 관한 것이다. 그에 따라, 이 설명은 알파벳의 요소들인 "심볼들"을 종종 언급한다. 어떤 경우에, 본 명세서에서의 설명에서는 이진 소스를 언급하고, 심볼을 비트라고 한다. 때때로, 용어 "심볼" 및 "비트"는 주어진 예에 대해 서로 바꾸어 사용될 수 있다. 이진 소스가 유한 알파벳 소스의 한 예에 불과하다는 것을 잘 알 것이다. 본 출원은 이진 소스로 제한되지 않는다.Exemplary embodiments described herein relate to data compression of finite alphabetic sources. Accordingly, this description often refers to "symbols", which are elements of the alphabet. In some cases, the description herein refers to binary sources and symbols are referred to as bits. At times, the terms "symbol" and "bit" may be used interchangeably for a given example. It will be appreciated that a binary source is only one example of a finite alphabet source. This application is not limited to binary sources.

이하의 설명에서, 예시적인 실시예는 H.264 표준을 참조하여 기술되어 있다. 기술 분야의 당업자라면 본 출원이 H.264로 제한되지 않고 HEVC 등의 가능한 장래의 표준들을 포함한 다른 비디오 코딩/디코딩 표준들에 적용가능할 수 있다는 것을 잘 알 것이다. 또한, 본 출원이 비디오 코딩/디코딩으로 꼭 제한될 필요는 없고 임의의 유한 알파벳 소스의 코딩/디코딩에 적용가능할 수 있다는 것을 잘 알 것이다.In the following description, exemplary embodiments are described with reference to the H.264 standard. Those skilled in the art will appreciate that this application is not limited to H.264 and may be applicable to other video coding / decoding standards, including possible future standards such as HEVC. It will also be appreciated that the present application is not necessarily limited to video coding / decoding and may be applicable to coding / decoding of any finite alphabet source.

이제부터, 비디오를 인코딩하는 인코더(10)를 블록도 형태로 도시하고 있는 도 1을 참조한다. 또한, 비디오를 디코딩하는 디코더(50)의 블록도를 도시하고 있는 도 2를 참조한다. 본 명세서에 기술되어 있는 인코더(10) 및 디코더(50) 각각이 하나 이상의 처리 요소 및 메모리를 포함하는 특정 용도(application- specific) 또는 범용(general purpose) 컴퓨팅 장치 상에 구현될 수 있다는 것을 잘 알 것이다. 인코더(10) 또는 디코더(50)에 의해 수행되는 동작들은, 경우에 따라, 예를 들어, ASIC(application-specific integrated circuit)을 통해, 또는 범용 프로세서에 의해 실행가능한 저장된 프로그램 명령어를 통해 구현될 수 있다. 장치는, 예를 들어, 기본적인 장치 기능들을 제어하는 운영 체제를 포함하는 부가의 소프트웨어를 포함할 수 있다. 기술 분야의 당업자가 이하의 설명에 살펴보면 인코더(10) 또는 디코더(50)가 구현될 수 있는 장치들 및 플랫폼들의 범위를 잘 알 것이다.Reference is now made to FIG. 1, which shows, in block diagram form, an encoder 10 that encodes video. Reference is also made to FIG. 2, which shows a block diagram of a decoder 50 for decoding video. It will be appreciated that each of the encoder 10 and decoder 50 described herein can be implemented on an application-specific or general purpose computing device that includes one or more processing elements and memory. will be. The operations performed by the encoder 10 or decoder 50 may optionally be implemented, for example, through an application-specific integrated circuit (ASIC) or through stored program instructions executable by a general purpose processor. have. The device may include additional software, including, for example, an operating system that controls basic device functions. Those skilled in the art will appreciate the range of devices and platforms on which encoder 10 or decoder 50 may be implemented when reviewed in the following description.

인코더(10)는 비디오 소스(12)를 수신하고, 인코딩된 비트스트림(14)을 생성한다. 디코더(50)는 인코딩된 비트스트림(14)을 수신하고, 디코딩된 비디오 프레임(16)을 출력한다. 인코더(10) 및 디코더(50)는 다수의 비디오 압축 표준들에 따라 동작하도록 구성되어 있을 수 있다. 예를 들어, 인코더(10) 및 디코더(50)는 H.264/AVC 호환일 수 있다. 다른 실시예에서, 인코더(10) 및 디코더(50)는 HEVC 등의 H.264/AVC 표준의 발전들을 포함한 다른 비디오 압축 표준들을 준수할 수 있다.Encoder 10 receives video source 12 and generates an encoded bitstream 14. Decoder 50 receives encoded bitstream 14 and outputs decoded video frame 16. Encoder 10 and decoder 50 may be configured to operate in accordance with a number of video compression standards. For example, encoder 10 and decoder 50 may be H.264 / AVC compatible. In other embodiments, encoder 10 and decoder 50 may conform to other video compression standards, including advances in the H.264 / AVC standard, such as HEVC.

인코더(10)는 공간 예측기(21), 코딩 모드 선택기(20), 변환 처리기(22), 양자화기(24), 및 엔트로피 코더(26)를 포함하고 있다. 기술 분야의 당업자라면 잘 알 것인 바와 같이, 코딩 모드 선택기(20)는 비디오 소스에 대한 적절한 코딩 모드, 예를 들어, 대상 프레임/슬라이스가 I, P 또는 B 유형인지, 및 프레임/슬라이스 내의 특정의 매크로블록(또는 코딩 단위)가 인터-코딩되는지 인트라-코딩되는지를 결정한다. 변환 처리기(22)는 공간 영역 데이터에 대해 변환을 수행한다. 상세하게는, 변환 처리기(22)는 공간 영역 데이터를 스펙트럼 성분들로 변환하기 위해 블록 기반 변환을 적용한다. 예를 들어, 많은 실시예들에서, DCT(discrete cosine transform, 이산 코사인 변환)가 사용된다. 어떤 경우에, 이산 사인 변환(discrete sine transform) 등과 같은 다른 변환들이 사용될 수 있다. 블록 기반 변환을 픽셀 데이터 블록에 적용하면 한 세트의 변환 영역 계수들이 얻어진다. 이 한 세트의 변환 영역 계수들은 양자화기(24)에 의해 양자화된다. 양자화된 계수들 및 연관된 정보(움직임 벡터, 양자화 파라미터, 기타 등등)는 이어서 엔트로피 코더(26)에 의해 인코딩된다.The encoder 10 includes a spatial predictor 21, a coding mode selector 20, a transform processor 22, a quantizer 24, and an entropy coder 26. As will be appreciated by those skilled in the art, the coding mode selector 20 may determine a suitable coding mode for the video source, e.g., whether the target frame / slice is of type I, P or B, and which particular within the frame / slice. Determines whether a macroblock (or coding unit) of is inter- or intra-coded. The transform processor 22 performs a transform on the spatial domain data. Specifically, transform processor 22 applies block-based transforms to transform spatial domain data into spectral components. For example, in many embodiments a discrete cosine transform (DCT) is used. In some cases, other transforms may be used, such as a discrete sine transform or the like. Applying the block-based transform to the block of pixel data results in a set of transform region coefficients. This set of transform domain coefficients is quantized by quantizer 24. The quantized coefficients and associated information (motion vector, quantization parameter, etc.) are then encoded by entropy coder 26.

인트라-코딩된 프레임/슬라이스(즉, 유형 I)는 다른 프레임들/슬라이스들을 참조하여 인코딩된다. 환언하면, 이들은 시간 예측을 이용하지 않는다. 그렇지만, 인트라-코딩된 프레임들은, 도 1에 예시된 바와 같이, 공간 예측기(21)에 의해 프레임/슬라이스 내에서의 공간 예측에 의존한다. 즉, 특정의 블록을 인코딩할 때, 블록 내의 데이터는 그 프레임/슬라이스에 대해 이미 인코딩된 블록들 내의 근방의 픽셀들의 데이터와 비교될 수 있다. 예측 알고리즘을 사용하여, 블록의 소스 데이터는 잔차 데이터로 변환될 수 있다. 변환 처리기(22)는 이어서 잔차 데이터를 인코딩한다. H.264는, 예를 들어, 4x4 변환 블록에 대한 9개의 공간 예측 모드를 규정하고 있다. 어떤 실시예들에서, 9개의 모드 각각은 블록을 독립적으로 처리하는 데 사용될 수 있고, 이어서 레이트-왜곡 최적화(rate-distortion optimization)는 최상의 모드를 선택하는 데 사용된다.Intra-coded frames / slices (ie, type I) are encoded with reference to other frames / slices. In other words, they do not use time prediction. However, intra-coded frames rely on spatial prediction within a frame / slice by the spatial predictor 21, as illustrated in FIG. 1. That is, when encoding a particular block, the data in the block can be compared with the data of the pixels in the vicinity of the blocks already encoded for that frame / slice. Using the prediction algorithm, the source data of the block can be converted to residual data. Transform processor 22 then encodes the residual data. H.264, for example, defines nine spatial prediction modes for a 4x4 transform block. In some embodiments, each of the nine modes may be used to process the block independently, and then rate-distortion optimization is used to select the best mode.

H.264 표준은 또한 시간 예측을 이용하기 위해 움직임 예측/보상을 사용하는 것을 규정하고 있다. 그에 따라, 인코더(10)는 역양자화기(28), 역변환 처리기(30), 및 역블록화 처리기(32)를 포함하는 피드백 루프를 가진다. 이 요소들은 프레임/슬라이스를 재생하기 위해 디코더(50)에 의해 구현되는 디코딩 프로세스를 반영하고 있다. 프레임 저장소(34)는 재생된 프레임들을 저장하는 데 사용된다. 이러한 방식으로, 움직임 예측은, 인코딩/디코딩에 수반되는 손실 압축으로 인해 재구성된 프레임과 상이할지도 모르는 원래의 프레임이 아니라, 디코더(50)에서 재구성될 것에 기초하고 있다. 움직임 예측기(36)는 프레임 저장소(34)에 저장되어 있는 프레임/슬라이스를, 유사한 블록을 식별하기 위해 현재의 프레임과 비교하기 위한 소스 프레임/슬라이스로서 사용한다. 그에 따라, 움직임 예측이 적용되는 매크로블록에 대해, 변환 처리기(22)가 인코딩하는 "소스 데이터"는 움직임 예측 프로세스로부터 나오는 잔차 데이터이다. 잔차 데이터는 참조 블록과 현재 블록 사이의 차이(있는 경우)를 나타내는 픽셀 데이터이다. 참조 프레임 및/또는 움직임 벡터에 관한 정보는 변환 처리기(22) 및/또는 양자화기(24)에 의해 처리되지 않을 수 있고, 그 대신에, 양자화된 계수와 함께 비트스트림의 일부로서 인코딩하기 위해 엔트로피 코더(26)에 제공될 수 있다.The H.264 standard also specifies the use of motion prediction / compensation to take advantage of temporal prediction. Accordingly, encoder 10 has a feedback loop that includes inverse quantizer 28, inverse transform processor 30, and inverse block processor 32. These elements reflect the decoding process implemented by the decoder 50 to play the frames / slices. Frame storage 34 is used to store the reproduced frames. In this way, motion prediction is based on being reconstructed at decoder 50, not the original frame, which may be different from the reconstructed frame due to the lossy compression involved in encoding / decoding. The motion predictor 36 uses the frame / slice stored in the frame store 34 as a source frame / slice for comparing with the current frame to identify similar blocks. Thus, for macroblocks to which motion prediction is applied, the "source data" encoded by transform processor 22 is the residual data coming from the motion prediction process. Residual data is pixel data indicating the difference (if any) between the reference block and the current block. Information about the reference frame and / or motion vector may not be processed by transform processor 22 and / or quantizer 24, instead entropy for encoding as part of the bitstream with quantized coefficients. May be provided to the coder 26.

기술 분야의 당업자라면 H.264 인코더를 구현하기 위한 상세 및 가능한 변형례들을 잘 알 것이다.Those skilled in the art will be familiar with the details and possible variations for implementing the H.264 encoder.

디코더(50)는 엔트로피 디코더(52), 역양자화(54), 역변환 처리기(56), 공간 보상기(57), 및 역블록화 처리기(60)를 포함하고 있다. 프레임 버퍼(58)는 움직임 보상기(62)에서 움직임 보상을 적용하는 데 사용하기 위한 재구성된 프레임들을 제공한다. 공간 보상기(57)는 이전에 디코딩된 블록으로부터 특정의 인트라-코딩된 블록에 대한 비디오 데이터를 복원하는 동작을 나타낸다.Decoder 50 includes entropy decoder 52, inverse quantization 54, inverse transform processor 56, spatial compensator 57, and inverse block processor 60. Frame buffer 58 provides reconstructed frames for use in applying motion compensation in motion compensator 62. Spatial compensator 57 represents the operation of reconstructing video data for a particular intra-coded block from a previously decoded block.

양자화된 계수들을 복원하기 위해, 비트스트림(14)이 엔트로피 디코더(52)에 의해 수신되고 디코딩된다. 엔트로피 디코딩 프로세스 동안, 보조 정보(side information)도 역시 복원될 수 있고, 그 중 일부는, 적용가능한 경우, 움직임 보상에서 사용하기 위해 움직임 보상 루프에 제공될 수 있다. 예를 들어, 엔트로피 디코더(52)는 인터-코딩된 매크로블록에 대한 움직임 벡터 및/또는 참조 프레임 정보를 복원할 수 있다.To recover the quantized coefficients, the bitstream 14 is received and decoded by the entropy decoder 52. During the entropy decoding process, side information may also be recovered, some of which may be provided to the motion compensation loop for use in motion compensation, if applicable. For example, entropy decoder 52 may reconstruct the motion vector and / or reference frame information for the inter-coded macroblock.

양자화된 계수들은 이어서 변환 영역 계수들을 생성하기 위해 역양자화기(54)에 의해 역양자화되고, 변환 영역 계수들은 이어서 역변환 처리기(56)에 의해 역변환되어 "비디오 데이터"를 재생성한다. 인트라-코딩된 매크로블록에서와 같은 어떤 경우에, 재생성된 "비디오 데이터"가 프레임 내의 이전에 디코딩된 블록에 대한 공간 보상에서 사용하기 위한 잔차 데이터라는 것을 잘 알 것이다. 공간 보상기(57)는 이전에 디코딩된 블록으로부터의 잔차 데이터 및 픽셀 데이터로부터 비디오 데이터를 발생한다. 인터-코딩된 매크로블록 등의 다른 경우에, 역변환 처리기(56)로부터의 재생성된 "비디오 데이터"는 상이한 프레임으로부터의 참조 블록에 대한 움직임 보상에서 사용하기 위한 잔차 데이터이다. 공간 보상 및 움직임 보상 둘 다를, 본 명세서에서, "예측 동작"이라고 할 수 있다.The quantized coefficients are then inverse quantized by inverse quantizer 54 to produce transform region coefficients, and the transform region coefficients are then inverse transformed by inverse transform processor 56 to regenerate "video data". In some cases, such as in an intra-coded macroblock, it will be appreciated that the regenerated "video data" is residual data for use in spatial compensation for previously decoded blocks in a frame. Spatial compensator 57 generates video data from pixel data and residual data from previously decoded blocks. In other cases, such as inter-coded macroblocks, the regenerated "video data" from inverse transform processor 56 is residual data for use in motion compensation for reference blocks from different frames. Both spatial compensation and motion compensation may be referred to herein as "predictive motion".

움직임 보상기(62)는 특정의 인터-코딩된 매크로블록에 대해 지정된 참조 블록을 프레임 버퍼(58) 내에서 찾아낸다. 움직임 보상기(62)는 인터-코딩된 매크로블록에 대해 지정된 참조 프레임 정보 및 움직임 벡터에 기초하고 그렇게 한다. 움직임 보상기(62)는 이어서 그 매크로블록에 대한 재생성된 비디오 데이터에 도달하기 위해 잔차 데이터와 결합하기 위한 참조 블록 픽셀 데이터를 제공한다.The motion compensator 62 finds in the frame buffer 58 a reference block designated for a particular inter-coded macroblock. The motion compensator 62 is based on and does so with reference frame information and motion vectors specified for the inter-coded macroblocks. Motion compensator 62 then provides reference block pixel data for combining with the residual data to arrive at the regenerated video data for that macroblock.

역블록화 처리기(60)로 나타낸 바와 같이, 역블록화 프로세스가 이어서 재구성된 프레임/슬라이스에 적용될 수 있다. 역블록화 이후에, 프레임/슬라이스는, 예를 들어, 디스플레이 장치 상에 디스플레이하기 위한 디코딩된 비디오 프레임(16)으로서 출력된다. 컴퓨터, 셋톱 박스, DVD 또는 블루레이 플레이어 등의 비디오 재생 기계, 및/또는 모바일 핸드헬드 장치가, 출력 장치 상에 디스플레이하기 위해, 디코딩된 프레임들을 메모리에 버퍼링할 수 있다는 것을 잘 알 것이다.As shown by deblocking processor 60, a deblocking process may then be applied to the reconstructed frame / slice. After deblocking, the frames / slices are output, for example, as decoded video frames 16 for display on a display device. It will be appreciated that a computer, set top box, video playback machine such as a DVD or Blu-ray player, and / or mobile handheld device may buffer decoded frames in memory for display on an output device.

엔트로피 코딩은 앞서 기술한 비디오 압축을 포함한 모든 무손실 및 손실 압축 방식들의 기본적인 부분이다. 엔트로피 코딩의 목적은 아마도 역상관된 신호(종종 독립적이지만 동일한 분포를 갖지 않은 프로세스에 의해 모델링됨)를 비트 시퀀스로서 나타내기 위한 것이다. 이것을 달성하는 데 사용되는 기법은 역상관된 신호가 어떻게 발생되었는지에 의존해서는 안되고, 나오게 될 각각의 심볼에 대한 관련 확률 추정에 의존할 수 있다.Entropy coding is a fundamental part of all lossless and lossy compression schemes, including the video compression described above. The purpose of entropy coding is probably to represent the decorrelated signal (which is often modeled by a process that is independent but does not have the same distribution) as a bit sequence. The technique used to achieve this should not depend on how the decorrelated signal was generated, but could rely on the associated probability estimates for each symbol that would come out.

실제로 사용되는 엔트로피 코딩에 대한 다음과 같은 2가지 통상적인 방식이 있다: 첫번째 것은 코드워드(codeword)로 입력 심볼 또는 입력 시퀀스를 식별하는 가변 길이 코딩이고, 두번째 것은 단일 구간에 도달하기 위해 [0, 1) 구간의 서브구간들(subintervals)의 시퀀스를 캡슐화하는 범위(또는 산술) 코딩이고, 그로부터 원래의 시퀀스가 그 구간들을 정의한 확률 분포들을 사용하여 재구성될 수 있다. 통상적으로, 범위 코딩 방법은 더 나은 압축을 제공하는 경향이 있는 반면, VLC 방법은 보다 빠를 가능성을 가지고 있다. 어느 경우든지, 입력 시퀀스의 심볼들은 유한 알파벳으로부터 온 것이다.There are two common approaches to entropy coding that are actually used: the first is variable length coding that identifies an input symbol or input sequence by codeword, and the second is [0, 1) Range (or arithmetic) coding that encapsulates a sequence of subintervals of an interval, from which the original sequence can be reconstructed using probability distributions that define the intervals. Typically, range coding methods tend to provide better compression, while VLC methods have the potential to be faster. In either case, the symbols of the input sequence are from finite alphabet.

엔트로피 코딩의 특수한 경우는 입력 알파벳이 이진 심볼로 제한될 때이다. 여기서, VLC 방식은 압축할 가능성을 갖도록 입력 심볼들을 그룹화해야만 하지만, 확률 분포가 각각의 비트 후에 변할 수 있기 때문에, 효율적인 코드 구성이 어렵다. 그에 따라, 범위 인코딩은 그의 보다 많은 유연성으로 인해 보다 큰 압축을 갖는 것으로 생각되지만, 실제의 응용은 산술 코드의 보다 많은 계산 요구에 의해 방해를 받는다.A special case of entropy coding is when the input alphabet is limited to binary symbols. Here, the VLC scheme must group the input symbols with the possibility of compression, but the efficient code construction is difficult because the probability distribution can change after each bit. As such, range encoding is thought to have greater compression because of its greater flexibility, but the actual application is hindered by the more computational demands of the arithmetic code.

이 인코딩 방식들 둘 다에 대한 공통적인 문제는 이들이 사실상 직렬적(serial)이라는 것이다. 고품질 비디오 디코딩 등의 어떤 중요한 실제 응용에서, 엔트로피 디코더는 아주 높은 출력 속도에 도달해야만 하며, 이는 제한된 처리 능력 또는 속도를 갖는 장치에 대한 문제를 제기할 수 있다.A common problem with both of these encoding schemes is that they are in fact serial. In some important practical applications, such as high quality video decoding, the entropy decoder must reach very high output speeds, which can pose a problem for devices with limited processing power or speed.

CAVLC 및 CABAC(둘 다가 H.264/AVC에서 사용됨) 등의 어떤 엔트로피 코딩 방식들과 함께 사용되는 기법들 중 하나는 컨텍스트 모델링(context modeling)이다. 컨텍스트 모델링에서, 입력 시퀀스의 각각의 비트는 컨텍스트를 가지며, 여기서 컨텍스트는 그에 앞서 있는 비트들 등의 다른 비트들의 어떤 서브셋, 또는 보조 정보, 또는 둘 다에 의해 주어질 수 있다. 1차 컨텍스트 모델(first-order context model)에서, 컨텍스트는 전적으로 이전의 비트(심볼)에 의존할 수 있다. 많은 경우에, 컨텍스트 모델은 적응적일 수 있고, 따라서 시퀀스의 추가의 비트들이 처리됨에 따라 주어진 컨텍스트에 대한 심볼들과 연관되어 있는 확률이 변할 수 있다. 또 다른 경우에, 주어진 비트의 컨텍스트는 시퀀스에서의 그의 위치[예컨대, 계수들의 행렬 또는 블록에서의 계수의 위치 또는 순위(ordinal)]에 의존할 수 있다.One of the techniques used with some entropy coding schemes such as CAVLC and CABAC (both used in H.264 / AVC) is context modeling. In context modeling, each bit of the input sequence has a context, where the context can be given by some subset of other bits, such as bits preceding it, or by supplemental information, or both. In a first-order context model, the context may depend entirely on the previous bit (symbol). In many cases, the context model can be adaptive, so the probability associated with symbols for a given context can change as additional bits of the sequence are processed. In another case, the context of a given bit may depend on its position in the sequence (eg, the position or order of the coefficients in the matrix or block of coefficients).

예시적인 인코딩 프로세스(100)의 블록도를 도시하고 있는 도 3을 참조한다. 인코딩 프로세스(100)는 컨텍스트 모델링 구성요소(104) 및 엔트로피 코더(106)를 포함하고 있다. 컨텍스트 모델링 구성요소(104)는 입력 시퀀스 x(102) - 이 예에서, 비트 시퀀스 (b₀, b₁, …, b_n)임 - 를 수신한다. 이 예시적인 예시에서, 컨텍스트 모델링 구성요소(104)는 아마도 시퀀스에서의 하나 이상의 이전 비트들에 기초하여 또는 보조 정보에 기초하여 각각의 비트 b_i에 대한 컨텍스트를 결정하고, 이 컨텍스트에 기초하여, 그 비트 b_i와 연관되어 있는 확률 p_i를 결정하며, 여기서 이 확률은 비트가 LPS(Least Probable Symbol)일 확률이다. 규약 또는 응용에 따라, LPS는, 이진 실시예에서, "0" 또는 "1"일 수 있다. 확률을 결정하는 것은 그 자체가 그 동일한 컨텍스트에 대한 이전의 비트들/심볼들에 의존할 수 있다.Reference is made to FIG. 3, which shows a block diagram of an example encoding process 100. The encoding process 100 includes a context modeling component 104 and an entropy coder 106. The context modeling component 104 receives an input sequence x 102, which in this example is a bit sequence b ₀ , b ₁ ,..., B _n . In this illustrative example, the context modeling component 104 determines the context for each bit b _i , presumably based on one or more previous bits in the sequence or based on auxiliary information, and based on this context, Determine the probability p _i associated with that bit b _i , where this probability is the probability that the bit is a Least Probable Symbol (LPS). Depending on the protocol or application, the LPS may, in binary embodiments, be "0" or "1". Determining the probability may itself depend on previous bits / symbols for that same context.

컨텍스트 모델링 구성요소는 입력 시퀀스, 즉 비트들 (b₀, b₁, …, b_n)을 그 각자의 확률들 (p₀, p₁, ..., p_n)과 함께 출력한다. 이 확률들은 컨텍스트 모델에 의해 결정되는 추정된 확률이다. 이 데이터는 이어서 확률 정보를 사용하여 입력 시퀀스를 인코딩하는 엔트로피 코더(106)에 입력된다. 예를 들어, 엔트로피 코더(106)는 이진 산술 코더일 수 있다. 엔트로피 코더(106)는 인코딩된 데이터의 비트스트림(108)을 출력한다.The context modeling component outputs an input sequence, i.e. bits (b ₀ , b ₁ , ..., b _n ) with its respective probabilities (p ₀ , p ₁ , ..., p _n ). These probabilities are the estimated probabilities determined by the context model. This data is then input to an entropy coder 106 that encodes the input sequence using probability information. For example, entropy coder 106 may be a binary arithmetic coder. Entropy coder 106 outputs a bitstream 108 of encoded data.

입력 시퀀스의 각각의 비트가 컨텍스트 모델을 갱신하기 위해 직렬적으로 처리되고, 직렬 비트들(serial bits) 및 확률 정보가 엔트로피 코더(106)에 제공되고, 엔트로피 코더(106)는 이어서 비트들을 직렬적으로 엔트로피 코딩하여 비트스트림(108)을 생성한다는 것을 잘 알 것이다. 기술 분야의 당업자라면, 어떤 실시예들에서, 명확한 확률 정보가 컨텍스트 모델링 구성요소(104)로부터 엔트로피 코더(106)로 전달되지 않을 수 있고; 오히려, 어떤 경우에, 각각의 비트에 대해, 컨텍스트 모델링 구성요소(104)가 컨텍스트 모델 및 입력 시퀀스(102)의 현재의 컨텍스트에 기초하여 컨텍스트 모델링 구성요소(104)에 의해 행해지는 확률 추정을 반영하는 인덱스 또는 기타 표시자를 엔트로피 코더(106)로 송신할 수 있다는 것을 잘 알 것이다. 인덱스 또는 기타 표시자는 그의 대응하는 비트와 연관되어 있는 확률 추정치를 나타낸다.Each bit of the input sequence is processed serially to update the context model, serial bits and probability information are provided to the entropy coder 106, and the entropy coder 106 then serializes the bits. It will be appreciated that the bitstream 108 is generated by entropy coding. As those skilled in the art will appreciate, in some embodiments, explicit probability information may not be passed from the context modeling component 104 to the entropy coder 106; Rather, in some cases, for each bit, the context modeling component 104 reflects the probability estimates made by the context modeling component 104 based on the context model and the current context of the input sequence 102. It will be appreciated that an index or other indicator may be sent to entropy coder 106. An index or other indicator represents a probability estimate associated with its corresponding bit.

어떤 실시예들에서, 엔트로피 코더(106)는 입력 시퀀스(102)를 인코딩하는 병렬 처리 구조를 가질 수 있다. 이러한 실시예에서, 엔트로피 코더(106)는 각각이 입력 시퀀스(102)의 일부분을 처리하는 복수의 엔트로피 코더들을 포함할 수 있다. 어떤 경우에, 입력 시퀀스는 디멀티플렉싱되고, 각자의 비트들과 연관되어 있는 추정된 확률들에 기초하여, 병렬 엔트로피 코더들 간에 할당될 수 있다. 환언하면, 입력 시퀀스(102)로부터의 비트는, 그의 추정된 확률에 기초하여, 병렬 엔트로피 코더들 중 하나에 할당된다.In some embodiments, entropy coder 106 may have a parallel processing structure that encodes input sequence 102. In such embodiments, entropy coder 106 may include a plurality of entropy coders, each of which processes a portion of input sequence 102. In some cases, the input sequence may be demultiplexed and allocated between parallel entropy coders based on estimated probabilities associated with the respective bits. In other words, the bits from the input sequence 102 are assigned to one of the parallel entropy coders based on their estimated probability.

디코더에서, 인코딩된 비트스트림이 정반대 프로세스를 사용하여 디코딩된다. 상세하게는, 디코더는, 재구성된 시퀀스의 그 다음의 재구성된 심볼의 컨텍스트를 결정하기 위해, 동일한 컨텍스트 모델링 및 확률 추정 프로세스를 수행한다. 그 다음의 재구성된 심볼에 대해 결정된 컨텍스트에 기초하여, 추정된 확률이 결정된다. 디코딩된 심볼들을 획득하기 위해, 엔트로피 코더(들)에 의해 출력된 코드워드들로 이루어져 있을 수 있는 인코딩된 비트스트림이 디코딩된다. 그 추정 확률들에 대응하는 디코딩된 심볼들을 획득하기 위해, 컨텍스트/확률을 결정하는 것은 코드워드들을 디코딩하는 것과 인터리빙되어 있다.At the decoder, the encoded bitstream is decoded using the opposite process. In detail, the decoder performs the same context modeling and probability estimation process to determine the context of the next reconstructed symbol of the reconstructed sequence. Based on the context determined for the next reconstructed symbol, an estimated probability is determined. To obtain decoded symbols, an encoded bitstream, which may consist of codewords output by the entropy coder (s), is decoded. In order to obtain decoded symbols corresponding to the estimated probabilities, determining the context / probability is interleaved with decoding the codewords.

병렬 인코딩 실시예에서, 디코더는 인코딩된 비트스트림을 복수의 디코딩된 서브시퀀스들(decoded subsequences) - 각각이 추정된 확률과 연관되어 있음 - 로 디멀티플렉싱하도록 구성되어 있을 수 있다. 컨텍스트 모델링 및 확률 추정의 결과, 이어서 연관된 디코딩 서브시퀀스로부터 재구성된 심볼이 선택된다. 이러한 구현예에서, 인코딩된 비트스트림의 디코딩이 컨텍스트 모델링 및 확률 추정으로부터 역인터리빙(de-interleave)되는 것으로 간주될 수 있다는 것을 잘 알 것이다.In a parallel encoding embodiment, the decoder may be configured to demultiplex the encoded bitstream into a plurality of decoded subsequences, each associated with an estimated probability. As a result of the context modeling and probability estimation, the reconstructed symbols are then selected from the associated decoding subsequences. It will be appreciated that in such implementations, decoding of the encoded bitstream may be considered to be de-interleave from context modeling and probability estimation.

이하의 상세한 설명으로부터, 본 출원이 직렬 또는 병렬 엔트로피 코딩 및 디코딩에 적용가능하다는 것을 잘 알 것이다.From the following detailed description, it will be appreciated that the present application is applicable to serial or parallel entropy coding and decoding.

본 출원은 마지막 유효 계수 위치가 인코딩되고 위치의 축들 중 하나의 축의 컨텍스트가 다른 축에 의존하는 인코딩 및 디코딩 프로세스를 제안한다.The present application proposes an encoding and decoding process in which the last significant coefficient position is encoded and the context of one of the axes of the position depends on the other axis.

이하의 예들은 특정하여 비디오 인코딩, 상세하게는, ITU-T H.264/AVC 표준에 규정되어 있는 바와 같이 CABAC에 정의되어 있는 시퀀스들 sig[i, j] 및 last[i, j]의 인코딩을 언급할 수 있다. 본 출원이 CABAC 내에서의 이 2개의 특정의 시퀀스들의 인코딩 및 디코딩으로 제한되지 않고 비디오 인코딩 및 디코딩 또는 H.264/AVC 표준으로도 제한되지 않는다는 것을 잘 알 것이다. 본 출원은 비디오, 영상, 및, 어떤 경우에, 오디오를 포함한 기타 데이터 시퀀스에 적용될 수 있는 인코딩 및 디코딩 방법 및 프로세스를 기술하고 있다. 본 명세서에서의 방법 및 프로세스는 마지막 유효 계수 위치를 인코딩하고 위치가 2차원이거나 2차원으로서 모델링될 수 있는 컨텍스트 모델을 수반하는 인코딩 및 디코딩 프로세스에 적용가능하다. 이하의 예들은 예로서 이진 소스를 언급할 수 있지만, 본 출원은 보다 일반적으로 임의의 유한 알파벳 소스에 적용가능하다.The following examples specifically describe video encoding, in particular the encoding of the sequences sig [i, j] and last [i, j] defined in CABAC as defined in the ITU-T H.264 / AVC standard. May be mentioned. It will be appreciated that the present application is not limited to the encoding and decoding of these two specific sequences within CABAC, nor to the video encoding and decoding or the H.264 / AVC standard. The present application describes encoding and decoding methods and processes that can be applied to video, video, and, in some cases, other data sequences, including audio. The method and process herein is applicable to an encoding and decoding process involving encoding a last significant coefficient position and involving a context model where the position can be modeled as two-dimensional or two-dimensional. The examples below may refer to binary sources as examples, but the present application is more generally applicable to any finite alphabet source.

앞서 기술된 바와 같이, 예시적인 비디오 및 영상 인코딩 및 디코딩 프로세스는, 잔차 데이터를 픽셀 영역(pixel domain)으로부터 변환 영역(transform domain)으로 변환하기 위해, 블록 기반 변환을 이용한다. 예시적인 블록 기반 변환은 4x4 DCT 또는 8x8 DCT이다. 어떤 응용에서는, 다른 크기 또는 유형의 변환(DST 또는 DFT 등)이 사용될 수 있다. 변환 데이터의 행렬 또는 집합이 이어서 양자화기에 의해 양자화되어, 양자화된 변환 영역 계수들의 행렬 또는 집합을 생성한다. 본 출원은 양자화된 변환 영역 계수들의 행렬을, 계수들 중 임의의 계수의 위치가 2차원 좌표 [x,y]에 의해 지정될 수 있는 데이터의 순서 집합(ordered set)을 의미하는 행렬, 집합, 또는 블록이라고 지칭할 수 있다.As described above, the exemplary video and image encoding and decoding process utilizes block-based transformation to transform residual data from the pixel domain to a transform domain. Exemplary block based transformations are 4x4 DCT or 8x8 DCT. In some applications, other sizes or types of transforms (such as DST or DFT) may be used. A matrix or set of transform data is then quantized by a quantizer to produce a matrix or set of quantized transform region coefficients. The present application is directed to a matrix of quantized transform region coefficients, a matrix, a set, meaning an ordered set of data in which the position of any of the coefficients can be specified by two-dimensional coordinates [x, y], Or block.

양자화된 변환 영역 계수들의 블록의 엔트로피 인코딩은 컨텍스트 모델에 기초하고 있다. 예를 들어, H.264/AVC에서, 블록은, 먼저 유의성 맵을 인코딩하는 것에 의해, 엔트로피 인코딩된다. 유의성 맵은 2개의 시퀀스 last[i, j] 및 sig[i, j]를 포함하고 있다. 시퀀스 sig[i, j]는 DCT 블록에서의 각각의 위치에 영이 아닌 계수가 있는지를 나타내는 이진 시퀀스이다. 다른 시퀀스 last[i, j]는 DCT 블록의 영이 아닌 계수들에 매핑되고 그 영이 아닌 계수가 (H.264 및 다른 변환 영역 영상 또는 비디오 인코딩 방식에서 사용되는 지그재그 스캔 순서에서) DCT 블록의 마지막 영이 아닌 계수인지를 나타내는 이진 시퀀스이다. 유의할 점은, 인덱스 [i, j]가 블록에서의 2차원 좌표 위치가 아니고; 인덱스 i는 블록에 대한 인덱스이고 인덱스 j는 이하에서 기술되는 재그재그 스캔 순서에서의 1차원 계수 위치에 대한 인덱스라는 것이다.Entropy encoding of a block of quantized transform region coefficients is based on a context model. For example, in H.264 / AVC, a block is entropy encoded by first encoding the significance map. The significance map contains two sequences last [i, j] and sig [i, j]. The sequence sig [i, j] is a binary sequence indicating whether each position in the DCT block has a nonzero coefficient. The other sequence last [i, j] maps to nonzero coefficients in the DCT block and the nonzero coefficients correspond to the last zero of the DCT block (in the zigzag scan order used in H.264 and other transform domain image or video encoding schemes). Binary sequence that indicates whether or not a coefficient. Note that the index [i, j] is not a two-dimensional coordinate position in the block; Index i is the index for the block and index j is the index for the one-dimensional count position in the zigzag scan order described below.

H.264 표준은 DCT 블록의 계수들을 인코딩하는 지그재그 스캔 순서를 명시하고 있다. 예를 들어, 4x4 DCT 블록을 참조하면, H.264 표준은 16개의 계수들을 도 6에 예시되어 있는 지그재그 순서로 인코딩한다. 도 6에 도시된 바와 같이, 스캔 순서는 블록의 좌측 상부 코너에서 시작하고 우측 하부 코너로 지그재그 패턴을 따라간다. 이 코딩 순서로 배열되었으면, 제i 블록에 대한 계수들의 얻어진 시퀀스는 X(i, 0), ..., X(i, 15)이다.The H.264 standard specifies the zigzag scan order for encoding the coefficients of a DCT block. For example, referring to a 4x4 DCT block, the H.264 standard encodes 16 coefficients in the zigzag order illustrated in FIG. As shown in FIG. 6, the scan order starts at the upper left corner of the block and follows the zigzag pattern to the lower right corner. If arranged in this coding order, the resulting sequence of coefficients for the ith block is X (i, 0), ..., X (i, 15).

H.264/AVC 표준은 블록들이 순차적으로 인코딩되는 방식을 사용한다. 환언하면, 각각의 시퀀스 X(i, 0), ..., X(i, 15)가 차례로 인코딩된다. H.264/AVC에서 사용되는 컨텍스트 모델은 각각의 벡터 X[i, j]에 대한 2개의 이진 시퀀스 sig[i, j] 및 last[i, j]를 결정하는 것을 포함한다. 이어서, 계수들의 실제 값(레벨이라고 함)이 또한 인코딩된다.The H.264 / AVC standard uses a way in which blocks are encoded sequentially. In other words, each sequence X (i, 0), ..., X (i, 15) is encoded in turn. The context model used in H.264 / AVC includes determining two binary sequences sig [i, j] and last [i, j] for each vector X [i, j]. Subsequently, the actual value of the coefficients (called the level) is also encoded.

예로서 예시하기 위해, 다음과 같은 예시적인 계수 시퀀스 X를 생각해보자:To illustrate by way of example, consider the following example coefficient sequence X:

3, 5, 2, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0 i=03, 5, 2, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0 i = 0

6, 4, 0, 3, 3, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0 i=16, 4, 0, 3, 3, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0 i = 1

4, 2, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 i=24, 2, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 i = 2

이 계수 시퀀스들에 대한 시퀀스 sig[i, j] 및 last[i, j]는 다음과 같다:The sequences sig [i, j] and last [i, j] for these coefficient sequences are as follows:

sig [0, j] = 1, 1, 1,0, 1, 1, 0, 1sig [0, j] = 1, 1, 1,0, 1, 1, 0, 1

last [0, j] = 0, 0, 0, 0, 0, 1last [0, j] = 0, 0, 0, 0, 0, 1

sig[1, j] = 1, 1, 0, 1, 1, 0, 1sig [1, j] = 1, 1, 0, 1, 1, 0, 1

last [1, j] = 0, 0, 0, 0, 1last [1, j] = 0, 0, 0, 0, 1

sig [2, j] = 1, 1, 0, 0, 1sig [2, j] = 1, 1, 0, 0, 1

last [2, j] = 0, 0, 1last [2, j] = 0, 0, 1

last[i, j]가 sig[i, j]가 영이 아닐 때의 값들만을 포함하고, 양 시퀀스가 마지막 영이 아닌 계수 이후에 종료한다는 것을 잘 알 것이다. 그에 따라, last[i, j]는 sig[i, j]에서의 모든 비트 j에 대한 비트를 꼭 포함하지는 않을 것이다. 이 시퀀스들의 길이가 계수 값들에 따라 변할 수 있다는 것을 잘 알 것이다. 마지막으로, sig[i, j] 시퀀스가 영이 아닌 비트를 포함하는지를 알고 있는 경우, last[i, j] 시퀀스에 대응하는 비트가 있는지 여부 - 이 시퀀스들의 인코딩 및 디코딩이 비트 위치마다 인터리빙된다는 것을 의미함 - 를 알게 될 것임을 잘 알 것이다.It will be appreciated that last [i, j] only contains values when sig [i, j] is nonzero, and that both sequences end after the last non-zero coefficient. As such, last [i, j] will not necessarily include the bits for every bit j in sig [i, j]. It will be appreciated that the length of these sequences may vary with coefficient values. Finally, if we know that the sig [i, j] sequence contains nonzero bits, whether there are any bits corresponding to the last [i, j] sequence-meaning that the encoding and decoding of these sequences are interleaved per bit position You will know well.

H.264/AVC 예에서, 비트에 대한 확률(때때로 "상태"라고 함)은 그의 컨텍스트에 기초하여 결정된다. 구체적으로는, 그 동일한 컨텍스트에 대한 비트 이력(bit history)이 선택되거나 그 비트에 할당되는 확률을 결정한다. 예를 들어, 주어진 시퀀스에서 j번째 위치에 있는 비트에 대해, 그의 확률은, 이전의 시퀀스들(i-1, 기타 등등)에서의 j번째 위치에 있는 비트들의 이력에 기초하여, 64개의 가능한 확률들 중에서 선택된다.In the H.264 / AVC example, the probability (sometimes called "state") for a bit is determined based on its context. Specifically, determine the probability that a bit history for that same context is selected or assigned to that bit. For example, for a bit at the j th position in a given sequence, its probability is 64 possible probabilities, based on the history of the bits at the j th position in the previous sequences (i-1, etc.). It is selected from these.

앞서 기술된 바와 같이, 비트들은 그의 확률에 기초하여 인코딩된다. 어떤 예시적인 실시예에서, 병렬 인코딩이 사용될 수 있고, 어떤 경우에, 병렬 인코딩은 각각의 확률에 관련된 엔트로피 코더를 포함할 수 있다. 다른 예시적인 실시예에서, 직렬 엔트로피 인코딩(serial entropy encoding)이 사용될 수 있다. 어느 경우든지, 엔트로피 코더는 심볼들을 그의 연관된 확률에 기초하여 인코딩한다.As described above, the bits are encoded based on their probability. In some example embodiments, parallel encoding may be used, and in some cases, parallel encoding may include an entropy coder associated with each probability. In another exemplary embodiment, serial entropy encoding may be used. In either case, the entropy coder encodes the symbols based on their associated probabilities.

디코더에서, 시퀀스를 재구성하는 데 동일한 컨텍스트 모델링 및 확률 추정이 행해진다. 재구성된 심볼을 획득하기 위해 인코딩된 비트스트림을 디코딩하는 데 확률 추정이 사용된다. 심볼들이 인코딩된 비트스트림의 디코딩된 부분들로부터 선택되고, 재구성된 심볼들의 시퀀스를 심볼들의 연관된 확률 및 컨텍스트 모델에 기초하여 형성하기 위해 인터리빙된다.At the decoder, the same context modeling and probability estimation is done to reconstruct the sequence. Probability estimation is used to decode the encoded bitstream to obtain a reconstructed symbol. The symbols are selected from the decoded portions of the encoded bitstream and interleaved to form a sequence of reconstructed symbols based on the associated probabilities and context model of the symbols.

CABAC를 사용하는 H.264/AVC에서의 잔차 블록에 대한 구문은 이하의 표에 기재되어 있다:The syntax for residual blocks in H.264 / AVC using CABAC is described in the table below:

상기 표는 sig[i, j], last[i, j] 및 레벨 정보를 포함하는 비트스트림을 엔트로피 디코딩하는 의사 코드를 제공한다. 기술자(descriptor) "ae(v)"는 표에서 그 행에 나타낸 값들을 획득하기 위해 비트스트림으로부터의 비트들을 엔트로피 디코딩하는 것을 나타낸다.The table provides pseudo code for entropy decoding a bitstream comprising sig [i, j], last [i, j] and level information. The descriptor “ae (v)” refers to entropy decoding the bits from the bitstream to obtain the values indicated in that row in the table.

유의할 점은, 이 구문에서 sig[i, j] 및 last[i, j]의 비트들이 인터리빙된다는 것이다. 예를 들어, "if(coded_block_flag)" 루프는 sig[i, j] 시퀀스로부터의 비트를 디코딩하는 것(significant_coeff_flag[i]라고 함) 및 그에 뒤이어서, significant_coeff_flag[i]가 영이 아닌 경우, last[i, j] 시퀀스로부터의 비트를 디코딩하는 것(last_significant_coeff_flag[i]라고 함)을 포함한다.Note that in this syntax the bits of sig [i, j] and last [i, j] are interleaved. For example, the "if (coded_block_flag)" loop decodes bits from the sig [i, j] sequence (called significant_coeff_flag [i]), followed by last [i if non significant_coeff_flag [i] is nonzero. , j] decoding the bits from the sequence (called last_significant_coeff_flag [i]).

또한, 이 구문이 블록에 대한 지그재그 스캔 순서에서의 마지막 유효 계수의 1차원 위치를 신호하기 위해 시퀀스에 의존한다는 것을 잘 알 것이다.It will also be appreciated that this syntax depends on the sequence to signal the one-dimensional position of the last significant coefficient in the zigzag scan order for the block.

본 출원의 한 측면에 따르면, 블록 내에서의 마지막 유효 계수의 2차원 좌표를 신호하기 위해 이 구문이 수정된다. 예를 들어, 4x4 블록에서, 마지막 유효 계수의 위치는 x-좌표 및 y-좌표를 가지며, 여기서 x 및 y는 0과 3 사이의 범위에 있다. 이 좌표 쌍이 이 구문에서 last[i, j] 시퀀스 대신에 전달될 수 있다.According to one aspect of the present application, this syntax is modified to signal the two-dimensional coordinates of the last significant coefficient in the block. For example, in a 4x4 block, the position of the last significant coefficient has x- and y-coordinates, where x and y are in the range between 0 and 3. This pair of coordinates can be passed in this syntax instead of the last [i, j] sequence.

발생되는 문제는 마지막 유효 계수의 2차원 좌표를 어떻게 효율적이고 효과적으로 코딩하느냐이다. 추가적인 문제는 이 파라미터의 인코딩을 위해 컨텍스트 모델에 대한 수정이 행해져야만 하는지 여부이다.The problem that arises is how to efficiently and effectively code the two-dimensional coordinates of the last significant coefficient. A further problem is whether modifications to the context model must be made for the encoding of this parameter.

본 출원의 한 측면에 따르면, 2차원 좌표들이 차례로 인코딩되고, 이 경우 제1 좌표의 값은 제2 좌표를 인코딩하는 것에 대한 컨텍스트를 결정하는 데 부분적으로 사용된다. 이 개념은, 경험적으로 관찰되는 바와 같이, 쌍에 있는 2개의 좌표의 값들 사이에 어느 정도의 상관이 있다는 것에 기초하고 있다. x-좌표의 값은 대응하는 y-좌표에 대한 값의 확률에 중요한 영향을 미치는 경향이 있다. 이 관계는 인코딩의 효율을 향상시키는 데 이용될 수 있다.According to one aspect of the present application, two-dimensional coordinates are encoded in sequence, in which case the value of the first coordinate is used in part to determine the context for encoding the second coordinate. This concept is based on some correlation between the values of the two coordinates in the pair, as observed empirically. The value of the x-coordinate tends to have a significant effect on the probability of the value for the corresponding y-coordinate. This relationship can be used to improve the efficiency of the encoding.

본 출원의 한 측면에 따른 일 실시예에서, x-좌표 값 및 y-좌표 값을 이진화하기 위해 고정 길이 코드가 사용된다. 다른 실시예들에서, 다른 이진화 방식들이 사용될 수 있다.In one embodiment according to one aspect of the present application, a fixed length code is used to binarize the x- and y-coordinate values. In other embodiments, other binarization schemes may be used.

본 출원의 다른 측면에 따르면, 마지막 유효 계수가 [0,0]에서의 DC 계수라는 것을 신호하기 위한 플래그를 포함하기 위해 잔차 블록을 인코딩하는 구문이 추가로 수정된다. 이것은 실제의 구현에서 흔한 일이고, 이 상황이 비트스트림에서 신호될 때, 비트스트림은 그 상황에서 마지막 유효 계수에 대한 x-좌표 값 및 y-좌표 값을 포함하지 않을 수 있고, 그에 의해 압축 효율을 향상시킨다.According to another aspect of the present application, the syntax for encoding the residual block is further modified to include a flag to signal that the last significant coefficient is the DC coefficient at [0,0]. This is common in practical implementations, and when this situation is signaled in the bitstream, the bitstream may not include the x- and y-coordinate values for the last significant coefficient in that situation, thereby compressing efficiency To improve.

본 명세서에서의 설명에서, "위치"라는 용어는, 경우에 따라, x-좌표 또는 y-좌표를 말하는 데 때때로 사용될 수 있다.In the description herein, the term "position" may sometimes be used to refer to x- or y-coordinates, as the case may be.

이하에 기술되는 예시적인 실시예가 y-위치의 비트들을 인코딩하는 것에 대한 컨텍스트가 x-위치의 값에 부분적으로 의존하는 것을 명시하고 있지만, 그 순서는 임의적이다. 다른 실시예에서, x-위치를 인코딩하는 것에 대한 컨텍스트가 y-위치의 값에 부분적으로 의존할 수 있다.Although the exemplary embodiment described below specifies that the context for encoding bits of the y-position depends in part on the value of the x-position, the order is arbitrary. In another embodiment, the context for encoding the x-position may depend in part on the value of the y-position.

본 출원의 한 측면에 다른, 잔차 블록을 인코딩하는 구문의 한 예시적인 실시예가 이하의 표에 기재되어 있다:One exemplary embodiment of the syntax for encoding residual blocks, which is one aspect of the present application, is described in the following table:

상기 의사 코드로부터, 유의할 점은, 디코더가 x-위치(last_pos_x=0) 및 y-위치(last_pos_y=0)에 대한 값들을 초기화하고, 이어서 마지막 유효 계수가 DC 계수라는 것을 신호하는 플래그(last_0_flag)를 판독한다는 것이다. last_0_flag가 마지막 유효 계수가 [0,0]에 있다는 것을 나타내지 않는 경우에만 이 루프가 직후에 수행된다. 그 경우에, 디코더는 이어서 비트스트림으로부터 last_pos_x 및 last_pos_y에 대한 값들을 판독한다.From the pseudo code, note that the decoder initializes the values for the x-position (last_pos_x = 0) and the y-position (last_pos_y = 0), and then the flag last_0_flag to signal that the last significant coefficient is the DC coefficient. Is to read. This loop is performed immediately after only if last_0_flag does not indicate that the last significant coefficient is at [0,0]. In that case, the decoder then reads the values for last_pos_x and last_pos_y from the bitstream.

또한, 유의할 점은, 이 실시예에서, last_pos_x가 0인 경우, 그 상황에서의 2차원 위치가 플래그 설정으로 인해 [0,0]일 수 없고 따라서 y-위치가 1 이상이어야만 하기 때문에, y-위치가 그의 값을 1만큼 감소시킨 것으로서 인코딩된다는 것이다. 이 인코딩 구문으로 인해, last_pos_x가 0으로 설정되어 있는 경우, last _pos_y에 대한 값은 그의 실제 값으로 복원하기 위해 1만큼 증가된다. 이 특정의 상황에서 last_pos_y를 인코딩을 위해 1만큼 감소시키는 것을 엔트로피 인코딩에서의 효율을 위한 것이다.Also note that in this embodiment, if last_pos_x is 0, since the two-dimensional position in that situation cannot be [0,0] due to the flag setting, the y-position must be at least 1, so y- The position is encoded as decreasing its value by one. Due to this encoding syntax, when last_pos_x is set to 0, the value for last_pos_y is incremented by 1 to restore to its actual value. In this particular situation, reducing last_pos_y by 1 for encoding is for efficiency in entropy encoding.

2차원 좌표가 판독된 후에, 디코더는 비트스트림으로부터 significant_coeff_flag[i] 시퀀스를 계속하여 판독한다. 인덱스 [i]는 인덱스를 좌표 위치에 매핑하기 위한 지그재그 표를 사용하여 현재의 pos x 및 pos_y 값들을 설정한다. pos_x 및 pos_y 값들이, 각각, last _pos_x 및 last _pos_y 값들과 일치할 때, last_significent_coeff 플래그는 비트별로 significant_coeff_flag[i]로서 판독되는 sig[i, j] 시퀀스를 판독하는 것을 중지시키도록 설정된다.After the two-dimensional coordinates have been read, the decoder continues reading the significant_coeff_flag [i] sequence from the bitstream. Index [i] sets the current pos x and pos_y values using a zigzag table for mapping the index to the coordinate position. When the pos_x and pos_y values coincide with the last_pos_x and last_pos_y values, respectively, the last_significent_coeff flag is set to stop reading the sig [i, j] sequence read as significant_coeff_flag [i] bit by bit.

이제부터, 본 출원의 한 측면에 따라 생성된 비트스트림(200)의 구조를 개략적으로 나타내고 있는 도 7을 참조한다. 도 7에 예시되어 있는 비트스트림(200)의 일부분은 잔차 블록에 관련된 데이터를 나타내고 있다. 비트스트림(200)의 일부분은, 도시된 바와 같이, 엔트로피 인코딩 이전의 또는 엔트로피 디코딩 이후의 비트스트림이다. 엔트로피 인코딩은 CABAC, CAVLC, 또는 기타 컨텍스트 기반 엔트로피 코딩 방식들을 포함할 수 있다.Reference is now made to FIG. 7, which schematically illustrates the structure of a bitstream 200 generated in accordance with an aspect of the present application. A portion of the bitstream 200 illustrated in FIG. 7 represents data related to the residual block. A portion of the bitstream 200 is, as shown, a bitstream before entropy encoding or after entropy decoding. Entropy encoding may include CABAC, CAVLC, or other context based entropy coding schemes.

선두 플래그(leading flag)는 coded_block_flag이다. 이 플래그에 뒤이어서, [0,0]의 마지막 유효 계수 좌표를 신호하는 last_0_flag가 온다. 이어서, last_0_flag가 설정되어 있지 않은 것으로 가정하면, 비트스트림(200)은 last _pos_x 및 last _pos_y 값들을 포함한다. 이들에 뒤이어서, sig[i, j] 시퀀스, 즉 유효 계수 시퀀스가 따라온다. 마지막으로, 비트스트림(200)의 일부분은 레벨 정보를 포함하고 있다. last_0_flag가 설정되어 있는 경우에, last_pos_x, last_pos_y, 및 유효 계수 시퀀스가 생략되어 있다는 것을 잘 알 것이다.The leading flag is coded_block_flag. This flag is followed by last_0_flag, which signals the last significant coefficient coordinate of [0,0]. Subsequently, assuming that last_0_flag is not set, the bitstream 200 includes last _pos_x and last _pos_y values. This is followed by a sig [i, j] sequence, that is, a sequence of significant coefficients. Finally, a portion of the bitstream 200 contains level information. It will be appreciated that when last_0_flag is set, the last_pos_x, last_pos_y, and significant coefficient sequences are omitted.

일 실시예에서, last _pos_x 및 last _pos_y 값들이 고정 길이 이진화(fixed-length binarization)를 사용하여 이진화된다. 이들 이진값의 길이는 변환 행렬의 크기, 즉 양자화된 변환 영역 계수들의 블록의 크기에 의존한다.In one embodiment, the last _pos_x and last _pos_y values are binarized using fixed-length binarization. The length of these binary values depends on the size of the transform matrix, i.e. the size of the block of quantized transform region coefficients.

이제부터, 잔차 블록을 인코딩할 때 마지막 유효 계수 데이터를 인코딩하는 예시적인 방법(300)을 나타내는 도 8을 참조한다. 예시적인 방법(300)은 마지막 유효 계수 좌표를 결정하는 동작(302)을 포함하고 있다. 이들 좌표는, NxN 변환 블록 크기의 경우, 0부터 N-1까지의 범위에 있는 x-좌표 및 y-좌표이다. 이 좌표들은 x-위치 및 y-위치라고 할 수 있다.Reference is now made to FIG. 8, which illustrates an example method 300 of encoding last significant coefficient data when encoding a residual block. Exemplary method 300 includes an operation 302 for determining the last significant coefficient coordinate. These coordinates are the x- and y-coordinates in the range from 0 to N-1 for the NxN transform block size. These coordinates can be referred to as x- and y-positions.

동작(304)에서, 2개의 위치들이 이진화된다. 앞서 살펴본 바와 같이, 이 위치들은 고정 길이 이진화를 사용하여 이진화될 수 있다. 이진화된 위치의 길이는 Log₂(N)일 수 있다. 다른 실시예들에서, 다른 이진화 방식들이 사용될 수 있다.In operation 304, two locations are binarized. As discussed above, these locations can be binarized using fixed length binarization. The length of the binarized position may be Log ₂ (N). In other embodiments, other binarization schemes may be used.

이진화된 위치들을 엔트로피 인코딩하는 것은 이진화된 위치들의 각각의 빈에 대한 컨텍스트를 결정하는 것을 포함한다. 그에 따라, 동작(306)에서, 위치들 중 하나의 위치의 각각의 빈에 대한 컨텍스트가 결정된다. 예시적인 실시예의 목적상, x-위치는 인코딩될 첫번째 위치로 간주될 수 있다. 이진화된 x-위치의 각각의 빈에 대한 컨텍스트는 다수의 인자들에 기초할 수 있다. 예를 들어, 일 실시예에서, x-위치의 각각의 빈의 컨텍스트는 변환 행렬의 크기에 기초할 수 있다. x-위치의 이전의 빈들(있는 경우)도 역시 그 x-위치의 후속 빈들에 대한 컨텍스트에 영향을 줄 수 있다.Entropy encoding the binarized positions includes determining a context for each bin of the binarized positions. Accordingly, in operation 306, the context for each bin of one of the locations is determined. For the purposes of the example embodiment, the x-position may be considered the first position to be encoded. The context for each bin of the binarized x-position may be based on a number of factors. For example, in one embodiment, the context of each bin of the x-position may be based on the size of the transformation matrix. Previous bins, if any, of the x-position may also affect the context for subsequent bins of that x-position.

동작(308)에서, 위치들 중 다른 위치(이 예에서, y-위치)의 각각의 빈에 대한 컨텍스트가 이어서 결정된다. y-위치의 빈들에 대한 컨텍스트를 결정함에 있어서, 이 컨텍스트는 x-위치의 값에 부분적으로 의존한다. y-위치에 대한 빈들에 대한 컨텍스트도 역시 변환 행렬의 크기 및 y-위치에서의 이전 빈들(있는 경우)에 부분적으로 의존할 수 있다.In operation 308, the context for each bin of the other of the locations (in this example, the y-position) is then determined. In determining the context for the bins in the y-position, this context depends in part on the value of the x-position. The context for the bins for the y-position may also depend in part on the size of the transformation matrix and the previous bins, if any, at the y-position.

동작(310)에서, 이진화된 위치들은 이어서 단계(306) 및 단계(308)에서 결정된 바와 같은 그의 연관된 컨텍스트들에 따라 엔트로피 인코딩된다. 엔트로피 인코딩은 CABAC, CAVLC, 또는 임의의 다른 적당한 컨텍스트 기반 엔트로피 인코딩 방식을 포함할 수 있다.In operation 310, the binarized positions are then entropy encoded according to their associated contexts as determined in step 306 and step 308. Entropy encoding may include CABAC, CAVLC, or any other suitable context based entropy encoding scheme.

이제부터, 양자화된 변환 영역 계수 데이터를 재구성하기 위해 인코딩된 데이터의 비트스트림을 디코딩하는 방법(400)을 나타내는 플로우차트를 나타내는 도 9를 참조할 것이다. 도 9에 나타내어져 있는 예시적인 방법(400)은 본 명세서에 기술되어 있는 것과 유사한 구문을 사용하여 인코딩된 데이터의 비트스트림을 처리하는 것을 포함한다. 동작(402)에서, 마지막 유효 계수에 대한 2차원 좌표를 정의하는 2개의 이진화된 위치들을 복원하기 위해 비트스트림의 일부분이 엔트로피 디코딩된다. 이 동작은 위치들 중 제1 위치(예를 들어, 일 실시예에서 x-위치)의 각각의 빈에 대한 컨텍스트를 결정하는 것을 포함한다. 그 컨텍스트는, 예를 들어, 변환 행렬의 크기 및 x-위치의 이전에 디코딩된 빈들(있는 경우)에 의존할 수 있다. 각각의 빈에 대한 컨텍스트, 따라서 연관되어 있는 추정된 확률에 기초하여, 비트스트림의 엔트로피 디코딩은 그 빈을 결정하고, 그 결과 이진화된 x-위치를 재구성한다.Reference will now be made to FIG. 9, which shows a flowchart illustrating a method 400 of decoding a bitstream of encoded data to reconstruct quantized transform region coefficient data. The example method 400 shown in FIG. 9 includes processing a bitstream of encoded data using syntax similar to that described herein. In operation 402, a portion of the bitstream is entropy decoded to recover two binarized positions that define two-dimensional coordinates for the last significant coefficient. This operation includes determining a context for each bin of the first of the locations (eg, the x-position in one embodiment). The context may depend, for example, on the size of the transformation matrix and previously decoded bins, if any, of the x-position. Based on the context for each bin, and thus the estimated probability associated with it, entropy decoding of the bitstream determines the bin and consequently reconstructs the binarized x-position.

동작(402)은 다른 위치(이 예에서, y-위치)의 각각의 빈의 컨텍스트를 결정하는 것을 추가로 포함하고 있다. y-위치의 각각의 빈의 컨텍스트는 변환 행렬의 크기 및 y-위치의 이전의 빈들(있는 경우)에 의존할 수 있지만, 게다가 x-위치의 값에도 의존한다. 각각의 빈에 대한 결정된 컨텍스트 및 그 결과의 추정된 확률에 따른 비트스트림의 엔트로피 디코딩의 결과, 이진화된 y-좌표가 재구성된다.Operation 402 further includes determining the context of each bin at another location (in this example, the y-position). The context of each bin of the y-position may depend on the size of the transformation matrix and the previous bins, if any, of the y-position, but also on the value of the x-position. As a result of the entropy decoding of the bitstream according to the determined context for each bin and the estimated probability of the result, the binarized y-coordinate is reconstructed.

동작(404)에서, 비트스트림으로부터 유효 계수 시퀀스를 엔트로피 디코딩하기 위해 x-위치 및 y-위치가 사용된다. 동작(406)에서, 엔트로피 디코딩을 사용하여 비트스트림으로부터 레벨 정보가 복원된다. 동작(408)에서 양자화된 변환 영역 계수 데이터를 재구성하기 위해 유효 계수 시퀀스 및 레벨 정보 모두가 사용된다.In operation 404, x-position and y-position are used to entropy decode the significant coefficient sequence from the bitstream. In operation 406, level information is recovered from the bitstream using entropy decoding. In operation 408 both effective coefficient sequence and level information are used to reconstruct the quantized transform region coefficient data.

한 예시적인 구현예에서, x-위치의 빈들에 대한 컨텍스트(2차원 좌표의 인코딩될 첫번째 것인 경우)는 다음과 같이 주어진다:In one exemplary implementation, the context for the bins of the x-position (if it is the first to be encoded in two-dimensional coordinates) is given as follows:

ctxIdxInc = binCtxOffset + binCtxIncctxIdxInc = binCtxOffset + binCtxInc

이 표현식에서, ctxIdxInc는 last _pos_x의 주어진 빈에 대한 컨텍스트 인덱스이다. 변수 binCtxOffset은 변환 크기에 기초한 컨텍스트 오프셋이다. 일 실시예에서, 오프셋은 이하의 표에 따라 결정된다:In this expression, ctxIdxInc is the context index for the given bean of last _pos_x. The variable binCtxOffset is the context offset based on the transform size. In one embodiment, the offset is determined according to the following table:

이 예에서, log2TrafoSize는 변환 행렬 크기의 이진 로그, 즉 log₂(N)이다.In this example, log2TrafoSize is the binary log of the transform matrix size, log ₂ (N).

상기 컨텍스트 인덱스에 대한 표현식에서의 다른 변수는 last_pos_x에서의 이전의 빈들(있는 경우)의 값들에 기초하여 적용되는 컨텍스트 인덱스 증분(context index increment)을 나타내는 binCtxInc이다. binCtxInc 변수는, 예를 들어, 이하의 표에 따라 결정될 수 있다:Another variable in the expression for the context index is binCtxInc indicating the context index increment applied based on the values of the previous bins (if any) at last_pos_x. The binCtxInc variable can be determined according to the following table, for example:

이 예에서, binIdx는 last pos_x에서의 빈의 인덱스이고, b₀ 및 b₁은, 각각, 인덱스 0 및 인덱스 1에서의 last_pos_x 이진 시퀀스에서의 빈들이다.In this example, binIdx is the index of the bin at last pos_x and b ₀ and b ₁ are the bins in the last_pos_x binary sequence at index 0 and index 1, respectively.

한 예시적인 실시예에서, last_pos_y의 빈들에 대한 컨텍스트는 하기의 표현식에 따라 결정될 수 있다:In an exemplary embodiment, the context for the beans of last_pos_y may be determined according to the following expression:

ctxIdxInc = binCtxOffsetO + 3*binCtxOffset1 + binCtxIncctxIdxInc = binCtxOffsetO + 3 * binCtxOffset1 + binCtxInc

이 경우에, last_pos_y의 빈에 대한 컨텍스트는 변환 행렬의 크기, last_pos_y의 이전의 빈들(있는 경우), 및 last_pos_x 값에 의존한다. 구체적으로는, 변환 행렬은, 어떤 실시예들에서, 이하의 표에 따라 결정될 수 있는 변수 binCtxOffset0를 통해 컨텍스트에 영향을 준다:In this case, the context for the bin of last_pos_y depends on the size of the transformation matrix, the previous bins (if any) of last_pos_y, and the last_pos_x value. Specifically, the transformation matrix, in some embodiments, affects the context via the variable binCtxOffset0, which can be determined according to the following table:

last_pos_y의 디코딩된 빈들은, 어떤 실시예들에서, 이하의 표에 따라 결정될 수 있는 변수 binCtxInc를 통해 컨텍스트에 영향을 줄 수 있다:Decoded bins of last_pos_y may, in some embodiments, affect the context via the variable binCtxInc, which may be determined according to the following table:

이 예에서, 유의할 점은, 인덱스 0에 있는 첫번째 빈만이 last_pos_y의 임의의 추가의 빈들에 대한 컨텍스트에 영향을 준다는 것이다.In this example, note that only the first bin at index 0 affects the context for any additional bins of last_pos_y.

마지막으로, last_pos_x의 값은 다음과 같이 결정될 수 있는 변수 binCtxOffset1을 통해 last_pos_y의 빈에 대한 컨텍스트에 영향을 줄 수 있다:Finally, the value of last_pos_x can affect the context for the bean of last_pos_y through the variable binCtxOffset1, which can be determined as follows:

last_pos_x == 0인 경우,If last_pos_x == 0,

binCtxOffset1 = 0binCtxOffset1 = 0

그렇지 않은 경우,Otherwise,

binCtxOffset1 = Floor(Log2(last _pos_x)) + 1binCtxOffset1 = Floor (Log2 (last _pos_x)) + 1

상기 예가 last_pos_y의 빈들에 대한 컨텍스트가 last_pos_x의 값에 의존하는 하나의 예시적인 구현예에 불과하다는 것을 잘 알 것이다. 다양한 다른 구현예들이 특정의 응용에 적합하도록 경험적으로 설계된 특정의 표 및 컨텍스트 오프셋으로 실현될 수 있다는 것을 잘 알 것이다.It will be appreciated that the above example is just one exemplary implementation where the context for the beans of last_pos_y depends on the value of last_pos_x. It will be appreciated that various other implementations can be realized with specific tables and context offsets that are empirically designed to suit a particular application.

컨텍스트 인덱스(context index)의 의미는 양자화 파라미터(quantization parameter, QP)의 값에 의존할 수 있다. 즉, 상이한 QP들을 갖는 동일한 구문을 코딩하기 위해 상이한 컨텍스트들이 사용될 수 있다. 그렇지만, 이들 컨텍스트가 동일한 컨텍스트 인덱스를 공유할 수 있다. 한 예시적인 실시예에서, last_pos_x는 허프만 코드로 코딩될 수 있다. 대응하는 허프만 트리(Huffman tree)는 QP의 값에 의존할 수 있다. last_pos_x의 코딩 빈들에 대한 컨텍스트는 허프만 트리에 의존할 수 있고, 그에 의해 상이한 QP들에서 상이하다. 예를 들어, 하나의 컨텍스트가 각각의 QP에 대해 사용되는 경우, 이 컨텍스트들은 동일한 인덱스 0를 공유할 수 있지만, 상이한 의미를 가질 수 있다.The meaning of the context index may depend on the value of the quantization parameter QP. That is, different contexts can be used to code the same syntax with different QPs. However, these contexts can share the same context index. In one example embodiment, last_pos_x may be coded with Huffman code. The corresponding Huffman tree may depend on the value of QP. The context for the coding bins of last_pos_x may depend on the Huffman tree, whereby they are different in different QPs. For example, if one context is used for each QP, these contexts may share the same index 0, but may have different meanings.

대안의 구현예에서, 2차원 데카르트 좌표(Cartesian coordinate) x 및 y를 인코딩하는 대신에, 마지막 유효 계수 위치는 계수가 있는 반대각선(anti-diagonal line) 및 그 선에 있는 계수의 상대 위치로 표현된다. 4x4 계수 블록(600)을 개략적으로 나타내고 있는 도 11을 참조한다. 반대각선(602)이 블록(600) 상에 나타내어져 있다. 유의할 점은, 0부터 6까지 인덱싱되어 있는 7개의 반대각선이 있다는 것이다. 선 0 및 선 6은, 각각, x,y 좌표 [0,0] 및 [3,3]에 대응하는 단일의 위치만을 가진다. 다른 반대각선들(602)은, 각각, 2개 내지 4개의 위치를 가진다. 위치들은 스캔 순서의 관례에 따라 인덱싱될 수 있다. 즉, 반대각선들 상의 위치들의 인덱싱은 그의 방향이 라인마다 교번할 수 있다. 그에 따라, 4x4 블록에서의 각각의 좌표의 표현은, 다음과 같이, 반대각선 기반 2차원 좌표 [a, b]로 표현될 수 있다:In an alternative embodiment, instead of encoding the two-dimensional Cartesian coordinates x and y, the last effective coefficient position is represented by the anti-diagonal line with the coefficients and the relative position of the coefficients in that line. do. Reference is made to FIG. 11, which schematically illustrates a 4 × 4 coefficient block 600. Opposite line 602 is shown on block 600. Note that there are seven opposite angles indexed from 0 to 6. Lines 0 and 6 have only a single position corresponding to the x, y coordinates [0,0] and [3,3], respectively. The other opposite angles 602 have two to four positions, respectively. The locations may be indexed according to the convention of scan order. That is, indexing of locations on opposite angles may alternate in direction in each line. Thus, the representation of each coordinate in a 4x4 block can be expressed in opposite diagonal based two-dimensional coordinates [a, b] as follows:

[0,0] [1,0] [2,2] [3,0][0,0] [1,0] [2,2] [3,0]

[1,1] [2,1] [3,1] [4,2][1,1] [2,1] [3,1] [4,2]

[2,0] [3,2] [4,1] [5,0][2,0] [3,2] [4,1] [5,0]

[3,3] [4,0] [5,1] [6,0][3,3] [4,0] [5,1] [6,0]

유의할 점은, 선 상의 위치에 대한 제2 좌표를 명시할 필요가 없기 때문에, 마지막 위치 [0,0] 및 [6,0]가 그의 반대각선 번호만으로 인코딩될 수 있다는 것이다.Note that since it is not necessary to specify the second coordinate for the position on the line, the last positions [0,0] and [6,0] can be encoded with only their opposite diagonal numbers.

[a, b] 값들은 2차원 좌표 x 및 y에 대해 앞서 기술한 것과 유사한 방식으로 인코딩될 수 있다. 반대각선 인덱스 a에 대한 코딩된 빈들의 총수는 log₂(2N-1)이다. 선 상의 계수의 위치 b에 대한 코딩된 빈들의 수는 값 a에 의존하고, a < N에 대해서는 1+log₂(a)이고 a≥N에 대해서는 1+log₂(2(N-1)-a)이다. 각각의 빈을 인코딩/디코딩하는 것에 대한 컨텍스트는 이전의 인코딩된/디코딩된 빈들의 값에 기초한다.The [a, b] values can be encoded in a similar manner as described above for the two-dimensional coordinates x and y. The total number of coded bins for opposite diagonal index a is log ₂ (2N-1). The number of coded bins for position b of the coefficient on the line depends on the value a, 1 + log ₂ (a) for a <N and 1 + log ₂ (2 (N-1) − for a≥N). a) The context for encoding / decoding each bin is based on the value of the previous encoded / decoded bins.

한 다른 대안의 구현예에서, 마지막 유효 계수 위치에 대한 2차원 좌표를 인코딩하기보다는, 인코더는 계수 스캔 순서를 참작하여 마지막 유효 위치에 대한 1차원 좌표를 인코딩한다. 계수 스캔 순서는 H.264에서는 지그재그 순서로, 또는 다른 변환 영역 영상 또는 비디오 인코딩 방식들에서 사용되는 임의의 적응적 또는 비적응적 스캔 순서로 되어 있을 수 있다. 이 구현예에서, last_pos_x 및 last_pos_y 대신에, 인코더는 0부터 (N*N-1)(단, N은 변환 영역 계수들의 블록의 크기임)까지의 범위에 있는 last_pos만을 인코딩한다. last_0_flag가 구문에서 사용되는 경우, last_pos에 대한 범위는 0을 포함할 필요가 없다. 어떤 경우에, last_pos는 자동으로 감소될 수 있고, 따라서 디코더는, 마지막 유효 계수 위치에 대한 실제의 1차원 좌표를 실현하기 위해, last_pos 값에 1을 가산해야만 한다는 것을 알고 있다.In another alternative implementation, rather than encoding the two-dimensional coordinates for the last significant coefficient position, the encoder encodes the one-dimensional coordinates for the last significant position in consideration of the coefficient scan order. The coefficient scan order may be in zigzag order in H.264, or in any adaptive or non-adaptive scan order used in other transform domain image or video encoding schemes. In this implementation, instead of last_pos_x and last_pos_y, the encoder only encodes last_pos in the range from 0 to (N * N-1), where N is the size of the block of transform domain coefficients. If last_0_flag is used in the syntax, the range for last_pos need not include zero. In some cases, last_pos can be automatically reduced, so the decoder knows to add 1 to the last_pos value in order to realize the actual one-dimensional coordinates for the last significant coefficient position.

last_pos 값은 2차원 좌표들 중 하나 x에 대해 앞서 기술한 것과 유사한 방식으로 인코딩될 수 있다. 1차원 좌표 last_pos에 대한 코딩된 빈들의 총수는 21og₂(N)이다. 각각의 빈을 인코딩/디코딩하는 것에 대한 컨텍스트는 이전의 인코딩된/디코딩된 빈들의 값에 기초한다.The last_pos value may be encoded in a similar manner as described above for one of the two-dimensional coordinates. The total number of coded bins for one-dimensional coordinate last_pos is 21og ₂ (N). The context for encoding / decoding each bin is based on the value of the previous encoded / decoded bins.

유의할 점은, 예시적인 구현예에서, 2차원(또는 1차원) 마지막 유효 계수 위치에 대한 코딩된 빈들의 최악의 수가 종래의 last[i, j] 시퀀스에 대한 코딩된 빈들의 최악의 수인 N*N보다 훨씬 더 작은 log₂(N) 정도이고, 그에 의해 엔트로피 코딩 엔진 구현에서의 복잡도를 감소시킨다.Note that, in an exemplary implementation, the worst number of coded bins for a two-dimensional (or one-dimensional) last significant coefficient position is N *, which is the worst number of coded bins for a conventional last [i, j] sequence. It is on the order of log ₂ (N) much smaller than N, thereby reducing the complexity in the entropy coding engine implementation.

본 출원의 또 다른 측면에서, 어떤 블록들에 대해서만 앞서 기술한 마지막 위치 인코딩 프로세스를 사용하는 것이 유익할 수 있다. 상세하게는, 이 프로세스는 사전 설정된 수보다 많은 계수들을 가지는 블록에 대해 보다 유익할 수 있다. 적은 수의 계수들을 가지는 블록은 앞서 기술한 종래의 인터리빙 방식의 sig[i, j] 및 last[i, j] 구문을 사용하여 보다 효율적으로 인코딩될 수 있다. 그에 따라, 일 실시예에서, 인코더는 불록에서의 영이 아닌 계수들의 수(NNZ)를 결정하고, NNZ가 문턱값보다 작은 경우, 종래의 구문을 사용하여 sig[i, j] 및 last[i, j] 시퀀스를 인코딩한다. 문턱값은 2, 3 또는 임의의 다른 적당한 값으로 사전 설정될 수 있다. NNZ가 문턱값보다 크거나 같은 경우, 인코더는 앞서 기술한 2차원(또는 1차원) 마지막 유효 계수 위치 인코딩 프로세스를 사용한다.In another aspect of the present application, it may be beneficial to use the last position encoding process described above for only certain blocks. In particular, this process may be more beneficial for blocks with more coefficients than a preset number. A block with a small number of coefficients can be encoded more efficiently using the sig [i, j] and last [i, j] syntax of the conventional interleaving scheme described above. Thus, in one embodiment, the encoder determines the number of nonzero coefficients (NNZ) in the block and, if NNZ is less than the threshold, using conventional syntax sig [i, j] and last [i, j] Encode the sequence. The threshold may be preset to 2, 3 or any other suitable value. If NNZ is greater than or equal to the threshold, the encoder uses the two-dimensional (or one-dimensional) last significant coefficient position encoding process described above.

이 구문은 블록에 대한 NNZ가 문턱값보다 작은지 여부를 디코더에 신호하는 플래그를 포함하도록 구성되어 있을 수 있다. 이것은 또한 어느 유의성 맵 인코딩 프로세스가 인코더에 의해 사용되었는지를 신호한다. 어떤 경우에, last_0_flag가 제거될 수 있는데, 그 이유는 종래의 유의성 맵 인코딩 프로세스에서 DC만 있는 블록(DC only block)을 신호하는 것이 비교적 효율적이기 때문이다.This syntax may be configured to include a flag that signals to the decoder whether the NNZ for the block is less than the threshold. It also signals which significance map encoding process was used by the encoder. In some cases, last_0_flag may be removed because it is relatively efficient to signal a DC only block in a conventional significance map encoding process.

이제부터, 유의성 맵을 인코딩하는 방법(500)을 간략화된 흐름도 형태로 도시하고 있는 도 10을 참조한다. 양자화된 변환 영역 계수들의 각각의 블록에 대해 방법(500)이 수행된다. 이 방법은 동작(502)에서 블록에서의 NNZ를 결정하는 단계를 포함한다. 동작(504)에서, NNZ가 사전 설정된 문턱값과 비교된다. NNZ가 문턱값 미만인 경우, 인코더는, 동작(506)에 나타낸 바와 같이, 종래의 인터리빙 방식의 유의성 맵 인코딩을 사용한다. NNZ가 문턱값보다 크거나 같은 경우, 동작(508)에서, 인코더는, 앞서 기술한 바와 같이, 마지막 위치 좌표들을 인코딩한다.Reference is now made to FIG. 10, which illustrates, in simplified flow chart form, a method 500 for encoding a significance map. The method 500 is performed for each block of quantized transform region coefficients. The method includes determining an NNZ in a block at operation 502. In operation 504, the NNZ is compared with a preset threshold. If the NNZ is below the threshold, the encoder uses the significance map encoding of the conventional interleaving scheme, as shown in operation 506. If NNZ is greater than or equal to the threshold, at operation 508, the encoder encodes the last position coordinates, as described above.

이상의 예시적인 방법들이, H.264에 규정된 것과 같은, 유의성 맵을 인코딩 및 디코딩하는 특정의 예시적인 응용을 나타낸 것임을 잘 알 것이다. 본 출원은 그 특정의 예시적인 응용으로 제한되지 않는다.It will be appreciated that the example methods above represent specific example applications for encoding and decoding a significance map, such as defined in H.264. This application is not limited to that particular exemplary application.

이제부터, 인코더(900)의 예시적인 실시예의 간략화된 블록도를 도시하고 있는 도 4를 참조한다. 인코더(900)는 프로세서(902), 메모리(904), 및 인코딩 애플리케이션(906)을 포함하고 있다. 인코딩 애플리케이션(906)은 메모리(904)에 저장되어 있고 본 명세서에 기술되어 있는 것과 같은 단계들 또는 동작들을 수행하도록 프로세서(902)를 구성하는 명령어들을 포함하는 컴퓨터 프로그램 또는 애플리케이션을 포함할 수 있다. 예를 들어, 인코딩 애플리케이션(906)은 본 명세서에 기술되어 있는 마지막 유효 계수 위치 인코딩 프로세스에 따라 비트스트림을 인코딩하고 인코딩된 비트스트림을 출력할 수 있다. 인코딩 애플리케이션(906)은, 본 명세서에 기술되어 있는 프로세스들 중 하나 이상의 프로세스를 사용하여, 입력 시퀀스를 엔트로피 인코딩하고 비트스트림을 출력하도록 구성되어 있는 엔트로피 인코더(26)를 포함할 수 있다. 인코딩 애플리케이션(906)이 콤팩트 디스크, 플래시 메모리 장치, 랜덤 액세스 메모리, 하드 드라이브 등과 같은 컴퓨터 판독가능 매체 상에 저장되어 있을 수 있다는 것을 잘 알 것이다.Reference is now made to FIG. 4, which shows a simplified block diagram of an exemplary embodiment of an encoder 900. Encoder 900 includes a processor 902, a memory 904, and an encoding application 906. Encoding application 906 may include a computer program or application that includes instructions stored in memory 904 and that configure processor 902 to perform steps or operations as described herein. For example, encoding application 906 may encode the bitstream and output the encoded bitstream in accordance with the last significant coefficient position encoding process described herein. The encoding application 906 may include an entropy encoder 26 configured to entropy encode the input sequence and output the bitstream using one or more of the processes described herein. It will be appreciated that the encoding application 906 may be stored on a computer readable medium, such as a compact disc, flash memory device, random access memory, hard drive, or the like.

어떤 실시예들에서, 인코더(900) 내의 프로세서(902)는 인코딩 애플리케이션(906)의 명령어들을 구현하도록 구성되어 있는 단일 처리 유닛일 수 있다. 또한, 어떤 경우에, 인코딩 애플리케이션(906)의 동작들 중 일부 또는 전부 및 하나 이상의 처리 유닛들이 ASIC(application-specific integrated circuit) 등을 통해 구현될 수 있다는 것을 잘 알 것이다.In some embodiments, the processor 902 in the encoder 900 may be a single processing unit configured to implement the instructions of the encoding application 906. In addition, it will be appreciated that in some cases, some or all of the operations of encoding application 906 and one or more processing units may be implemented via an application-specific integrated circuit (ASIC) or the like.

이제부터, 또한 디코더(1000)의 예시적인 실시예의 간략화된 블록도를 도시하고 있는 도 5를 참조한다. 디코더(1000)는 프로세서(1002), 메모리(1004), 및 디코딩 애플리케이션(1006)을 포함하고 있다. 디코딩 애플리케이션(1006)은 메모리(1004)에 저장되어 있고 본 명세서에 기술되어 있는 것과 같은 단계들 또는 동작들을 수행하도록 프로세서(1002)를 구성하는 명령어들을 포함하는 컴퓨터 프로그램 또는 애플리케이션을 포함할 수 있다. 디코딩 애플리케이션(1006)은 본 명세서에 기술되어 있는 마지막 유효 계수 위치 인코딩 프로세스에 따라 인코딩되어 있는 비트스트림을 수신하고, 본 명세서에 기술된 바와 같이, 비트스트림을 디코딩하기 위해 마지막 유효 계수 위치 컨텍스트 모델링 프로세스를 사용하여 양자화된 변환 영역 계수 데이터를 재구성하도록 구성되어 있는 엔트로피 디코더(1008)를 포함할 수 있다. 디코딩 애플리케이션(1006)이 콤팩트 디스크, 플래시 메모리 장치, 랜덤 액세스 메모리, 하드 드라이브 등과 같은 컴퓨터 판독가능 매체 상에 저장되어 있을 수 있다는 것을 잘 알 것이다.Reference is now made to FIG. 5, which also shows a simplified block diagram of an exemplary embodiment of the decoder 1000. Decoder 1000 includes a processor 1002, a memory 1004, and a decoding application 1006. Decoding application 1006 may include a computer program or application stored in memory 1004 and including instructions that configure processor 1002 to perform steps or operations as described herein. Decoding application 1006 receives the bitstream encoded according to the last significant coefficient position encoding process described herein and, as described herein, the last significant coefficient position context modeling process for decoding the bitstream. May include an entropy decoder 1008 configured to reconstruct quantized transform region coefficient data using. It will be appreciated that the decoding application 1006 may be stored on a computer readable medium, such as a compact disc, flash memory device, random access memory, hard drive, or the like.

어떤 실시예들에서, 디코더(1000) 내의 프로세서(1002)는 디코딩 애플리케이션(1006)의 명령어들을 구현하도록 구성되어 있는 단일 처리 유닛일 수 있다. 어떤 다른 실시예들에서, 프로세서(1002)는 명령어들을 병렬로 실행할 수 있는 2개 이상의 처리 유닛을 포함할 수 있다. 다수의 처리 유닛들은 논리적으로 또는 물리적으로 개별적인 처리 유닛들일 수 있다. 또한, 어떤 경우에, 디코딩 애플리케이션(1006)의 동작들 중 일부 또는 전부 및 하나 이상의 처리 유닛들이 ASIC(application-specific integrated circuit) 등을 통해 구현될 수 있다는 것을 잘 알 것이다.In some embodiments, processor 1002 in decoder 1000 may be a single processing unit configured to implement the instructions of decoding application 1006. In some other embodiments, processor 1002 may include two or more processing units capable of executing instructions in parallel. Multiple processing units can be logically or physically separate processing units. In addition, it will be appreciated that in some cases, some or all of the operations of decoding application 1006 and one or more processing units may be implemented via an application-specific integrated circuit (ASIC) or the like.

본 출원에 따른 디코더 및/또는 인코더가 서버, 적절히 프로그램된 범용 컴퓨터, 셋톱 텔레비전 박스, 텔레비전 방송 장비, 및 모바일 장치(이들로 제한되지 않음)를 비롯한 다수의 컴퓨팅 장치들에서 구현될 수 있다는 것을 잘 알 것이다. 디코더 또는 인코더는 본 명세서에 기술되어 있는 기능들을 수행하도록 프로세서를 구성하는 명령어들을 포함하는 소프트웨어를 통해 구현될 수 있다. 소프트웨어 명령어들은 CD, RAM, ROM, 플래시 메모리 등을 비롯한 임의의 적당한 컴퓨터 판독가능 메모리에 저장될 수 있다.It is well understood that the decoder and / or encoder according to the present application may be implemented in a number of computing devices, including but not limited to servers, suitably programmed general purpose computers, set top television boxes, television broadcast equipment, and mobile devices. Will know. The decoder or encoder may be implemented through software that includes instructions that configure the processor to perform the functions described herein. Software instructions may be stored in any suitable computer readable memory, including CD, RAM, ROM, flash memory, and the like.

본 명세서에 기술되어 있는 인코더 및 디코더 그리고 인코더를 구성하는 기술된 방법/프로세스를 구현하는 모듈, 루틴, 프로세스, 쓰레드, 또는 기타 소프트웨어 구성요소가 표준 컴퓨터 프로그래밍 기법 및 언어를 사용하여 실현될 수 있다는 것을 잘 알 것이다. 본 출원은 특정의 프로세서, 컴퓨터 언어, 컴퓨터 프로그래밍 규약, 데이터 구조, 기타 이러한 구현 상세로 제한되지 않는다. 기술 분야의 당업자라면 기술된 프로세스들이 ASIC(application-specific integrated chip) 등의 일부로서 휘발성 또는 비휘발성 메모리에 저장되어 있는 컴퓨터 실행가능 코드의 일부로서 구현될 수 있다는 것을 잘 알 것이다.Encoders and decoders described herein and modules, routines, processes, threads, or other software components that implement the described methods / processes of configuring the encoder can be realized using standard computer programming techniques and languages. You will know well. This application is not limited to specific processors, computer languages, computer programming conventions, data structures, and other such implementation details. Those skilled in the art will appreciate that the described processes may be implemented as part of computer executable code stored in volatile or nonvolatile memory as part of an application-specific integrated chip (ASIC) or the like.

기술된 실시예에 대해 특정의 적응 및 수정이 행해질 수 있다. 따라서, 앞서 논의된 실시예들이 제한적인 것이 아니라 예시적인 것으로 간주된다.Certain adaptations and modifications may be made to the described embodiments. Accordingly, the embodiments discussed above are to be considered illustrative rather than restrictive.

Claims

A method for encoding quantized transform domain coefficient data comprising last significant coefficient information, the method comprising:
Binarizing each of the two positions of the two-dimensional coordinates of the last significant coefficient;
Determining a context for each bin of one of the locations;
Determining a context for each bin of the other of the locations, the context of each bin of the other of the locations based in part on the one of the locations ; And
Entropy encoding the binarized positions based on a context determined for each of the bins of the binarized positions to produce encoded data.
And a method for encoding quantized transform region coefficient data.

The method of claim 1,
And wherein said binarizing comprises encoding each of said two positions as a fixed length binary code.

3. The method according to claim 1 or 2,
Determining the context for each bin of the other one of the locations includes calculating a context index that specifies the determined context, wherein the context index is the one of the locations, the transform And a block size, and if present, calculated based on previous bins at the other one of the positions.

4. The method according to any one of claims 1 to 3,
Wherein the two positions comprise an x-position and a y-position within a block.

The method of claim 1,
Determining the context for each bin of the other one of the locations includes calculating a context index that specifies the determined context, wherein the context index is a binary of the one of the locations. A method for encoding quantized transform domain coefficient data, which is based on a binary logarithm.

The method according to any one of claims 1 to 5,
Prior to performing the binarization, determining the context for each bin of each location, and entropy encoding, determining that the two-dimensional coordinates are not [0,0]. , A method for encoding quantized transform domain coefficient data.

7. The method according to any one of claims 1 to 6,
Prior to said binarizing, determining a context for each bin of one of said locations, and determining a context for each bin of another of said locations, said quantized Counting the number of non-zero coefficients in the transform domain coefficient data and determining that the number meets or exceeds a threshold.

An encoder for encoding quantized transform domain coefficient data, the encoder comprising:
A processor;
Memory; And
An encoding application, stored in the memory, comprising instructions for configuring the processor to encode the quantized transform region coefficient data by performing the method of any one of claims 1 to 7.
And an encoder for encoding the quantized transform region coefficient data.

A method for decoding a bitstream of encoded data to reconstruct quantized transform region coefficient data,
Entropy decoding a portion of the encoded data to generate two binary positions that define two-dimensional coordinates of a last significant coefficient, wherein entropy decoding the portion of the data comprises:
Determining a context for each bin of one of the locations, and
Determining a context for each bin of the other one of the locations
Wherein the context of each bin of the other one of the locations is based in part on the one of the locations;
Entropy decoding a significant coefficient sequence based on the two-dimensional coordinates of the last significant coefficient;
Entropy decoding level information based on the valid coefficient sequence; And
Reconstructing the quantized transform region coefficient data using the level information and the effective coefficient sequence
And a bitstream of the encoded data.

10. The method of claim 9,
Wherein each of the two binarized locations comprises a fixed length binary code.

11. The method according to claim 9 or 10,
Entropy decoding the significant coefficient sequence comprises converting the two-dimensional coordinates of the last significant coefficient into a one-dimensional index representing the end of the significant coefficient sequence. How to.

12. The method according to any one of claims 9 to 11,
Determining the context for each bin of the other one of the locations includes calculating a context index that specifies the determined context, wherein the context index is the one of the locations, the transform And a block size, and if present, calculated based on previous bins at the other one of the locations.

13. The method according to any one of claims 9 to 12,
Wherein the two positions comprise an x-position and a y-position within the block.

10. The method of claim 9,
Determining the context for each bin of the other one of the locations includes calculating a context index that specifies the determined context, wherein the context index is a binary of the one of the locations. A method for decoding a bitstream of encoded data, which is based on a log.

15. The method according to any one of claims 9 to 14,
Prior to decoding the portion, further comprising entropy decoding a zero flag and determining from the zero flag that the two-dimensional coordinate position is not [0,0]. Method for decoding a bitstream.

16. The method according to any one of claims 9 to 15,
Entropy decoding a non-zero coefficient flag, and entropy decoding the portion based on the value of the non-zero coefficient flag. Method for decoding the.

A decoder for decoding a bitstream of encoded data to reconstruct a sequence of symbols, the method comprising:
The symbols belong to a finite alphabet and the decoder
A processor;
Memory; And
17. A decoding application, stored in the memory, comprising instructions for configuring the processor to decode the bitstream by performing the method of any one of claims 9-16.
And a decoder for decoding the bitstream of the encoded data.

17. A computer readable medium storing computer executable instructions that, when executed by a processor, configure the processor to execute the method of any one of claims 1-7 or 9-16.