KR101166732B1

KR101166732B1 - Video coding mode selection using estimated coding costs

Info

Publication number: KR101166732B1
Application number: KR1020097025315A
Authority: KR
Inventors: 시타라만 가나파티 수브라마니아; 팡 시; 페이쑹 천; 세이풀라 할리트 오구즈; 스콧 티 스와지; 비노드 카우시크
Original assignee: 퀄컴 인코포레이티드
Priority date: 2007-05-04
Filing date: 2007-05-04
Publication date: 2012-07-19
Also published as: CN101663895A; KR20100005240A; JP2010526515A; EP2156672A1; WO2008136828A1; KR20120031529A; CN101663895B

Abstract

This disclosure describes techniques related to video coding mode selection using estimated coding costs. In order to provide high compression efficiency, for example, the encoding device may attempt to select a coding mode for coding a block of pixels that codes data of blocks with high efficiency. In turn, this coding device may perform coding mode selection based on an estimate of the coding cost for at least some of the possible modes. According to the techniques described herein, the encoding device estimates the coding cost for different modes without actually coding the blocks. In fact, in some aspects, the encoding module device may estimate the coding cost for the modes without quantizing the block's data for each mode. In this way, the coding cost estimation technique of the present disclosure reduces the computationally intensive computational amount required to perform effective mode selection.

Coding mode selection, coding cost estimation, transform coefficients, quantization

Description

VIDEO CODING MODE SELECTION USING ESTIMATED CODING COSTS}

기술분야Technical Field

본 개시는 비디오 코딩에 관한 것이며, 보다 상세하게는, 비디오 시퀀스를 코딩하기 위한 코딩 비용을 추정하는 것에 관한 것이다.TECHNICAL FIELD This disclosure relates to video coding, and more particularly, to estimating a coding cost for coding a video sequence.

배경기술Background technology

디지털 비디오 성능은, 디지털 텔레비전, 디지털 다이렉트 브로드캐스트 시스템, 무선 통신 디바이스, 개인 디지털 보조기 (PDA), 랩탑 컴퓨터, 데스크 컴퓨터, 비디오 게임 콘솔, 디지털 카메라, 디지털 레코딩 디바이스, 셀룰러 또는 인공위성 무선 전화기 등을 포함하는 광범위한 디바이스에 통합될 수 있다. 디지털 비디오 디바이스는 비디오 시퀀스의 처리 및 송신에 있어서 종래의 아날로그 비디오 시스템에 비해 상당한 개선을 제공할 수 있다.Digital video capabilities include digital televisions, digital direct broadcast systems, wireless communication devices, personal digital assistants (PDAs), laptop computers, desk computers, video game consoles, digital cameras, digital recording devices, cellular or satellite cordless phones, and the like. Can be integrated into a wide range of devices. Digital video devices can provide significant improvements over conventional analog video systems in the processing and transmission of video sequences.

디지털 비디오 시퀀스를 코딩하기 위해서 상이한 비디오 코딩 표준이 구축되어왔다. 동영상 전문가 그룹 (MPEG) 은 예를 들어, MPEG-1, MPEG-2 및 MPEG-4를 포함하는 다수의 표준을 개발하였다. 다른 예는 국제 전기통신 연합 (ITU)-T H.263 표준, 및 ITU-T H.264 표준 및 그 대응물, ISO/IEC MPEG-4, Part 10, 즉 AVC (Advanced Video Coding) 를 포함한다. 이러한 비디오 코딩 표준은 압축 방식으로 데이터를 코딩함으로써 비디오 시퀀스의 통신 효율의 개선을 지원한다.Different video coding standards have been established for coding digital video sequences. The Video Experts Group (MPEG) has developed a number of standards, including, for example, MPEG-1, MPEG-2 and MPEG-4. Other examples include the International Telecommunication Union (ITU) -T H.263 standard, and the ITU-T H.264 standard and its counterparts, ISO / IEC MPEG-4, Part 10, or Advanced Video Coding (AVC). . This video coding standard supports improving the communication efficiency of video sequences by coding the data in a compressed manner.

많은 현재의 기술들이 블록 기반 코딩을 이용한다. 블록 기반 코딩에서, 멀티미디어 시퀀스의 프레임은 픽셀들의 개별 블록들로 분할되고, 픽셀들의 블록들은 동일한 프레임 내에 또는 상이한 프레임 내에 위치될 수도 있는 다른 블록들과의 차이점에 기초하여 코딩된다. 종종 "매크로블록"으로도 지칭되는 일부 픽셀들의 블록들은 픽셀들의 서브 블록들의 그룹핑을 포함한다. 예로써, 16×16 매크로블록은 4개의 8×8 서브 블록을 포함할 수도 있다. 서브 블록들은 개별적으로 코딩될 수도 있다. 예를 들어, H.264 표준은, 다양한 상이한 사이즈, 예를 들어, 16×16, 16×8, 8×16, 8×8, 4×4, 8×4, 및 4×8로 블록들을 코딩할 수 있게 한다. 또한, 더 나아가, 임의의 사이즈의 서브 블록들이 매크로블록, 예를 들어, 2×16, 16×2, 2×2, 4×16, 및 8×2 내에 포함될 수도 있다.Many current technologies use block based coding. In block-based coding, a frame of a multimedia sequence is divided into individual blocks of pixels, and blocks of pixels are coded based on differences from other blocks that may be located in the same frame or in different frames. Blocks of some pixels, sometimes referred to as "macroblocks", include grouping of subblocks of pixels. By way of example, a 16x16 macroblock may include four 8x8 subblocks. Subblocks may be coded separately. For example, the H.264 standard codes blocks in a variety of different sizes, for example 16 × 16, 16 × 8, 8 × 16, 8 × 8, 4 × 4, 8 × 4, and 4 × 8. To do it. Furthermore, further subblocks of any size may be included in macroblocks, for example 2x16, 16x2, 2x2, 4x16, and 8x2.

요약summary

본 개시의 특정 양태에서, 디지털 비디오 데이터를 처리하는 방법은, 양자화될 때 논-제로로 남게 될 픽셀들의 블록의 잔여 데이터에 대한 하나 이상의 변환 계수들을 식별하는 단계, 적어도 식별된 변환 계수들에 기초하여 잔여 데이터의 코딩과 연관된 비트들의 수를 추정하는 단계, 및 적어도 잔여 데이터의 코딩과 연관된 비트들의 추정된 수에 기초하여 픽셀들의 블록을 코딩하기 위한 코딩 비용을 추정하는 단계를 포함한다.In certain aspects of the present disclosure, a method of processing digital video data includes identifying one or more transform coefficients for residual data of a block of pixels that will remain non-zero when quantized, based at least on the identified transform coefficients. Estimating the number of bits associated with the coding of the residual data, and estimating the coding cost for coding the block of pixels based at least on the estimated number of bits associated with the coding of the residual data.

특정 양태에서, 디지털 비디오 데이터를 처리하는 장치는, 픽셀들의 블록의 잔여 데이터에 대한 변환 계수를 생성하는 변환 모듈, 양자화될 때 논-제로로 남게 될 하나 이상의 변환 계수들을 식별하고 적어도 식별된 변환 계수들에 기초하여 잔 여 데이터의 코딩과 연관된 비트들의 수를 추정하는 비트 추정 모듈, 및 적어도 잔여 데이터의 코딩과 연관된 비트들의 추정된 수에 기초하여 픽셀들의 블록을 코딩하기 위한 코딩 비용을 추정하는 제어 모듈을 포함한다.In a particular aspect, an apparatus for processing digital video data includes a transform module that generates transform coefficients for residual data of a block of pixels, identifies one or more transform coefficients that will remain non-zero when quantized and at least identified transform coefficients A bit estimation module for estimating the number of bits associated with the coding of the residual data based on the data, and a control for estimating the coding cost for coding the block of pixels based at least on the estimated number of bits associated with the coding of the residual data. Contains modules

특정 양태에서, 디지털 비디오 데이터를 처리하는 장치는, 양자화될 때 논-제로로 남게 될 픽셀들의 블록의 잔여 데이터에 대한 하나 이상의 변환 계수들을 식별하는 수단, 적어도 식별된 변환 계수들에 기초하여 잔여 데이터의 코딩과 연관된 비트들의 수를 추정하는 수단, 및 적어도 잔여 데이터의 코딩과 연관된 비트들의 추정된 수에 기초하여 픽셀들의 블록을 코딩하기 위한 코딩 비용을 추정하는 수단을 포함한다.In a particular aspect, an apparatus for processing digital video data includes means for identifying one or more transform coefficients for residual data of a block of pixels that will remain non-zero when quantized, the residual data based at least on the identified transform coefficients Means for estimating the number of bits associated with the coding of, and means for estimating the coding cost for coding the block of pixels based at least on the estimated number of bits associated with the coding of the residual data.

특정 양태에서, 디지털 비디오 데이터를 처리하기 위한 컴퓨터 프로그램 제품은 명령들을 갖는 컴퓨터 판독가능 매체를 포함한다. 이 명령들은, 양자화될 때 논-제로로 남게 될 픽셀들의 블록의 잔여 데이터에 대한 하나 이상의 변환 계수들을 식별하는 코드, 적어도 식별된 변환 계수들에 기초하여 잔여 데이터의 코딩과 연관된 비트들의 수를 추정하는 코드, 및 적어도 잔여 데이터의 코딩과 연관된 비트들의 추정된 수에 기초하여 픽셀들의 블록을 코딩하기 위한 코딩 비용을 추정하는 코드를 포함한다.In a particular aspect, a computer program product for processing digital video data includes a computer readable medium having instructions. These instructions are code that identifies one or more transform coefficients for the residual data of the block of pixels that will remain non-zero when quantized, and estimate the number of bits associated with coding of the residual data based at least on the identified transform coefficients. Code for estimating a coding cost for coding a block of pixels based on at least an estimated number of bits associated with coding of residual data.

하나 이상의 예들의 세부사항을 첨부된 도면과 아래의 상세한 설명에 나타낸다. 다른 특징들, 목적들, 및 이점들이 상세한 설명 및 도면, 그리고 청구범위로부터 명확해질 것이다.The details of one or more examples are set forth in the accompanying drawings and the description below. Other features, objects, and advantages will be apparent from the description and drawings, and from the claims.

도면의 간단한 설명Brief description of the drawings

도 1은 본원에 기재된 코딩 비용 추정 기술을 이용하는 비디오 코딩 시스템을 도시하는 블록도이다.1 is a block diagram illustrating a video coding system utilizing the coding cost estimation technique described herein.

도 2는 예시적인 인코딩 모듈을 보다 상세하게 도시하는 블록도이다.2 is a block diagram illustrating an example encoding module in more detail.

도 3은 다른 예시적인 인코딩 모듈을 보다 상세하게 도시하는 블록도이다.3 is a block diagram illustrating another exemplary encoding module in more detail.

도 4는 추정된 코딩 비용에 기초하여 인코딩 모드를 선택하는 인코딩 모듈의 예시적인 동작을 도시하는 흐름도이다.4 is a flowchart illustrating an exemplary operation of an encoding module to select an encoding mode based on an estimated coding cost.

도 5는 잔여 데이터를 인코딩 또는 양자화하지 않고 블록의 잔여 데이터의 코딩과 연관된 비트들의 수를 추정하는 인코딩 모듈의 예시적인 동작을 도시하는 흐름도이다.5 is a flowchart illustrating an exemplary operation of an encoding module to estimate the number of bits associated with coding of residual data of a block without encoding or quantizing the residual data.

도 6은 잔여 데이터를 인코딩하지 않고 블록의 잔여 데이터의 코딩과 연관된 비트들의 수를 추정하는 인코딩 모듈의 예시적인 동작을 도시하는 흐름도이다.6 is a flowchart illustrating an exemplary operation of an encoding module to estimate the number of bits associated with coding of residual data of a block without encoding residual data.

상세한 설명details

본 개시는 추정된 코딩 비용을 이용한 비디오 코딩 모드 선택에 관한 기술을 설명한다. 높은 압축 효율을 제공하기 위해서, 예를 들어, 인코딩 디바이스는, 높은 효율로 블록들의 데이터를 코딩하는 픽셀들의 블록을 코딩하기 위한 코딩 모드를 선택하도록 시도할 수도 있다. 결국, 이 코딩 디바이스는, 가능한 모드들 중 적어도 일부에 대한 적어도 코딩 비용의 추정에 기초하여 코딩 모드 선택을 수행할 수도 있다. 본원에 기재된 기술들에 따르면, 인코딩 디바이스는 블록들을 실제로 코딩하지 않고 상이한 모드들에 대한 코딩 비용을 추정한다. 사실상, 몇몇 양태들에서, 인코딩 모듈 디바이스는 각각의 모드에 대하여 블록의 데이터를 양자화하지 않고 모드들에 대한 코딩 비용을 추정할 수도 있다. 이 방식으로, 본 개시의 코딩 비용 추정 기술은 유효한 모드 선택을 수행하는데 필요한 연산 집약적인 계산량을 감소시킨다.This disclosure describes techniques related to video coding mode selection using estimated coding costs. In order to provide high compression efficiency, for example, the encoding device may attempt to select a coding mode for coding a block of pixels that codes the data of the blocks at high efficiency. In turn, this coding device may perform coding mode selection based on an estimate of at least coding cost for at least some of the possible modes. According to the techniques described herein, the encoding device estimates the coding cost for different modes without actually coding the blocks. In fact, in some aspects, the encoding module device may estimate the coding cost for the modes without quantizing the block's data for each mode. In this way, the coding cost estimation technique of the present disclosure reduces the computationally intensive computational amount required to perform effective mode selection.

도 1은 본원에 기재된 바와 같은 코딩 비용 추정 기술을 이용하는 멀티미디어 코딩 시스템 (10) 을 도시하는 블록도이다. 코딩 시스템 (10) 은 전송 채널 (16) 에 의해 접속되는 인코딩 디바이스 (12) 및 디코딩 디바이스 (14) 를 포함한다. 인코딩 디바이스 (12) 는 디지털 멀티미디어 데이터의 하나 이상의 시퀀스들을 인코딩하고, 디코딩 및 가능하게는, 디코딩 디바이스 (14) 의 사용자에게의 프리젠테이션을 위해, 그 인코딩된 시퀀스를 전송 채널 (16) 을 통해 디코딩 디바이스 (14) 로 송신한다. 전송 채널 (16) 은 임의의 유선 또는 무선 매체, 또는 그 조합을 포함할 수도 있다.1 is a block diagram illustrating a multimedia coding system 10 that employs a coding cost estimation technique as described herein. Coding system 10 includes an encoding device 12 and a decoding device 14 connected by a transport channel 16. Encoding device 12 encodes, decodes and possibly decodes the encoded sequence via transport channel 16 for presentation to a user of decoding device 14. Transmit to device 14. Transport channel 16 may comprise any wired or wireless medium, or a combination thereof.

인코딩 디바이스 (12) 는 멀티미디어 데이터의 하나 이상의 채널들을 브로드캐스트하는데 사용되는 브로드캐스트 네트워크 컴포넌트의 일부를 형성할 수도 있다. 예로써, 인코딩 디바이스 (12) 는 무선 기지국, 서버, 또는 인코딩된 멀티미디어 데이터의 하나 이상의 채널들을 무선 디바이스로 브로드캐스트하는데 사용되는 임의의 인프라스트럭처 노드의 일부를 형성할 수도 있다. 이 경우, 인코딩 디바이스 (12) 는 인코딩된 데이터를 디코딩 디바이스 (14) 와 같은 복수의 무선 디바이스로 송신할 수도 있다. 그러나, 간략함을 위해 도 1에 하나의 디코딩 디바이스 (14) 만을 도시하였다. 대안으로, 인코딩 디바이스 (12) 는 비디오 텔레포니 또는 다른 유사한 애플리케이션에 대하여 국부적으로 캡처링된 비디오 를 송신하는 핸드셋을 포함할 수도 있다.Encoding device 12 may form part of a broadcast network component used to broadcast one or more channels of multimedia data. By way of example, encoding device 12 may form part of a wireless base station, server, or any infrastructure node used to broadcast one or more channels of encoded multimedia data to a wireless device. In this case, encoding device 12 may transmit the encoded data to a plurality of wireless devices, such as decoding device 14. However, only one decoding device 14 is shown in FIG. 1 for the sake of simplicity. Alternatively, encoding device 12 may include a handset that transmits locally captured video for video telephony or other similar application.

디코딩 디바이스 (14) 는 인코딩 디바이스 (12) 에 의해 송신된 인코딩된 멀티미디어 데이터를 수신하여 그 멀티미디어 데이터를 사용자에게 프리젠테이션하기 위해 디코딩하는 사용자 장치를 포함할 수도 있다. 예로써, 디코딩 디바이스 (14) 는, 디지털 텔레비전, 무선 통신 디바이스, 게이밍 디바이스, 휴대 디지털 보조기 (PDA), 랩탑 컴퓨터 또는 데스크탑 컴퓨터, "iPod"이라는 상표명으로 판매되는 제품들과 같은 디지털 음악 및 비디오 디바이스, 또는 셀룰러, 인공위성, 또는 지상 기반 무선전화기와 같은 무선전화기, 또는 비디오 및/또는 오디오 스트리밍, 비디오 텔레포니 또는 둘 모두를 갖춘 다른 무선 이동 단말기의 일부로서 구현될 수도 있다. 디코딩 디바이스 (14) 는 이동 또는 고정 디바이스와 연관될 수도 있다. 브로드캐스트 애플리케이션에서, 인코딩 디바이스 (12) 는 인코딩된 비디오 및/또는 오디오를 다수의 사용자들과 연관된 다수의 디코딩 디바이스들 (14) 로 송신할 수도 있다.Decoding device 14 may include a user device that receives encoded multimedia data sent by encoding device 12 and decodes the multimedia data to present to the user. By way of example, decoding device 14 may be a digital music, video device, such as a digital television, wireless communication device, gaming device, portable digital assistant (PDA), laptop computer or desktop computer, products sold under the trade name “iPod”. Or as part of a wireless telephone such as a cellular, satellite, or terrestrial based telephone, or other wireless mobile terminal with video and / or audio streaming, video telephony, or both. Decoding device 14 may be associated with a mobile or fixed device. In a broadcast application, encoding device 12 may transmit the encoded video and / or audio to multiple decoding devices 14 associated with multiple users.

몇몇 양태에서, 양방향 통신 애플리케이션에 있어서, 멀티미디어 코딩 시스템 (10) 은 세션 개시 프로토콜 (SIP), 국제 전기통신 연합 표준 섹터 (ITU-T)H.323 표준, ITU-T H.324 표준, 또는 다른 표준에 따라서 비디오 텔레포니 또는 비디오 스트리밍을 지원할 수도 있다. 단방향 또는 양방향 통신에 있어서, 인코딩 디바이스 (12) 는 동영상 전문가 그룹 (MPEG)-2, MPEG-4, ITU-T H.263, 또는 MPEG-4, Part 10, AVC (Advanced Video Coding) 에 대응하는 ITU-T H.264와 같은 비디오 압축 표준에 따라서 인코딩된 멀티미디어 데이터를 생성할 수도 있다. 도 1에 도시하지는 않았지만, 인코딩 디바이스 (12) 및 디코딩 디바이스 (14) 는 인코더 및 디코더에 각각 통합될 수도 있고, 공통 데이터 시퀀스 또는 개별 데이터 시퀀스로 오디오와 비디오 둘 모두의 인코딩을 다루기 위해서, 적절한 멀티플렉서-디멀티플렉서 (MUX-DEMUX) 모듈, 또는 다른 하드웨어, 펌웨어, 또는 소프트웨어를 포함한다. 적용가능 하다면, MUX-DEMUX 모듈들은 ITU H.223 멀티플렉서 프로토콜, 또는 사용자 데이터그램 프로토콜 (UDP) 과 같은 다른 프로토콜을 따를 수도 있다.In some aspects, for bidirectional communication applications, the multimedia coding system 10 may include a Session Initiation Protocol (SIP), International Telecommunication Union Standards Sector (ITU-T) H.323 Standard, ITU-T H.324 Standard, or other. Depending on the standard, it may also support video telephony or video streaming. For one-way or two-way communication, encoding device 12 corresponds to Video Expert Group (MPEG) -2, MPEG-4, ITU-T H.263, or MPEG-4, Part 10, Advanced Video Coding (AVC). It is also possible to generate encoded multimedia data according to video compression standards such as ITU-T H.264. Although not shown in FIG. 1, encoding device 12 and decoding device 14 may be integrated into an encoder and a decoder, respectively, and may be appropriate multiplexers to handle the encoding of both audio and video into a common data sequence or a separate data sequence. A demultiplexer (MUX-DEMUX) module, or other hardware, firmware, or software. If applicable, MUX-DEMUX modules may follow another protocol, such as the ITU H.223 Multiplexer Protocol, or User Datagram Protocol (UDP).

특정한 양태에서, 본 개시는 FLO (Forward Link Only) 공중 인터페이스 사양을 이용하여 지상의 이동성 멀티미디어 멀티캐스트 (TM3) 시스템에서 실시간 멀티미디어 서비스를 전달하는 향상된 H.264 비디오 코딩에 관한, 2006년 8월 기술 표준 TIA-1099로서 출판된 "Forward Link Only Air Interface Specification for Terrestrial Mobile Multimedia Multicast"("FLO 사양") 를 고려한다. 그러나, 본 개시에 설명된 코딩 비용 추정 기술은 임의의 특정 유형의 브로드캐스트, 멀티캐스트, 유니캐스트, 또는 포인트-투-포인트 시스템으로 제한되지 않는다.In a particular aspect, the present disclosure relates to enhanced H.264 video coding that delivers real-time multimedia services in terrestrial mobile multimedia multicast (TM3) systems using the Forward Link Only (FLO) air interface specification. Consider the "Forward Link Only Air Interface Specification for Terrestrial Mobile Multimedia Multicast" ("FLO Specification") published as standard TIA-1099. However, the coding cost estimation technique described in this disclosure is not limited to any particular type of broadcast, multicast, unicast, or point-to-point system.

도 1에 도시된 바와 같이, 인코딩 디바이스 (12) 는 인코딩 모듈 (18) 및 송신기 (20) 를 포함한다. 인코딩 모듈 (18) 은, 비디오 인코딩의 경우, 데이터의 하나 이상의 프레임들을 포함할 수 있는 하나 이상의 입력 멀티미디어 시퀀스를 수신하고 수신된 멀티미디어 시퀀스의 프레임을 선택적으로 인코딩한다. 인코딩 모듈 (18) 은 하나 이상의 소스들 (도 1에 미도시) 로부터 입력 멀티미디어 시퀀스를 수신한다. 몇몇 양태들에서, 인코딩 모듈 (18) 은 하나 이상의 비디오 콘텐츠 제공자, 예를 들어, 인공위성으로부터 입력 멀티미디어 시퀀스를 수신할 수도 있다. 다른 예로써, 인코딩 모듈 (18) 은 인코딩 디바이스 (12) 내에 통합되거나 인코딩 디바이스 (12) 에 커플링된 이미지 캡쳐 디바이스 (도 1에 미도시) 로부터 멀티미디어 시퀀스를 수신할 수도 있다. 대안으로, 인코딩 모듈 (18) 은 인코딩 디바이스 (12) 내부에 있는 또는 인코딩 디바이스 (12) 에 커플링된 메모리 또는 아카이브 (도 1에 미도시) 로부터 멀티미디어 시퀀스를 수신할 수도 있다. 멀티미디어 시퀀스는, 코딩되어, 브로드캐스트로서 또는 온-디맨드로서 송신되는 라이브 실시간의 또는 거의 실시간의 비디오, 오디오, 또는 비디오 및 오디오 시퀀스를 포함할 수도 있고, 또는 코딩되어, 브로드캐스트로서 또는 온-디맨드로서 송신되는 미리 레코딩되어 저장된 비디오, 오디오, 또는 비디오 및 오디오 시퀀스를 포함할 수도 있다. 몇몇 양태들에서, 멀티미디어 시퀀스들의 적어도 일부는 게이밍의 경우에서와 같이 컴퓨터 생성될 수도 있다.As shown in FIG. 1, encoding device 12 includes an encoding module 18 and a transmitter 20. Encoding module 18, in the case of video encoding, receives one or more input multimedia sequences that may include one or more frames of data and optionally encodes the frames of the received multimedia sequence. Encoding module 18 receives an input multimedia sequence from one or more sources (not shown in FIG. 1). In some aspects, encoding module 18 may receive an input multimedia sequence from one or more video content providers, eg, satellites. As another example, encoding module 18 may receive a multimedia sequence from an image capture device (not shown in FIG. 1) integrated in or coupled to encoding device 12. Alternatively, encoding module 18 may receive a multimedia sequence from a memory or archive (not shown in FIG. 1) within or coupled to encoding device 12. The multimedia sequence may comprise a live real-time or near real-time video, audio, or video and audio sequence that is coded and transmitted as a broadcast or on-demand, or coded to be broadcast or on-demand And may include pre-recorded and stored video, audio, or video and audio sequences that are transmitted. In some aspects, at least some of the multimedia sequences may be computer generated as in the case of gaming.

어떤 경우, 인코딩 모듈 (18) 은 복수의 코딩된 프레임을 인코딩하고 송신기 (20) 를 통해 디코딩 디바이스 (14) 로 송신한다. 인코딩 모듈 (18) 은 인트라 코딩된 프레임, 인터 코딩된 프레임 또는 그 둘의 결합으로서 입력 멀티미디어 시퀀스의 프레임을 인코딩할 수도 있다. 인트라 코딩 기술을 이용하여 인코딩된 프레임들은 다른 프레임들에 관계없이 코딩되고 종종 인트라 ("I") 프레임으로 지칭된다. 인터 코딩 기술들을 이용하여 인코딩된 프레임들은 하나 이상의 다른 프레임들과 관련하여 코딩된다. 인터 코딩된 프레임들은 하나 이상의 예측 ("P") 프레임, 양방향 ("B") 프레임 또는 그 결합을 포함할 수도 있다. P 프레 임은 적어도 하나의 시간적으로 이전의 프레임과 관련하여 인코딩되는 반면, B 프레임은 적어도 하나의 시간적으로 미래의 프레임과 관련하여 인코딩된다. 몇몇 경우들에서, B 프레임은 적어도 하나의 시간적으로 미래의 프레임과 적어도 하나의 시간적으로 이전의 프레임과 관련하여 인코딩될 수도 있다.In some cases, encoding module 18 encodes the plurality of coded frames and transmits via transmitter 20 to decoding device 14. Encoding module 18 may encode the frames of the input multimedia sequence as intra coded frames, inter coded frames, or a combination of both. Frames encoded using intra coding techniques are coded independently of other frames and are often referred to as intra (“I”) frames. Frames encoded using inter coding techniques are coded with respect to one or more other frames. Inter coded frames may include one or more prediction (“P”) frames, bidirectional (“B”) frames, or a combination thereof. P frames are encoded in relation to at least one temporal previous frame, while B frames are encoded in relation to at least one temporal future frame. In some cases, a B frame may be encoded with respect to at least one temporal future frame and at least one temporal previous frame.

인코딩 모듈 (18) 은 프레임을 복수의 블록들로 분할하고 그 블록들 각각을 개별적으로 인코딩하도록 더 구성될 수도 있다. 예로써, 인코딩 모듈 (18) 은 프레임을 복수의 16×16 블록들로 분할할 수도 있다. 종종, "매크로블록"이라 지칭되는 몇몇 블록은 (본원에서 "서브-블록"으로 지칭되는) 서브 분할 블록의 그룹핑을 포함한다. 예로써, 16×16 매크로블록은 4개의 8×8 서브 블록, 또는 다른 서브 분할 블록을 포함할 수도 있다. 예를 들어, H.264 표준은 다양한 상이한 사이즈, 예를 들어, 16×16, 16×8, 8×16, 8×8, 4×4, 8×4, 및 4×8을 갖는 블록들의 인코딩을 허용한다. 또한, 확장하면, 임의의 사이즈의 서브 블록이 매크로 블록, 예를 들어, 2×16, 16×2, 2×2, 4×16, 8×2 등에 포함될 수도 있다. 따라서, 인코딩 모듈 (18) 은 프레임을 여러 개의 블록들로 분할하고 픽셀들의 블록들 각각을 인트라 코딩된 블록 또는 인터 코딩된 블록으로서 인코딩하도록 구성될 수도 있으며, 인트라 코딩된 블록 또는 인터 코딩된 블록은 각각 일반적으로 블록으로 지칭될 수도 있다.Encoding module 18 may be further configured to divide the frame into a plurality of blocks and to encode each of the blocks individually. By way of example, encoding module 18 may split the frame into a plurality of 16 × 16 blocks. Often, some blocks called "macroblocks" comprise groupings of subdivided blocks (herein referred to as "sub-blocks"). By way of example, a 16x16 macroblock may include four 8x8 subblocks, or other subdivision blocks. For example, the H.264 standard encodes blocks having various different sizes, for example 16 × 16, 16 × 8, 8 × 16, 8 × 8, 4 × 4, 8 × 4, and 4 × 8. Allow. In addition, when expanded, a subblock of any size may be included in a macro block, for example, 2x16, 16x2, 2x2, 4x16, 8x2, or the like. Thus, encoding module 18 may be configured to divide the frame into several blocks and encode each of the blocks of pixels as an intra coded block or an inter coded block, wherein the intra coded block or inter coded block is Each may also be referred to generally as a block.

인코딩 모듈 (18) 은 복수의 코딩 모드를 지원할 수도 있다. 모드들 각각은 블록 사이즈 및 코딩 기술의 상이한 결합에 대응할 수도 있다. H.264 표준의 경우, 예를 들어, 7개의 인터 모드 및 13개의 인트라 모드가 있다. 7개의 가변 블록 사이즈 인터 모드는 SKIP 모드, 16×16 모드, 16×8 모드, 8×16 모드, 8×8 모드, 8×4 모드, 4×8 모드, 및 4×4 모드를 포함한다. 13개의 인트라 모드는 9개의 가능한 보간 방향이 존재하는 INTRA 4×4 모드와 4개의 가능한 보간 방향이 존재하는 INTRA 16×16 모드를 포함한다.Encoding module 18 may support multiple coding modes. Each of the modes may correspond to a different combination of block size and coding technique. For the H.264 standard, for example, there are seven inter modes and thirteen intra modes. The seven variable block size inter modes include SKIP mode, 16x16 mode, 16x8 mode, 8x16 mode, 8x8 mode, 8x4 mode, 4x8 mode, and 4x4 mode. The thirteen intra modes include an INTRA 4x4 mode with nine possible interpolation directions and an INTRA 16x16 mode with four possible interpolation directions.

높은 압축 효율을 제공하기 위해서, 본 개시의 다양한 양태에 따라서, 인코딩 모듈 (18) 은 높은 효율로 블록의 데이터를 코딩하는 모드를 선택하도록 시도한다. 결국, 인코딩 모듈 (18) 은, 블록들 각각에 대하여, 모드들 중 적어도 일부에 대한 코딩 비용을 추정한다. 인코딩 모듈 (18) 은 레이트 및 왜곡의 함수로서 코딩 비용을 추정한다. 본원에 기재된 기술에 따르면, 인코딩 모듈 (18) 은 레이트 및 왜곡 메트릭을 결정하기 위해 실제로 블록을 코딩하지 않고 모드들에 대한 코딩 비용을 추정한다. 이 방식으로, 인코딩 모듈 (18) 은 각각의 모드에 대하여 블록의 데이터의 계산적으로 복잡한 코딩을 수행하지 않고 적어도 코딩 비용에 기초하여 모드들 중 하나를 선택할 수도 있다. 종래의 모드 선택은 선택할 모드를 결정하기 위해 모드들 각각을 이용하여 데이터의 실제 코딩을 요구한다. 이와 같이, 본 기술은 모드들 각각에 대하여 데이터를 실제로 코딩하지 않고 코딩 비용에 기초하여 모드를 선택함으로써 시간과 계산 자원을 절약한다. 사실상, 몇몇 양태들에서, 인코딩 모듈 (18) 은 각각의 모드에 대하여 블록의 데이터를 양자화하지 않고 모드들에 대한 코딩 비용을 추정할 수도 있다. 이 방식에서, 본 개시의 코딩 비용 추정 기술은 효과적인 모드 선택을 수행하기 위해 요구되는 연산 집약적인 계산량을 감소시킨다.In order to provide high compression efficiency, in accordance with various aspects of the present disclosure, encoding module 18 attempts to select a mode for coding the data of the block at high efficiency. In turn, encoding module 18 estimates, for each of the blocks, the coding cost for at least some of the modes. Encoding module 18 estimates the coding cost as a function of rate and distortion. According to the techniques described herein, encoding module 18 estimates the coding cost for the modes without actually coding the block to determine the rate and distortion metric. In this way, encoding module 18 may select one of the modes based at least on the coding cost without performing computationally complex coding of the data of the block for each mode. Conventional mode selection requires actual coding of the data using each of the modes to determine the mode to select. As such, the present technology saves time and computational resources by selecting a mode based on coding cost without actually coding the data for each of the modes. In fact, in some aspects, encoding module 18 may estimate the coding cost for the modes without quantizing the block's data for each mode. In this way, the coding cost estimation technique of the present disclosure reduces the computationally intensive computation required to perform effective mode selection.

인코딩 디바이스 (12) 는 선택된 모드를 적용하여 프레임의 블록을 코딩하고 데이터의 코딩된 프레임을 송신기 (20) 를 통해 송신한다. 송신기 (20) 는 인코딩된 멀티미디어를 전송 채널 (16) 을 통해 송신하기 위해 적절한 모뎀 및 드라이버 회로 소프트웨어 및/또는 펌웨어를 포함할 수도 있다. 무선 애플리케이션에 있어서, 송신기 (26) 는, 인코딩된 멀티미디어 데이터를 전달하는 무선 데이터를 송신하는 RF 회로를 포함한다.Encoding device 12 applies the selected mode to code a block of frames and transmits the coded frame of data via transmitter 20. Transmitter 20 may include suitable modem and driver circuit software and / or firmware to transmit encoded multimedia over transport channel 16. In a wireless application, transmitter 26 includes RF circuitry for transmitting wireless data that carries encoded multimedia data.

디코딩 디바이스 (14) 는 수신기 (22) 및 디코딩 모듈 (24) 을 포함한다. 디코딩 디바이스 (14) 는 수신기 (22) 를 통해 인코딩된 디바이스 (12) 로부터 인코딩된 데이터를 수신한다. 송신기 (20) 와 유사하게, 수신기 (22) 는 인코딩된 멀티미디어를 전송 채널 (16) 을 통해 수신하는 적절한 모뎀 및 드라이버 회로 소프트웨어 및/또는 펌웨어를 포함할 수도 있고, 무선 애플리케이션에서 인코딩된 멀티미디어 데이터를 전달하는 무선 데이터를 수신하는 RF 회로를 포함하여 할 수도 있다. 디코딩 모듈 (24) 은 수신기 (22) 를 통해 수신된 데이터의 코딩된 프레임을 디코딩한다. 디코딩 디바이스 (14) 는 또한, 디코딩 디바이스 (14) 내에 통합될 수도 있고, 또는 무선 접속을 통해 디코딩 디바이스 (14) 에 커플링된 별도의 디바이스로서 제공될 수도 있는 디스플레이 (미도시) 를 통해 사용자에게 데이터의 디코딩된 프레임을 나타낼 수도 있다.Decoding device 14 includes a receiver 22 and a decoding module 24. Decoding device 14 receives encoded data from encoded device 12 via receiver 22. Similar to transmitter 20, receiver 22 may include suitable modem and driver circuit software and / or firmware to receive encoded multimedia over transmission channel 16, and transmit encoded multimedia data in a wireless application. It may also include RF circuitry to receive the wireless data to communicate. Decoding module 24 decodes a coded frame of data received via receiver 22. Decoding device 14 may also be integrated into decoding device 14 to a user via a display (not shown), which may be provided as a separate device coupled to decoding device 14 via a wireless connection. It may represent a decoded frame of data.

몇몇 예들에서, 인코딩 디바이스 (12) 및 디코딩 디바이스 (14) 는 각각 상호적 송신 및 수신 회로를 포함하여, 각각은 전송 채널 (16) 을 통해 송신된 인코딩된 멀티미디어 및 다른 정보에 대한 송신 디바이스 및 수신 디바이스 둘 모두로 서 역할을 할 수도 있다. 이 경우, 인코딩 디바이스 (12) 및 디코딩 디바이스 (14) 둘 모두가 멀티미디어 시퀀스를 송신 및 수신하여, 그 결과 양방향 통신에 참여할 수도 있다. 다른 말로, 코딩 시스템 (10) 의 설명된 컴포넌트들은 인코더/디코더 (CODEC) 의 일부로서 통합될 수도 있다.In some examples, encoding device 12 and decoding device 14 each include mutual transmission and reception circuitry, each transmitting and receiving devices for encoded multimedia and other information transmitted over transmission channel 16. It can also serve as both devices. In this case, both encoding device 12 and decoding device 14 may transmit and receive a multimedia sequence, resulting in participating in bidirectional communication. In other words, the described components of coding system 10 may be integrated as part of an encoder / decoder (CODEC).

인코딩 디바이스 (12) 및 디코딩 디바이스 (14) 내 컴포넌트들은 본원에 기재된 기술들을 구현하기 위해 적용가능한 컴포넌트들의 예시이다. 그러나, 원한다면, 인코딩 디바이스 (12) 및 디코딩 디바이스 (14) 는 많은 다른 컴포넌트들을 포함할 수도 있다. 예를 들어, 인코딩 디바이스 (12) 는 복수의 인코딩 모듈들을 포함할 수도 있는데, 이들 각각은 본원에 기재된 기술에 따라서 멀티미디어 데이터의 하나 이상의 시퀀스를 수신하고 멀티미디어 데이터의 각각의 시퀀스를 인코딩한다. 이 경우, 인코딩 디바이스 (12) 는, 송신을 위해 데이터의 세그먼트들을 결합하는 적어도 하나의 멀티플렉서를 더 포함할 수도 있다. 그 외에도, 인코딩 디바이스 (12) 및 디코딩 디바이스 (14) 는, 적용에 따라, 무선 주파수 (RF) 무선 컴포넌트 및 안테나를 포함하는, 인코딩된 비디오의 송신 및 수신을 위한 적절한 변조, 복조, 주파수 변환, 필터링, 및 증폭기 컴포넌트를 포함할 수도 있다. 그러나, 설명의 용이함을 위해서, 이러한 컴포넌트들을 도 1에 도시하지 않았다.The components in encoding device 12 and decoding device 14 are examples of components applicable for implementing the techniques described herein. However, if desired, encoding device 12 and decoding device 14 may include many other components. For example, encoding device 12 may include a plurality of encoding modules, each of which receives one or more sequences of multimedia data and encodes each sequence of multimedia data in accordance with the techniques described herein. In this case, encoding device 12 may further include at least one multiplexer that combines segments of data for transmission. In addition, encoding device 12 and decoding device 14, depending on the application, include appropriate modulation, demodulation, frequency conversion, for transmission and reception of encoded video, including radio frequency (RF) radio components and antennas. Filtering, and amplifier components. However, for ease of explanation, these components are not shown in FIG. 1.

도 2는 예시적인 인코딩 모듈 (30) 을 더욱 상세하게 도시하는 블록도이다. 인코딩 모듈 (30) 은, 예를 들어, 도 1의 인코딩 디바이스 (12) 의 인코딩 모듈 (18) 을 나타낼 수도 있다. 도 2에 도시된 바와 같이, 인코딩 모듈 (30) 은 하 나 이상의 소스들로부터 하나 이상의 멀티미디어 시퀀스들의 멀티미디어 데이터의 입력 프레임을 수신하고 수신된 멀티미디어 시퀀스의 프레임을 처리하는 제어 모듈 (32) 을 포함한다. 특히, 제어 모듈 (32) 은 멀티미디어 시퀀스의 인입하는 프레임을 분석하고 그 프레임의 분석에 기초하여 인입하는 프레임을 인코딩할지 또는 스킵할지 여부를 결정한다. 몇몇 양태에서, 인코딩 디바이스 (12) 는 전송 채널 (16) 에 걸친 대역폭을 보존하기 위해서 프레임 스킵핑을 이용하여 감소된 프레임 레이트에서 멀티미디어 시퀀스에 포함된 정보를 인코딩할 수도 있다.2 is a block diagram illustrating an example encoding module 30 in more detail. Encoding module 30 may represent, for example, encoding module 18 of encoding device 12 of FIG. 1. As shown in FIG. 2, encoding module 30 includes a control module 32 that receives an input frame of multimedia data of one or more multimedia sequences from one or more sources and processes the frame of the received multimedia sequence. . In particular, the control module 32 analyzes the incoming frames of the multimedia sequence and determines whether to encode or skip the incoming frames based on the analysis of those frames. In some aspects, encoding device 12 may encode the information included in the multimedia sequence at a reduced frame rate using frame skipping to conserve bandwidth over transport channel 16.

더욱이, 인코딩될 인입하는 프레임에 있어서, 제어 모듈 (32) 은 또한 프레임을 I 프레임, P 프레임 또는 B 프레임으로서 인코딩할지 여부를 결정하도록 구성될 수도 있다. 제어 모듈 (32) 은 인입하는 프레임을, 멀티미디어 시퀀스의 시작에서, 시퀀스 내의 장면 변경에서, 채널 스위치 프레임으로 사용하기 위해, 또는 인트라 리프레쉬 프레임으로 사용하기 위해 I 프레임으로 인코딩할 것을 결정할 수도 있다. 그렇지 않으면, 제어 모듈 (32) 은 프레임을 인터 코딩된 프레임 (즉, P 프레임 또는 B 프레임) 으로서 인코딩하여 프레임의 코딩과 연관된 대역폭의 양을 감소시킨다.Moreover, for incoming frames to be encoded, control module 32 may also be configured to determine whether to encode the frame as an I frame, a P frame or a B frame. Control module 32 may determine to encode the incoming frame into an I frame at the beginning of the multimedia sequence, at a scene change in the sequence, for use as a channel switch frame, or for use as an intra refresh frame. Otherwise, control module 32 encodes the frame as an inter coded frame (ie, a P frame or a B frame) to reduce the amount of bandwidth associated with coding of the frame.

제어 모듈 (32) 은 또한, 프레임을 복수의 블록들로 분할하고, 블록들 각각에 대하여, 상술된 H.264 코딩 모드들 중 하나와 같은 코딩 모드를 선택하도록 구성될 수도 있다. 아래에 더욱 상세하게 설명될 바와 같이, 인코딩 모듈 (30) 은 코딩 모드들 중 가장 효율적인 모드를 선택하도록 돕기 위해 모드들 중 적어도 일부에 대한 코딩 비용을 추정할 수도 있다. 블록들 중 하나를 코딩하는데 사 용하는 코딩 모드를 선택한 후, 인코딩 모듈 (30) 은 블록에 대한 잔여 데이터를 생성한다. 인트라 코딩될 것으로 선택된 블록에 있어서, 공간 예측 모듈 (34) 은 블록에 대한 잔여 데이터를 생성한다. 공간 예측 모듈 (34) 은, 예를 들어, 선택된 인트라 코딩 모드에 대응하는 보간 방향성 및 하나 이상의 인접 블록들을 이용하여 보간을 통해 블록의 예측된 버전을 생성할 수도 있다. 이후, 공간 예측 모듈 (34) 은 입력 프레임의 블록과 예측된 블록 사이의 차를 계산할 수도 있다. 이 차는 잔여 데이터 또는 잔여 계수로 지칭된다.Control module 32 may also be configured to divide the frame into a plurality of blocks and, for each of the blocks, select a coding mode, such as one of the H.264 coding modes described above. As will be described in more detail below, encoding module 30 may estimate a coding cost for at least some of the modes to help select the most efficient one of the coding modes. After selecting the coding mode to use to code one of the blocks, encoding module 30 generates residual data for the block. For the block selected to be intra coded, spatial prediction module 34 generates residual data for the block. Spatial prediction module 34 may, for example, generate a predicted version of the block through interpolation using interpolation directionality and one or more adjacent blocks corresponding to the selected intra coding mode. Spatial prediction module 34 may then calculate the difference between the block of the input frame and the predicted block. This difference is referred to as residual data or residual coefficient.

인터 코딩되도록 선택된 블록에 있어서, 움직임 추정 모듈 (36) 및 움직임 보상 모듈 (38) 은 블록에 대한 잔여 데이터를 생성한다. 특히, 움직임 추정 모듈 (36) 은 적어도 하나의 기준 프레임을 식별하고 그 입력 프레임 내의 블록과 가장 잘 매치하는 기준 프레임 내의 블록을 서치한다. 움직임 추정 모듈 (36) 은 입력 프레임 내의 블록의 위치와 기준 프레임 내의 식별된 블록의 위치 사이의 오프셋을 나타내기 위해서 움직임 벡터를 계산한다. 움직임 보상 모듈 (38) 은, 움직임 벡터가 가리키는, 입력 프레임의 블록과 기준 프레임 내의 식별된 블록 사이의 차를 계산한다. 이 차는 블록에 대한 잔여 데이터이다.For the block selected to be inter coded, motion estimation module 36 and motion compensation module 38 generate residual data for the block. In particular, motion estimation module 36 identifies at least one reference frame and searches for a block within the reference frame that best matches a block within that input frame. Motion estimation module 36 calculates a motion vector to indicate an offset between the position of the block in the input frame and the position of the identified block in the reference frame. Motion compensation module 38 calculates the difference between the block of the input frame and the identified block in the reference frame, indicated by the motion vector. This difference is the residual data for the block.

인코딩 모듈 (30) 은 또한 변환 모듈 (40), 양자화 모듈 (46) 및 엔트로피 인코더 (48) 를 포함한다. 변환 모듈 (40) 은 변환 함수에 따라서 블록의 잔여 데이터를 변환한다. 몇몇 양태에서, 변환 모듈 (40) 은 4×4 또는 8×8 정수 변환과 같은 정수 변환 또는 이산 코사인 변환 (DCT) 을 잔여 데이터에 적용하여 잔여 데이터에 대한 변환 계수를 생성한다. 양자화 모듈 (46) 은 변환 계수를 양자화하고 양자화된 변환 계수들을 엔트로피 인코더 (48) 로 제공한다. 엔트로피 인코더 (48) 는 CAVLC (context-adaptive variable-length coding) 또는 CABAC (context-adaptive binary arithmetic coding) 과 같은 콘텍스트 적응형 코딩 기술 (context-adaptive coding techniques) 을 이용하여 양자화된 변환 계수를 인코딩한다. 아래에 더욱 상세하게 설명될 바와 같이, 엔트로피 인코더 (48) 는 선택된 모드를 적용하여 블록의 데이터를 코딩한다.Encoding module 30 also includes transform module 40, quantization module 46, and entropy encoder 48. Transform module 40 transforms the residual data of the block according to the transform function. In some aspects, transform module 40 applies an integer transform or discrete cosine transform (DCT), such as a 4x4 or 8x8 integer transform, to the residual data to generate transform coefficients for the residual data. Quantization module 46 quantizes the transform coefficients and provides the quantized transform coefficients to entropy encoder 48. Entropy encoder 48 encodes quantized transform coefficients using context-adaptive coding techniques, such as context-adaptive variable-length coding (CAVLC) or context-adaptive binary arithmetic coding (CABAC). . As will be described in more detail below, entropy encoder 48 applies the selected mode to code the data of the block.

엔트로피 인코더 (48) 는 또한, 블록과 연관된 추가 데이터를 인코딩할 수도 있다. 예를 들어, 잔여 데이터 이외에, 엔트로피 인코더 (48) 는 블록의 하나 이상의 움직임 벡터, 블록의 코딩 모드를 나타내는 식별자, 하나 이상의 기준 프레임 인덱스, 양자화 파라미터 (QP) 정보, 블록의 슬라이스 정보 등을 인코딩할 수도 있다. 엔트로피 인코더 (48) 는 인코딩 모듈 (30) 내의 다른 모듈들로부터 이 추가적인 블록 데이터를 수신할 수도 있다. 예를 들어, 움직임 벡터 정보는 움직임 추정 모듈 (36) 로부터 수신될 수도 있는 반면, 블록 모드 정보는 제어 모듈 (32) 로부터 수신될 수도 있다. 몇몇 양태들에서, 엔트로피 인코더 (48) 는, 지수-골롬 코딩 ("Exp-Golomb") 과 같은 보편적인 VLC (universal variable length coding) 또는 FLC (fixed length coding) 기술을 이용하여 이 추가적인 정보 중 적어도 일부를 코딩할 수도 있다. 대안으로, 엔트로피 인코더 (48) 는 상술된 콘텍스트 적응형 코딩 기술들, 즉, CABAC 또는 CAVLC를 이용하여 추가적인 블록 데이터의 일부를 인코딩할 수도 있다.Entropy encoder 48 may also encode additional data associated with the block. For example, in addition to the residual data, entropy encoder 48 may encode one or more motion vectors of the block, an identifier indicating the coding mode of the block, one or more reference frame indexes, quantization parameter (QP) information, slice information of the block, and the like. It may be. Entropy encoder 48 may receive this additional block data from other modules in encoding module 30. For example, motion vector information may be received from motion estimation module 36, while block mode information may be received from control module 32. In some aspects, entropy encoder 48 uses at least one of this additional information using universal variable length coding (VLC) or fixed length coding (FLC) techniques, such as exponential-Golomb coding (“Exp-Golomb”). You can also code some. Alternatively, entropy encoder 48 may encode some of the additional block data using the context adaptive coding techniques described above, ie, CABAC or CAVLC.

블록에 대한 모드의 선택 시 제어 모듈 (32) 을 돕기 위해서, 제어 모듈 (32) 은 가능한 모드들 중 적어도 일부에 대한 코딩 비용을 추정한다. 특정 양태에서, 제어 모듈 (32) 은 가능한 코딩 모드들 각각에서 블록을 코딩하는 비용을 추정할 수도 있다. 이 비용은, 예를 들어, 주어진 모드에서 블록을 코딩하는 것과 연관된 비트들의 수 대 그 모드에서 생성된 왜곡의 양에 대하여 추정될 수도 있다. H.264 표준의 경우, 예를 들어, 제어 모듈 (32) 은 인터 코딩을 위해 선택된 블록에 대한 22개의 상이한 코딩 모드들 (인터 코딩 모드 및 인트라 코딩 모드) 과 인트라 코딩을 위해 선택된 블록에 대한 13개의 상이한 코딩 모드들에 대한 코딩 비용을 추정할 수도 있다. 다른 양태에서, 제어 모듈 (32) 은, 처음에 가능한 모드들의 세트를 감소시키기 위해 다른 모든 선택 기술을 이용할 수도 있고, 이후, 그 세트 중 남은 모드들에 대한 코딩 비용을 추정하기 위해 본 개시의 기술을 이용할 수도 있다. 다른 말로, 몇몇 양태들에서, 제어 모듈 (32) 은 비용 추정 기술을 적용하기 전에 모드 가능성의 수를 좁힌다. 유익하게, 인코딩 모듈 (30) 은 상이한 모드들에 대하여 블록들의 데이터를 실제로 코딩하지 않고 모드들에 대한 코딩 비용을 추정함으로써, 코딩 결정과 연관된 계산적인 오버헤드를 감소시킨다. 사실상, 도 2에 도시된 예에서, 인코딩 모듈 (30) 은 상이한 모드들에 대하여 블록의 데이터를 양자화하지 않고 코딩 비용을 추정할 수도 있다. 이 방식에서, 본 개시의 코딩 비용 추정 기술은 코딩 비용을 계산하는데 필요한 연산 집약적인 계산량을 감소시킨다. 특히, 모드들 중 하나를 선택하기 위해서 다양한 코딩 모드들을 이용하는 블록들을 인코딩할 필요가 없다.In order to assist the control module 32 in selecting the mode for the block, the control module 32 estimates the coding cost for at least some of the possible modes. In a particular aspect, control module 32 may estimate the cost of coding the block in each of the possible coding modes. This cost may, for example, be estimated with respect to the number of bits associated with coding a block in a given mode versus the amount of distortion generated in that mode. For the H.264 standard, for example, the control module 32 may include 22 different coding modes (inter coding mode and intra coding mode) for the block selected for inter coding and 13 for the block selected for intra coding. Coding costs may be estimated for two different coding modes. In another aspect, control module 32 may initially use all other selection techniques to reduce the set of possible modes, and then describe techniques of this disclosure to estimate coding costs for the remaining modes of the set. Can also be used. In other words, in some aspects, control module 32 narrows the number of mode possibilities before applying the cost estimation technique. Advantageously, encoding module 30 reduces the computational overhead associated with the coding decision by estimating the coding cost for the modes without actually coding the data of the blocks for the different modes. In fact, in the example shown in FIG. 2, encoding module 30 may estimate the coding cost without quantizing the data of the block for different modes. In this manner, the coding cost estimation technique of the present disclosure reduces the computationally intensive computational amount required to calculate the coding cost. In particular, there is no need to encode blocks using various coding modes to select one of the modes.

본원에서 보다 상세하게 설명될 바와 같이, 제어 모듈 (32) 은 식 (1) 에 따 라서 각각 분석된 모드의 코딩 비용을 추정한다.As will be described in more detail herein, the control module 32 estimates the coding cost of each analyzed mode according to equation (1).

J는 추정된 코딩 비용이고, D는 블록의 왜곡 메트릭이고, λmode는 각각의 모드의 라그랑즈 승수이고, R은 블록의 레이트 메트릭이다. 왜곡 메트릭 (D) 은, 예를 들어, SAD (sum of absolute difference), SSD (sum of square difference), SATD (sum of absolute transform difference), SSTD (sum of square transform different) 등을 포함할 수도 있다. 레이트 메트릭 (R) 은, 예를 들어, 주어진 블록 내 데이터의 코딩과 연관된 비트들의 수일 수도 있다. 상술된 바와 같이, 상이한 유형의 블록 데이터가 상이한 코딩 기술을 이용하여 코딩될 수도 있다. 이와 같이, 식 (1) 을 식 (2) 의 형식으로 다시 쓸 수도 있다.J is the estimated coding cost, D is the distortion metric of the block, [lambda] mode is the Lagrangian multiplier of each mode, and R is the rate metric of the block. The distortion metric (D) may include, for example, sum of absolute difference (SAD), sum of square difference (SSD), sum of absolute transform difference (SATD), sum of square transform different (SSTD), and the like. . The rate metric R may be, for example, the number of bits associated with the coding of data in a given block. As mentioned above, different types of block data may be coded using different coding techniques. In this way, equation (1) may be rewritten in the form of equation (2).

여기서 R_context는 콘텍스트 적응형 코딩 기술을 이용하여 코딩된 블록 데이터데 대한 레이트 메트릭을 나타내고 R_{non_context}는 논 콘텍스트 적응형 코딩 기술을 이용하여 코딩된 블록 데이터에 대한 레이트 메트릭을 나타낸다. H.264 표준에서, 예를 들어, 잔여 데이터는 CAVLC 또는 CABAC와 같은 콘텍스트 적응형 코딩을 이용하여 코딩될 수도 있다. 움직임 벡터, 블록 모드들과 같은 다른 블록 데이터는 지수-골롬과 같은 보편적인 VLC 또는 FLC 기술을 이용하여 코딩될 수도 있다. 이 경우, 식 (2) 를 식 (3) 의 형식으로 다시 쓸 수도 있다.Where R _context represents a rate metric for block data coded using a context adaptive coding technique and R _{non_context} represents a rate metric for block data coded using a non context adaptive coding technique. In the H.264 standard, for example, residual data may be coded using context adaptive coding such as CAVLC or CABAC. Other block data, such as motion vectors, block modes, may be coded using universal VLC or FLC techniques such as exponential-golom. In this case, equation (2) may be rewritten in the form of equation (3).

R_residual은 콘텍스트 적응형 코딩 기술을 이용하여 잔여 데이터를 코딩하기 위한 레이트 메트릭, 예를 들어, 잔여 데이터의 코딩과 연관된 비트들의 수를 나타내고, R_other는 FLC 또는 보편적인 VLD 기술을 이용하여 다른 블록 데이터를 코딩하기 위한 레이트 메트릭, 예를 들어, 다른 블록 데이터의 코딩과 연관된 비트들의 수를 나타낸다.R _residual by using the context-adaptive coding technique for rate metric, for example, to code the residual data, indicates the number of bits associated with the encoding of the residual data, R _other different blocks by using the FLC or universal VLD Technology A rate metric for coding data, eg, the number of bits associated with coding of other block data.

추정된 코딩 비용 (J) 의 계산 시, 인코딩 모듈 (30) 은 FLC 또는 보편적인 VLC를 이용하여 코딩 블록 데이터와 연관된 비트들의 수, 즉, R_other를 비교적 쉽게 결정할 수도 있다. 인코딩 모듈 (30) 은, 예를 들어, FLC 또는 보편적인 VLC를 이용한 블록 데이터의 코딩과 연관된 비트들의 수를 식별하기 위해 코드 테이블을 이용할 수도 있다. 이 코드 테이블은, 예를 들어, 복수의 코드워드와 코드워드의 코딩과 연관된 비트들의 수를 포함할 수도 있다. 그러나, 잔여 데이터 (R_residual) 의 코딩과 연관된 비트들의 수의 결정은, 데이터의 콘텍스트의 함수로서 콘텍스트 적응형 코딩의 적응 성질로 인해 훨씬 더 어려운 작업을 나타낸다. 잔여 데이터의 코딩과 연관된 비트들의 정확한 수 또는 어떤 데이터가 콘텍스트 적응형 코딩 중인지를 결정하기 위해, 인코딩 모듈 (30) 은 잔여 데이터를 변환하고, 변환된 잔여 데이터를 양자화하고, 변환-양자화된 잔여 데이터를 인코딩해야 한다. 그러나, 본 개시의 기술에 따르면, 비트 추정 모듈 (42) 은 잔여 데이터를 실제로 코딩하지 않고 콘텍스트 적응형 코딩 기술을 이용한 잔여 데이터의 코딩과 연관된 비트들의 수를 추정할 수도 있다.In calculating the estimated coding cost J, encoding module 30 may use FLC or universal VLC to relatively easily determine the number of bits associated with the coding block data, ie, R _other . Encoding module 30 may, for example, use a code table to identify the number of bits associated with coding of block data using FLC or universal VLC. This code table may include, for example, a plurality of codewords and the number of bits associated with coding of the codeword. However, the determination of the number of bits associated with the coding of _residual data R _residual represents a much more difficult task due to the adaptive nature of context adaptive coding as a function of the context of the data. To determine the exact number of bits associated with the coding of residual data or which data is context adaptive coding, encoding module 30 transforms the residual data, quantizes the transformed residual data, and transform-quantizes the residual data. Must be encoded. However, in accordance with the techniques of this disclosure, bit estimation module 42 may estimate the number of bits associated with coding of residual data using a context adaptive coding technique without actually coding the residual data.

도 2에 도시된 예에서, 비트 추정 모듈 (42) 은 잔여 데이터에 대한 변환 계수를 이용하여 잔여 데이터의 코딩과 연관된 비트들의 수를 추정한다. 이와 같이, 분석될 모드 각각에 대하여, 인코딩 모듈 (30) 은, 잔여 데이터의 코딩과 연관된 비트들의 수를 추정하기 위해 잔여 데이터에 대한 변환 계수를 계산하기만 하면 된다. 따라서, 인코딩 모듈 (30) 은, 변환 계수를 양자화하지 않고 또는 모드들 각각에 대한 양자화된 변환 계수를 인코딩하지 않음으로써 잔여 데이터의 코딩과 연관된 비트들의 수를 결정하는데 요구되는 시간과 계산 자원의 양을 감소시킨다.In the example shown in FIG. 2, bit estimation module 42 estimates the number of bits associated with the coding of the residual data using the transform coefficients for the residual data. As such, for each mode to be analyzed, encoding module 30 only needs to calculate transform coefficients for the residual data to estimate the number of bits associated with the coding of the residual data. Thus, encoding module 30 does not quantize the transform coefficients or encode the quantized transform coefficients for each of the modes so that the amount of time and computational resources required to determine the number of bits associated with the coding of the residual data. Decreases.

비트 추정 모듈 (42) 은 양자화 이후 논-제로로 남게 될 하나 이상의 변환 계수들을 식별하기 위해 변환 모듈 (40) 에 의해 출력된 변환 계수들을 분석한다. 특히, 비트 추정 모듈 (42) 은 변환 계수들 각각을 대응하는 임계치와 비교한다. 몇몇 양태들에서, 대응하는 임계치는 인코딩 모듈 (30) 의 QP의 함수로서 계산될 수도 있다. 비트 추정 모듈 (42) 은, 양자화 이후 논-제로로 남게 될 변환 계수로서, 그 대응하는 임계치보다 크거나 같은 변환 계수들을 식별한다.Bit estimation module 42 analyzes the transform coefficients output by transform module 40 to identify one or more transform coefficients that will remain non-zero after quantization. In particular, bit estimation module 42 compares each of the transform coefficients to a corresponding threshold. In some aspects, the corresponding threshold may be calculated as a function of QP of encoding module 30. Bit estimation module 42 identifies transform coefficients that are greater than or equal to their corresponding threshold, as transform coefficients that will remain non-zero after quantization.

비트 추정 모듈 (42) 은, 적어도 양자화 이후 논-제로로 남는 것으로 식별된 변환 계수에 기초하여 잔여 데이터의 코딩과 연관된 비트들의 수를 추정한다. 특히, 비트 추정 모듈 (42) 은 양자화에 잔존할 논-제로 변환 계수들의 수를 결정할 수도 있다. 비트 추정 모듈 (42) 은 또한, 양자화에 잔존한 것으로 식별된 변환 계수들의 절대값들 중 적어도 일부를 합산한다. 이후, 비트 추정 모듈 (42) 은 식 (4) 를 이용하여 잔여 데이터에 대한 레이트 메트릭, 즉 잔여 데이터의 코딩과 연관된 비트들의 수를 추정한다.Bit estimation module 42 estimates the number of bits associated with the coding of the residual data based at least on the transform coefficients identified as remaining non-zero after quantization. In particular, bit estimation module 42 may determine the number of non-zero transform coefficients that will remain in quantization. Bit estimation module 42 also sums at least some of the absolute values of the transform coefficients identified as remaining in quantization. The bit estimation module 42 then uses equation (4) to estimate the rate metric for the residual data, ie the number of bits associated with the coding of the residual data.

SATD는 양자화에 잔존하는 것으로 예측된 논-제로 변환 계수들의 절대값들 중 적어도 일부의 합이고, NZ_est는 양자화에 잔존하는 것으로 예측된 논-제로 변환 계수들의 추정된 수이고 a₁, a₂, 및 a₃는 계수들이다. 계수들 a₁, a₂, 및 a₃는, 예를 들어, 최소 자승 추정을 이용하여 계산될 수도 있다. 변환 계수들의 합은 식 (4) 의 예의 절대 변환 차의 합 SATD이지만, SSTD들과 같은 다른 차 계수들이 사용될 수도 있다.SATD is the sum of at least some of the absolute values of the non-zero transform coefficients predicted to remain in quantization, NZ _est is the estimated number of non-zero transform coefficients predicted to remain in quantization and a ₁ , a ₂ , And a ₃ are the coefficients. The coefficients a ₁ , a ₂ , and a ₃ may be calculated using, for example, a least squares estimate. The sum of the transform coefficients is the sum SATD of the absolute transform difference of the example of equation (4), but other difference coefficients such as SSTDs may be used.

4×4 블록에 대한 R_residual의 예시적인 계산을 아래에 설명한다. 상이한 사이즈의 블록에 대하여 유사한 계산들이 수행될 수도 있다. 인코딩 모듈 (30) 은 잔여 데이터에 대한 변환 계수들의 매트릭스를 계산한다. 예시적인 변환 계수들의 매트릭스를 아래에 나타낸다.An example calculation of R _residual for a 4x4 block is described below. Similar calculations may be performed for blocks of different sizes. Encoding module 30 calculates a matrix of transform coefficients for the residual data. An example matrix of transform coefficients is shown below.

변환 계수들의 매트릭스 A의 로우의 수는 블록 내 픽셀들의 로우의 수와 같고 변환 계수들의 매트릭스의 컬럼의 수는 블록 내 픽셀들의 컬럼의 수와 같다. 이와 같이, 상기 예에서, 변한 계수들의 매트릭스의 디멘젼은 4×4 블록에 대응하는 4×4 이다. 변환 계수들의 매트릭스의 엔트리들 A(i,j) 각각은 각각의 잔여 계수들의 변환이다.The number of rows of the matrix A of transform coefficients is equal to the number of rows of pixels in the block and the number of columns of the matrix of transform coefficients is equal to the number of columns of pixels in the block. As such, in the above example, the dimension of the matrix of changed coefficients is 4 × 4 corresponding to a 4 × 4 block. Each of entries A (i, j) of the matrix of transform coefficients is a transform of respective residual coefficients.

양자화 동안, 보다 작은 값들을 갖는 매트릭스 A의 변환 계수는 양자화 이후 제로가 되는 경향이 있다. 이와 같이, 인코딩 모듈 (30) 은 잔여 변환 계수들의 매트릭스 A를 임계치들의 매트릭스와 비교하여 매트릭스 A의 어느 변환 계수가 양자화 이후 논-제로로 남게 될지를 예측한다. 예시적인 임계치들의 매트릭스를 아래에 나타낸다.During quantization, the transform coefficients of matrix A with smaller values tend to be zero after quantization. As such, encoding module 30 compares matrix A of residual transform coefficients with a matrix of thresholds to predict which transform coefficient of matrix A will remain non-zero after quantization. An example matrix of thresholds is shown below.

매트릭스 C 는 QP 값의 함수로서 계산될 수도 있다. 매트릭스 C의 디멘젼은 매트릭스 A의 디멘젼과 동일하다. H.264 표준의 경우에서, 예를 들어, 매트릭스 C의 엔트리는 식 (5) 에 기초하여 계산될 수도 있다.The matrix C may be calculated as a function of the QP value. The dimension of matrix C is the same as the dimension of matrix A. In the case of the H.264 standard, for example, the entry of matrix C may be calculated based on equation (5).

QBITS{QP}는 QP의 함수로서 스케일링을 결정하는 파라미터이고, Level-Offset(i,j){QP}는 매트릭스의 로우 i와 칼럼 j에서의 엔트리에 대한 데드존 파라미터이고, 또한, QP의 함수, Level_Scale(i,j){QP}는 매트릭스의 로우 i와 칼럼 j에서의 엔트리에 대한 곱셈 인자이며 QP의 함수이고, i는 매트릭스의 로우에 해당하고 j는 매트릭스의 칼럼에 해당하고 QP는 인코딩 모듈 (30) 의 양자화 파라미터에 해당한다. 예시적인 식 (5) 에서, 변수는 연산 QP의 함수로서 H.264 코딩 표준에서 정의될 수도 있다. 변수들 중 어느 것이 양자화에 잔존할 것인지를 결정하기 위해 다른 식들이 사용될 수도 있고, 특정 표준에 의해 채택된 양자화 방법에 기초한 다른 코딩 표준에서 다른 식들이 정의될 수도 있다. 몇몇 양태들에서, 인코딩 모듈 (30) 은 QP 값들의 범위 내에서 연산하도록 구성될 수도 있다. 이 경우, 인코딩 모듈 (30) 은 QP 값들의 범위 내의 QP 값들 각각에 대응하는 복수의 비교 매트릭스들을 미리 계산할 수도 있다. 인코딩 모듈 (30) 은 인코딩 모듈 (30) 의 QP에 대응하는 비교 매트릭스를 선택하여 변환 계수 매트릭스와 비교한다.QBITS {QP} is a parameter that determines scaling as a function of QP, and Level-Offset (i, j) {QP} is a dead zone parameter for entries in row i and column j of the matrix, and also a function of QP , Level_Scale (i, j) {QP} is the multiplication factor for the entries in row i and column j of the matrix and is a function of QP, i is the row of the matrix, j is the column of the matrix and QP is the encoding Corresponds to the quantization parameter of module 30. In the exemplary equation (5), the variable may be defined in the H.264 coding standard as a function of operation QP. Other equations may be used to determine which of the variables will remain in quantization, and other equations may be defined in other coding standards based on the quantization method adopted by a particular standard. In some aspects, encoding module 30 may be configured to operate within a range of QP values. In this case, encoding module 30 may precalculate a plurality of comparison matrices corresponding to each of the QP values within the range of QP values. Encoding module 30 selects a comparison matrix corresponding to the QP of encoding module 30 and compares it with the transform coefficient matrix.

변환 계수의 매트릭스 A와 임계치의 매트릭스 C 사이의 비교 결과는 1과 0의 매트릭스이다. 상기 예에서, 이 비교는 아래에 나타낸 1과 0의 매트릭스가 된다:The result of the comparison between the matrix A of the transform coefficients and the matrix C of the threshold is a matrix of ones and zeros. In the above example, this comparison is a matrix of 1s and 0s shown below:

1은 양자화에 잔존할 것 같은, 즉, 논-제로로 남게될 것 같은 것으로 식별된 변환 계수들의 위치를 나타내고, 0은 양자화에 잔존할 것 같지 않은, 즉, 제로가 될 것 같은 변환 계수들의 위치를 나타낸다. 상술된 바와 같이, 변환 계수는, 매트릭스 A의 변환 계수의 절대값이 매트릭스 C의 대응하는 임계치보다 크거나 같을 때, 논-제로로 남게될 것으로 식별된다.1 indicates the position of transform coefficients that are likely to remain in quantization, i.e., remain non-zero, and 0 indicates the position of transform coefficients that are unlikely to remain in quantization, i.e., become zero. Indicates. As mentioned above, the transform coefficients are identified to remain non-zero when the absolute value of the transform coefficients of matrix A is greater than or equal to the corresponding threshold of matrix C.

결과적으로 생성된 1과 0의 매트릭스를 이용하여, 비트 추정 모듈 (42) 은 양자화에 잔존하게 될 변환 계수의 수를 결정한다. 다른 말로, 비트 추정 모듈 (42) 은 양자화 후 논-제로로 남게 되는 것으로 식별된 변환 계수의 수를 결정한다. 비트 추정 모듈 (42) 은 식 (6) 에 따라서 양자화 후 논-제로로 남는 것으로 식별된 변환 계수의 수를 결정할 수도 있다.Using the resulting matrix of 1s and 0s, bit estimation module 42 determines the number of transform coefficients that will remain in quantization. In other words, bit estimation module 42 determines the number of transform coefficients identified as being left non-zero after quantization. Bit estimation module 42 may determine the number of transform coefficients identified as remaining non-zero after quantization according to equation (6).

NZ_est는 논-제로 변환 계수들의 추정된 수이고, M(i,j)는 로우 i와 칼럼 j에서의 매트릭스 M의 값이다. 상술된 예에서, NZ_est는 8과 같다.NZ _est is an estimated number of non-zero transform coefficients, and M (i, j) is the value of matrix M in row i and column j. In the above example, NZ _est is equal to eight.

비트 추정 모듈 (42) 은 또한, 양자화에 잔존하는 것으로 추정된 변환 계수 들의 절대값의 적어도 일부의 합을 계산한다. 특정 양태에서, 비트 추정 모듈 (42) 은 식 (7) 에 따라서 변환 계수의 절대값들 중 적어도 일부의 합을 계산할 수도 있다:Bit estimation module 42 also calculates the sum of at least some of the absolute values of the transform coefficients estimated to remain in quantization. In a particular aspect, the bit estimation module 42 may calculate the sum of at least some of the absolute values of the transform coefficients according to equation (7):

SATD는 양자화 이후 논-제로로 남은 것으로 식별된 변환 계수들의 총 합이고, M(i,j)는 로우 i와 칼럼 j에서의 매트릭스 M의 값이고, A(i,j)는 로우 i 및 칼럼 j에서의 매트릭스 A의 값이고, abs(x)는 x의 절대값을 계산하는 절대값 함수이다. 상술된 예에서, SATD는 2361과 같다. SSTD들과 같이, 변환 계수들을 위해 다른 상이한 매트릭스들이 사용될 수도 있다.SATD is the sum of transform coefficients identified as remaining non-zero after quantization, M (i, j) is the value of matrix M in row i and column j, and A (i, j) is row i and column The value of matrix A in j, abs (x) is an absolute value function that computes the absolute value of x. In the example described above, the SATD is equal to 2361. Like SSTDs, other different matrices may be used for the transform coefficients.

이러한 값들을 이용하여, 비트 추정 모듈 (42) 은 상기 식 (3) 을 이용한 잔여 계수들의 코딩과 연관된 비트들의 수를 근사화한다. 제어 모듈 (32) 은 모드의 총 코딩 비용의 추정을 계산하기 위해 R_residual의 추정을 이용할 수도 있다. 인코딩 모듈 (30) 은 동일한 방식으로 하나 이상의 다른 가능한 모드들에 대한 총 코딩 비용을 추정할 수도 있고, 이후, 최소 코딩 비용을 갖는 모드를 선택한다. 이후, 인코딩 모듈 (30) 은 선택된 코딩 모드를 적용하여 프레임의 블록 또는 블록들을 코딩한다.Using these values, bit estimation module 42 approximates the number of bits associated with the coding of residual coefficients using equation (3) above. Control module 32 may use the estimate of R _residual to calculate an estimate of the total coding cost of the mode. Encoding module 30 may estimate the total coding cost for one or more other possible modes in the same manner, and then select the mode with the minimum coding cost. Encoding module 30 then applies the selected coding mode to code the block or blocks of the frame.

앞의 기술들은 개별적으로 구현될 수도 있고, 또는 기술들 중 2 이상, 또는 이러한 기술들 모두가 인코딩 디바이스 (12) 에서 함께 구현될 수도 있다. 인 코딩 모듈 (30) 내 컴포넌트들은 본원에 기재된 기술들을 구현하기 위해 적용가능한 컴포넌트들의 예이다. 그러나, 인코딩 모듈 (30) 은 많은 다른 컴포넌트들 뿐만 아니라, 원한다면, 상술된 하나 이상의 모듈들의 기능성을 결합하는 보다 소수의 컴포넌트들을 포함할 수도 있다. 인코딩 모듈 (30) 의 컴포넌트들은 하나 이상의 프로세서, 디지털 신호 프로세서, 주문형 집적 반도체 (ASIC), 필드 프로그래머블 게이트 어레이 (FPGA), 별도의 로직, 소프트웨어, 하드웨어, 펌웨어, 또는 그 임의의 조합으로서 구현될 수도 있다. 상이한 특징들을 모듈로서 서술한 것은 인코딩 모듈 (30) 의 상이한 기능적 양태들을 강조하기 위해 의도된 것이고, 이러한 모듈들이 하드웨어 또는 소프트웨어 컴포넌트들을 분리함으로써 실현되어야 한다는 것을 필수적으로 의미하는 것은 아니다. 오히려, 하나 이상의 모듈들과 연관된 기능성은 공동의 또는 단독의 하드웨어 또는 소프트웨어 컴포넌트들을 내에서 통합될 수도 있다.The foregoing techniques may be implemented separately, or two or more of the techniques, or both, may be implemented together in the encoding device 12. Components in encoding module 30 are examples of components applicable for implementing the techniques described herein. However, encoding module 30 may include many other components, as well as fewer components that combine the functionality of one or more modules described above, if desired. The components of encoding module 30 may be implemented as one or more processors, digital signal processors, application specific integrated semiconductors (ASICs), field programmable gate arrays (FPGAs), separate logic, software, hardware, firmware, or any combination thereof. have. The description of the different features as a module is intended to highlight different functional aspects of the encoding module 30 and does not necessarily mean that these modules must be realized by separating hardware or software components. Rather, functionality associated with one or more modules may be integrated within common or separate hardware or software components.

도 3은 다른 예시적인 인코딩 모듈 (50) 을 도시하는 블록도이다. 도 3의 인코딩 모듈 (50) 은, 인코딩 모듈 (50) 의 비트 추정 모듈 (52) 이 잔여 데이터에 대한 변환 계수들의 양자화 이후 잔여 데이터의 코딩과 연관된 비트들의 수를 추정하는 것을 제외하고 도 2의 인코딩 모듈 (30) 과 실질적으로 일치한다. 특히, 변환 계수들의 양자화 이후, 비트 추정 모듈 (52) 은 식 (8) 을 이용하여 잔여 계수들의 코딩과 연관된 비트들의 수를 추정한다:3 is a block diagram illustrating another example encoding module 50. Encoding module 50 of FIG. 3 uses the method of FIG. 2 except that bit estimation module 52 of encoding module 50 estimates the number of bits associated with coding of the residual data after quantization of the transform coefficients for the residual data. Substantially consistent with encoding module 30. In particular, after quantization of the transform coefficients, bit estimation module 52 uses equation (8) to estimate the number of bits associated with the coding of the residual coefficients:

SATQD는 논-제로 양자화된 변환 계수들의 절대값들의 합이고, NZ_TQ는 논-제로 양자화된 변환 계수들의 수이고, a₁, a₂ 및 a₃는 계수이다. 계수들 a₁, a₂ 및 a₃는, 예를 들어, 최소 자승 추정을 이용하여 계산될 수도 있다. 인코딩 모듈 (50) 이 잔여 데이터의 코딩과 연관된 비트들의 수를 추정하기 전에 변환 계수들을 양자화하더라도, 인코딩 모듈 (50) 은 여전히, 블록들의 데이터를 실제로 추정하지 않고 모드들에 대한 코딩 비용을 추정한다. 따라서, 연산 집약적인 계산량이 여전히 감소된다.SATQD is the sum of absolute values of the non-zero quantized transform coefficients, NZ _TQ is the number of non-zero quantized transform coefficients, and a ₁ , a _2, and a ₃ are coefficients. The coefficients a ₁ , a _2, and a ₃ may be calculated using, for example, a least squares estimate. Even though encoding module 50 quantizes the transform coefficients before estimating the number of bits associated with coding of the residual data, encoding module 50 still estimates the coding cost for the modes without actually estimating the data of the blocks. . Thus, the computationally intensive computation amount is still reduced.

도 4는, 적어도 추정된 코딩 비용에 기초하여 인코딩 모듈을 선택하는, 도 2의 인코딩 모듈 (30) 및/또는 도 3의 인코딩 모듈 (50) 과 같은 인코딩 모듈의 예시적인 연산을 도시하는 흐름도이다. 그러나, 예시를 위해, 도 4는 인코딩 모듈 (30) 에 대하여 논의할 것이다. 인코딩 모듈 (30) 은 코딩 비용 (60) 을 추정하기 위한 모드를 선택한다 (60). 인코딩 모듈 (30) 은 현재 블록에 대한 왜곡 메트릭을 생성한다 (62). 예를 들어, 인코딩 모듈 (30) 은 블록과 적어도 하나의 기준 블록 사이의 비교에 기초하여 왜곡 메트릭을 계산할 수도 있다. 인트라 코딩되도록 선택된 블록의 경우, 기준 블록은 동일 프레임 내의 인접 블록일 수도 있다. 반면에, 인터 코딩되도록 선택된 블록에 있어서, 기준 블록은 인접 블록으로부터의 블록일 수도 있다. 왜곡 메트릭은, 예를 들어, SAD, SSD, SATD, SSTD 또는 다른 유사한 왜곡 메트릭일 수도 있다.4 is a flow diagram illustrating exemplary operations of an encoding module, such as encoding module 30 of FIG. 2 and / or encoding module 50 of FIG. 3, selecting an encoding module based at least on an estimated coding cost. . However, for illustration purposes, FIG. 4 will discuss the encoding module 30. Encoding module 30 selects a mode for estimating coding cost 60 (60). Encoding module 30 generates a distortion metric for the current block (62). For example, encoding module 30 may calculate a distortion metric based on the comparison between the block and the at least one reference block. For a block selected to be intra coded, the reference block may be an adjacent block within the same frame. On the other hand, for a block selected to be inter coded, the reference block may be a block from an adjacent block. The distortion metric may be, for example, a SAD, SSD, SATD, SSTD or other similar distortion metric.

도 4의 예에서, 인코딩 모듈 (30) 은 논 콘텍스트 적응형 코딩 기술을 이용하여 코딩된 데이트의 일부의 코딩과 연관된 비트들의 수를 결정한다 (64). 상술된 바와 같이, 이 데이터는 블록의 하나 이상의 움직임 벡터들, 블록의 코딩 모드를 나타내는 식별자, 하나 이상의 기준 프레임 인덱스들, QP 정보, 블록의 슬라이스 정보 등을 포함할 수도 있다. 인코딩 모듈 (30) 은, 예를 들어, FLC, 보편적인 VLC 또는 콘텍스트 적응형 코딩 기술을 이용한 데이터의 코딩과 연관된 비트들의 수를 식별하기 위해 코드 테이블을 사용할 수도 있다.In the example of FIG. 4, encoding module 30 uses a non-context adaptive coding technique to determine the number of bits associated with coding of a portion of the coded data (64). As described above, this data may include one or more motion vectors of the block, an identifier indicating the coding mode of the block, one or more reference frame indices, QP information, slice information of the block, and the like. Encoding module 30 may use a code table to identify the number of bits associated with coding of data using, for example, FLC, universal VLC, or context adaptive coding techniques.

인코딩 모듈 (30) 은 콘텍스트 적응형 코딩 기술을 이용하여 코딩된 데이터의 일부분의 코딩과 연관된 비트들의 수를 추정 및/또는 계산한다 (66). H.264 표준의 콘텍스트에서, 예를 들어, 인코딩 모듈 (30) 은 콘텍스트 적응형 코딩을 이용한 잔여 데이터의 코딩과 연관된 비트들의 수를 추정할 수도 있다. 인코딩 모듈 (30) 은 잔여 데이터의 코딩을 실제로 수행하지 않고 잔여 데이터의 코딩과 연관된 비트들을 수를 추정할 수도 있다. 특정 양태에서, 인코딩 모듈 (30) 은 잔여 데이터를 양자화하지 않고 잔여 데이터의 코딩과 연관된 비트들의 수를 추정할 수도 있다. 예를 들어, 인코딩 모듈 (30) 은 잔여 데이터에 대한 변환 계수를 계산하고 양자화 이후 논-제로로 남을 것 같은 변환 계수를 식별할 수도 있다. 이러한 식별된 변환 계수들을 이용하여, 인코딩 모듈 (30) 은 잔여 데이터의 코딩과 연관된 비트들의 수를 추정한다. 다른 양태에서, 인코딩 모듈 (30) 은 변환 계수를 양자화하고, 적어도 양자화된 변환 계수에 기초하여 잔여 데이터의 코딩 과 연관된 비트들의 수를 추정할 수도 있다. 어느 한 경우, 인코딩 모듈 (30) 은 비트들의 요구된 수를 추정함으로써 시간과 처리 자원을 절약한다. 충분한 컴퓨팅 자원이 있다면, 인코딩 모듈 (30) 은 추정 대신 요구된 비트들의 실제 수를 계산할 수도 있다.Encoding module 30 estimates and / or calculates the number of bits associated with the coding of the portion of the coded data using the context adaptive coding technique (66). In the context of the H.264 standard, for example, encoding module 30 may estimate the number of bits associated with coding of residual data using context adaptive coding. Encoding module 30 may estimate the number of bits associated with coding of residual data without actually performing coding of residual data. In a particular aspect, encoding module 30 may estimate the number of bits associated with coding of the residual data without quantizing the residual data. For example, encoding module 30 may calculate transform coefficients for the residual data and identify transform coefficients that are likely to remain non-zero after quantization. Using these identified transform coefficients, encoding module 30 estimates the number of bits associated with the coding of the residual data. In another aspect, encoding module 30 may quantize the transform coefficients and estimate the number of bits associated with coding of residual data based at least on the quantized transform coefficients. In either case, encoding module 30 saves time and processing resources by estimating the required number of bits. If there are sufficient computing resources, encoding module 30 may calculate the actual number of bits required instead of estimation.

인코딩 모듈 (30) 은 선택된 모드에서 블록을 코딩하기 위한 총 코딩 비용을 추정 및/또는 계산한다 (68). 인코딩 모듈 (30) 은 왜곡 메트릭, 논 콘텍스트 적응형 코딩을 이용하여 코딩되는 데이터의 일부분의 코딩과 연관된 비트들, 및 콘텍스트 적응형 코딩을 이용하여 코딩되는 데이터의 일부분의 코딩과 연관된 비트들에 기초하여 블록을 코딩하기 위한 총 코딩 비용을 추정할 수도 있다. 예를 들어, 인코딩 모듈 (30) 은 상기 식 (2) 또는 식 (3) 을 이용하여 선택된 모드에서 블록을 코딩하기 위한 총 코딩 비용을 추정할 수도 있다.Encoding module 30 estimates and / or calculates the total coding cost for coding the block in the selected mode (68). Encoding module 30 is based on the distortion metric, the bits associated with the coding of the portion of the data coded using non-context adaptive coding, and the bits associated with the coding of the portion of the data coded using context adaptive coding. To estimate the total coding cost for coding the block. For example, encoding module 30 may estimate the total coding cost for coding the block in the selected mode using equation (2) or equation (3) above.

인코딩 모듈 (30) 은 코딩 비용을 추정하기 위한 임의의 다른 코딩 모드들이 있는지 여부를 결정한다 (70). 상술된 바와 같이, 인코딩 모듈 (30) 은 가능한 모드들의 적어도 일부분에 대하여 코딩 비용을 추정한다. 특정 양태에서, 인코딩 모듈 (30) 은 가능한 코딩 모드들 각각에서 블록의 코딩 비용을 추정할 수도 있다. H.264 표준의 콘텍스트에서, 예를 들어, 인코딩 모듈 (30) 은 인터 코딩을 위해 선택된 블록에 대한 22개의 상이한 코딩 모드들 (인터 코딩 모드 및 인트라 코딩 모드) 과 인트라 코딩을 위해 선택된 블록에 대한 13개의 상이한 코딩 모드들에 대한 코딩 비용을 추정할 수도 있다. 다른 양태들에서, 인코딩 모듈 (30) 은, 가능한 모드들의 세트를 최초로 감소시키기 위한 다른 모드 선택 기술을 이용 할 수도 있고, 이후, 감소된 코딩 모드들의 세트에 대한 코딩 비용을 추정하기 위해 본 개시의 기술을 이용할 수도 있다.Encoding module 30 determines whether there are any other coding modes for estimating coding cost (70). As described above, encoding module 30 estimates the coding cost for at least some of the possible modes. In a particular aspect, encoding module 30 may estimate the coding cost of the block in each of the possible coding modes. In the context of the H.264 standard, for example, encoding module 30 is configured for 22 different coding modes (inter coding mode and intra coding mode) for a block selected for inter coding and for a block selected for intra coding. Coding costs may be estimated for 13 different coding modes. In other aspects, encoding module 30 may use another mode selection technique to initially reduce the set of possible modes, and thereafter, to estimate the coding cost for the reduced set of coding modes. Technology can also be used.

코딩 비용을 추정하기 위한 코딩 모드들이 더 존재할 때, 인코딩 모듈 (30) 은 다음 코딩 모드를 선택하고 선택된 코딩 모드에서 데이터의 코딩 비용을 추정한다. 코딩 비용을 추정하기 위한 코딩 모드가 더 이상 존재하지 않을 때, 인코딩 모듈 (30) 은 적어도 추정된 코딩 비용에 기초하여 블록의 코딩을 위해 이용하기 위한 모드들 중 하나를 선택한다 (72). 일례로, 코딩 모듈 (30) 은 최소로 추정된 코딩 비용을 갖는 코딩 모드를 선택할 수도 있다. 모드의 선택 시, 코딩 모듈 (30) 은 선택된 모드를 적용하여 특정 블록을 코딩할 수도 있다 (74). 이 프로세스는 주어진 프레임 내 추가적인 블록들에 대하여 계속될 수도 있다. 예로써, 프로세스는, 본원에 기재된 기술과 연관되어 선택된 코딩 모드를 이용하여 프레임 내의 모든 블록들이 코딩될 때까지 계속될 수도 있다. 또한, 이 프로세스는, 복수의 프레임들의 블록들이 높은 효율의 모드를 이용하여 코딩될 때까지 계속될 수도 있다.When there are further coding modes for estimating the coding cost, encoding module 30 selects the next coding mode and estimates the coding cost of the data in the selected coding mode. When a coding mode for estimating coding cost no longer exists, encoding module 30 selects one of the modes to use for coding of the block based at least on the estimated coding cost (72). In one example, coding module 30 may select a coding mode with the least estimated coding cost. In selecting a mode, coding module 30 may apply the selected mode to code a particular block (74). This process may continue for additional blocks within a given frame. By way of example, the process may continue until all the blocks in the frame are coded using the coding mode selected in connection with the techniques described herein. This process may also continue until blocks of a plurality of frames are coded using a high efficiency mode.

도 5는, 블록의 잔여 계수들의 코딩과 연관된 비트들의 수를 추정하는, 도 2의 인코딩 모듈 (30) 과 같은 인코딩 모듈의 예시적인 동작을 도시하는 흐름도이다. 코딩 비용을 추정하는 코딩 모드들 중 하나를 선택한 후, 인코딩 모듈 (30) 은 선택된 모드에 대한 블록의 잔여 데이터를 생성한다 (80). 인트라 코딩되도록 선택된 블록에 있어서, 예를 들어, 공간 예측 모듈 (34) 은 그 블록과 그 블록의 예측된 버전과의 비교에 기초하여 그 블록에 대한 잔여 데이터를 생성한다. 대안으로, 인터 코딩되도록 선택된 블록에 있어서, 움직임 추정 모듈 (36) 및 움직임 보상 모듈 (38) 은 그 블록과 기준 프레임 내 대응 블록 간의 비교에 기초하여 그 블록에 대한 잔여 데이터를 계산한다. 몇몇 양태에서, 잔여 데이터는 블록의 왜곡 메트릭을 생성하기 위해 이미 계산되었을 수도 있다. 이 경우, 인코딩 모듈 (30) 은 메모리로부터 잔여 데이터를 검색할 수도 있다.5 is a flowchart illustrating an exemplary operation of an encoding module, such as encoding module 30 of FIG. 2, to estimate the number of bits associated with coding of residual coefficients of the block. After selecting one of the coding modes to estimate the coding cost, encoding module 30 generates residual data of the block for the selected mode (80). For a block selected to be intra coded, for example, spatial prediction module 34 generates residual data for that block based on the comparison of the block with the predicted version of the block. Alternatively, for a block selected to be inter coded, motion estimation module 36 and motion compensation module 38 calculate residual data for that block based on the comparison between that block and the corresponding block in the reference frame. In some aspects, the residual data may have already been calculated to generate the distortion metric of the block. In this case, encoding module 30 may retrieve the residual data from the memory.

변환 모듈 (40) 은 잔여 데이터에 대한 변환 계수를 생성하기 위해 변환 함수에 따라서 블록의 잔여 계수들을 변환한다 (82). 변환 모듈 (40) 은, 예를 들어, 잔여 데이터에 대한 변환 계수들을 생성하기 위해 4×4 또는 8×8 정수 변환 또는 DCT 변환을 잔여 데이터에 적용할 수도 있다. 비트 추정 모듈 (42) 은 변환 계수들 중 하나를 대응하는 임계치와 비교하여 그 변환 계수가 임계치보다 크거나 같은지 여부를 결정한다 (84). 변환 계수에 대응하는 임계치는 인코딩 모듈 (30) 의 QP의 함수로서 계산될 수도 있다. 변환 계수가 대응하는 임계치보다 크거나 같은 경우, 비트 추정 모듈 (42) 은 그 변환 계수를 양자화 이후 논-제로로 남게 될 계수로서 식별한다 (86). 변환 계수가 대응하는 임계치보다 작다면, 비트 추정 모듈 (42) 은 그 변환 계수를 양자화 이후 제로로 될 계수로서 식별한다 (88).Transform module 40 transforms the residual coefficients of the block in accordance with the transform function to generate transform coefficients for the residual data (82). Transform module 40 may apply a 4x4 or 8x8 integer transform or a DCT transform to the residual data, for example, to generate transform coefficients for the residual data. Bit estimation module 42 compares one of the transform coefficients with a corresponding threshold to determine whether the transform coefficient is greater than or equal to the threshold (84). The threshold corresponding to the transform coefficients may be calculated as a function of QP of encoding module 30. If the transform coefficient is greater than or equal to the corresponding threshold, bit estimation module 42 identifies the transform coefficient as a coefficient that will remain non-zero after quantization (86). If the transform coefficient is smaller than the corresponding threshold, bit estimation module 42 identifies the transform coefficient as a coefficient that will be zero after quantization (88).

비트 추정 모듈 (42) 은 블록의 잔여 데이터에 대한 임의의 추가적인 변환 계수들이 존재하는지 여부를 결정한다 (90). 블록의 추가적인 변환 계수들이 존재한다면, 비트 추정 모듈 (42) 은 계수들 중 다른 하나를 선택하고 이것을 대응하는 임계치와 비교한다. 분석할 추가적인 변환 계수들이 존재하지 않는다면, 비트 추정 모듈 (42) 은 양자화 이후 논-제로로 남는 것으로 식별된 계수들의 수를 결정한다 (92). 비트 추정 모듈 (42) 은 또한, 양자화 이후 논-제로로 남는 것으로 식별된 변환 계수들의 절대값들 중 적어도 일부를 합산한다 (94). 비트 추정 모듈 (42) 은 논-제로 계수들의 결정된 수와 논-제로 계수들의 일부의 합을 이용하여 잔여 데이터의 코딩과 연관된 비트들의 수를 추정한다 (96). 비트 추정 모듈 (42) 은, 예를 들어, 상기 식 (4) 을 이용한 잔여 데이터의 코딩과 연돤된 비트들의 수를 추정할 수도 있다. 이 방식으로, 인코딩 모듈 (30) 은 잔여 데이터를 양자화 또는 인코딩하지 않고, 선택된 모드에서 블록의 잔여 데이터의 코딩과 연관된 비트들의 수를 추정한다.Bit estimation module 42 determines whether there are any additional transform coefficients for the residual data of the block (90). If there are additional transform coefficients of the block, bit estimation module 42 selects the other one of the coefficients and compares it with the corresponding threshold. If there are no additional transform coefficients to analyze, bit estimation module 42 determines the number of coefficients identified as remaining non-zero after quantization (92). Bit estimation module 42 also sums at least some of the absolute values of the transform coefficients identified as remaining non-zero after quantization (94). Bit estimation module 42 estimates the number of bits associated with the coding of the residual data using the sum of the determined number of non-zero coefficients and a portion of the non-zero coefficients (96). Bit estimation module 42 may, for example, estimate the number of bits associated with coding of residual data using Equation (4) above. In this way, encoding module 30 estimates the number of bits associated with the coding of the residual data of the block in the selected mode, without quantizing or encoding the residual data.

도 6은, 블록의 잔여 계수들의 코딩과 연관된 비트들의 수를 추정하는, 도 3의 인코딩 모듈 (50) 과 같은 인코딩 모듈의 예시적인 동작을 도시하는 흐름도이다. 코딩 비용을 추정하기 위한 코딩 모드들 중 하나를 선택한 후, 인코딩 모듈 (50) 은 블록의 잔여 계수들을 생성한다 (100). 인트라 코딩되도록 선택된 블록에 있어서, 예를 들어, 공간 예측 모듈 (34) 은 그 블록과 그 블록의 예측된 버전과의 비교에 기초하여 그 블록에 대한 잔여 데이터를 계산한다. 대안으로, 인터 코딩되도록 선택된 블록에 있어서, 움직임 추정 모듈 (36) 및 움직임 보상 모듈 (38) 은 그 블록과 기준 프레임 내 대응 블록 간의 비교에 기초하여 그 블록에 대한 잔여 데이터를 계산한다. 몇몇 양태들에서, 잔여 계수들은 블록의 왜곡 메트릭을 생성하기 위해 이미 계산되었을 수도 있다.FIG. 6 is a flowchart illustrating an exemplary operation of an encoding module, such as encoding module 50 of FIG. 3, to estimate the number of bits associated with coding of residual coefficients of a block. After selecting one of the coding modes for estimating the coding cost, encoding module 50 generates residual coefficients of the block (100). For a block selected to be intra coded, for example, spatial prediction module 34 calculates residual data for that block based on the comparison of the block with the predicted version of the block. Alternatively, for a block selected to be inter coded, motion estimation module 36 and motion compensation module 38 calculate residual data for that block based on the comparison between that block and the corresponding block in the reference frame. In some aspects, the residual coefficients may have already been calculated to produce a distortion metric of the block.

변환 모듈 (40) 은 잔여 데이터에 대한 변환 계수들을 생성하기 위해서 변환 함수에 따라서 블록의 잔여 계수들을 변환한다 (102). 변환 모듈 (40) 은, 예를 들어, 변환된 잔여 계수들을 생성하기 위해 4×4 또는 8×8 정수 변환 또는 DCT 변환을 잔여 데이터에 적용할 수도 있다. 양자화 모듈 (46) 은 인코딩 모듈 (50) 의 QP에 따라서 변환 계수들을 양자화한다 (104).Transform module 40 transforms the residual coefficients of the block according to the transform function to generate transform coefficients for the residual data (102). Transform module 40 may apply a 4x4 or 8x8 integer transform or a DCT transform to the residual data, for example, to produce transformed residual coefficients. Quantization module 46 quantizes the transform coefficients according to the QP of encoding module 50 (104).

비트 추정 모듈 (52) 은 논-제로인 양자화된 변환 계수들의 수를 결정한다 (106). 비트 추정 모듈 (42) 은 또한, 논-제로 레벨들 또는 양자화된 변환 계수들의 절대값들을 합산한다 (108). 비트 추정 모듈 (52) 은 논-제로로 양자화된 변환 계수들의 계산된 수 및 논-제로로 양자화된 변환 계수들의 합을 이용하여 잔여 데이터의 코딩과 연관된 비트들을 수를 추정한다 (110). 비트 추정 모듈 (52) 은, 예를 들어, 상기 식 (4) 을 이용하여 잔여 계수들의 코딩과 연관된 비트들의 수를 추정할 수도 있다. 이 방식에서, 인코딩 모듈은 잔여 데이터를 인코딩하지 않고 선택된 모드에서 블록의 잔여 데이터의 코딩과 연관된 비트들의 수를 추정한다.Bit estimation module 52 determines the number of non-zero quantized transform coefficients (106). Bit estimation module 42 also sums the absolute values of the non-zero levels or quantized transform coefficients (108). Bit estimation module 52 estimates the number of bits associated with coding of residual data using the sum of the calculated number of non-zero quantized transform coefficients and the non-zero quantized transform coefficients (110). Bit estimation module 52 may, for example, estimate the number of bits associated with the coding of residual coefficients using equation (4) above. In this way, the encoding module estimates the number of bits associated with the coding of the residual data of the block in the selected mode without encoding the residual data.

본원에 기재된 교시에 기초하여, 본원에 개시된 양태는 임의의 다른 양태들과 독립하여 구현될 수도 있고, 2 이상의 양태가 다양한 방법으로 결합될 수도 있다는 것은 명백하다. 본원에 기재된 기술은 하드웨어, 소프트웨어, 펌웨어 또는 그 임의의 조합으로 구현될 수도 있다. 하드웨어로 구현된다면, 본 교시는 디지털 하드웨어, 아날로그 하드웨어 또는 그 결합을 이용하여 실현될 수도 있다. 소프트웨어로 구현된다면, 본 기술들은, 저장된 명령들 및 코드를 갖는 컴퓨터 판독가능 매체를 포함하는 컴퓨터 프로그램 제품에 의해 적어도 부분적으로 실현될 수도 있다. 컴퓨터 프로그램 제품의 컴퓨터 판독 가능 매체와 연관된 명령들 또는 코드는 컴퓨터, 예를 들어, 하나 이상의 디지털 신호 프로세서 (DSP), 범용 마이크로프로세서들, ASIC, FPGA, 또는 다른 등가의 통합된 또는 별도의 로직 회로와 같은 하나 이상의 프로세서에 의해 실시될 수도 있다. Based on the teachings described herein, it is apparent that an aspect disclosed herein may be implemented independently of any other aspect, and two or more aspects may be combined in various ways. The techniques described herein may be implemented in hardware, software, firmware, or any combination thereof. If implemented in hardware, the present teachings may be implemented using digital hardware, analog hardware, or a combination thereof. If implemented in software, the techniques may be implemented at least in part by a computer program product including a computer readable medium having stored instructions and code. The instructions or code associated with the computer readable medium of the computer program product may be a computer, for example, one or more digital signal processors (DSPs), general purpose microprocessors, ASICs, FPGAs, or other equivalent integrated or separate logic circuits. It may be implemented by one or more processors, such as.

예로써, 컴퓨터 판독가능 매체는, 동기식 동적 램 (SDRAM) 과 같은 RAM, 읽기 전용 메모리 (ROM), 비 휘발성 랜덤 액세스 메모리 (NVRAM), ROM, 전기적 소거 및 프로그램 가능 읽기 전용 기억 장치 (EEPROM), EEPROM, FLASH 메모리, CD-ROM 또는 다른 광 디스크 스토리지, 자기 디스크 스토리지 또는 다른 자기 스토리지 디바이스, 또는 명령들 및 데이터 구조들의 형태로 원하는 프로그램 코드를 이송하거나 저장하는데 사용될 수 있고 컴퓨터에 의해 액세스될 수 있는 임의의 다른 유형 매체를 포함할 수 있지만, 이것으로 제한되는 것은 아니다.By way of example, computer-readable media may comprise RAM, such as synchronous dynamic RAM (SDRAM), read only memory (ROM), nonvolatile random access memory (NVRAM), ROM, electrically erasable and programmable read only memory (EEPROM), EEPROM, FLASH memory, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage device, or can be used to transfer or store the desired program code in the form of instructions and data structures and can be accessed by a computer It may include, but is not limited to any other tangible media.

다수의 양태들 및 실시예들이 설명되었다. 그러나, 이러한 실시예들에 대하여 다양한 수정들이 가능하고, 마찬가지로, 본원에 나타낸 원리는 다른 양태들에 적용될 수도 있다. 이러한 양태 및 다른 양태는 다음의 청구 범위 내에 있다.Numerous aspects and embodiments have been described. However, various modifications are possible to these embodiments, and likewise, the principles presented herein may be applied to other aspects. These and other aspects are within the scope of the following claims.

Claims

As a method of processing digital video data,

Identifying, for each of the plurality of coding modes, one or more transform coefficients for the residual data of the block of pixels that will remain non-zero when quantized, wherein the one or more transform coefficients are non-zero and unquantized transform coefficients Identifying the transform coefficients;

Estimating, for each of the plurality of coding modes, the number of bits associated with the coding of the residual data in each of the plurality of coding modes based on at least the identified transform coefficients for each of the plurality of coding modes;

Based on the estimated number of bits associated with the coding of the residual data in each of the plurality of coding modes, coding cost for coding the block of pixels in each of the plurality of coding modes, the plurality of codings. Estimating for each mode; And

Selecting and applying a coding mode having the smallest coding cost of the estimated coding costs among the plurality of coding modes to code the block of pixels,

Estimating the number of bits associated with coding of the residual data is performed without quantizing the identified non-zero quantized transform coefficients.

The method of claim 1,

Identifying the transform coefficients includes comparing each of the transform coefficients to a corresponding one of a plurality of thresholds to identify transform coefficients that will remain non-zero when quantized, wherein the plurality of thresholds Each of which is calculated as a function of a quantization parameter (QP).

The method of claim 2,

Comparing each of the transform coefficients with a corresponding one of a plurality of thresholds, identifying the transform coefficients that will remain non-zero when quantized is the transform coefficients that will remain non-zero when quantized, corresponding to Identifying transform coefficients that are greater than or equal to a threshold.

The method of claim 2,

Precomputing a plurality of sets of thresholds, each of the sets of thresholds corresponding to a different value of the QP; And

Selecting one of the plurality of sets of thresholds based on the value of the QP used to encode the block of pixels.

The method of claim 1,

Estimating the number of bits associated with the coding of the residual data,

Determining the number of transform coefficients identified as being left non-zero when quantized;

Summing the absolute values of at least one of the transform coefficients identified as being left non-zero when quantized; And

Estimating the number of bits associated with the coding of the residual data based at least on the sum of the determined number of non-zero transform coefficients and the absolute values of the at least one non-zero transform coefficient. Treatment method.

delete

The method of claim 1,

Estimating the coding cost,

Calculating a distortion metric for the block of pixels;

Calculating a number of bits associated with coding of non-residual data of the block of pixels; And

Estimating a coding cost for coding the block of pixels based at least on the distortion metric, the number of bits associated with coding of the non-residual data, and the number of bits associated with coding of the residual data, Method of processing digital video data.

The method of claim 1,

Quantizing the transform coefficients for the residual data after the step of selecting the coding mode;

Encoding the quantized transform coefficients for the residual data; And

Transmitting the encoded coefficients for the residual data.

The method of claim 1,

Generating a matrix of transform coefficients, wherein the number of rows of the matrix of transform coefficients is equal to the number of rows of pixels in the block, and the number of columns of the matrix of transform coefficients is of the column of pixels in the block. Generating a matrix of transform coefficients, such as a number;

Comparing the matrix of transform coefficients to a matrix of thresholds, the matrix of thresholds having the same dimensions as the matrix of transform coefficients, the comparison also being a matrix of ones and zeros, wherein zero is zero after quantization Comparing the matrix of transform coefficients to a matrix of thresholds, wherein the matrix represents a position in the matrix of transform coefficients to be represented and wherein 1 represents a position in the matrix of transform coefficients to remain non-zero after quantization;

Summing the number of 1s in the matrix of 1s and 0s to calculate the number of transform coefficients identified as being non-zero when quantized;

Summing absolute values of at least one of the transform coefficients corresponding to the position of 1 in the matrix of 1 and 0 in the matrix of transform coefficients; And

Estimating the number of bits associated with coding of the residual data based at least on the number of non-zero transform coefficients and the sum of the at least one non-zero transform coefficient. .

An apparatus for processing digital video data,

Means for identifying, for each of the plurality of coding modes, one or more transform coefficients for the residual data of the block of pixels that will remain non-zero when quantized;

Means for estimating, for each of the plurality of coding modes, the number of bits associated with the coding of the residual data in each of the plurality of coding modes based on at least the identified transform coefficients for each of the plurality of coding modes; And

Based on the estimated number of bits associated with the coding of the residual data in each of the plurality of coding modes, coding cost for coding the block of pixels in each of the plurality of coding modes, the plurality of codings. Means for estimating for each mode; And

Means for selecting and applying a coding mode having the smallest coding cost of an estimated coding cost among the plurality of coding modes to code the block of pixels,

And estimating the number of bits associated with the coding of the residual data is performed prior to quantization of any of the at least the identified transform coefficients.

The method of claim 11, wherein

A transform module for generating transform coefficients for the residual data of the block of pixels,

The means for identifying the one or more transform coefficients for each of the plurality of coding modes and the means for estimating the number of bits for each of the plurality of coding modes comprise the one or more transform coefficients that will remain non-zero when quantized. Identify for each of the plurality of coding modes and determine the number of bits associated with the coding of the residual data in each of the plurality of coding modes based on at least the identified transform coefficients for each of the plurality of coding modes, A bit estimation module for estimating for each of the plurality of coding modes,

The means for estimating the coding cost for each of the plurality of coding modes is based on at least an estimated number of bits associated with the coding of the residual data in each of the plurality of coding modes in each of the plurality of coding modes. And a control module for estimating a coding cost for coding the block of pixels of for each of the plurality of coding modes.

13. The method of claim 12,

The bit estimation module compares each of the transform coefficients with a corresponding one of a plurality of thresholds to identify transform coefficients that will remain non-zero when quantized, each of the plurality of thresholds being a quantization parameter (QP). A device for processing digital video data, calculated as a function.

The method of claim 13,

And the bit estimation module identifies transform coefficients that are less than a corresponding threshold as transform coefficients that will remain non-zero when quantized.

The method of claim 13,

The bit estimation module precalculates a plurality of sets of thresholds, each of the sets of thresholds corresponding to a different value of the QP, and based on the value of the QP used to encode the block of pixels. And select one of the plurality of sets.

13. The method of claim 12,

The bit estimation module determines the number of the transform coefficients identified as remaining non-zero when quantized, sums the absolute values of at least one of the transform coefficients identified as remaining non-zero when quantized, And estimating the number of bits associated with the coding of the residual data based at least on the sum of the determined number of non-zero transform coefficients and the absolute values of the at least one non-zero transform coefficient. Device.

delete

13. The method of claim 12,

The control module calculates a distortion metric for the block of pixels, calculates the number of bits associated with the coding of non-residual data of the block of pixels, and at least is associated with the distortion metric, coding of the non-residual data. And estimate a coding cost for coding the block of pixels based on the number of bits and the number of bits associated with the coding of the residual data.

13. The method of claim 12,

A quantization module for quantizing the transform coefficients for the residual data after selecting the coding mode;

An entropy encoding module for encoding the quantized transform coefficients for the residual data; And

And a transmitter for transmitting the encoded coefficients for the residual data.

13. The method of claim 12,

The transform module generates the matrix of transform coefficients, wherein the number of rows of the matrix of transform coefficients is equal to the number of rows of pixels in the block, and the number of columns of the matrix of transform coefficients is the column of pixels in the block. Equal to, and

The bit estimation module compares the matrix of transform coefficients with a matrix of thresholds, the matrix of thresholds having the same dimensions as the matrix of transform coefficients, and wherein the comparison is a matrix of 1s and 0s, where 0 is Indicates a position in the matrix of transform coefficients that will be zero after quantization and 1 denotes a position in the matrix of transform coefficients that will remain non-zero after quantization,

The bit estimation module also calculates the number of transform coefficients identified as remaining non-zero when quantized by summing the number of ones in the matrix of ones and zeros, and calculating the number of ones in the matrix of transform coefficients. Sum the absolute values of at least one of the transform coefficients corresponding to the position of 1 in the matrix of zero, and based at least on the sum of the number of non-zero transform coefficients and the at least one non-zero transform coefficient And estimating the number of bits associated with the coding of the residual data.

The method of claim 11, wherein

The identifying means compares each of the transform coefficients with a corresponding one of a plurality of thresholds to identify transform coefficients that will remain non-zero when quantized, each of the plurality of thresholds being a quantization parameter (QP). A device for processing digital video data, calculated as a function.

23. The method of claim 22,

And said identifying means identifies transform coefficients that are less than a corresponding threshold as transform coefficients that will remain non-zero when quantized.

23. The method of claim 22,

Means for precomputing a plurality of sets of thresholds, each of the sets of thresholds corresponding to a different value of the QP; And

And means for selecting one of the plurality of sets of thresholds based on a value of a QP used to encode the block of pixels.

22. The method of claim 21,

The bit estimating means,

Determine the number of transform coefficients identified as remaining non-zero when quantized, sum the absolute values of at least one of the transform coefficients identified as remaining non-zero when quantized, and, at least, the non- And estimate the number of bits associated with the coding of the residual data based on the sum of the determined number of zero transform coefficients and the absolute values of the at least one non-zero transform coefficient.

delete

22. The method of claim 21,

The coding cost estimation means calculates a distortion metric for the block of pixels, calculates a number of bits associated with coding of non-residual data of the block of pixels, and at least, the distortion metric, the non-residual data. And estimate a coding cost for coding the block of pixels based on the number of bits associated with a coding of and the number of bits associated with coding of the residual data.

22. The method of claim 21,

Means for quantizing the transform coefficients for the residual data after selecting the coding mode;

Means for encoding the quantized transform coefficients for the residual data; And

And means for transmitting the encoded coefficients for the residual data.

22. The method of claim 21,

Means for generating a matrix of transform coefficients, wherein the number of rows of the matrix of transform coefficients is equal to the number of rows of pixels in the block, and the number of columns of the matrix of transform coefficients is the number of columns of pixels in the block Means for generating a matrix of transform coefficients, such as

The identifying means compares the matrix of transform coefficients with a matrix of thresholds, the matrix of thresholds having the same dimensions as the matrix of transform coefficients, the comparison is a matrix of 1s and 0s, and 0 is quantized A position in the matrix of transform coefficients to be zero afterwards and 1 denotes a position in the matrix of transform coefficients to be left non-zero after quantization; And

The bit estimating means calculates the number of transform coefficients identified as remaining non-zero when quantized by summing the number of 1s in the matrix of 1s and 0s, and calculating the number of transform coefficients 1 and 0 in the matrix of transform coefficients. Summing the absolute values of at least one of the transform coefficients corresponding to the position of 1 in the matrix, and based at least on the sum of the number of non-zero transform coefficients and the at least one non-zero transform coefficient And estimating the number of bits associated with the coding of residual data.

A computer readable medium having instructions for processing digital video data, comprising:

The commands are

Code for identifying, for each of the plurality of coding modes, one or more transform coefficients for the residual data of the block of pixels that will remain non-zero when quantized;

Code for estimating, for each of the plurality of coding modes, the number of bits associated with the coding of the residual data in each of the plurality of coding modes based on at least the identified transform coefficients for each of the plurality of coding modes. ;

Based on the estimated number of bits associated with the coding of the residual data in each of the plurality of coding modes, coding cost for coding the block of pixels in each of the plurality of coding modes, the plurality of codings. Code for estimating for each mode; And

Code for selecting and applying a coding mode having the smallest coding cost among estimated coding costs among the plurality of coding modes, for coding the block of pixels,

Estimating the number of bits associated with coding of the residual data is performed without quantizing the at least the identified transform coefficients.

32. The method of claim 31,

The code identifying the transform coefficients includes a code identifying each of the transform coefficients with a corresponding one of a plurality of thresholds to identify transform coefficients that will remain non-zero when quantized, wherein the plurality of thresholds Each calculated as a function of quantization parameter (QP).

33. The method of claim 32,

Comparing each of the transform coefficients with a corresponding one of a plurality of thresholds, the code identifying the transform coefficients that will remain non-zero when quantized is the transform coefficients that will remain non-zero when quantized, corresponding to And code for identifying transform coefficients less than a threshold.

33. The method of claim 32,

Code for precomputing a plurality of sets of thresholds, each of the sets of thresholds corresponding to a different value of the QP; And

And code for selecting one of the plurality of sets of thresholds based on a value of a QP used to encode the block of pixels.

32. The method of claim 31,

A code for estimating the number of bits associated with coding of the residual data is

Code for determining the number of transform coefficients identified as being left non-zero when quantized;

Code for summing the absolute values of at least one of the transform coefficients identified as being left non-zero when quantized; And

And code for estimating the number of bits associated with coding of the residual data based at least on the sum of the determined number of non-zero transform coefficients and the absolute values of the at least one non-zero transform coefficient. media.

delete

32. The method of claim 31,

The code for estimating the coding cost,

Code for calculating a distortion metric for the block of pixels;

Code for calculating the number of bits associated with coding of non-residual data of the block of pixels; And

Code for estimating a coding cost for coding the block of pixels based at least on the distortion metric, the number of bits associated with coding of the non-residual data, and the number of bits associated with coding of the residual data, Computer readable media.

32. The method of claim 31,

Code for quantizing transform coefficients for the residual data after selecting the coding mode;

Code for encoding the quantized transform coefficients for the residual data; And

And code for transmitting the encoded coefficients for the residual data.

32. The method of claim 31,

Code for generating the matrix of transform coefficients, wherein the number of rows of the matrix of transform coefficients is equal to the number of rows of pixels in the block, and the number of columns of the matrix of transform coefficients is the number of columns of pixels in the block Code for generating a matrix of transform coefficients, such as;

Code for comparing the matrix of transform coefficients to a matrix of thresholds, the matrix of thresholds having the same dimensions as the matrix of transform coefficients, wherein the comparison is a matrix of ones and zeros, and zero is zero after quantization Code for comparing the matrix of transform coefficients to a matrix of thresholds, wherein the position represents a position in the matrix of transform coefficients to be represented and wherein 1 is a position in the matrix of transform coefficients to remain non-zero after quantization;

Summing a number of ones in the matrix of ones and zeros to calculate the number of transform coefficients identified as being non-zero when quantized;

Code for summing absolute values of at least one of the transform coefficients corresponding to the position of 1 in the matrix of 1 and 0 in the matrix of transform coefficients; And

And code for estimating the number of bits associated with coding of the residual data based at least on the number of non-zero transform coefficients and the sum of the at least one non-zero transform coefficients.